-
-
Notifications
You must be signed in to change notification settings - Fork 798
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BytesToNameCanonicalizer can mishandle leading NUL #148
Comments
Hmmh. I hate the fact that null Unicode character is technically valid. Almost enough to rather throw an exception if one is seen, and indicate Jackson will not allow it... Now: I am guessing this is found via generated test combinations, and not with actual usage? Just want to figure out likelihood of this being something found in the wild. |
That guess is correct. I'm not actually deliberately testing Jackson directly; I'm testing my own code and coming up with places where the test fails because the property being checked is Of course, it's possible come up with a situation where it's a potential security issue, due to allowing a user to change the interpretation of a chunk of JSON depending on what kind of parser is used. I admit it'd be hard to come up with a real case, but a super-contrived one's like this: say there's a service processor that operates in two phases. Phase one checks the credentials stored in a request, and if they're good, passes it unmodified to a second service that assumes the creds are good (because it knows the first layer has already handled it) and uses the username to decide whether to allow or deny a certain action. A valid but malicious user called "unprivileged" sends |
0xff can't occur in well-formed UTF-8, so it will not cause false-positives. Might resolve FasterXML#148
I'm not sure that patch is correct, but it seems to solve the immediate issue and pass the tests. I don't much like it; it feels to me like the padding should be handled in the BytesToNameCanonicalizer itself rather than in the user of the BTNC. |
Right, this is problematic, and good catch no matter how things end up. I started thinking that there may actually be relatively easy and inexpensive fix by using different padding value: since UTF-8 encoded content can not have couple of highest-valued bytes (for example, |
Heh. I should have read your update before adding a comment. Great minds think alike etc. :-) I see why you feel conflicted about patch: I agree that ideally it'd be hidden in symbol table. But then again, patch is delightfully small. One potential improvement could be to try to initialize quad value with padding, instead of applying patch in the end; but then again that'd actually do bit more work. And I suspect performance-wise difference probably couldn't even be measured. I'll think about this for a bit; I checked in (failing) unit test. I am still wondering if there is any real risk in adding this in patch version, but your patch seems so elegantly small that it seems unlikely there could be any risk. |
Actually looking at patch again, I think it is pretty optimal (from efficiency POV). No point in tweaking, at most could inline if |
@rjmac Do you think patch is complete, that is, can I just apply it as is? |
I think so. It passes Jackson's own test suite, and the snapshot jar I built also looks like it passes my tests that showed this up in the first place. |
Ok thanks. My main concern really is with patch release; 2.4.2 is finally resolving couple of regressions 2.4 had (due to various low-level performance optimizations). One possibility would be to do this for 2.5.0, just to eliminate even the small risk. |
I switched master branch to be 2.5.0-SNAPSHOT, and will target fix for 2.5, just to be extra sure. |
More from randomized testing: deserializing the document
{ "\u0000abc" : "a", "abc" : "b" }
via the UTF8StreamJsonParser will produce the same text for the second field name as the first. This is because the BytesToNameCanonicalizer will left-pad "abc" with a 0 byte to fit it in an int, which is indistinguishable from a string which really has a leading zero char.Here's a standalone test case for it:
The text was updated successfully, but these errors were encountered: