fix: do not parse base62 strings of unexpected length #8

nilium · 2024-02-23T17:01:47Z

Motivation

I noticed that from_base62 would accept inputs of arbitrary length and decode them from base62 instead of validating length ahead of time. Further, if a KSUID decoded to an excessively long string of bytes, it would quietly accept this and only copy the remaining 20 bytes of data. As a result, this meant that rust-ksuid would accept strings of invalid length instead of rejecting them like the Segment KSUID implementation.

I'm unclear if this is a behavior that rust-ksuid users depend on, so maybe should have a default-on strict crate feature (or, preferably, nonstrict that's off by default) so users have to opt into parsing invalid KSUIDs. If that's the right path forward, users should probably also be made aware that rust-ksuid might allocate much more memory than is needed for decoding a KSUID. I have a feeling this isn't a great way to OOM code that uses the crate, but sending enough unusually long KSUIDs might pose a risk to those applications from either memory allocation or consuming time decoding long base62 strings.

Solution

Before decoding any base62 string, check that the string's input length is 27 bytes and add tests to verify long and short strings are still invalid.

Reject parsing base62 strings that would result in unexpectedly long payloads and potentially large allocations. Without this, it is possible to use a base62 string to pass in extra bytes of data that are unused (but do consume memory for a time). I don't know if this is something people rely on, but it seems unlikely that anyone would find this desirable. This might represent a possible way to OOM any code accepting KSUIDs that didn't limit incoming payload sizes already, but there are probably better options for that. Add a test for both long and short base62 strings to try to skip allocating anything for strings of the wrong length by decoding them. Previously, length checks only happened after decoding and would only take the tail end of the decoded base62 string.

svix-james

Thanks for the PR -- looks good to me!

I don't think there's any need to maintain compatibility for invalid lengths, but will defer to others. @tasn any thoughts there?

svix-jplatte · 2024-02-28T12:29:09Z

Hm, right below after decoding the base62 strings, we have some logic for ignoring extra bytes though. If we're going to error for over-long IDs anyways, we should at least remove that code. But it being there suggests that there is a compatibility concern there.

nilium · 2024-02-28T17:59:10Z

I think the two later checks have different purposes now and can't really be removed, but for different reasons than they might have originally had:

Handling excess bytes is necessary because base_encode will sometimes produce vectors over 20 bytes in length (there are a few in the test cases JSON file that have 22 bytes; I assume this is just a side effect of 160 not mapping evenly to base62, so some extra padding bits will be left over). So, the function still has to get it back down to 20 bytes for use as a KSUID. Right now this is done with a copy, it could also be done with try_into after slicing, but I have a feeling there isn't much difference in how it gets there.
When checking for fewer bytes (the else when checking the loaded vec's length), base_encode could in case of a bug (none exist that I know of, this is just about handling the case safely) return a shorter vec than can be handled, so at minimum it's needed as a bounds check for that. It should be incredibly unlikely but it's better than panicking.

svix-jplatte

Explanation sounds reasonable to me, but I'm don't have that much experience in this area so will leave merging up to @svix-james.

svix-james approved these changes Feb 23, 2024

View reviewed changes

svix-jplatte approved these changes Feb 29, 2024

View reviewed changes

tasn merged commit f71ac1d into svix:main Feb 29, 2024
3 checks passed

nilium deleted the check-input-base62-length branch February 29, 2024 16:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: do not parse base62 strings of unexpected length #8

fix: do not parse base62 strings of unexpected length #8

nilium commented Feb 23, 2024

svix-james left a comment

svix-jplatte commented Feb 28, 2024

nilium commented Feb 28, 2024 •

edited

Loading

svix-jplatte left a comment

fix: do not parse base62 strings of unexpected length #8

fix: do not parse base62 strings of unexpected length #8

Conversation

nilium commented Feb 23, 2024

Motivation

Solution

svix-james left a comment

Choose a reason for hiding this comment

svix-jplatte commented Feb 28, 2024

nilium commented Feb 28, 2024 • edited Loading

svix-jplatte left a comment

Choose a reason for hiding this comment

nilium commented Feb 28, 2024 •

edited

Loading