-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
basenc: emit partial output on invalid input #6008
Comments
Partial output on GNU It looks like $ echo -n 'aGVsbG8>' | LC_ALL=C basenc --base64url -d | hd
basenc: invalid input
00000000 68 65 6c 6c 6f |hello|
00000005 However, it does not create any output if a special base64-only character is detected: $ echo -n 'aGVsbG8+' | LC_ALL=C basenc --base64url -d | hd
basenc: invalid input
$ echo -n 'aGVsbG8/' | LC_ALL=C basenc --base64url -d | hd
basenc: invalid input
$ … unless it's late enough in the stream: $ cat <(yes | tr $'\ny' a | head -c5599) <(echo -n '.') | LC_ALL=C basenc --base64url -d | wc
basenc: invalid input
0 1 4199
$ cat <(yes | tr $'\ny' a | head -c5599) <(echo -n '+') | LC_ALL=C basenc --base64url -d | wc
basenc: invalid input
0 0 0
$ cat <(yes | tr $'\ny' a | head -c5599) <(echo -n 'a') | LC_ALL=C basenc --base64url -d | wc
0 1 4200
$ cat <(yes | tr $'\ny' a | head -c5599) <(echo -n 'a+') | LC_ALL=C basenc --base64url -d | wc
basenc: invalid input
0 1 4200
$ This is cursed. Replicating this behavior would require hard-coding this look-ahead, which seems a bad idea ("surprising behavior is a bug"), and is not documented in the help. (Note that |
What I think is happening is that it loads a chunk of the input into a buffer and then -- depending on the encoding -- optionally does a check on that buffer before converting it. Then, it loads the next part into the buffer and repeat. Doing that at least is probably a good idea so that we can handle larger inputs because we don't need to store the entire input in memory. The differences between the encodings might be a bug on GNU's side. We could try to bring this to their attention. |
This is a GNU behavior bug (i.e. this is a bug because GNU behaves differently, even though uutils' current behavior could be considered reasonable, too).
Example:
I would prefer to first land #6007 before starting work on this issue.The text was updated successfully, but these errors were encountered: