Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Currently, handling of UTF-8 codes in WebSocket text messages is handled entirely on the Java side. This is understandable for most uses, but some users may want to interact with the bytes directly (e.g. with
utf8
) without Java meddling with codepoints outside CC's range.This PR adds a binary flag to
http.websocket
that disables UTF-8 decoding of text messages on the Java side. When enabled, the binary flag of each message remains intact, but the data received will always be interpreted as binary data - no more?
s replacing out-of-range codes. Sending text messages on a binary handle will cause Java to reinterpret the bytes as UTF-8 before sending, preserving manually-encoded codepoints. (This is an implementation detail, sinceTextWebSocketFrame
takes a UTF-16String
, so we need to convert it from UTF-8 ourselves.) If that fails for some reason, the sender will fall back to normal bytewise strings.The syntax for this flag matches
http.get
's argument list, so there should be no confusion between the two lists, and no compatibility issues should arise - WebSockets still default to normal/text handles. I've added a test suite as well (which duplicates the original WS test) that ensures that UTF-8 bytes are preserved through the socket.