Skip UTF-8 BOM mark in EncodingDetectingInputStream
and default to UTF-8 in RewriteTest
#4546
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
As the
EncodingDetectingInputStream
is only used as input for the parsers, we typically don't want to see any UTF-8 BOM marker. Additionally, platforms like .NET remove the BOM mark as well, so this change brings better compatibility.The
EncodingDetectingInputStream
now also has less runtime overhead. Especially in cases when the charset was either already detected or specified by the caller.Finally,
RewriteTest
will now default to parsing the source files using UTF-8, whereas it would before let theEncodingDetectingInputStream
try to detect the encoding. When another encoding is required (or the test explicitly wants the encoding to be detected), the test can useRecipeSpec#executionContext(ExecutionContext)
together withParsingExecutionContextView#setCharset()
.