-
Notifications
You must be signed in to change notification settings - Fork 230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(chunker): Fix chunk length calculation for unicode characters #726
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ssr/src/utils/chunker.ts createChunks() in the diffs of this Pull Request should be updated as follows, due to TypeScript constraints:
export function createChunks(key: string, value: string, chunkSize?: number): Chunk[] {
const resolvedChunkSize = chunkSize ?? MAX_CHUNK_SIZE;
let encodedValue = encodeURIComponent(value);
if (encodedValue.length <= resolvedChunkSize) {
return [{ name: key, value }];
}
const chunks: Chunk[] = [];
while (encodedValue.length > 0) {
let encodedChunkHead = encodedValue.slice(0, resolvedChunkSize);
const lastEscapePos = encodedChunkHead.lastIndexOf('%');
// Check if the last escaped character is truncated.
if (lastEscapePos > resolvedChunkSize - 3) {
// If so, reslice the string to exclude the whole escape sequence.
// We only reduce the size of the string as the chunk must
// be smaller than the chunk size.
encodedChunkHead = encodedChunkHead.slice(0, lastEscapePos);
}
let valueHead;
// Check if the chunk was split along a valid unicode boundary.
while (encodedChunkHead.length > 0) {
try {
// Try to decode the chunk back and see if it is valid.
// Stop when the chunk is valid.
valueHead = decodeURIComponent(encodedChunkHead);
break;
} catch (error) {
if (
error instanceof URIError &&
encodedChunkHead.at(-3) === '%' &&
encodedChunkHead.length > 3
) {
encodedChunkHead = encodedChunkHead.slice(0, encodedChunkHead.length - 3);
} else {
throw error;
}
}
}
chunks.push({
name: `${key}.${chunks.length}`,
value: valueHead as string,
});
encodedValue = encodedValue.slice(encodedChunkHead.length);
}
return chunks.map((value, i) => ({
name: `${key}.${i}`,
value: value.value as string,
}));
}
@SyntheticGoop |
I've updated the types so that the package build on my system with |
const lastEscapePos = encodedChunkHead.lastIndexOf('%'); | ||
|
||
// Check if the last escaped character is truncated. | ||
if (lastEscapePos > resolvedChunkSize - 3) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just to clarify, we subtract 3 here because an escaped character would take up 3 characters in the string right (a % followed by 2 hexadecimal characters)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that's exactly it.
What kind of change does this PR introduce?
Bug fix
What is the current behavior?
The length of cookies with URI escaped characters can exceed the valid cookie length of 4096, causing parts of the split cookies to be rejected by the browser.
What is the new behavior?
Chunked cookies after URI encoding now fit within the chunk size specified.
Additional context
The reason for this issue occurring is that we are chunking base on lengths before escaping. While we have some buffer in the
MAX_CHUNK_SIZE
for longer unicode characters, this does not always work.For example
﷽
has a length in javascript of1
as well as match/(.{1})/g.exec("﷽")
to be$1 = ﷽
, but it will escape to%EF%B7%BD
which is 9 characters long.The current approach of chunking with a buffer for some unicode characters will never be "correct" unless we implement a buffer that accounts for every character being a maximum size unicode.
I've reimplemented the createChunks function to do the following:
There are currently other related solutions and issues:
PRs
Issues