Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the normalizer work with new Unicode 16 normalization behaviors #4860

Merged
merged 9 commits into from
Oct 28, 2024

Conversation

hsivonen
Copy link
Member

@hsivonen hsivonen commented May 2, 2024

Closes #4859

ICU4C PR coming up.

@hsivonen hsivonen requested a review from echeran as a code owner May 2, 2024 15:33
@hsivonen
Copy link
Member Author

hsivonen commented May 2, 2024

ICU4C PR: unicode-org/icu#2994

@hsivonen
Copy link
Member Author

hsivonen commented May 2, 2024

This reverts #4538 , which turned out to be a bad idea.

@hsivonen
Copy link
Member Author

hsivonen commented May 2, 2024

I've tested this with Unicode 16 data, but this PR doesn't include the new data.

@sffc
Copy link
Member

sffc commented May 30, 2024

Conclusion: spend the time to benchmark these changes, in conjunction with #4967. Do this before 2.0 because it might involve a data struct change.

@robertbastian robertbastian added this to the ICU4X 2.0 ⟨P1⟩ milestone Aug 7, 2024
@robertbastian robertbastian added the waiting-on-author PRs waiting for action from the author for >7 days label Aug 13, 2024
@hsivonen hsivonen removed the waiting-on-author PRs waiting for action from the author for >7 days label Sep 18, 2024
@hsivonen
Copy link
Member Author

We should merge this and #4878 now regardless of the outcome of #4967 . If the outcome of that investigation shows that it makes sense to rearrange the bits, let's land that change on top of this one.

@hsivonen hsivonen added waiting-on-reviewer PRs waiting for action from the reviewer for >7 days C-collator Component: Collation, normalization labels Sep 20, 2024
@robertbastian robertbastian mentioned this pull request Oct 25, 2024
Copy link
Member

@Manishearth Manishearth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this seems like a minor optimization at best. Let's land this, we should have 16 support for 2.0

@robertbastian robertbastian merged commit c3da0ca into unicode-org:main Oct 28, 2024
28 checks passed
Manishearth pushed a commit that referenced this pull request Oct 28, 2024
This is the collator counterpart of #4860.

Co-authored-by: Robert Bastian <[email protected]>
@hsivonen hsivonen deleted the norm16 branch October 29, 2024 15:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-collator Component: Collation, normalization waiting-on-reviewer PRs waiting for action from the reviewer for >7 days
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make the normalizer work with new Unicode 16 normalization behaviors
4 participants