Skip to content

Ucdxml 17v1 #1104

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 26 commits into
base: main
Choose a base branch
from
Draft

Ucdxml 17v1 #1104

wants to merge 26 commits into from

Conversation

jowilco
Copy link
Contributor

@jowilco jowilco commented Apr 25, 2025

  1. Addresses https://www.unicode.org/L2/L2019/19270.htm#160-A10 by consuming the RegEx syntax for Unihan directly from UAX38.
  2. Resolved https://www.unicode.org/L2/L2024/24221.htm#181-A43 by removing kGB7.
  3. Resolved https://www.unicode.org/L2/L2024/24221.htm#181-A131 by updating the RegEx for the na attribute.
  4. Addresses https://www.unicode.org/L2/L2024/24221.htm#181-A133 by adding support for Unicode Version 17.0
  5. Resolved https://www.unicode.org/L2/L2025/25085.htm#183-A100 by removing kJa.
  6. Resolved https://www.unicode.org/L2/L2025/25085.htm#183-A127 by adding kTayNumeric.
  7. Resolved https://www.unicode.org/L2/L2025/25085.htm#183-A180. All Unihan syntaxes are sourced directly from UAX38.
  8. Addresses UCDXML: add Unikemet properties #921 by adding support for Unikemet.
  9. Resolved Update UAX42 to document that common Unihan attributes are grouped starting with Unicode 17 #1071 by adding a comment indicating that "Unihan attributes are applied at the group where applicable, similar to how non-Unihan attributes are applied at the group."
  10. Partially addresses Update UAX42 to document which UCDXML fields correspond to UCD properties (UAX44) vs. which are “just data” corresponding to various UCD files #1049 by removing Deprecated properties, Normalization Corrections, and Emoji Sources.

@jowilco jowilco marked this pull request as draft April 25, 2025 00:17
@jowilco jowilco requested a review from markusicu April 25, 2025 00:17
@markusicu markusicu self-assigned this Apr 25, 2025
@markusicu
Copy link
Member

Hi @jowilco the CI failures suggest that you need to run GenerateEnums again, and also mvn spotless:apply.

@jowilco
Copy link
Contributor Author

jowilco commented Apr 25, 2025

Hi @jowilco the CI failures suggest that you need to run GenerateEnums again, and also mvn spotless:apply.

Agreed. This is still a draft pending the fixes for TR57. I'll definitely clean everything up before removing the draft status.

Comment on lines 310 to 314
//Deprecated
// createPropertyFragment(
// UcdProperty.ISO_Comment,
// SCHEMA.PROPERTIES,
// getFormattedSyntax(UcdProperty.ISO_Comment));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i suggest you just delete this altogether

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 595 to 596
//Deprecated
// case FC_NFKC_Closure:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i suggest you just delete this altogether

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 1016 to 1026
//Deprecated
// + TRIPLELINE
// + getFormattedBoolean(UcdProperty.Expands_On_NFC)
// + DOUBLELINE
// + getFormattedBoolean(UcdProperty.Expands_On_NFD)
// + DOUBLELINE
// + getFormattedBoolean(UcdProperty.Expands_On_NFKC)
// + DOUBLELINE
// + getFormattedBoolean(UcdProperty.Expands_On_NFKD)
// + TRIPLELINE
// + getFormattedSyntax(UcdProperty.FC_NFKC_Closure);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i suggest you just delete this altogether

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 1131 to 1133
//Deprecated
// + getFormattedBoolean(UcdProperty.Hyphen)
// + DOUBLELINE
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 1183 to 1185
//Deprecated
// + getFormattedBoolean(UcdProperty.Grapheme_Link)
// + DOUBLELINE
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 322 to 332
// Deprecated
// public static UCDPropertyDetail Hyphen_Detail =
// new UCDPropertyDetail(
// UcdProperty.Hyphen,
// VersionInfo.getInstance(2, 0, 0),
// 32,
// true,
// false,
// false,
// true,
// false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i suggest you just delete this altogether

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I screwed up here - I should have added values for maxVersion, which I have done now.

Comment on lines 593 to 603
// Deprecated
// public static UCDPropertyDetail Grapheme_Link_Detail =
// new UCDPropertyDetail(
// UcdProperty.Grapheme_Link,
// VersionInfo.getInstance(3, 2, 0),
// 59,
// true,
// false,
// false,
// true,
// false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i suggest you just delete this altogether

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I screwed up here - I should have added values for maxVersion, which I have done now.

true);
public static UCDPropertyDetail Expands_On_NFD_Detail =
false);
// Deprecated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i suggest you just delete this altogether

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I screwed up here - I should have added values for maxVersion, which I have done now.

false,
true);
false);
// public static UCDPropertyDetail kGB7_Detail =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i suggest you just delete this altogether

false,
true);
false);
// Deprecated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i suggest you just delete this altogether

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I screwed up here - I should have added values for maxVersion, which I have done now.

@jowilco jowilco changed the base branch from main to trunk April 29, 2025 18:08
@jowilco jowilco changed the base branch from trunk to main April 29, 2025 18:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update UAX42 to document that common Unihan attributes are grouped starting with Unicode 17
2 participants