Skip to content

Commit

Permalink
Fix PUA characters
Browse files Browse the repository at this point in the history
Related: #6 (not in effect yet, planned for Qieyun.js v0.15)
  • Loading branch information
syimyuzya committed Jul 8, 2024
1 parent 552fb69 commit 6ad4990
Show file tree
Hide file tree
Showing 2 changed files with 14 additions and 5 deletions.
13 changes: 11 additions & 2 deletions build.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,15 @@ def process_音韻地位(row: list[str]) -> str:
return + + 等類 + +


def fix_pua(s: str) -> str:
fixed = s.replace('\uee42', '𧞬').replace('\uece0', '勳')
for ch in fixed:
assert not (
0xE000 <= ord(ch) <= 0xF8FF
), f'PUA character U+{ord(ch):04x} in {repr(s)}'
return fixed


def main():
小韻_data: dict[str, list[str]] = {}
with open('src/rime-table-bfa9b50.tsv') as fin:
Expand Down Expand Up @@ -147,8 +156,8 @@ def main():
last_原小韻號 = 原小韻號
小韻內字序 = 0
小韻內字序 += 1
row[1] = 小韻內字序
print(*row, sep=',', file=fout)
row[1] = str(小韻內字序)
print(fix_pua(','.join(row)), file=fout)


if __name__ == '__main__':
Expand Down
6 changes: 3 additions & 3 deletions 韻書/廣韻.csv
Original file line number Diff line number Diff line change
Expand Up @@ -1220,8 +1220,8 @@
130,1,支,生開三支平,所宜,釃,,下酒所宜切又山爾切七,
130,2,支,生開三支平,所宜,簁,,下物竹器又所綺切,
130,3,支,生開三支平,所宜,欐,,梁棟別名又禮麗二音,
130,4,支,生開三支平,所宜,襹,,襹毛羽衣皃,
130,5,支,生開三支平,所宜,褷,,上同,襹毛羽衣皃
130,4,支,生開三支平,所宜,襹,,𧞬襹毛羽衣皃,
130,5,支,生開三支平,所宜,褷,,上同,𧞬襹毛羽衣皃
130,6,支,生開三支平,所宜,𧕯,,蚰蜒別名,
130,7,支,生開三支平,所宜,籭,,𥂖也又山佳切,
131,1,支,生合三支平,山垂,䪎,,鞍鞘一曰垂皃山垂切二,
Expand Down Expand Up @@ -16031,7 +16031,7 @@
2368,9,霽,疑開四齊去,五計,甈,,破罌,
2368,10,霽,疑開四齊去,五計,堄,,埤堄女牆也見博雅,
2368,11,霽,疑開四齊去,五計,霓,,虹又音倪,
2369,1,霽,見開四齊去,古詣,計,,籌計說文會也筭也又姓後漢有計子古詣切十二,
2369,1,霽,見開四齊去,古詣,計,,籌計說文會也筭也又姓後漢有計子勳古詣切十二,
2369,2,霽,見開四齊去,古詣,係,,連係,
2369,3,霽,見開四齊去,古詣,繼,,紹繼俗作継,
2369,4,霽,見開四齊去,古詣,繫,,縛繫又口奚胡計二切,
Expand Down

0 comments on commit 6ad4990

Please sign in to comment.