Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reorder 廣韻字頭 and correct some 字 & 釋義 (WIP) #10

Draft
wants to merge 20 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 14 additions & 8 deletions DEVELOP.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,24 @@

## Sources

- 廣韻(20170209).csv: From [廣韻字音表](https://zhuanlan.zhihu.com/p/20430939), created by poem.
- rime-table-0b69606.tsv: From [切韻新韻圖](https://phesoca.com/rime-table/) by unt, built from git commit `0b69606`.
- split.csv: Maintained here, ultimately also from 切韻新韻圖.
_poem_'s 廣韻 data:

- 廣韻(20170209).csv: From [廣韻字音表](https://zhuanlan.zhihu.com/p/20430939), created by _poem_

Maintained by NK2028:

- 小韻表.csv: 音韻地位 and 反切
- split.csv: Details of 小韻s with multiple 音韻地位s
- 字序表: Correct order of 廣韻's entries
- `poem_*` fields refer to _poem_'s 廣韻字音表
- `sbgy_*` fields refer to [宋本廣韻データ](https://kanji-database.sourceforge.net/dict/sbgy/index.html)
- `ytenx_*` fields refer to [韻典網](https://ytenx.org/)
- Data is taken from commit `d95d247` (2023-12-21), which differs from the current (as of Jan. 2025) deployed version (commit `3666370` 2020-03-23) by two 字頭s (小韻 1326 茅→芧, 小韻 2882 匕→𠤎)
- patches.csv: Corrections to _poem_'s data

## Build

```sh
python build.py
python check.py
```

## Remarks

- poem 表註「應補」者,給出 Unicode 字頭者均可見於原表末尾(小韻內字序號帶 .5),未給出者(以 IDS 或文字描述字頭)則仍未錄
- poem 表註「應換序」及「順序應爲」者,均未修正,且釋義補充字段亦有問題(似乎源自早先有女同車《廣韻全字表》底本差異)
32 changes: 25 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,30 @@
A database of the Qieyun phonological system.

- 韻書
- 王一:`王一.csv` (not completed)
- 王三:`王三.csv` (小韻內部待校)
- 廣韻澤存堂本:`廣韻.csv`
- 王一:`王一.csv` (not completed)
- 王三:`王三.csv` (小韻內部待校)
- 廣韻 (澤存堂本, with corrections from 廣韻校本):`廣韻.csv`
- 韻圖
- 韻鏡(嘉吉本):`韻鏡(嘉吉本).csv` (not completed)
- 韻鏡(古逸叢書本):`韻鏡(古逸叢書本).csv`
- 韻鏡(嘉吉本):`韻鏡(嘉吉本).csv` (not completed)
- 韻鏡(古逸叢書本):`韻鏡(古逸叢書本).csv`
- 反切音韻地位
- 王三:`王三反切音韻地位表.csv` (rev. Ayaka & unt)
- 廣韻:`廣韻反切音韻地位表.csv` (beta)
- 王三:`王三反切音韻地位表.csv` (rev. Ayaka & unt)
- 廣韻:`廣韻反切音韻地位表.csv` (beta)

## About fields in 韻書/廣韻.csv

- 小韻號: May contain -a/-b/-c if a 小韻 has multiple 音韻地位s
- 小韻字號: May contain -a1, -a2 etc for entries not present in 澤存堂本 but added back according to 廣韻校本
- 反切: May contain annotations:
- 脫字: `[徒]候` (小韻 #3067 豆)
- 訛字: `士<七>演` (小韻 #1625 淺)
- 改用其他來源的音韻地位: `姊宜⦉規⦊` (小韻 #133 厜)
- 替換成近似等價字,反切結果改變: `符咸(䒦)` (小韻 #1155 凡)
- 替換成音近字,反切結果改變: `式之(脂)` (小韻 #157 尸)
- 替換成等價字,反切結果不變: `甫⦅府⦆妄` (小韻 #2918 放)
- 替換成同音字,反切結果不變: `呼東⦅紅⦆` (小韻 #32 烘)
- 複合使用: `以沼⦅小⦆<水>` (小韻 #1692a 鷕)
- 字頭當刪: if nonempty, indicates this entry in 澤存堂本 is errorneous and should be removed according to 廣韻校本
- 釋義參照:
- `上` if 釋義 refers to the entry above ("同上", "俗", "古文" etc.)
- `下` if it shares 釋義 with the entry below ("並上同", "並古文" etc.)
Loading
Loading