-
Notifications
You must be signed in to change notification settings - Fork 11
libpinyin TODOs
Peng Wu edited this page Jul 3, 2012
·
9 revisions
libpinyin TODOs
-
input by keystroke sequence, and output a list of candidate word. (maybe in input method engine.)
-
Berkeley database replacement. (optional)
-
Tri-gram support.
Note: the task order is unimportant.
-
large web raw corpus training. (in progress.)
-
try to support fuzzy pinyin segment. (initial support done.)
-
support for bow value of back-off model.
a. an inherited class from n-gram. b. or re-write a new n-gram class for back-off model.
-
computing bow value, and store it in a sub-class of n-gram.
-
Back-off pinyin lookup algorithms.
-
Entropy-based pruning. (maybe optional)
Note: the above tasks are for sunpinyin backward compatibility, as some items are already partially implemented in sunpinyin, we will try to port it to libpinyin.