Skip to content

Pinyin Parser Survey

epico edited this page Dec 8, 2010 · 1 revision

Pinyin Parser Survey in Open Source World

1. Features comparations of existing pinyin parsers. (listed by alphabetic)

a. android pinyin:

  1. also called SCIM-GooglePinyin.

  2. seems uni-gram based.

b. ibus-pinyin:

  1. support single pinyin and double pinyin.

  2. support pinyin correction.

  3. fuzzy pinyin support.

  4. maximum forward algorithms for pinyin parsing.

c. novel-pinyin:

  1. support single pinyin.

  2. support double pinyin, with multiple double pinyin schemes.

  3. fuzzy pinyin support.

  4. dynamic programming like algorithms, will choose the longest pinyin sequence with least pinyin keys.

d. scim-pinyin:

  1. as novel-pinyin uses a slightly modified version of pinyin parser inherited from scim-pinyin, please see the section of novel-pinyin.

e. sunpinyin:

  1. support single pinyin and double pinyin (with multiple double pinyin schemes).

  2. support pinyin correction.

  3. fuzzy pinyin support.

  4. limited fuzzy pinyin segment.

For libpinyin, desired feature list:

  1. support single pinyin.

  2. support double pinyin, with multiple double pinyin schemes.

  3. fuzzy pinyin support.

  4. support pinyin correction.

  5. fuzzy pinyin segment.