TextCleaner could benefit from templating #5

jirkamarsik · 2011-03-23T04:50:00Z

TextCleaner could sends its output in UTF-32 so the RoughTokenizer doesn't have to redecode it from UTF8. Since the TextCleaner must also be able to output UTF8 for the Classifier stage (reading annotated data and aligning), the TextCleaner class would have to be heavily templated. Performance gain would probably wouldn't be too high.

jirkamarsik · 2011-05-08T00:49:51Z

During a potential rewrite session (not too likely in the coming weeks), it could be useful to actually switch from UTF-8 to UTF-32 for most of the application.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TextCleaner could benefit from templating #5

TextCleaner could benefit from templating #5

jirkamarsik commented Mar 23, 2011

jirkamarsik commented May 8, 2011

TextCleaner could benefit from templating #5

TextCleaner could benefit from templating #5

Comments

jirkamarsik commented Mar 23, 2011

jirkamarsik commented May 8, 2011