Add support for custom tokenizer for BLEU #73

devrimcavusoglu · 2022-02-10T22:22:40Z

Due to the nature of the Jury API, all input strings must be a whole (not tokenized), the current implementation of BLEU score is tokenized by white spaces. However, one might want results for smaller tokens, morphemes, or even character level rather than BLEU score of the words. Thus, it'd be great to support this with adding a support for tokenizer in the score computation for BLEU.

devrimcavusoglu added enhancement New feature or request help wanted Extra attention is needed labels Feb 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for custom tokenizer for BLEU #73

Add support for custom tokenizer for BLEU #73

devrimcavusoglu commented Feb 10, 2022

Add support for custom tokenizer for BLEU #73

Add support for custom tokenizer for BLEU #73

Comments

devrimcavusoglu commented Feb 10, 2022