Training set sometimes required for parsing #1

tdozat · 2017-06-18T17:35:50Z

The model saves a list of all the tokens in the vocabulary in save_dir/words.txt. If there's a case mismatch between the character model and the token model--that is, if you want the character model to be cased and the word vocabulary to be caseless--it reads through the training set to build up the character vocabulary. This is a problem when you only want to parse and the training set isn't available.

Solution: modify the code to save cased and caseless vocabularies in save_dir/words-cased.txt and save_dir/words-caseless.txt, and at parse time load whichever one is dictated by the cased configuration setting.

The text was updated successfully, but these errors were encountered:

tdozat added the bug label Jun 18, 2017

tdozat self-assigned this Jun 18, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training set sometimes required for parsing #1

Training set sometimes required for parsing #1

tdozat commented Jun 18, 2017

Training set sometimes required for parsing #1

Training set sometimes required for parsing #1

Comments

tdozat commented Jun 18, 2017