Adjustable multibucket sizes #7

tdozat · 2017-06-18T21:40:45Z

Currently, the model raises an error if not all buckets can be filled. When training this is hardly a problem, and likewise when parsing a sequence of suitably large files. However, when the list of files to parse contains a mix of large files and small files, this causes problems--in order to parse the large files quickly without large memory consumption, you need to sort it into multiple buckets, but in order to parse files with only one or two sentences you can't use more than one or two buckets.

In order to handle a mix of large and small files, the system needs a way of setting up empty buckets.

tdozat added the enhancement label Jun 18, 2017

tdozat self-assigned this Jun 18, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adjustable multibucket sizes #7

Adjustable multibucket sizes #7

tdozat commented Jun 18, 2017

Adjustable multibucket sizes #7

Adjustable multibucket sizes #7

Comments

tdozat commented Jun 18, 2017