Skip to content

A bunch of highly-requested features

Compare
Choose a tag to compare
@minimaxir minimaxir released this 20 May 03:53
· 87 commits to master since this release
7dc8210

Adapted a few functions from Neil Shepperd's fork:

  • Nucleus Sampling (top_p) when generating text, which results in surprisingly different results. (setting top_p=0.9 works well). Supercedes top_k when used. (#51)
  • An encode_dataset() function to preencode and compress a large dataset before loading it for finetuning. (#19, #54)

Improvements to continuing model training:

  • overwrite argument for finetune: with restore_from="latest", this continues model training without creating a duplicate copy of the model, and is therefore good for transfer learning using multiple datasets (#20)
  • You can continue to finetune a model without having the original GPT-2 model present.

Improvements with I/O involving Colaboratory

  • Checkpoint folders are now packaged into a .tar file when copying to Google Drive, and when copying from Google Drive, the '.tar' file is automatically unpackaged into the correct checkpoint format. (you can pass copy_folder=True to the copy_checkpoint function to revert to the old behavior). (#37: thanks @woctezuma !)
  • copy_checkpoint_to_gdrive and copy_checkpoint_from_gdrive now take a run_name argument instead of a checkpoint_folder argument.

Miscellaneous

  • Added CLI arguments for top_k, top_p, overwrite.
  • Cleaned up redundant function parameters (#39)