Skip to content

Handle 774M (large)

Compare
Choose a tag to compare
@minimaxir minimaxir released this 28 Aug 17:11
· 50 commits to master since this release
e6afb28
  • 774M is explicitly blocked from being fine-tuned and will trigger an assert if attempted. If a way to finetune it without being super-painful is added, the ability to finetune it will be restored.
  • Allow ability to generate text from the default pretrained models by passing model_name to gpt2.load_gpt2() and gpt2.generate() (this will work with 774M.
  • Addsgd as an optimizer parameter to finetune (default: adam)
  • Support for changed model names, w/ changes more prominent in the README.