Skip to content
/ lm Public

teaching myself about transformer language models

Notifications You must be signed in to change notification settings

immindich/lm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This repo contains code for training and running autoregressive language models based on the transformer architecture. I mainly wrote this code to teach myself PyTorch and learn more about how large language models work. It should not be used for anything serious. Heavily inspired by Andrej Karpathy's nanoGPT.

Current features:

Planned:

  • Hybrid architectures like Jamba
  • Mixture of Experts

About

teaching myself about transformer language models

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published