GitHub - immindich/lm: teaching myself about transformer language models

This repo contains code for training and running autoregressive language models based on the transformer architecture. I mainly wrote this code to teach myself PyTorch and learn more about how large language models work. It should not be used for anything serious. Heavily inspired by Andrej Karpathy's nanoGPT.

Current features:

Basic decoder-only transformer architecture with learned positional embeddings in the style of GPT-2
RMSNorm
Gated feedforward layers
Rotary Positional Embeddings

Planned:

Hybrid architectures like Jamba
Mixture of Experts

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.gitignore		.gitignore
README.md		README.md
config.py		config.py
generate.ipynb		generate.ipynb
requirements.txt		requirements.txt
train.py		train.py
transformer.py		transformer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

immindich/lm

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages