SIGPT 🤖

My attempt at training a transformer based LLM model. The environment for interacting with this code base is managed by uv.

Inference

The model can be interacted with using the package's fastAPI server. The onnx artefact must be fetched from the GCP bucket in order to run the server locally. To start the server simply run:

uv run uvicorn sigpt.serve:app --port 8000 --host 0.0.0.0

which will accept POST requests

curl -X POST "http://0.0.0.0:8000/sigpt/?prompt=Hello%20world\!&batches=3"

Training

The model was trained on 8x A100 GPUs using LambdaLabs using the allenai/c4-en dataset. The training script can be invoked (assuming distributed workloads) using:

uv run --group train torchrun --standalone --nproc_per_node=8 scripts/train.py

Name		Name	Last commit message	Last commit date
Latest commit History 147 Commits
.github/workflows		.github/workflows
scripts		scripts
src/sigpt		src/sigpt
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SIGPT 🤖

Inference

Training

About

Releases

Packages

Languages

siranipour/sigpt

Folders and files

Latest commit

History

Repository files navigation

SIGPT 🤖

Inference

Training

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages