Skip to content

siranipour/sigpt

Repository files navigation

SIGPT 🤖

My attempt at training a transformer based LLM model. The environment for interacting with this code base is managed by uv.

Inference

The model can be interacted with using the package's fastAPI server. The onnx artefact must be fetched from the GCP bucket in order to run the server locally. To start the server simply run:

uv run uvicorn sigpt.serve:app --port 8000 --host 0.0.0.0

which will accept POST requests

curl -X POST "http://0.0.0.0:8000/sigpt/?prompt=Hello%20world\!&batches=3"

Training

The model was trained on 8x A100 GPUs using LambdaLabs using the allenai/c4-en dataset. The training script can be invoked (assuming distributed workloads) using:

uv run --group train torchrun --standalone --nproc_per_node=8 scripts/train.py

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published