-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Example] NanoGPT support in DLrover #516
[Example] NanoGPT support in DLrover #516
Conversation
A big step towards #350 completion in Allreduce mode. |
The PR need to merge the master branch and format codes to pass pre-commit. |
I tried to run pre-commit locally with the docker |
Nanogpt meaning? |
NanoGPT is a GPT to build from scratch by setting n_layer,n_head, and n_embedding of the transformer model. Users can test the ability of dlrover on GPT scaling from 6M parameters to 1.5B parameters (GPT2-xl size) or even larger. |
We can submit a doc to explain how to scaling nanoGPT to test elasticity ability of dlrover. |
|
to add a parameter interpreter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this source refer to other open source implementation of GPT2?
Yes, and the reference will be added to the doc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This will be resolved in #526 . |
LGTM |
b014272
into
intelligent-machine-learning:master
Add nanogpt job support.
UT result
:No modification to main code. This pr add nanoGPT(GPT2) code support.
Job Test result:
kubectl get pod
worker0 log
worker1 log