Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory saving loading weight for non-quant models #56

Closed
wants to merge 3 commits into from

Conversation

KaneGreen
Copy link

@KaneGreen KaneGreen commented Apr 5, 2024

Trying to fix #51
And this also increase the speed of loading weights. (in my computer, about 1min vs 2min)
Tested on 1.1-7b-it and 7b-it model.

but:

  1. This method is not suitable for the int8 data type, so the original loading method is still used when using the quant model.
  2. The new loading method will automatically reset the requires_grad of nn.Parameter in Linear and Embedding to True after the loading is completed. (I don't know why some nn.Parameters in model.py have requires_grad as False and others as default True) But I think, this shouldn't affect since forward function of GemmaForCausalLM has @torch.no_grad().

Copy link

google-cla bot commented Apr 5, 2024

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@KaneGreen
Copy link
Author

@pengchongjin @michaelmoynihan Any idea on this PR?

@pengchongjin
Copy link
Collaborator

Thanks, @KaneGreen ! Coould you please sign the CLA in order to pass the pre-check?

@KaneGreen
Copy link
Author

@pengchongjin I've signed that. But it doesn't update. Any way to re-run this check?

@KaneGreen
Copy link
Author

@pengchongjin CLA has been signed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

How to save memory when loading weights?
2 participants