-
Notifications
You must be signed in to change notification settings - Fork 19
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
refactor(Jetstream Pt): avoid duplicating Llama modeling
Since this is error-prone, a better solution is just to use this. This hadn't been done before mainly because in the model config we do not have some of the params anymore (ffn_dim_multiplier and multiple_of). We do have intermediate_size though, and that is enough to reconstruct parameters that end up producing the same calculation. This refactor should allow for future code to follow Jetstream/Pytorch changes in an easier way.
- Loading branch information
1 parent
d4e7310
commit fa50f3b
Showing
1 changed file
with
18 additions
and
294 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters