This repository has been archived by the owner on Aug 7, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 20
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
delayed scaling: stop syncing weight amax values across ranks (#277)
Summary: Pull Request resolved: #277 FSDP already ensures that each rank receives the same weight, so the amaxes of weights are the same on each rank. I checked performance before/after on the multi GPU benchmark and didn't see a significant impact on the toy model, but less comms value is better. Reviewed By: drisspg Differential Revision: D58396925 fbshipit-source-id: 9dc1253bdd49de4c1cf61843c1d778956981aa0e
- Loading branch information
1 parent
323fb48
commit 5d293a7
Showing
3 changed files
with
13 additions
and
11 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters