forked from NVIDIA/TransformerEngine
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pull] main from NVIDIA:main #32
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* Align RNG tracker with megatron Signed-off-by: Robin Zhang <[email protected]> Co-authored-by: Yifei Song <[email protected]> * Fix module_params order and warmup bug in cudagraph Signed-off-by: Robin Zhang <[email protected]> Co-authored-by: Yifei Song <[email protected]> * Add fp8_group argument and fix fp8 accuracy issue for cudagraph Signed-off-by: Robin Zhang <[email protected]> Co-authored-by: Yifei Song <[email protected]> * Add TE modules and weights filters to support MoE models Signed-off-by: Robin Zhang <[email protected]> Co-authored-by: Yifei Song <[email protected]> * Revert self.fp8 Signed-off-by: Robin Zhang <[email protected]> * Use hooks to filter module params Signed-off-by: Robin Zhang <[email protected]> * Filter all TE modules in hooks Signed-off-by: Robin Zhang <[email protected]> Co-authored-by: Yifei Song <[email protected]> * Format code Signed-off-by: Robin Zhang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update graph.py Signed-off-by: Xin Yao <[email protected]> * Revert CudaRNGStatesTracker Signed-off-by: Robin Zhang <[email protected]> * Format Update Signed-off-by: Yifei Song <[email protected]> * Revert "Use hooks to filter module params" This reverts commit 73a22e2. Signed-off-by: Yifei Song <[email protected]> * Remove filtering module params Signed-off-by: Robin Zhang <[email protected]> --------- Signed-off-by: Robin Zhang <[email protected]> Signed-off-by: Xin Yao <[email protected]> Signed-off-by: Yifei Song <[email protected]> Co-authored-by: Yifei Song <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Xin Yao <[email protected]> Co-authored-by: Xin Yao <[email protected]> Co-authored-by: Tim Moon <[email protected]>
Moved framework agnostic THD kernels to common. --------- Signed-off-by: Michael Goldfarb <[email protected]>
* retain_graph=True for grouped gemm Signed-off-by: Xiaowei Ren <[email protected]> * remove an unnecessary retain_graph=True Signed-off-by: Xiaowei Ren <[email protected]> * make retain_graph in graph capture configurable Signed-off-by: Xiaowei Ren <[email protected]> * typo fix Signed-off-by: Xiaowei Ren <[email protected]> --------- Signed-off-by: Xiaowei Ren <[email protected]>
* Update list of CI users Signed-off-by: Tim Moon <[email protected]> * Update list of CI users Signed-off-by: Tim Moon <[email protected]> --------- Signed-off-by: Tim Moon <[email protected]>
…age (#1308) * draft implementation Signed-off-by: Youngeun Kwon <[email protected]> * compile error fix Signed-off-by: Youngeun Kwon <[email protected]> * fix compile error Signed-off-by: Youngeun Kwon <[email protected]> * remove print Signed-off-by: Youngeun Kwon <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Edit comments Signed-off-by: Youngeun Kwon <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * edit the bulk-overlap test case Signed-off-by: Youngeun Kwon <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add version guard Signed-off-by: Youngeun Kwon <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add runtime version guard Signed-off-by: Youngeun Kwon <[email protected]> * fix the version guard Signed-off-by: Youngeun Kwon <[email protected]> --------- Signed-off-by: Youngeun Kwon <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Scale sequence length in CP tests to avoid tiny sizes. Signed-off-by: Michael Goldfarb <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.1)
Can you help keep this open source service alive? 💖 Please sponsor : )