Skip to content

Pull requests: NVIDIA/TransformerEngine

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Avoid searching unnecessary dirs for shared libs bug Something isn't working
#1801 opened May 20, 2025 by timmoon10 Loading…
4 of 13 tasks
Check cublas/cuda versions for deprecated cublasLt GEMM attributes
#1800 opened May 19, 2025 by ksivaman Loading…
5 of 13 tasks
Add support for head_dim > 128
#1797 opened May 18, 2025 by cyanguwa Loading…
9 of 13 tasks
fix model parallel encoder to be properly sharded params
#1794 opened May 16, 2025 by sudhakarsingh27 Loading…
6 of 13 tasks
[PyTorch][MoE] Reduce CPU Overhead By Fuse Torch Empty Calls performance Performance issues
#1793 opened May 16, 2025 by zhongbozhu Loading…
1 of 13 tasks
[JAX][Draft] PR #1666 rebase, squash, and debug
#1792 opened May 16, 2025 by huanghua1994 Loading…
7 of 13 tasks
Add huggingface from_pretrained / save_pretrained tests
#1787 opened May 14, 2025 by pstjohn Loading…
Misc 2.4 2.4.0
#1780 opened May 13, 2025 by cyanguwa Loading…
9 of 13 tasks
[common] Added support of FP4 data type
#1779 opened May 13, 2025 by Oleg-Goncharov Loading…
6 of 13 tasks
[PyTorch] Update PyTorch FSDP2 test to cover all TE layer types 2.4.0 testing Improvements to tests or testing infrastructure
#1777 opened May 12, 2025 by denera Loading…
8 of 13 tasks
[PyTorch] Draft of new activation offloading API
#1762 opened May 8, 2025 by pggPL Draft
13 tasks
cache sequence chunk ids for reordering
#1757 opened May 7, 2025 by xrennvidia Draft
13 tasks
Zr te doc edits
#1745 opened May 2, 2025 by zredeaux07 Loading…
12 tasks
[PyTorch] Refactor activation offloading of quantized tensors.
#1738 opened Apr 30, 2025 by pggPL Loading…
8 of 13 tasks
Support Context Parallel for Multi Latent Attention (MLA)
#1729 opened Apr 29, 2025 by yuzhongw-nvidia Loading…
13 tasks
[JAX] Decouple Recipe and ScalingMode
#1728 opened Apr 29, 2025 by jberchtold-nvidia Loading…
8 of 13 tasks
Add variance calculation from FusedAdam optimizer states
#1726 opened Apr 28, 2025 by kwyss-nvidia Loading…
7 of 13 tasks
[JAX] GroupedDense v.2 without dynamic shape
#1721 opened Apr 25, 2025 by phu0ngng Draft
13 tasks
[JAX] Updated: unbalanced CP with THD format
#1709 opened Apr 22, 2025 by huanghua1994 Loading…
8 of 13 tasks
ProTip! Updated in the last three days: updated:>2025-05-16.