-
Notifications
You must be signed in to change notification settings - Fork 4.4k
Pull requests: deepspeedai/DeepSpeed
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Fix the GPU memory usage of ZeRO-Offload (only update stage_1_and_2.py)
#7309
opened May 23, 2025 by
arminzhu
Loading…
Fix: Update grad norm calculation for CPU offload
#7302
opened May 22, 2025 by
therealnaveenkamal
Loading…
docs: fix Adam paper link and correct grammatical error in fused_adam.py
#7294
opened May 19, 2025 by
ishanjmukherjee
Loading…
Fix AutoTP gathering replaced layer params when bias is not None
#7257
opened Apr 28, 2025 by
HollowMan6
Loading…
HF2UCP: Converting a
pytorch_model.bin
or .safetensors
checkpoint to UCP
#7212
opened Apr 10, 2025 by
Schwidola0607
Loading…
Add DataStates-LLM: Asynchronous Checkpointing Engine Support
#7166
opened Mar 21, 2025 by
mauryaavinash95
Loading…
fixed: Modified the topkgating function and modified the test_moe file for testing
#7163
opened Mar 21, 2025 by
xiongjyu
Loading…
[bugfix] update results of state_dict loading, embedding resizing to secondary partitions (hpz)
#7130
opened Mar 11, 2025 by
cyr0930
Loading…
Fix, pipeline model with moe cause error when send grad
#7055
opened Feb 19, 2025 by
wukong1992
Loading…
Add
pyproject.toml
with legacy build backend to keep most logic in setup.py
#7033
opened Feb 13, 2025 by
loadams
Loading…
4 of 5 tasks
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.