Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Remove assumption that padding only occurs on last rank (deepspeedai#…
…6974) As discussed in [PR-6918](deepspeedai#6918), padding can occur on multiple ranks with large DP degrees. For example, with: - Flattened tensor size: 266240 - DP degree: 768 - Alignment: 1536 - Required padding: 1024 (1536 * 174 - 266240) - Per-rank partition size: 348 (1536 * 174 / 768) - The padding occurs on last three ranks. This PR removes the single-rank padding assumption for more general cases. --------- Co-authored-by: Sam Foreman <[email protected]> Co-authored-by: Logan Adams <[email protected]> Co-authored-by: Olatunji Ruwase <[email protected]> Signed-off-by: siqi <[email protected]>
- Loading branch information