You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Running torch_distributed/test_ddp.py test on my 1-GPU machine I see test_ddp_correctness_large_net failing, but test_ddp_correctness_with_gradient_as_bucket_view passing. I will try to run this once again in a multi-GPU machine.
I didn't realize that test_ddp_correctness_with_gradient_as_bucket_view is actually pretty recent. It was introduced in #8521 (January 2, 2025). The test actually fails there, too.
This is a tracking issue for fixing the tests that had to be skipped in the PR that brought back GPU CI #8593.
TestXrtDistributedDataParallel.test_ddp_correctness_with_gradient_as_bucket_view
:test_ddp_correctness_with_gradient_as_bucket_view
fails for multi-device CUDA #8841TritonTest
: Re-enable triton tests. #8803The text was updated successfully, but these errors were encountered: