Skip to content

Fix skipped tests on GPU CI. #8706

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
1 of 2 tasks
ysiraichi opened this issue Feb 13, 2025 · 3 comments · Fixed by #8803
Closed
1 of 2 tasks

Fix skipped tests on GPU CI. #8706

ysiraichi opened this issue Feb 13, 2025 · 3 comments · Fixed by #8803
Assignees
Labels
CI CI related change testing Testing and coverage related issues. xla:gpu

Comments

@ysiraichi
Copy link
Collaborator

ysiraichi commented Feb 13, 2025

This is a tracking issue for fixing the tests that had to be skipped in the PR that brought back GPU CI #8593.

@ysiraichi ysiraichi added CI CI related change testing Testing and coverage related issues. xla:gpu labels Feb 13, 2025
@ysiraichi ysiraichi self-assigned this Feb 13, 2025
@ysiraichi ysiraichi reopened this Mar 6, 2025
@ysiraichi
Copy link
Collaborator Author

Running torch_distributed/test_ddp.py test on my 1-GPU machine I see test_ddp_correctness_large_net failing, but test_ddp_correctness_with_gradient_as_bucket_view passing. I will try to run this once again in a multi-GPU machine.

@ysiraichi
Copy link
Collaborator Author

I didn't realize that test_ddp_correctness_with_gradient_as_bucket_view is actually pretty recent. It was introduced in #8521 (January 2, 2025). The test actually fails there, too.

@ysiraichi
Copy link
Collaborator Author

Closing in favor of #8841

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI CI related change testing Testing and coverage related issues. xla:gpu
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant