Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Figure out why MI210 did not work in nightlies #1064

Open
aprokop opened this issue Apr 11, 2024 · 2 comments
Open

Figure out why MI210 did not work in nightlies #1064

aprokop opened this issue Apr 11, 2024 · 2 comments
Labels
testing Anything to do with tests and CI

Comments

@aprokop
Copy link
Contributor

aprokop commented Apr 11, 2024

In #1048, we updated HIP builds to run on MI210 instead of MI100. It worked perfectly fine for continuous. But it did not work for nightly.

There are some differences:

  • In continuous, we use Dockerfile.hipcc. In nightly, we just grab the rocm image
  • In continuous, we use ROCm 5.6 (changed from 5.3 in Fix label for AMD GPU in the CI #1027). In nightly, we use 5.4

Currently, the patch for nightly was reverted in #1060.

Tasks

Preview Give feedback
No tasks being tracked yet.
@aprokop aprokop added the testing Anything to do with tests and CI label Apr 11, 2024
@aprokop
Copy link
Contributor Author

aprokop commented Apr 19, 2024

Interestingly, sometimes things pass in the nightlies. The latest build passed.

@Rombur
Copy link
Collaborator

Rombur commented Apr 19, 2024

It was a different machine. One of the machine is more stable than the other.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
testing Anything to do with tests and CI
Projects
None yet
Development

No branches or pull requests

2 participants