Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support ROCm 6.0 #2206

Closed
fwyzard opened this issue Dec 17, 2023 · 5 comments · Fixed by #2210
Closed

Support ROCm 6.0 #2206

fwyzard opened this issue Dec 17, 2023 · 5 comments · Fixed by #2210

Comments

@fwyzard
Copy link
Contributor

fwyzard commented Dec 17, 2023

ROCm 6.0 was release a couple of days ago: https://rocm.docs.amd.com/en/docs-6.0.0/about/release-notes.html .

@fwyzard
Copy link
Contributor Author

fwyzard commented Dec 17, 2023

Can we add ROCm 5.6.1, 5.7.1 and 6.0.0 to the CI ?

Do the CI job run on bare metal or in a container ?
If it's a container, what version of ROCm is installed on the host OS ? What kernel driver does it use ?

According to https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/user-kernel-space-compat-matrix.html , ROCm is a lot less flexible than CUDA with respect to backward and forward compatibility.

@psychocoderHPC
Copy link
Member

We build on own container build on top of the original ROCm container.
The reason is that we need pre-build some dependencies to reduce the CI time.

I opened an issue in the container repo that we need ROCm 6.

https://codebase.helmholtz.cloud/crp/alpaka-group-container/-/issues/36

@psychocoderHPC
Copy link
Member

If it's a container, what version of ROCm is installed on the host OS ? What kernel driver does it use ?

That's hard to say because we do not maintain the host OS. We ask from time to time that the admins should update the driver.
I will do it next year, if I trigger this week an update it could end in a broken CI for the next weeks.

@SimeonEhrig
Copy link
Member

The containers should be not a problem. We use it only for caching. If a container with a specific ROCm version is not available, the CI will use another container and install ROCm each time, if the test is started.

Actual you need only add the version numbers to the version.py and the CI will generate the test. In the case of ROCm, there is something special. You need to add the clang version of each SDK version here:

if job[DEVICE_COMPILER][NAME] == HIPCC:

I will open a PR and add the tests. But if a test fails, I have not time to fix it at the moment.

@SimeonEhrig
Copy link
Member

The PR is open: #2207

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants