Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rocm6.4_internal_testing] [NAVI32] Skipped sdpa_2 test in test_aot_inductor for Navi32 #1882

Merged

Conversation

iupaikov-amd
Copy link

@iupaikov-amd iupaikov-amd commented Feb 5, 2025

The test fails with assertion error "Tensors are not close"

After testing I can confirm that this issue is caused by eager mode execution specific to navi32 during the test_sdpa_2 run. Made a cross reference between navi31, navi32 and mi300. AOTInductor results are all the exact same for all of the archs, only the eager mode fails here for navi32 with 1.5% difference in tensor values from the gpu run. I assume that this happens due to fp16-32-16 conversions in eager mode or missing some if-statements for navi32 specifically.

Simple reproducer to check the values for cpu/gpu/eager/aoti runs.
gfx1101_test_sdpa_2_issue_reproducer.txt

@rocm-repo-management-api
Copy link

rocm-repo-management-api bot commented Feb 5, 2025

Jenkins build for 5d647c36630d8d201cfe8a29820943bc4c2191a2 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

@jataylo
Copy link

jataylo commented Feb 5, 2025

@iupaikov-amd please add PR description explaining the justification for skipping

@rocm-repo-management-api
Copy link

rocm-repo-management-api bot commented Feb 5, 2025

Jenkins build for 5d647c36630d8d201cfe8a29820943bc4c2191a2 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

@iupaikov-amd
Copy link
Author

Added description

@rocm-repo-management-api
Copy link

rocm-repo-management-api bot commented Feb 7, 2025

Jenkins build for 5d647c36630d8d201cfe8a29820943bc4c2191a2 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

@rocm-repo-management-api
Copy link

rocm-repo-management-api bot commented Feb 8, 2025

Jenkins build for 5d647c36630d8d201cfe8a29820943bc4c2191a2 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

@rocm-repo-management-api
Copy link

rocm-repo-management-api bot commented Feb 11, 2025

Jenkins build for 5d647c36630d8d201cfe8a29820943bc4c2191a2 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

@rocm-repo-management-api
Copy link

rocm-repo-management-api bot commented Feb 18, 2025

Jenkins build for 5d647c36630d8d201cfe8a29820943bc4c2191a2 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

@pruthvistony pruthvistony merged commit 896c789 into rocm6.4_internal_testing Feb 19, 2025
9 of 13 checks passed
@pruthvistony pruthvistony deleted the iupaikov_test_sdpa_2_skip_rocm6.4 branch February 19, 2025 17:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants