-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OMP target implementation #2
base: master
Are you sure you want to change the base?
Conversation
Replaced all std math call with cmath
Default to Ofast
Add workaround for GCC producing runtime error at target data exit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lots of great stuff here. The kernels all look good to me.
Some specific comments throughout that it would be good to look over.
Thanks for the good work here, and the great documentation in the README about building it with different compilers.
Handle CMake <= 3.12 gracefully
Refactor/document buffer synchronisation macros
Only print OMP messages for boss rank
All comments have been addressed. This is ready for merge I think. |
Leave out native opt flags (to be added by the user)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure you update the README with the new names. We should also probably remove CentOS 7 as a prerequisite, since the application should work everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the justification for changing the binary name? It's clover_leaf
in all other implementations, so we should just leave that.
Ah, so for |
I think I'll go with the easy path now, which is to rename the kokkos' one to |
We won't change the reference versions, so all ports should match that. I guess the Kokkos one slipped through. |
This PR contains the full OMP target implementation, it has been tested on the following configurations with correct results:
GCC 10.1 with RadeonVII fails at link time with the following:
Note that GCC+RadeonVII worked correctly at 6456a30, it was then broken in d7857cd.
The changes between the two commits are non trivial so investigating this will take some time.
This PR will try to address the error if time allows, but it is good for review as is.