-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature: Prepare to introduce RAJA for all loops #1018
feature: Prepare to introduce RAJA for all loops #1018
Conversation
36b61e3
to
e1dfc3d
Compare
Codecov Report
@@ Coverage Diff @@
## develop #1018 +/- ##
===========================================
+ Coverage 89.60% 89.61% +0.01%
===========================================
Files 141 142 +1
Lines 11195 11211 +16
===========================================
+ Hits 10031 10047 +16
Misses 1164 1164
📣 Codecov offers a browser extension for seamless coverage viewing on GitHub. Try it in Chrome or Firefox today! |
@white238 , can you take a look at this? |
src/serac/physics/heat_transfer.hpp
Outdated
*/ | ||
ThermalMaterialIntegrand(MaterialType material) : material_(material) {} | ||
|
||
// Due to nvcc's lack of support for generic lambdas (i.e. functions of the form |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this comment is misleading, as nvcc
does support generic lambdas, and it also supports their use in cuda kernels. For example,
https://godbolt.org/z/TM8zacGcM
What it doesn't support is the corner case of generic, extended lambdas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, this should be specified for extended lambdas.
src/serac/physics/heat_transfer.hpp
Outdated
*/ | ||
ThermalMaterialIntegrand(MaterialType material) : material_(material) {} | ||
|
||
// Due to nvcc's lack of support for generic lambdas (i.e. functions of the form |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this comment is misleading, as nvcc
does support generic lambdas, and it also supports their use in cuda kernels. For example,
https://godbolt.org/z/TM8zacGcM
What it doesn't support is the corner case of generic, extended lambdas.
src/serac/physics/heat_transfer.hpp
Outdated
class DummyArgumentType {}; | ||
static_assert(!std::is_invocable<MaterialType, DummyArgumentType&>::value); | ||
static_assert(!std::is_invocable<MaterialType, DummyArgumentType*>::value); | ||
static_assert(!std::is_invocable<MaterialType, DummyArgumentType>::value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear to me what this is checking. If we want to detect for generic extended lambdas, then we have a compiler specific intrinsic:
https://gist.github.com/samuelpmish/5679a488c8724b215e96e107569b4fe9#file-main-cu-L49
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this static assert will pass if given a functor but fail if given a generic extended lambda, you can try it out in godbolt if you want. This is not compiler specific, which in my opinion is better in this case because we want the option to be able to compile exclusively with clang or gcc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this static assert will pass if given a functor but fail if given a generic extended lambda, you can try it out in godbolt if you want.
it incorrectly disallows functors with generic operator()
definitions: https://godbolt.org/z/7nhEc1jYY
it incorrectly disallows non-extended lambdas: https://godbolt.org/z/rKrTcovcj
NVIDIA wrote a dedicated compiler intrinsic for checking exactly what we want to know, why not use it?
This is not compiler specific, which in my opinion is better
It's important to remember that the underlying limitation here is compiler specific, so imposing the restriction on all compilers isn't a benefit.
For instance, a user who wants to run their simulations on a CPU can (and probably should) use generic lambdas for their material / traction definitions, where appropriate. It doesn't seem right to take away a useful feature from all users, because one compiler doesn't support it.
we want the option to be able to compile exclusively with clang or gcc
Then we can guard the assertion behind conditional compilation checks that are only enabled for the compiler that cares
#if __NVCC__
static_assert(!__nv_is_extended_device_lambda_closure_type(MaterialType), "error: serac doesn't support extended lambdas with nvcc");
#endif
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NVIDIA wrote a dedicated compiler intrinsic for checking exactly what we want to know, why not use it?
If we guard the static assert behind an ifdef, users will be able to write generic lambdas for CPU versions of serac. Then, if they want to switch to using GPU-enabled serac, their code will require refactoring. In my opinion it is better to simplify the requirements to users, and force them to write templated functors. I suppose the generic functor error is a limitation, but I feel that putting the static assert behind the if guard will only complicate things for users and make serac seem less portable.
Ultimately, we want serac to be portable to CPU and GPU platforms. We want user code to be portable to CPU and GPU platforms as well. By forcing users to write templated functors, we are providing users with simplicity and guarantee of portability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just spoke offline with @jamiebramwell and she said that still allowing generic lambdas for CPU generic lambdas would permit faster development/be more convenient for users. I think this is a great point so we will go for @samuelpmishLLNL 's approach
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should document this difference very thoroughly in Doxygen and in Sphinx. Maybe even providing an example on how to convert from generic lambdas to the functor.
We should also be very careful in our unit testing. Defaulting to the non-lambda version should be the normal, but also add a couple test cases to ensure the generic CPU lambdas continue to work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just commenting here that I agree we should allow generic lambdas for rapid prototyping on non-nvcc compilers. We should just make sure that the vast majority of our examples and tests use the functor syntax as @white238 states.
…ns from other refactoring
534dffe
to
37eb4e1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent work @johnbowen42 !
src/serac/physics/heat_transfer.hpp
Outdated
*/ | ||
ThermalMaterialIntegrand(MaterialType material) : material_(material) {} | ||
|
||
// Due to nvcc's lack of support for generic lambdas (i.e. functions of the form |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, this should be specified for extended lambdas.
9be83ae
to
a1d55a8
Compare
First, here is some context on this PR. This PR is the first step of introducing RAJA support to serac. It is part of the larger PR here #987, and is a subset of that work. This PR does the following
(1) Eliminates multiple variadic template parameters from the enclosing methods of where RAJA::forall loops will be called. Introduces workarounds
(2) Adds static asserts on thermal integrands to prevent users from supplying extended lambdas as an integrand
(3) Adds SERAC_HOST_DEVICE annotations to necessary functions
(4) Adds a functor ThermalMaterialIntegrand instead of the generic extended lambda from before
This is a lightweight PR, and is a prerequisite for the RAJA work mentioned above.