Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: Prepare to introduce RAJA for all loops #1018

Merged

Conversation

johnbowen42
Copy link
Contributor

First, here is some context on this PR. This PR is the first step of introducing RAJA support to serac. It is part of the larger PR here #987, and is a subset of that work. This PR does the following
(1) Eliminates multiple variadic template parameters from the enclosing methods of where RAJA::forall loops will be called. Introduces workarounds
(2) Adds static asserts on thermal integrands to prevent users from supplying extended lambdas as an integrand
(3) Adds SERAC_HOST_DEVICE annotations to necessary functions
(4) Adds a functor ThermalMaterialIntegrand instead of the generic extended lambda from before

This is a lightweight PR, and is a prerequisite for the RAJA work mentioned above.

@johnbowen42 johnbowen42 force-pushed the feature/bowen/refactor-for-extended-host-device-lambdas branch from 36b61e3 to e1dfc3d Compare October 25, 2023 22:19
@codecov-commenter
Copy link

codecov-commenter commented Oct 26, 2023

Codecov Report

Merging #1018 (a1d55a8) into develop (ea7ab0f) will increase coverage by 0.01%.
Report is 3 commits behind head on develop.
The diff coverage is 98.71%.

@@             Coverage Diff             @@
##           develop    #1018      +/-   ##
===========================================
+ Coverage    89.60%   89.61%   +0.01%     
===========================================
  Files          141      142       +1     
  Lines        11195    11211      +16     
===========================================
+ Hits         10031    10047      +16     
  Misses        1164     1164              
Files Coverage Δ
.../numerics/functional/boundary_integral_kernels.hpp 100.00% <100.00%> (ø)
...serac/numerics/functional/detail/hexahedron_H1.inl 100.00% <100.00%> (ø)
...ac/numerics/functional/detail/hexahedron_Hcurl.inl 100.00% <ø> (ø)
...serac/numerics/functional/detail/hexahedron_L2.inl 100.00% <100.00%> (ø)
src/serac/numerics/functional/detail/qoi.inl 100.00% <100.00%> (ø)
...ac/numerics/functional/detail/quadrilateral_H1.inl 100.00% <100.00%> (ø)
...numerics/functional/detail/quadrilateral_Hcurl.inl 100.00% <ø> (ø)
...ac/numerics/functional/detail/quadrilateral_L2.inl 71.42% <100.00%> (ø)
...rc/serac/numerics/functional/detail/segment_H1.inl 100.00% <100.00%> (ø)
...rc/serac/numerics/functional/detail/segment_L2.inl 68.57% <100.00%> (ø)
... and 9 more

📣 Codecov offers a browser extension for seamless coverage viewing on GitHub. Try it in Chrome or Firefox today!

@jamiebramwell
Copy link
Member

@white238 , can you take a look at this?

*/
ThermalMaterialIntegrand(MaterialType material) : material_(material) {}

// Due to nvcc's lack of support for generic lambdas (i.e. functions of the form
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment is misleading, as nvcc does support generic lambdas, and it also supports their use in cuda kernels. For example,

https://godbolt.org/z/TM8zacGcM

What it doesn't support is the corner case of generic, extended lambdas.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, this should be specified for extended lambdas.

src/serac/numerics/functional/domain_integral_kernels.hpp Outdated Show resolved Hide resolved
src/serac/numerics/functional/domain_integral_kernels.hpp Outdated Show resolved Hide resolved
*/
ThermalMaterialIntegrand(MaterialType material) : material_(material) {}

// Due to nvcc's lack of support for generic lambdas (i.e. functions of the form
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment is misleading, as nvcc does support generic lambdas, and it also supports their use in cuda kernels. For example,

https://godbolt.org/z/TM8zacGcM

What it doesn't support is the corner case of generic, extended lambdas.

Comment on lines 306 to 310
class DummyArgumentType {};
static_assert(!std::is_invocable<MaterialType, DummyArgumentType&>::value);
static_assert(!std::is_invocable<MaterialType, DummyArgumentType*>::value);
static_assert(!std::is_invocable<MaterialType, DummyArgumentType>::value);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me what this is checking. If we want to detect for generic extended lambdas, then we have a compiler specific intrinsic:

https://gist.github.com/samuelpmish/5679a488c8724b215e96e107569b4fe9#file-main-cu-L49

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this static assert will pass if given a functor but fail if given a generic extended lambda, you can try it out in godbolt if you want. This is not compiler specific, which in my opinion is better in this case because we want the option to be able to compile exclusively with clang or gcc

Copy link
Contributor

@samuelpmishLLNL samuelpmishLLNL Nov 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this static assert will pass if given a functor but fail if given a generic extended lambda, you can try it out in godbolt if you want.

it incorrectly disallows functors with generic operator() definitions: https://godbolt.org/z/7nhEc1jYY

it incorrectly disallows non-extended lambdas: https://godbolt.org/z/rKrTcovcj

NVIDIA wrote a dedicated compiler intrinsic for checking exactly what we want to know, why not use it?


This is not compiler specific, which in my opinion is better

It's important to remember that the underlying limitation here is compiler specific, so imposing the restriction on all compilers isn't a benefit.

For instance, a user who wants to run their simulations on a CPU can (and probably should) use generic lambdas for their material / traction definitions, where appropriate. It doesn't seem right to take away a useful feature from all users, because one compiler doesn't support it.


we want the option to be able to compile exclusively with clang or gcc

Then we can guard the assertion behind conditional compilation checks that are only enabled for the compiler that cares

    #if __NVCC__
    static_assert(!__nv_is_extended_device_lambda_closure_type(MaterialType), "error: serac doesn't support extended lambdas with nvcc");
    #endif

see: https://godbolt.org/z/9n9oxWbeb

Copy link
Contributor Author

@johnbowen42 johnbowen42 Nov 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NVIDIA wrote a dedicated compiler intrinsic for checking exactly what we want to know, why not use it?

If we guard the static assert behind an ifdef, users will be able to write generic lambdas for CPU versions of serac. Then, if they want to switch to using GPU-enabled serac, their code will require refactoring. In my opinion it is better to simplify the requirements to users, and force them to write templated functors. I suppose the generic functor error is a limitation, but I feel that putting the static assert behind the if guard will only complicate things for users and make serac seem less portable.

Ultimately, we want serac to be portable to CPU and GPU platforms. We want user code to be portable to CPU and GPU platforms as well. By forcing users to write templated functors, we are providing users with simplicity and guarantee of portability.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just spoke offline with @jamiebramwell and she said that still allowing generic lambdas for CPU generic lambdas would permit faster development/be more convenient for users. I think this is a great point so we will go for @samuelpmishLLNL 's approach

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should document this difference very thoroughly in Doxygen and in Sphinx. Maybe even providing an example on how to convert from generic lambdas to the functor.

We should also be very careful in our unit testing. Defaulting to the non-lambda version should be the normal, but also add a couple test cases to ensure the generic CPU lambdas continue to work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just commenting here that I agree we should allow generic lambdas for rapid prototyping on non-nvcc compilers. We should just make sure that the vast majority of our examples and tests use the functor syntax as @white238 states.

@johnbowen42 johnbowen42 force-pushed the feature/bowen/refactor-for-extended-host-device-lambdas branch from 534dffe to 37eb4e1 Compare November 1, 2023 22:50
Copy link
Member

@jamiebramwell jamiebramwell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work @johnbowen42 !

src/serac/numerics/functional/domain_integral_kernels.hpp Outdated Show resolved Hide resolved
*/
ThermalMaterialIntegrand(MaterialType material) : material_(material) {}

// Due to nvcc's lack of support for generic lambdas (i.e. functions of the form
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, this should be specified for extended lambdas.

@johnbowen42 johnbowen42 force-pushed the feature/bowen/refactor-for-extended-host-device-lambdas branch from 9be83ae to a1d55a8 Compare November 2, 2023 23:48
@johnbowen42 johnbowen42 enabled auto-merge November 2, 2023 23:53
@johnbowen42 johnbowen42 merged commit 516f733 into develop Nov 3, 2023
@white238 white238 deleted the feature/bowen/refactor-for-extended-host-device-lambdas branch December 4, 2023 21:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants