feature: Prepare to introduce RAJA for all loops #1018

johnbowen42 · 2023-10-25T22:12:59Z

First, here is some context on this PR. This PR is the first step of introducing RAJA support to serac. It is part of the larger PR here #987, and is a subset of that work. This PR does the following
(1) Eliminates multiple variadic template parameters from the enclosing methods of where RAJA::forall loops will be called. Introduces workarounds
(2) Adds static asserts on thermal integrands to prevent users from supplying extended lambdas as an integrand
(3) Adds SERAC_HOST_DEVICE annotations to necessary functions
(4) Adds a functor ThermalMaterialIntegrand instead of the generic extended lambda from before

This is a lightweight PR, and is a prerequisite for the RAJA work mentioned above.

codecov-commenter · 2023-10-26T17:09:14Z

Codecov Report

Merging #1018 (a1d55a8) into develop (ea7ab0f) will increase coverage by 0.01%.
Report is 3 commits behind head on develop.
The diff coverage is 98.71%.

@@             Coverage Diff             @@
##           develop    #1018      +/-   ##
===========================================
+ Coverage    89.60%   89.61%   +0.01%     
===========================================
  Files          141      142       +1     
  Lines        11195    11211      +16     
===========================================
+ Hits         10031    10047      +16     
  Misses        1164     1164

Files	Coverage Δ
.../numerics/functional/boundary_integral_kernels.hpp	`100.00% <100.00%> (ø)`
...serac/numerics/functional/detail/hexahedron_H1.inl	`100.00% <100.00%> (ø)`
...ac/numerics/functional/detail/hexahedron_Hcurl.inl	`100.00% <ø> (ø)`
...serac/numerics/functional/detail/hexahedron_L2.inl	`100.00% <100.00%> (ø)`
src/serac/numerics/functional/detail/qoi.inl	`100.00% <100.00%> (ø)`
...ac/numerics/functional/detail/quadrilateral_H1.inl	`100.00% <100.00%> (ø)`
...numerics/functional/detail/quadrilateral_Hcurl.inl	`100.00% <ø> (ø)`
...ac/numerics/functional/detail/quadrilateral_L2.inl	`71.42% <100.00%> (ø)`
...rc/serac/numerics/functional/detail/segment_H1.inl	`100.00% <100.00%> (ø)`
...rc/serac/numerics/functional/detail/segment_L2.inl	`68.57% <100.00%> (ø)`
... and 9 more

📣 Codecov offers a browser extension for seamless coverage viewing on GitHub. Try it in Chrome or Firefox today!

jamiebramwell · 2023-10-31T20:28:10Z

@white238 , can you take a look at this?

src/serac/numerics/functional/tests/CMakeLists.txt

white238 · 2023-10-31T21:10:10Z

src/serac/physics/heat_transfer.hpp

+     */
+    ThermalMaterialIntegrand(MaterialType material) : material_(material) {}
+
+    // Due to nvcc's lack of support for generic lambdas (i.e. functions of the form


this comment is misleading, as nvcc does support generic lambdas, and it also supports their use in cuda kernels. For example,

https://godbolt.org/z/TM8zacGcM

What it doesn't support is the corner case of generic, extended lambdas.

I agree, this should be specified for extended lambdas.

src/serac/numerics/functional/boundary_integral_kernels.hpp

src/serac/numerics/functional/domain_integral_kernels.hpp

samuelpmishLLNL · 2023-11-01T17:26:59Z

src/serac/physics/heat_transfer.hpp

+     */
+    ThermalMaterialIntegrand(MaterialType material) : material_(material) {}
+
+    // Due to nvcc's lack of support for generic lambdas (i.e. functions of the form


this comment is misleading, as nvcc does support generic lambdas, and it also supports their use in cuda kernels. For example,

https://godbolt.org/z/TM8zacGcM

What it doesn't support is the corner case of generic, extended lambdas.

samuelpmishLLNL · 2023-11-01T17:32:29Z

src/serac/physics/heat_transfer.hpp

+    class DummyArgumentType {};
+    static_assert(!std::is_invocable<MaterialType, DummyArgumentType&>::value);
+    static_assert(!std::is_invocable<MaterialType, DummyArgumentType*>::value);
+    static_assert(!std::is_invocable<MaterialType, DummyArgumentType>::value);


It's not clear to me what this is checking. If we want to detect for generic extended lambdas, then we have a compiler specific intrinsic:

https://gist.github.com/samuelpmish/5679a488c8724b215e96e107569b4fe9#file-main-cu-L49

this static assert will pass if given a functor but fail if given a generic extended lambda, you can try it out in godbolt if you want. This is not compiler specific, which in my opinion is better in this case because we want the option to be able to compile exclusively with clang or gcc

this static assert will pass if given a functor but fail if given a generic extended lambda, you can try it out in godbolt if you want.

it incorrectly disallows functors with generic operator() definitions: https://godbolt.org/z/7nhEc1jYY

it incorrectly disallows non-extended lambdas: https://godbolt.org/z/rKrTcovcj

NVIDIA wrote a dedicated compiler intrinsic for checking exactly what we want to know, why not use it?

This is not compiler specific, which in my opinion is better

It's important to remember that the underlying limitation here is compiler specific, so imposing the restriction on all compilers isn't a benefit.

For instance, a user who wants to run their simulations on a CPU can (and probably should) use generic lambdas for their material / traction definitions, where appropriate. It doesn't seem right to take away a useful feature from all users, because one compiler doesn't support it.

we want the option to be able to compile exclusively with clang or gcc

Then we can guard the assertion behind conditional compilation checks that are only enabled for the compiler that cares

#if __NVCC__ static_assert(!__nv_is_extended_device_lambda_closure_type(MaterialType), "error: serac doesn't support extended lambdas with nvcc"); #endif

see: https://godbolt.org/z/9n9oxWbeb

NVIDIA wrote a dedicated compiler intrinsic for checking exactly what we want to know, why not use it?

If we guard the static assert behind an ifdef, users will be able to write generic lambdas for CPU versions of serac. Then, if they want to switch to using GPU-enabled serac, their code will require refactoring. In my opinion it is better to simplify the requirements to users, and force them to write templated functors. I suppose the generic functor error is a limitation, but I feel that putting the static assert behind the if guard will only complicate things for users and make serac seem less portable.

Ultimately, we want serac to be portable to CPU and GPU platforms. We want user code to be portable to CPU and GPU platforms as well. By forcing users to write templated functors, we are providing users with simplicity and guarantee of portability.

I just spoke offline with @jamiebramwell and she said that still allowing generic lambdas for CPU generic lambdas would permit faster development/be more convenient for users. I think this is a great point so we will go for @samuelpmishLLNL 's approach

We should document this difference very thoroughly in Doxygen and in Sphinx. Maybe even providing an example on how to convert from generic lambdas to the functor.

We should also be very careful in our unit testing. Defaulting to the non-lambda version should be the normal, but also add a couple test cases to ensure the generic CPU lambdas continue to work.

Just commenting here that I agree we should allow generic lambdas for rapid prototyping on non-nvcc compilers. We should just make sure that the vast majority of our examples and tests use the functor syntax as @white238 states.

… kernels

…ns from other refactoring

…. Clang format

jamiebramwell

Excellent work @johnbowen42 !

src/serac/numerics/functional/boundary_integral_kernels.hpp

src/serac/numerics/functional/domain_integral_kernels.hpp

src/serac/numerics/functional/function_signature.hpp

jamiebramwell · 2023-11-02T17:09:45Z

src/serac/physics/heat_transfer.hpp

+     */
+    ThermalMaterialIntegrand(MaterialType material) : material_(material) {}
+
+    // Due to nvcc's lack of support for generic lambdas (i.e. functions of the form


I agree, this should be specified for extended lambdas.

…device-lambdas

johnbowen42 force-pushed the feature/bowen/refactor-for-extended-host-device-lambdas branch from 36b61e3 to e1dfc3d Compare October 25, 2023 22:19

johnbowen42 requested review from samuelpmishLLNL, jamiebramwell and white238 October 25, 2023 22:44

white238 reviewed Oct 31, 2023

View reviewed changes

src/serac/numerics/functional/tests/CMakeLists.txt Outdated Show resolved Hide resolved

white238 reviewed Oct 31, 2023

View reviewed changes

white238 approved these changes Oct 31, 2023

View reviewed changes

white238 reviewed Oct 31, 2023

View reviewed changes

src/serac/numerics/functional/boundary_integral_kernels.hpp Outdated Show resolved Hide resolved

samuelpmishLLNL reviewed Nov 1, 2023

View reviewed changes

johnbowen42 and others added 10 commits November 1, 2023 15:49

Refactor extended lambda and enable RAJA usage inside domain integral…

65f8fbe

… kernels

Add static assert for generic lambdas

3aef667

Add docs string and limit visiblity of class used by static assert.

6fa35e5

Exercise basic CUDA enabled functionality with RAJA

b30e904

Clang format

3a204ea

Cleanup print statements and commented code

cf6125e

Separate out host device declarations and template parameter deductio…

830a7df

…ns from other refactoring

Fix build issues

55ad49f

Add maybe unused flag to empty trial element

d1fcd28

Add comments explaining trial_elements_tuple and refactor for clarity…

37eb4e1

…. Clang format

johnbowen42 force-pushed the feature/bowen/refactor-for-extended-host-device-lambdas branch from 534dffe to 37eb4e1 Compare November 1, 2023 22:50

Rename some variables and types

5931dc5

jamiebramwell reviewed Nov 2, 2023

View reviewed changes

johnbowen42 added 2 commits November 2, 2023 12:40

Move variables back to constexpr

cf090f5

style

a1d55a8

johnbowen42 force-pushed the feature/bowen/refactor-for-extended-host-device-lambdas branch from 9be83ae to a1d55a8 Compare November 2, 2023 23:48

johnbowen42 enabled auto-merge November 2, 2023 23:53

Merge branch 'develop' into feature/bowen/refactor-for-extended-host-…

98a5462

…device-lambdas

jamiebramwell approved these changes Nov 3, 2023

View reviewed changes

johnbowen42 merged commit 516f733 into develop Nov 3, 2023

white238 deleted the feature/bowen/refactor-for-extended-host-device-lambdas branch December 4, 2023 21:55

white238 mentioned this pull request Jan 29, 2024

investigate if RAJA works without --extended-lambda flag #659

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature: Prepare to introduce RAJA for all loops #1018

feature: Prepare to introduce RAJA for all loops #1018

johnbowen42 commented Oct 25, 2023

codecov-commenter commented Oct 26, 2023 •

edited

Loading

jamiebramwell commented Oct 31, 2023

white238 Oct 31, 2023

samuelpmishLLNL Nov 1, 2023

jamiebramwell Nov 2, 2023

samuelpmishLLNL Nov 1, 2023

samuelpmishLLNL Nov 1, 2023

johnbowen42 Nov 1, 2023

samuelpmishLLNL Nov 1, 2023 •

edited

Loading

johnbowen42 Nov 1, 2023 •

edited

Loading

johnbowen42 Nov 1, 2023

white238 Nov 2, 2023

jamiebramwell Nov 2, 2023

jamiebramwell left a comment

jamiebramwell Nov 2, 2023

feature: Prepare to introduce RAJA for all loops #1018

feature: Prepare to introduce RAJA for all loops #1018

Conversation

johnbowen42 commented Oct 25, 2023

codecov-commenter commented Oct 26, 2023 • edited Loading

Codecov Report

jamiebramwell commented Oct 31, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

samuelpmishLLNL Nov 1, 2023 • edited Loading

Choose a reason for hiding this comment

johnbowen42 Nov 1, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jamiebramwell left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-commenter commented Oct 26, 2023 •

edited

Loading

samuelpmishLLNL Nov 1, 2023 •

edited

Loading

johnbowen42 Nov 1, 2023 •

edited

Loading