-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature: add RAJA kernel launches and basic CUDA support #1026
base: develop
Are you sure you want to change the base?
Changes from 74 commits
b47db88
325f868
9c5d5b1
4e9e7a9
a8c1dd6
8b5dc3d
162052a
f35d677
bb181ec
8d51959
dc9739f
831ec04
6aa995c
7c598cb
83f8bcd
fdb4b23
2480dc3
96049f1
61130b4
7838993
0e02e98
e4224e2
bffae94
4c1f6e5
274f432
fcbf4a8
be9abc1
72e6d30
be05190
caeec16
865848e
9836836
1bca89d
f3d03f7
10da80a
5dca932
46bf790
dddf080
c47ded4
2c96666
474d0bd
da53cca
b4ec3c2
25223b3
7e823ff
e1014e7
bec4cde
6a0a0d4
e2165ea
0f843d0
e5c91f6
a6605b6
84feae7
a9da335
0c95469
7d58b5d
3545588
5d7cdc7
d77c24b
5f4d580
974d6c4
194f0c3
9d16d2c
2047461
ced8b7b
3665bdf
d822b96
723062e
98fc557
54d01c9
4954a36
13c7c20
ea744ec
27510b4
6125197
e58dfab
09f4282
091ccf8
1d8a6aa
364b46a
fa98c4c
1e2f7dc
7c4067f
c062c52
2ded3ec
e912aab
9721e45
7ac6b25
8995b61
abac93d
bdafb37
0a8b42e
17cd58c
26b99de
151d793
0dc4d88
0a720b7
7686463
4b2916e
01c8e6a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -13,5 +13,6 @@ | |
*.orig | ||
__pycache__/ | ||
view | ||
*.cache* | ||
/_serac_build_and_test* | ||
build-linux-*-*-* |
Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -12,7 +12,7 @@ | |||||||||||||||||||||||||
*/ | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
#pragma once | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
#include "RAJA/RAJA.hpp" | ||||||||||||||||||||||||||
#if defined(__CUDACC__) | ||||||||||||||||||||||||||
#define SERAC_HOST_DEVICE __host__ __device__ | ||||||||||||||||||||||||||
#define SERAC_HOST __host__ | ||||||||||||||||||||||||||
|
@@ -72,6 +72,26 @@ enum class ExecutionSpace | |||||||||||||||||||||||||
Dynamic // Corresponds to execution that can "legally" happen on either the host or device | ||||||||||||||||||||||||||
}; | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
#ifdef SERAC_USE_CUDA_KERNEL_EVALUATION | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
/// @brief Alias for parallel threads policy on GPU | ||||||||||||||||||||||||||
using threads_x = RAJA::LoopPolicy<RAJA::cuda_thread_x_direct>; | ||||||||||||||||||||||||||
using teams_e = RAJA::LoopPolicy<RAJA::cuda_block_x_direct>; | ||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we use a more descriptive name than There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It seems that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It would be better to not have to guess. |
||||||||||||||||||||||||||
using launch_policy = RAJA::LaunchPolicy<RAJA::cuda_launch_t<false>>; | ||||||||||||||||||||||||||
using forall_policy = RAJA::cuda_exec<128>; | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
#else | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
/// @brief Alias for parallel threads policy on GPU. | ||||||||||||||||||||||||||
using threads_x = RAJA::LoopPolicy<RAJA::seq_exec>; | ||||||||||||||||||||||||||
/// @brief Alias for number of teams for GPU kernel launches. | ||||||||||||||||||||||||||
using teams_e = RAJA::LoopPolicy<RAJA::seq_exec>; | ||||||||||||||||||||||||||
/// @brief Alias for GPU kernel launch policy. | ||||||||||||||||||||||||||
using launch_policy = RAJA::LaunchPolicy<RAJA::seq_launch_t>; | ||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||||||||||||||
using forall_policy = RAJA::seq_exec; | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Have you thought about also defining OpenMP policies to take advantage of multi-threading on normal multi-core CPUs? Or enabling simd vectorizations with a RAJA::simd_exec policy? What about policies for El Capitain that doesn't use CUDA, right? |
||||||||||||||||||||||||||
#endif | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
/** | ||||||||||||||||||||||||||
* @brief The default execution space for serac builds | ||||||||||||||||||||||||||
*/ | ||||||||||||||||||||||||||
|
@@ -88,6 +108,13 @@ struct execution_to_memory { | |||||||||||||||||||||||||
static constexpr axom::MemorySpace value = axom::MemorySpace::Dynamic; | ||||||||||||||||||||||||||
}; | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
/// @brief This helper is needed to suppress -Werror compilation errors caused by the | ||||||||||||||||||||||||||
/// explicit captures in the main execution lambdas. | ||||||||||||||||||||||||||
template <typename... T> | ||||||||||||||||||||||||||
SERAC_HOST_DEVICE void suppress_unused_capture_warnings(T...) | ||||||||||||||||||||||||||
{ | ||||||||||||||||||||||||||
} | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
#ifdef SERAC_USE_UMPIRE | ||||||||||||||||||||||||||
/// @overload | ||||||||||||||||||||||||||
template <> | ||||||||||||||||||||||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.