-
Notifications
You must be signed in to change notification settings - Fork 342
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
CTOParallelFor with BoxND / add AnyCTO (#4109)
## Summary This PR adds support for BoxND to CTOParallelFor by adding the AnyCTO function which can be used to implement compile time options with any kernel launching function such as ParallelFor, ParallelForRNG, launch, etc. I'm not sure if AnyCTO is a good name, are there other suggestions? ## Additional background AnyCTO Examples: ``` C++ int A_runtime_option = ...; int B_runtime_option = ...; enum A_options : int { A0, A1, A2, A3 }; enum B_options : int { B0, B1 }; AnyCTO(TypeList<CompileTimeOptions<A0,A1,A2,A3>, CompileTimeOptions<B0,B1>>{}, {A_runtime_option, B_runtime_option}, [&](auto cto_func){ ParallelForRNG(N, cto_func); }, [=] AMREX_GPU_DEVICE (int i, const RandomEngine& engine, auto A_control, auto B_control) { ... if constexpr (A_control.value == A0) { ... } else if constexpr (A_control.value == A1) { ... } else if constexpr (A_control.value == A2) { ... else { ... } if constexpr (A_control.value != A3 && B_control.value == B1) { ... } ... } ); constexpr int nthreads_per_block = ...; int nblocks = ...; AnyCTO(TypeList<CompileTimeOptions<A0,A1,A2,A3>, CompileTimeOptions<B0,B1>>{}, {A_runtime_option, B_runtime_option}, [&](auto cto_func){ launch<nthreads_per_block>(nblocks, Gpu::gpuStream(), cto_func); }, [=] AMREX_GPU_DEVICE (auto A_control, auto B_control){ ... } ); ``` Additionally, .GetOptions() can be used to use the compile time options in the function that launches the kernel: ```C++ int nthreads_per_block = ...; AnyCTO(TypeList<CompileTimeOptions<128,256,512,1024>>{}, {nthreads_per_block}, [&](auto cto_func){ constexpr std::array<int, 1> ctos = cto_func.GetOptions(); constexpr int c_nthreads_per_block = ctos[0]; ParallelFor<c_nthreads_per_block>(N, cto_func); }, [=] AMREX_GPU_DEVICE (int i, auto){ ... } ); BoxND<6> box6D = ...; int dims_needed = ...; AnyCTO(TypeList<CompileTimeOptions<1,2,3,4,5,6>>{}, {dims_needed}, [&](auto cto_func){ constexpr std::array<int, 1> ctos = cto_func.GetOptions(); constexpr int c_dims_needed = ctos[0]; const auto box = BoxShrink<c_dims_needed>(box6D); ParallelFor(box, cto_func); }, [=] AMREX_GPU_DEVICE (auto intvect, auto) -> decltype(void(intvect.size())) { ... } ); ```
- Loading branch information
1 parent
a31abb5
commit de4dc97
Showing
1 changed file
with
174 additions
and
87 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters