-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ray tracing benchmark #842
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The 33MB csv file is too large for merging.
benchmarks/rays/rays.cpp
Outdated
std::pair<Kokkos::View<ArborX::Experimental::Ray[n_rays], MemorySpace>, | ||
Kokkos::View<double[n_rays], MemorySpace>> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to use non-dynamic views?
benchmarks/rays/rays.cpp
Outdated
Kokkos::View<ArborX::Nearest<ArborX::Experimental::Ray> *, MemorySpace> | ||
queries(Kokkos::view_alloc("queries", Kokkos::WithoutInitializing), | ||
rays.size()); | ||
Kokkos::parallel_for( | ||
"Ray::Build_queries", | ||
Kokkos::RangePolicy<ExecutionSpace>(0, rays.extent(0)), | ||
KOKKOS_LAMBDA(int i) { | ||
queries(i) = ArborX::nearest<ArborX::Experimental::Ray>(rays(i), 1); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe use access traits instead?
benchmarks/rays/rays.cpp
Outdated
Kokkos::View<ArborX::Box[n_divisions_x * n_divisions_y * n_divisions_z], | ||
MemorySpace> | ||
bounding_boxes( | ||
Kokkos::view_alloc("bounding_boxes", Kokkos::WithoutInitializing)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You would want to use execution space here too, I think.
benchmarks/rays/rays.cpp
Outdated
constexpr int n_divisions_x = 100; | ||
constexpr int n_divisions_y = 100; | ||
constexpr int n_divisions_z = 100; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not simply nx
, ny
, nz
?
benchmarks/rays/rays.cpp
Outdated
MPI_Barrier(comm); | ||
|
||
// Build the distributed tree | ||
Kokkos::Profiling::pushRegion("Ray::Setup"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe prefix labels in the file with Example::
?
benchmarks/rays/rays.cpp
Outdated
std::ifstream file; | ||
file.open(filename); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::ifstream file; | |
file.open(filename); | |
std::ifstream file(filename); |
You would want to have a check for failure.
benchmarks/rays/rays.cpp
Outdated
std::ifstream file; | ||
file.open(filename); | ||
std::string line; | ||
std::getline(file, line); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment about why the first line is ignored.
This is a benchmark based on an adamantine use case. We have an IR camera which produces a file of rays and temperatures. We then need to find the first intersection of the rays with the mesh. In adamantine, we are interested in the exact coordinates of the intersections but I've removed that computation here since it doesn't use ArborX. The image contains about 300,000 rays with some of them missing the mesh. By increasing the number of processors we increase the number of cells but the number of rays is constant. All the processors read the camera file.
Here is part of the profiling (bottom-up and top-bottom trees) using 4 processors