-
Notifications
You must be signed in to change notification settings - Fork 43
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Added new
drjit.scatter_inc()
operation for stream compaction
This commit adds a new and relatively advanced Dr.Jit operation named ``drjit.scatter_inc()`` that atomically increments a value within a ``uint32``-typed Dr.Jit array. It works just like the standard ``drjit.scatter_reduce()`` operation for 32-bit unsigned integer operands, but with a fixed ``value=1`` parameter and ``reduce_op=ReduceOp::Add``. The main difference is that this variant additionally returns the *old* value of the target array prior to the atomic update in contrast to the more general scatter-reduction, which just returns ``None``. The operation also supports masking---the return value in the unmasked case is undefined. This operation is a building block for stream compaction: threads can scatter-increment a global counter to request a spot in an array and then write their result there. The recipe for this is look as follows: ```python ctr = UInt32(0) # Counter array mask = drjit.ones(Bool, len(data_1)) # .. or a more complex condition my_index = dr.scatter_inc(target=ctr, index=UInt32(0), mask=active) dr.scatter( target=data_compact_1, value=data_1, index=my_index, mask=active ) dr.scatter( target=data_compact_2, value=data_2, index=my_index, mask=active ) ``` When following this approach, be sure to provide the same mask value to the ``dr.scatter_inc()`` and subsequent ``dr.scatter()`` operations. ``dr.scatter_inc()`` exhibits the following unusual behavior compared to normal Dr.Jit operations: the return value references the instantaneous state during a potentially large sequence of atomic operations. This instantaneous state is not reproducible in later kernel evaluations, and Dr.Jit will refuse to do so when the computed index is reused: ```python my_index = dr.scatter_inc(target=ctr, index=UInt32(0), mask=active) dr.scatter( target=data_compact_1, value=data_1, index=my_index, mask=active ) dr.eval(data_compact_1) # Run Kernel #1 dr.scatter( target=data_compact_2, value=data_2, index=my_index, # <-- oops, reusing my_index in another kernel. mask=active # This raises an exception. ) ``` To get the above code to work, you will need to evaluate ``my_index`` at the same time to materialize it into a stored (and therefore trivially reproducible) representation. For this, ensure that the size of the ``active`` mask matches ``len(data_*)`` and that it is not the trivial ``True`` default mask (otherwise, the evaluated ``my_index`` will be scalar). ```python dr.eval(data_compact_1, my_index) ``` Such multi-stage evaluation is potentially inefficient and may defeat the purpose of performing stream compaction in the first place. In general, prefer keeping all scatter operations involving the computed index in the same kernel, and then this issue does not arise.
- Loading branch information
Showing
8 changed files
with
158 additions
and
6 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Submodule drjit-core
updated
16 files
+6 −1 | include/drjit-core/array.h | |
+22 −3 | include/drjit-core/jit.h | |
+6 −0 | src/api.cpp | |
+66 −5 | src/cuda_eval.cpp | |
+1 −0 | src/eval.cpp | |
+5 −2 | src/internal.h | |
+4 −4 | src/io.cpp | |
+58 −2 | src/llvm_eval.cpp | |
+65 −3 | src/op.cpp | |
+3 −0 | src/op.h | |
+15 −1 | src/var.cpp | |
+19 −13 | tests/basics.cpp | |
+6 −6 | tests/loop.cpp | |
+88 −5 | tests/mem.cpp | |
+2 −2 | tests/reductions.cpp | |
+7 −7 | tests/vcall.cpp |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters