-
Notifications
You must be signed in to change notification settings - Fork 10
Compute Id
This parameter in compute()
and task()
methods defines these:
-
For a separable-kernel(data-parallel) and with multiple GPUs, it tells a
compute()
method's load balancer will recognize this id value to continue balancing further whenever same id value is used, to minimizecompute()
overhead even more(by fairly partitioning the kernel to all devices). Using different value each time resets load balancer state even if its same kernel and uses same arrays. -
For each unique kernel name, it duplicates opencl-kernel(
cl::kernel
) whenever a new id value is used incompute()
ortask()
methods. Using "test" as kernel name and "1" as a compute-id, it duplicates once, then using "2" duplicates it again, using "1" again doesn't duplicate but uses that instance that was created first.
Advantages of having multiple kernels for same function(by using different compute-id values) are:
- Reduces number of
clSetKernelArg()
calls in C++ side whenever samecompute()
ortask()
is repeated. This increases performance. - Multiple queues (using async enqueue mode feature) can overlap same kernel execution with different parameters.
If not used with any async enqueue mode nor multiple devices, compute-id value is not important and should stay constant since load balancer is not used and all compute operations are serially executed.