-
Notifications
You must be signed in to change notification settings - Fork 10
Buffer Handling
In the hello world page, there was a sample buffer used with the number cruncher like:
Cekirdekler.ClArray<byte> array = new Cekirdekler.ClArray<byte>(1000);
array.compute(numberCruncher, 1, "hello", 1000, 100);
this usage implicitly copies whole array to all devices (actually duplicates it), then computes in all devices, then gets partial results back from all devices. Results are partial because this library was written "embarrassingly parallel algorithms" in mind. Each device generates its own partial result to make it a full array in the end.
To be able to control which array does what type of copy, one needs to alter some flags.
-
Whenever a "read" word is written in wiki, it is intended to tell a "read from C# side"-->"write to opencl buffer" will happen.
-
Whenever a "partial read" is written in wiki, it is intended to tell that the read operation will be partial so the C# side array will not be data-duplicated but dynamically distributed to all devices. A device may read only 128 elements while a faster device may read 16k elements at the same time so the pci-e bandwidth is conserved in case of "streaming"
-
Whenever a "write" word is written in wiki, it is intended to tell that an exact opposite of "partial read" will happen for an array.
Cekirdekler.ClArray<byte> array = new Cekirdekler.ClArray<byte>(1000);
array.read = false;
array.compute(numberCruncher, 1, "hello", 1000, 100);
unsetting the read
flag makes the array output-only which means, the kernel will generate some data in each device, then each partial result will be copied to target array("array" in this example) accordingly with distribution ranges.
Cekirdekler.ClArray<byte> array = new Cekirdekler.ClArray<byte>(1000);
array.read = false;
array.write = false;
array.compute(numberCruncher, 1, "hello", 1000, 100);
unsetting both read
and write
flags means this compute operation will use each device's own buffer for compute but will not copy anything. (this is compatible for single device but can be problematic for multiple devices since load balancer alters distribution percentages and offsets at each .compute()
execution)