Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Remove eager synchronization with HtoD copies. (#2625)
We assumed unpinned memory would always synchronize, but that does not seem to be the case. For some copy sizes (and potentially on some, e.g. coherent, memory architectures) the copy is fully asynchronous. This optimization was made to make `CuRef` of a scalar fully async. I considered making the `CuRef` ctor call `memset` instead, which is always asynchronous by virtue of passing the memory by value, however that does not support 64-bits floats while `memcpy` of 64 bits is still executed fully asynchronously.
- Loading branch information