You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Missing a fence on the exec space passed in to Kokkos::deep_copy(..)
Kokkos::deep_copy(execution_space(), PIDList_host, PIDList_view); // needs exec_space.fence() afterwards// Stash the RemotePIDs. Once remotePIDs is changed to become a Kokkos view, we can remove this and copy directly.// Note: If Teuchos::Array had a shrink_to_fit like std::vector,// we'd call it here.
Teuchos::Array<int> PIDList(NumRemoteColGIDs);
for(LO i = 0; i < NumRemoteColGIDs; ++i) {
PIDList[i] = PIDList_host[i];
}
As part of this bug fix could someone on the tpetra team also audit the other three arg deep copies in tpetra for correctness? e.g. searching for either "Kokkos::deep_copy(e" or Kokkos::deep_copy(s" for exec, space shows other spots that use the three arg version. We should make sure they also appropriately fence the exec space passed in when needed.
Thanks,
Yaro
The text was updated successfully, but these errors were encountered:
If an ExecutionSpace argument exec_space is provided the call is potentially asynchronous—i.e., the call returns before the copy operation is executed. In that case the copy operation will occur only after any already submitted work to exec_space is finished, and the copy operation will be finished before any work submitted to exec_space after the deep_copy call returns is executed. Note: the copy operation is only synchronous with respect to work in the specific execution space instance, but not necessarily with work in other instances of the same type. This behaves analogous to issuing a cudaMemcpyAsync into a specific CUDA stream, without any additional synchronization.
List of Tpetra's usage of 3-argument deep_copy is attached. (I did this on a slightly out-of-date branch, but line numbers should be very close, if not exact.) deep_copy_review.txt
Hi,
Reporting a bug. See https://github.com/trilinos/Trilinos/blob/master/packages/tpetra/core/src/Tpetra_Import_Util2.hpp#L1128C1-L1137
Missing a fence on the exec space passed in to Kokkos::deep_copy(..)
As part of this bug fix could someone on the tpetra team also audit the other three arg deep copies in tpetra for correctness? e.g. searching for either "Kokkos::deep_copy(e" or Kokkos::deep_copy(s" for exec, space shows other spots that use the three arg version. We should make sure they also appropriately fence the exec space passed in when needed.
Thanks,
Yaro
The text was updated successfully, but these errors were encountered: