Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds a zero-copy mode to the C-API. This allows the user to directly assemble matrices into AmgX, thus avoiding duplicate copies of the matrix data in memory and skipping the reordering of that data that would happen in the normal C-API upload routines. The user takes over responsibility for making sure the matrix layout matches what is used by AmgX internally, i.e. the ordering of rows must conform to the AmgX standard (interior, boundary, halo rows).
In the current implementation, there are additional requirements:
These additional requirements are not fundamental and the ZC interface could naturally be extended to not depend on them, I did not do so because this is our use case and therefore the only tested configuration.
Additionally, I only tested it with AGGREGATION AMG, but I see no reason why it would not work with CLASSICAL too.
Finally, all data should reside on the GPU as this is completely untested with host matrices.
A zero-copy matrix upload works like this:
Similarly, vectors can be interfaced with in a zero-copy-manner too:
When solving for ZC right-hand side and solution vectors, AMGX_ZC_solver_solve replaces the combination of AMGX_vector_upload, AMGX_solver_solve and AMGX_vector_download. In particular, after AMX_ZC_solver_solve, the halo-entries of the solution-vector are consistent i.e. accessing them via the data-pointer gives the correct, global value.
The zero-copy interface also supports releasing the underlying data in AmgX matrices and vectors without entirely destroying these objects via AMGX_ZC_(matrix_data)_vec_resize/shrink_to_fit. This may be useful when matrices with the same sparsity structure are created repeatedly and the user wants to keep the GPU memory high-water mark down, the behavior of the AmgX memory pool can be queried and changed with AMGX_get/set_async_free_to_pool_flag to support this.