-
Notifications
You must be signed in to change notification settings - Fork 987
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rfcs: add proposal on coo sparse encoding #1941
base: rfcs
Are you sure you want to change the base?
Conversation
## Scope of Extended Functionality | ||
|
||
For the extended API, a reference implementation for the matrix multiplication primitive will be introduced that supports input tensors with COO encoding. The coverage of the implementation will be identical to that for the CSR encoding and is listed below: | ||
* Datatype for COO tensor data: `f32` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very cool! Do you plan on adding f16 or bf16 support in the future ? or just sticking to f32 for now ?
And for small matrices, you may find that using even s16 could be of benefit ... If ncols is less than 32k, you could use s16 ... If I recall, many operations from SpMM in AI come from tall skinny matrices, so this might be a possibility for future optimizations ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comment! We do plan on adding f16 support for the implementation. I have updated the RFC accordingly.
/// @param adims Tensor dimensions. | ||
/// @param adata_type Data precision/type. | ||
/// @param nnz Number of non-zero entries. | ||
/// @param index_dt Data type of indices. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What sort of data types are expected for indices?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi! Keeping the same data considerations as the CSR format, the datatype is expected to be s32 for the indices.
d7a52de
to
aa9feb6
Compare
Because of the compressed row, this format tends to be more efficient for large sparse matrices. | ||
COO, on the other hand, is simpler in implementation in that the encoding comprises of a list of thruples `(values, row_index, column_index)` corresponding to the non-zero values. | ||
This makes the COO less efficient than CSR but it has the advantage of a reduced conversion overhead and better interpretability. | ||
For practical cases, a sorted variant of COO is used wherein the data is encoded as a set of sorted arrays containing the values, row indices and column indices respectively: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we explicitly require users to pass us the memory in the canonical COO format where the entries are sorted by row, then column?
Also, do we allow the values to be zero?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, the generic COO format does not require the user to explicitly pass the memory in a row-major order. But this will be the case for sorted COO as it determines how the buffers are defined for the data indices.
With the given declaration, the indices for dimension
The value buffers are expected to only hold non-zero elements.
/// assigned numbers (index): | ||
/// - 0: values | ||
/// - 1: dim0_indices | ||
/// - 2: dim1_indices |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to be generic here that is, the number of buffers for the indices should be equal to the number of dimensions of the tensor. For example, potentially, we could support 3D sparse tensors in COO format and in that case we would need 3 buffers with the indices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the catch! I have updated the RFC accordingly.
char *dim1_indices = malloc(indices_size); | ||
``` | ||
|
||
#### Memory Creation Example: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please also add an example for the C++ API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. Thanks!
aa9feb6
to
c2fc0e6
Compare
c2fc0e6
to
cd64159
Compare
cd64159
to
cd15b92
Compare
Proposal on extending sparsity support to include Coordinate sparse encoding (COO) format.
Link to rendered document.