Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New memory management policies, new SketchingOperator concept, redesign of DenseDist and SparseDist, lots of docs #108

Merged
merged 57 commits into from
Sep 10, 2024

Conversation

rileyjmurray
Copy link
Contributor

@rileyjmurray rileyjmurray commented Aug 24, 2024

This branch was originally created to expand source and web documentation. However, as I tried to write those docs I found myself questioning many previous design decisions. (It turns out this is a trend.)

Anyway, the PR is now gigantic in scope. I've tried to break down changes into categories.

Changes to source documentation

  • Introduced macros for opaque matrices and vectors in rst-compatable source code comments (\mtxA,\mtxB,\mtxX,\mtxS,\mtxx,\mtxy). Right now those macros render their arguments in bold, but we could easily switch to a different style.
  • Removed most bracket-based indexing notation like \mat(A)[i, j] in favor of mathematically typeset subscripts like \mat(A)_{ij}.
  • Removed (the vast majority of) rst macro directive definitions from source code comments.
  • Complete rewrites of DenseDist, DenseSkOp, SparseDist, and SparseSkOp docs.
  • Complete rewrites of COOMatrix, CSRMatrix, and CSCMatrix source docs to match the style of DenseSkOp and SparseSkOp.
  • Added source docs for the Axis enum (renamed from MajorAxis).
  • Added source docs for the SketchingOperator concept.

New features and identifiers

  • Change the definition of LASOs so they actually implement LESS-Uniform (with Rademachers as the scalar sub-gaussian distribution).
  • Made the SketchingDistribution and SketchingOperator concepts apply meaningful constraints.
  • Extend sample_indices_iid_uniform to support generating Rademachers in addition to the indices themselves, much like how repeated_fisher_yates works. Exposed an overload that matches the original signature.
  • Add a fill_sparse overload that's analogous to the many-argument fill_dense overload.
  • De-overloaded the verbose fill_dense and fill_sparse into new functions: fill_dense_unpacked and fill_sparse_unpacked_nosub. The latter has the nosub suffix because it doesn't have all the parameters needed to select a submatrix, and I'll eventually want a fill_sparse_unpacked that qualitatively matches fill_dense_unpacked.

API changes : types

Adopted a consistent policy on semantics of memory ownership. This included making sure all sketching operator and sparse matrix classes have non-const own_memory members.

Added isometry_scale member to SketchingDistribution, DenseDist, and SparseDist.

Added dim_major and dim_minor to DenseDist and SparseDist. Deliberately keeping this out of the SketchingDistribution concept.

RNGState

  • Defined a DefaultRNG type, equal to r123::Philox4x32. RNGState's default RNG template parameter is now RNG = DefaultRNG.
  • Add static_assert that counter length for RNGs is >= 2. Needed for Box-Muller transforms and many other places.

SparseDist

  • Changed initialization order.
  • Add full_nnz member
  • Added a constructor for SparseDist, and replaced many instances of defining SparseDists with initializer lists. The construct has vec_nnz=4 and major_axis=Short by default.

SparseSkOp

  • Add nnz member
  • Remove known_filled member from SparseSkOp; the old check "!S.known_filled" is equivalent to "S.nnz <= 0".

SparseMatrix classes

  • Remove reserve instance-methods.
  • Add reserve_xyz free-functions to replace the reserve instance-methods.
  • Removed some (mostly) redundant constructors.

Important changes that don't affect the public API:

  • Renamed MajorAxis enum to just "Axis".
  • Removed Axis::Undefined and ScalarDist::BlackBox.
  • Make a (private!) helper BLASFriendlyOperator datatype, so that I don't have to coerce DenseSkOp into supporting submatrices directly.

API changes : free functions

  • LSKGES and RSKGES no longer call fill_sparse on a user-provided unfilled SparseSkOp. Instead, we make a temporary shallow-copy, fill that copy, and delete it before returning. This is in preparation for when LSKGES/RSKGES only generate submatrices that are needed for the operation at hand, and it follows the same memory-allocation/ownership semantics of dense sketching ([L/R]SK[GE/SP]3).
  • Template sample_indicies_iid and sample_indices_iid_uniform to allow general signed integers, rather than only int64_t.
  • Removed the ability of spmm and sketch_sparse to operate on submatrices of sparse data matrices. This basic functionality is still preserved by left_spmm and right_spmm in the former case and lsksp3 and rsksp3 in the latter case.
  • Removed the "val" argument from the overwrite_triangle utility function. I was only using this function with val=0.0.
  • remove has_fixed_nnz_per_col(S), since it wasn't used.
  • Moved several utility functions into the main RandBLAS namespace.
  • Significantly modernized print_colmaj.

Templating conventions

  • Changed templating in sketch_sparse to avoid nested templating. By nested templating I mean something liketemplate <typename T, typename RNG> and having an argument of type DenseSkOp<T,RNG>, while more "flat" templating would use template <typename DenseSkOp, typename T = DenseSkOp::scalar_t> and then have arguments of type T and DenseSkOp.
  • Changed templating functions that only accepted sparse sketching operators but templated as SKOP. They now template as SparseSkOp (which in a technical sense makes no difference, but does make clear that it's supposed to be a SparseSkOp<T,RNG> for some parameters T and RNG.
  • Started templating RNGState in function calls as typename state_t = RNGState<DefaultRNG>.

Web docs

  • Actually wrote the tutorial on selecting parameters for DenseDist and SparseDist!
  • Added a "FAQ and Limitations" page to the web docs. Includes discussion of C++ idioms we do/don't use.

Other

Added a utility script for converting my awkward reStructuredText source code docs of function arguments into doxygen format. I thought the doxygen format might be viable since I tweaked the CSS, but it wasn't. I'm keeping the script in case I want to revisit this later.

Increased the size of allowed error messages in randblas_error_if_msg.

Added tests for LSKSP3 and RSKSP3.

…" wrapper functions and the "sketch_sparse" wrapper functions. This functionality is still available through the left_spmm and right_spmm functions, together with the lsksp3 and rsksp3 functions.
…wn_memory for SparseSkOp so that it only attempts to free a constituent array if its pointer is non-null.
… and SparseSkOp. Add a test that SparseSkOp respects ownership semantics.
…ntation of sample_indices_iid_uniform so that it can also populate an array of Rademachers.
…adding an nnz field to SparseSkOp. More tests and probably some cleanup needed. (For example, I think SparseSkOp.nnz!=0 can be used to equivalently signal SparseSkOp.known_filled.)
…olve failures of the address sanitizer tests on linux.
…SparseSkOp, DenseSkOp, and the sparse matrix classes. Make own_memory non-const for sparse matrix classes.
@rileyjmurray rileyjmurray changed the title WIP: Docs to resolve #107, improved implementation for #84. New memory management policies, new SketchingOperator concept, redesign of DenseDist and SparseDist, lots of docs Sep 3, 2024
@rileyjmurray rileyjmurray marked this pull request as ready for review September 3, 2024 01:10
@rileyjmurray rileyjmurray merged commit 7280c0f into main Sep 10, 2024
5 checks passed
@rileyjmurray rileyjmurray deleted the docs-107 branch September 10, 2024 14:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant