GitHub - matchms/sparsestack: Memory efficient stack of multiple 2D sparse arrays.

Memory efficient stack of multiple 2D sparse arrays.

Installation

Requirements

Python 3.8 or higher

Pip Install

Simply install using pip: pip install sparsestack

First code example

import numpy as np
from sparsestack import StackedSparseArray

# Create some fake data
scores1 = np.random.random((12, 10))
scores1[scores1 < 0.9] = 0  # make "sparse"
scores2 = np.random.random((12, 10))
scores2[scores2 < 0.75] = 0  # make "sparse"
sparsestack = StackedSparseArray(12, 10)
sparsestack.add_dense_matrix(scores1, "scores_1")

# Add second scores and filter
sparsestack.add_dense_matrix(scores2, "scores_2", join_type="left")

# Scores can be accessed using (limited) slicing capabilities
sparsestack[3, 4]  # => scores_1 and scores_2 at position row=3, col=4
sparsestack[3, :]  # => tuple with row, col, scores for all entries in row=3
sparsestack[:, 2]  # => tuple with row, col, scores for all entries in col=2
sparsestack[3, :, 0]  # => tuple with row, col, scores_1 for all entries in row=3
sparsestack[3, :, "scores_1"]  # => same as the one before

# Scores can also be converted to a dense numpy array:
scores2_after_merge = sparsestack.to_array("scores_2")

Adding data to a `sparsestack`-array

Sparsestack provides three options to add data to a new layer.

.add_dense_matrix(input_array) Can be used to add all none-zero elements of input_array to the sparsestack. Depending on the chosen join_type either all such values will be added (join_type="outer" or join_type="right"), or only those which are already present in underlying layers ("left" or "inner" join).
.add_sparse_matrix(input_coo_matrix) This method will expect a COO-style matrix (e.g. scipy) which has attributes .row, .col and .data. The join type can again be specified using join_type.
.add_sparse_data(row, col, data) This essentially does the same as .add_sparse_matrix(input_coo_matrix) but might in some cases be a bit more flexible because row, col and data are separate input arguments.

Accessing data from `sparsestack`-array

The collected sparse data can be accessed in multiple ways.

Slicing. sparsestack allows multiple types of slicing (see also code example above).

sparsestack[3, 4]  # => tuple with all scores at position row=3, col=4
sparsestack[3, :]  # => tuple with row, col, scores for all entries in row=3
sparsestack[:, 2]  # => tuple with row, col, scores for all entries in col=2
sparsestack[3, :, 0]  # => tuple with row, col, scores_1 for all entries in row=3
sparsestack[3, :, "scores_1"]  # => same as the one before

.to_array() Creates and returns a dense numpy array of size .shape. Can also be used to create a dense numpy array of only a single layer when used like .to_array(name="layerX").
Carefull: Obviously by converting to a dense array, the sparse nature will be lost and all empty positions in the stack will be filled with zeros.
.to_coo(name="layerX") Returns a scipy sparse COO-matrix of the specified layer.

Name		Name	Last commit message	Last commit date
Latest commit History 144 Commits
.github/workflows		.github/workflows
docs		docs
images		images
sparsestack		sparsestack
tests		tests
.gitignore		.gitignore
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Installation

Requirements

Pip Install

First code example

Adding data to a `sparsestack`-array

Accessing data from `sparsestack`-array

About

Releases 13

Packages

Contributors 2

Languages

License

matchms/sparsestack

Folders and files

Latest commit

History

Repository files navigation

Installation

Requirements

Pip Install

First code example

Adding data to a sparsestack-array

Accessing data from sparsestack-array

About

Resources

License

Stars

Watchers

Forks

Releases 13

Packages 0

Contributors 2

Languages

Adding data to a `sparsestack`-array

Accessing data from `sparsestack`-array

Packages