-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Polinabinder/file extend #477
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: polinabinder1 <[email protected]>
Signed-off-by: polinabinder1 <[email protected]>
This PR enables concatenation of multiple single cell memory map datasets without copying the full files over and needing to double the memory usage on disk. In this approach, the row, column, and data arrays are copied over in blocks of bytes, and files are deleted after they are copied over, ensuring limits on the amount of data stored in duplicate on disk. |
/build-ci |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks pretty good, left a few stylistic comments. Docstrings could use examples I think.
sub-packages/bionemo-scdl/src/bionemo/scdl/io/single_cell_collection.py
Outdated
Show resolved
Hide resolved
sub-packages/bionemo-scdl/src/bionemo/scdl/io/single_cell_collection.py
Outdated
Show resolved
Hide resolved
sub-packages/bionemo-scdl/src/bionemo/scdl/io/single_cell_memmap_dataset.py
Show resolved
Hide resolved
sub-packages/bionemo-scdl/src/bionemo/scdl/util/filecopyutil.py
Outdated
Show resolved
Hide resolved
sub-packages/bionemo-scdl/tests/bionemo/scdl/io/test_single_cell_memmap_dataset.py
Outdated
Show resolved
Hide resolved
sub-packages/bionemo-scdl/tests/bionemo/scdl/io/test_single_cell_memmap_dataset.py
Outdated
Show resolved
Hide resolved
…ll_memmap_dataset.py Co-authored-by: Steven Kothen-Hill <[email protected]> Signed-off-by: polinabinder1 <[email protected]>
…ll_memmap_dataset.py Co-authored-by: Steven Kothen-Hill <[email protected]> Signed-off-by: polinabinder1 <[email protected]>
/build-ci |
sub-packages/bionemo-scdl/src/bionemo/scdl/io/single_cell_collection.py
Outdated
Show resolved
Hide resolved
sub-packages/bionemo-scdl/tests/bionemo/scdl/io/test_single_cell_memmap_dataset.py
Show resolved
Hide resolved
/build-ci |
/build-ci |
1 similar comment
/build-ci |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved! Thank you for adding tests.
/build-ci |
/build-ci |
No description provided.