Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a caching dataloader for 3D dataset #161

Draft
wants to merge 12 commits into
base: main
Choose a base branch
from
Draft

Conversation

edyoshikun
Copy link
Contributor

@edyoshikun edyoshikun commented Sep 12, 2024

Summary

This PR adds a dataloader that caches the datasets in RAM and avoids using the SliceWindowDataset(). Here we cache the dataset on epoch 1. Ideally, the datasets fit within the 2TBs available per node. One can use btop to inspect the resource usage.

The z-slicing for the volumes is set as a parameter num_z_slices. We leave the slicing to the configurable MONAI transforms.

Testing

Working

  • Tested the hcs_ram.CachedDataset
  • Tested hcs_ram.CacheDataModule
  • Tested running UNeXt2 with a small toy dataset locally (M1) and gpu-sm02 nodes.
  • Running the model with dev_mode=True, which runs the model for 1 epoch.

Not Working

  • Running UNeXt2 with the actual datasets (500GB+) using one GPU (H100), one node
  • Running UNeXt2 with the actual datasets (500GB+) using 4 GPU (H100), one node

@ziw-liu ziw-liu added enhancement New feature or request translation Image translation (VS) labels Sep 21, 2024
@ziw-liu ziw-liu added this to the v0.3.0 milestone Sep 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request translation Image translation (VS)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants