Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial steps toward a torch.utils.data.IterableDataset (PR 3 of N) #6

Merged
merged 25 commits into from
Oct 3, 2024

Conversation

bkmartinjr
Copy link
Member

This PR adds a simple torch.utils.data.IterableDataset and torchdata.datapipes.iter.IterDataPipe implementation. An initial set of tests is included.

Notes to reviewer:

  • multi-worker and multi-GPU support - will partition automatically by both. Detects both Ligtning and torch.distributed variants of world and rank.
  • API refactored from CZI contribution for better UX:
    • accepts an experiment query directly, removing the need for redundant params
    • encoders removed
    • supports full context config inheritance from the ExperimentAxisQuery allowing access to non-Census datasets
  • There is no shuffling support - that will come in a later PR
  • There is no dataloader convenience API - that will come in a later PR

@bkmartinjr bkmartinjr marked this pull request as ready for review September 23, 2024 20:23
@ryan-williams ryan-williams force-pushed the bkmartinrj/delete-old-files branch 2 times, most recently from 1cb7db7 to 54ae565 Compare September 25, 2024 16:06
Base automatically changed from bkmartinrj/delete-old-files to main September 25, 2024 16:13
@ryan-williams ryan-williams force-pushed the bkmartinjr/initial-non-shuffling-code branch from 4a546ca to c0f32a1 Compare September 25, 2024 16:17
ryan-williams added a commit that referenced this pull request Sep 25, 2024
bkmartinjr and others added 4 commits September 25, 2024 12:29
* add GHA workflows; remove temp exclusion from pre-commit config

* GHA tweak, attempt to trigger PR runs

---------

Co-authored-by: Ryan Williams <[email protected]>
@ryan-williams ryan-williams force-pushed the bkmartinjr/initial-non-shuffling-code branch from c0f32a1 to d9c0ef4 Compare September 25, 2024 16:30
@ryan-williams ryan-williams mentioned this pull request Oct 1, 2024
Copy link
Member

@ryan-williams ryan-williams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the way through, couple q's here, and suggested changes in #15!

assert self._var_joinids is not None
world_size, _ = _get_distributed_world_rank()
n_workers, _ = _get_worker_world_rank()
partition_len = len(self._obs_joinids) // world_size // n_workers
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will some partitions be one greater than others, if this division has remainders? Does it matter?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, they can vary across data loader workers. And it doesn't really matter. What matters the most is having the same number of samples distributed to each GPU. Beyond that, things are data and config dependent.

src/tiledbsoma_ml/pytorch.py Outdated Show resolved Hide resolved
tests/test_pytorch.py Outdated Show resolved Hide resolved
ryan-williams added a commit that referenced this pull request Oct 1, 2024
@ryan-williams ryan-williams force-pushed the bkmartinjr/initial-non-shuffling-code branch from cd51799 to 579727a Compare October 2, 2024 20:34
Copy link
Member

@ryan-williams ryan-williams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

@ryan-williams ryan-williams merged commit d123979 into main Oct 3, 2024
24 checks passed
@ryan-williams ryan-williams deleted the bkmartinjr/initial-non-shuffling-code branch October 3, 2024 21:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants