Data loader optimizations #293

turian · 2021-09-05T21:26:39Z

A handful of optimizations that load data more quickly

jorshi · 2021-09-05T23:56:05Z

heareval/predictions/task_predictions.py

        # For timestamp (event) embedding tasks,
        # the metadata for each instance is {filename: , timestamp: }.
-        if self.embedding_type == "event":
+        if self.embedding_type == "event" and metadata:


This can be for a later issue, but I'm finding myself scratching my head a bit trying to remember how the metadata works here for event embeddings as well as the labels. Would be good to include in the docstring a bit of info on why we need metadata for event and how that is structured / should be used.

Yeah it's a bit of a headscratcher.

jorshi · 2021-09-06T00:06:47Z

heareval/predictions/task_predictions.py

        shuffle=False,
+        pin_memory=True,


Is there any consideration with the metadata? With this I think the embeddings and labels will be transferred to CUDA, but the metadata won't (I think b/c they aren't tensors). I think it will be fine, just curious if there are any gotchas there.

TBH I don't know

#294

Lightning-AI/pytorch-lightning#9340

turian added 3 commits September 5, 2021 23:26

Allow in_memory

ecd84de

Some batch load optimizations

039a374

mypy

b67469f

turian changed the title ~~[WIP] Allow in_memory~~ Data loader optimizations Sep 5, 2021

Merge branch 'main' into splits

26befc6

jorshi approved these changes Sep 6, 2021

View reviewed changes

turian mentioned this pull request Sep 6, 2021

Explain metadata for events in prediction dataloader #300

Open

turian merged commit c2c0124 into main Sep 6, 2021

turian deleted the splits branch September 6, 2021 00:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data loader optimizations #293

Data loader optimizations #293

turian commented Sep 5, 2021 •

edited

Loading

jorshi Sep 5, 2021

turian Sep 6, 2021

jorshi Sep 6, 2021

turian Sep 6, 2021

Data loader optimizations #293

Data loader optimizations #293

Conversation

turian commented Sep 5, 2021 • edited Loading

jorshi Sep 5, 2021

Choose a reason for hiding this comment

turian Sep 6, 2021

Choose a reason for hiding this comment

jorshi Sep 6, 2021

Choose a reason for hiding this comment

turian Sep 6, 2021

Choose a reason for hiding this comment

turian commented Sep 5, 2021 •

edited

Loading