You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a PR in #730 to use a separate directory for caching (by default it uses .datachain/tmp directory that we already have). That directory gets cleaned up after the prefetching is completed.
However, for pytorch, auto-cleanup looks tricky. We cannot do so with PytorchDataset instance, so either we'd need to provide a custom DataLoader, or provide an API to cleanup the dataset.
I went with latter in #730 that provides PytorchDataset.close() API to cleanup cache.
This will also avoid synchronization issues that may arise due to DataLoader creating multiple processes.
pre-fetch and caching should be independent settings.
See discussion in #635
The text was updated successfully, but these errors were encountered: