You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In related PR #294 we are going to have listing logic inside of DataChain.from_storage() itself.
We should replace old listing that is being called from CLI and maybe other places with DataChain.from_storage().
There is a lot of tests around listing / indexing and we should refactor them as well if needed
As a follow up for this we should remove old legacy listing codebase (maybe it's better to do this in actual separate issue / PR). We should also remove buckets and partials
Note that we also need to replace Catalog.ls_storages to use new listing datasets as bucket table will be removed, as well as partials
Refactor datachain.Dataset.is_bucket_listing() to not use old listing check and maybe remove this method altogether as low level Dataset class should not know about LISTING_PREFIX and similar higher level abstractions
The text was updated successfully, but these errors were encountered:
I think this should be prioritized as it's sometimes hard to refactor / change codebase as this old indexing part needs to be adopted. It would been much easier if it's just refactored to "new" indexing, not to mention it can only be used for CLI operations .. .e.g when we call Catalog.index(...) listing that's been created cannot be used in DataChain methods
In related PR #294 we are going to have listing logic inside of
DataChain.from_storage()
itself.We should replace old listing that is being called from CLI and maybe other places with
DataChain.from_storage()
.There is a lot of tests around listing / indexing and we should refactor them as well if needed
As a follow up for this we should remove old legacy listing codebase (maybe it's better to do this in actual separate issue / PR). We should also remove buckets and partials
Note that we also need to replace
Catalog.ls_storages
to use new listing datasets as bucket table will be removed, as well as partialsdatachain.Dataset.is_bucket_listing()
to not use old listing check and maybe remove this method altogether as low levelDataset
class should not know aboutLISTING_PREFIX
and similar higher level abstractionsThe text was updated successfully, but these errors were encountered: