You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Right now, it is possible to download a de-duplicated set w.r.t. metadata corresponding to a particular feature set (e.g. ViT-H-14). Unfortunately, if you already have downloaded LAION-2B with the original metadata, you will need to "map" the metadata from your existing set into a deduplicated version of this set.
Thus several things should be done:
Release a full set of de-dupped metadata (or de-dupped up to some redundancy factor)
Make functionality for easily finding correspondence between large sets of urls.
In fact, for the 2nd point, some de-duplication can be done just with this function, as we found duplicate urls (despite laion's attempt to de-dup the urls themselves).
The text was updated successfully, but these errors were encountered:
Right now, it is possible to download a de-duplicated set w.r.t. metadata corresponding to a particular feature set (e.g. ViT-H-14). Unfortunately, if you already have downloaded LAION-2B with the original metadata, you will need to "map" the metadata from your existing set into a deduplicated version of this set.
Thus several things should be done:
In fact, for the 2nd point, some de-duplication can be done just with this function, as we found duplicate urls (despite laion's attempt to de-dup the urls themselves).
The text was updated successfully, but these errors were encountered: