-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a component for image based deduplication #476
Comments
I'm watching your project with interest and recently had to deduplicate an image dataset on my own for controlnet training. Out of interest, I wonder why you choose imagededup specificly and if you have a comparison of different approaches? (btw: In my research I stumbled upon fastdup by visual-layer, which was easy to use and fast, and fiftyone by voxel51, which seems more sophisticated (and also includes a image suite).) |
Hi @geroldmeisinger, Thanks for your suggestions, Indeed Fastdup and Fiftyone's image uniqueness component seems much better than imagededup lib. I'll be performing test between Fastdup and Fiftyone to select the best one for image deduplication component. And Ill share the comparison results |
|
This issue is to create a new component for image based deduplication for the Fondant-cc-25m data preprocessing pipeline using imagededup library
The text was updated successfully, but these errors were encountered: