Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up model.generate() with coca? #475

Open
Pclanglais opened this issue Mar 26, 2023 · 4 comments
Open

Speed up model.generate() with coca? #475

Pclanglais opened this issue Mar 26, 2023 · 4 comments

Comments

@Pclanglais
Copy link

I am building an image classification workflow on top of coca captions and embeddings. The only downside is that this is slow (about 100/images per minute on a google colab).

So two related questions:

  • Is it possible to extract the embeddings calculated within model.generate()? Currently I use encode_image on top which is basically a duplicate.
  • Are there some settings that may speed up model.generate at the expense of accuracy? In my current workflow I only need the top characteristic words from the captions of images that belong to the same cluster. I'm not entirely clear how beamsearch work.
@gpucce
Copy link
Contributor

gpucce commented Mar 27, 2023

@Pclanglais Hi, I will get to work on 1 as soon as I can, as it is not possible right away. For 2 did you try setting generation_type="top_p" inside .generate? That should be faster and also allow for more control over the generation setting the "top_p" argument correctly.

@Pclanglais
Copy link
Author

hello @gpucce Thanks a lot. For 1. I just wanted to be sure that I hadn't missed any option but I could fork it on my side. It's a very good idea for 2 : I'm going to test it right away.

@rom1504
Copy link
Collaborator

rom1504 commented Apr 10, 2023

Duplicate of #409
But let's keep both

This is an important issue to fix for usability

@sramshetty
Copy link
Contributor

@Pclanglais Maybe a bit late, but if you aren't batching yet you can try #498. When I try replicating your findings, assuming GPU, I'm getting around a 100 images processed in around 40 seconds when batch size is 1. You can already batch with model.generate(), however I hoped to make a easier for future use in the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants