-
Notifications
You must be signed in to change notification settings - Fork 570
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Any known issue with 2025-01-09 on CPU using transformers getting stuck during image encoding? #235
Comments
I confirm that I encountered similar issues with rev. 2025-01-09 on different machines
This is also discussed here: |
@autmoate @geoffroy-noel-ddh Sorry to hear that you're running into issues. Windows requires some additional steps w/ the latest revision - do you have FFMPEG and Pyvips installed to your machine? Detailed steps on getting the latest revision running are available here. |
Thanks a Lot for your reply and yes I was aware of the pyvips and pyvips-binary install. But I‘ll follow again the instructions you provided and I‘ll have a look at ffmpeg. Will give it another try next week hopefully. |
Hi, you can find all the answers in my description at the top. I'm using Ubuntu, pyvips and pyvips-binary. ffmpeg is also installed on the machine. Although I did not see in the documentation you've linked or the readme any mention of ffmpeg. So it's not clear whether that's a moondream requirement. |
@parsakhaz Does the example code at the bottom of the README work for you? Have you tried to reproduce it? If it works, can you tell what step exactly is missing from my description above as I believe it follows the installation instructions? |
@geoffroy-noel-ddh @parsakhaz Starting ubuntu22.04:
Retrying brought up this:
So I switched revision to: Here is my code which is mostly the example code from the shared moondream docs link:
Inference took nearly forever which is a bit of a bummer (~500s on a 6 core instance with 18GB of RAM) but yeah, no wonder. Cheap cpu-only cloud instance, I know. I hoped that it runs faster on cpu so I wouldn't have to go for gpu but accelerating cpu-inference breaks this issue I guess. Thanks anyways for the help so far again! |
And now for testing Win11 again (setup as mentioned above): I installed ffmpeg and checked it:
This wasn't the trick. So I followed the instructions on moondream-docs checking all other dependencies for Win and extracted the vips DLLs from /bin to project root. That didn't help either. When checking the task-manager I can see that the code utilizes ~1GB of RAM but CPU doesn't do anything. I guess the RAM represents the image encoding but moondream model isn't loaded. So last thing I tested from my experiences with the ubuntu instance was to again change
Edit: See details here:
Here is the code:
Unfortunately also ~220s for inference. Will check with preloaded model. Thanks for all the help again! |
The example transformer code on the README.md stalls on two different machines. I'm using the example code as is, the only change is the path to the image (1092x1040, png).
I stopped the execution after 8 minutes. The trace shows that the execution was in encode_image() > _run_vision_encoder() > vision_encoder() > mlp() > linear().
Is anyone able to confirm whether that code works on CPU or if this is a known issues.
Previous version of the model (2024-08-26), using associated transformer code works well on the same input image, using the exact same python environment, on the same machine. It completes in 23s.
System
pip install transformers torch einops pillow pyvips pyvips-binary torchvision
The text was updated successfully, but these errors were encountered: