Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Huggingface models do not work #1031

Open
MartinMayday opened this issue Dec 26, 2024 · 5 comments
Open

Huggingface models do not work #1031

MartinMayday opened this issue Dec 26, 2024 · 5 comments

Comments

@MartinMayday
Copy link

I wish to use other faster models than the default whisper large v3 turbo.

Ive triad downloading 50+ huggingsface models, but the app crashes or fails to download.

is there a way we can either download with our huggingface token or bypass huggingface security restriction?

maybe downloading manually and loading manually into the app?

alternative, is there a way to add groq distil whisper api to the http section instead of openai api?

many questions, would love any feedback (i am not a coder, so please be gentle)

p.s. another question is there a way to connect to the app via local http similar to whisper-asr-server?

p.p.s i love the folder watch feature, and thats why i use the app, but my computer is to slow to get any optimal workflow using the default whisper large v3 turbo models.

Best regards

@raivisdejus
Copy link
Collaborator

@MartinMayday Let's see where the problem may be.

Can you give an example of some model you tried and that failed? Or try this Latvian model it definitely works RaivisDejus/whisper-tiny-lv. Also huggingface models will only work with Huggingface whisper type, so select that in the whisper type selection box.

Most models are open and do not require any tokens. If you need to specify the token see this https://huggingface.co/docs/huggingface_hub/quick-start as Buzz uses the huggingface_hub under the hood to download models. The huggingface-cli login may be what you need or set the HF_TOKEN environment variable to pass the token to the download scripts.

Downloading models manually is a bit tricky due to the necessary internal structure of the download scripts, but if you get errors downloading maybe something is broken in the caches. See this https://chidiwilliams.github.io/buzz/docs/faq#7-can-i-use-buzz-on-a-computer-without-internet to find the cache folder and maybe delete the old caches.

Buzz supports all OpenAI compatible APIs, also the Groq. See this discussion on more information on how to configure the Groq #827 This will also let you connect Buzz to any local server that supports OpenAI API format.

If you have more questions or something is still confusing, let me know, we will figure this out. Even if all works I would love to hear on sections that are not easy to understand, maybe we can find some areas of the documentation to improve.

@MartinMayday
Copy link
Author

MartinMayday commented Dec 26, 2024 via email

@MartinMayday
Copy link
Author

MartinMayday commented Dec 26, 2024 via email

@raivisdejus
Copy link
Collaborator

CrisperWhisper looks really interesting. Unfortunately I do not see an easy or quick way to integrate it into the Buzz. It uses forked version of the transformers library, that may or may not be compatible with all the other models in the Buzz. Also the license of the CrisperWhisper restricts it to non-commercial uses. Easiest way to integrate would be to look for some API library that can use CrisperWhisper and then use that to integrate with the Buzz.

Note on other whisper types.

  • faster-whisper this works if you select Faster whisper as whisper type. Default model sizes are configured in the Buzz. To use custom Ctranslate2 compatible mode you can download it in Help -> Preferences -> Models and then select Faster whisper and then paste the Huggingface id of a custom model.
  • whisper-jax this is not supported by the Buzz, but my tests show that it is not super much faster than the Faster whisper. I think the big speedup of jax approach comes on the specialized TPU hardware that most regular people do not have.
  • insanely-fast-whisper this is also not supported by the Buzz, but from their github repo it seems this whisper type also focuses and gets it's main speedup on powerful data center GPU setups. If someone does some testing on more regular consumer hardware and it shows this whisper type gets better speed, I would be happy to add it to the Buzz.
  • WhisperS2T and whisperX seem to use other whisper types under the hood, so they do similar things to the Buzz but with a command line interface. Speaker diarization is something I would like to see in some future Buzz version. Both of these are not currently supported.

Support for the Turbo models is built in the Buzz, you can select them in the dropdown for Whisper and Faster whisper.

@MartinMayday
Copy link
Author

Thanks a bunch for the effort and detailed respons 🙏

what model do use your self for speedy and avg. Quality transcribe?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants