Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[experimental-webgpu] - Configuring Encoder/Decoder Precision with dtype for Local Models #50

Open
kostia-ilani opened this issue Nov 17, 2024 · 2 comments

Comments

@kostia-ilani
Copy link

Hello,

I’m using whisper-web (experimental-webgpu branch) with local models,
(env.allowLocalModels = true and env.localModelPath = "./models"), and facing challenges in setting distinct dtype values for encoder_model and decoder_model_merged with a - small model.

The error I see -

Uncaught (in promise) Error: Can't create a session. ERROR_CODE: 7, ERROR_MESSAGE: Failed to load model because protobuf parsing failed.

Is there a specific convention for key names or values when setting dtype for encoder/decoder precision levels (according to the models ONNX files?

const transcriber = await pipeline(
  "automatic-speech-recognition",
  "my-whisper-model",
  {
    dtype: {
      encoder_model: "fp32",
      decoder_model_merged: "q4"
    },
    device: "webgpu"
  }
);
@xenova
Copy link
Owner

xenova commented Nov 17, 2024

Might be related to huggingface/transformers.js#1025 (comment)
(Have you pulled the git lfs files into your local folder?)

@kostia-ilani
Copy link
Author

Thanks for the fast reply. @xenova
I use local files stored under ./models, so it's not a git issue.

The files were taken from
https://huggingface.co/Xenova/whisper-small/tree/main/onnx

Do you have any assumptions about what might be the issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants