Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Share information #10

Open
RoboMagus opened this issue Sep 22, 2023 · 6 comments
Open

Share information #10

RoboMagus opened this issue Sep 22, 2023 · 6 comments
Assignees
Labels
good first issue Good for newcomers

Comments

@RoboMagus
Copy link

This is exactly what I was looking for / working on for the last couple of months. So great to see a whisper port coming to the Rk3588 NPU!

However, I was wondering where to find information on how the model is adapted? Perhaps you could share information on how to modify the whisper models ourselves (e.g. How the npz weight files are produced) and experiment with the process.

I'm looking forward to having a multi lingual model setup on my device and expand upon the integration with other services.

Thanks for the good work so far!

@keveman keveman added the good first issue Good for newcomers label Sep 22, 2023
@keveman
Copy link
Contributor

keveman commented Sep 22, 2023

Thanks for your interest. I merged ff1e2fc that has a script to convert the whisper model weights to an npz file. @RoboMagus please verify and close the issue.

@keveman keveman self-assigned this Sep 22, 2023
@RoboMagus
Copy link
Author

Thanks for the script. It works without problem. Though I've noticed a couple of issues.

  • Quite erratic behavior when running in non-english language mode. This works fine for the same model with regular old whisper, though this could easily be because of the number of additional parrameters available to that script. I'm still running some tests to further figure out the cause of this difference.
  • Running with any model larger than tiny errors are thrown:
    • Running transscribe_wav with base model (English language):
      E RKNN: [19:58:52.477] failed to convert handle(1019) to fd, ret: -1, errno: 24, errstr: Too many open files
    • Running transscribe_wav with small.en:
      E RKNN: [20:01:32.183] rknn_matmul_create, matmul K must be less than or equal 2048! Segmentation fault

The repo readme shows that the base model has been covered though, so the fact that RKNN throws errors on me is a bit unexpected. Is there something else required to get that model running?

@yhcvb
Copy link

yhcvb commented Sep 28, 2023

E RKNN: [19:58:52.477] failed to convert handle(1019) to fd, ret: -1, errno: 24, errstr: Too many open files
This error is due to the Linux limit on the number of fds for the process. You can run this command to increase the number limit
ulimit -n 10000

But when I continued to run the base model, the following error occurred.
n_vocab=51865 python: /userdata/asr/useful-transformers/examples/whisper/whisper.cc:100: TextDecoder::TextDecoder(int, int, int, int, int, int): Assertion n_vocab % 3 == 0' failed.`
I am still trying to figure out what happened.

@keveman
Copy link
Contributor

keveman commented Sep 28, 2023

Multilingual model needs some work, as n_vocab % 3 != 0 is not yet supported. I have a branch https://github.com/usefulsensors/useful-transformers/tree/keveman/non_multiple_of_3 where the support is implemented. However, I am still testing it with tiny and base models.

@yhcvb
Copy link

yhcvb commented Sep 28, 2023

I changed to using base.en and it runs very well now. Thank you for your help.

@RoboMagus
Copy link
Author

Im tinkering with the dev branch which seems to already have the non_multiple_of_3 features included.

The ulimit trick helped bypass the too many files error, but the behavior is still a bit wonky. It doesn't crash or throw errors, but it only seems to translate the first word and just return that response, or get caught in an interesting state where it behaves like a broken record just repeating a single word / token (just random nonsence) over and over again:
Vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem, vindem,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants