Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transcribe method kills jupyter notebook kernel #289

Closed
cpalappillil opened this issue Jun 8, 2023 · 5 comments
Closed

Transcribe method kills jupyter notebook kernel #289

cpalappillil opened this issue Jun 8, 2023 · 5 comments

Comments

@cpalappillil
Copy link

Not really sure why this could be happening. I already have the vanilla openai-whisper library running smoothly in my notebook, thought I'd try this out to see if I could improve speed.

I used task manager to check CPU and memory usage, and it isn't spiking. Also tried reducing beam_size from 5 to 1, still get the same issues.

I did get this message in my anaconda environment whenever my kernel failed:

OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized.
OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/.

My code is very simple, following the structure given in the readme

%%time
from faster_whisper import WhisperModel
model_size = "small"
model = WhisperModel(model_size, device="cpu", compute_type="int8")
audio = "audio_file_name"
segments, info = model.transcribe(audio, beam_size=1, vad_filter=True)
print("done")

Any help would be greatly appreciated, thanks!

@michaelgfeldman
Copy link

michaelgfeldman commented Jun 9, 2023

Same for me! Script from instruction just kills jupyter kernel.

model_size = "large-v2"
model = WhisperModel(model_size, device="cuda",
                     compute_type="default", download_root='./models')

segments, _ = model.transcribe("./audios_wav_resempled/32174134.wav", language='uk')
segments = list(segments)

I have CUDA Version: 11.6 installed.
pip install ctranslate2 already installed.

vanilla openai-whisper library also running smoothly

@guillaumekln
Copy link
Contributor

I think these are different issues.

@cpalappillil Do you have something else in your notebook? For example some codes related to the vanilla openai-whisper? If yes, you can try removing this part. If it still does not work, you can try the suggestion from the error message. Put these lines as the first instructions in your notebook:

import os
os["KMP_DUPLICATE_LIB_OK"] = "TRUE"

@michaelgfeldman Is there an error log? Does it work if you use device="cpu"?

@michaelgfeldman
Copy link

Yes it does work with device="cpu" very well (using int8 or int16).

yes i have error log from jupyter logs

15:53:04.150 [error] Disposing session as kernel process died ExitCode: undefined, Reason: This version of python seems to be incorrectly compiled
(internal generated filenames are not absolute).
This may make the debugger miss breakpoints.
Related bug: http://bugs.python.org/issue1666807
/data//ASR/.conda/lib/python3.11/site-packages/traitlets/traitlets.py:2548: FutureWarning: Supporting extra quotes around strings is deprecated in traitlets 5.0. You can use 'hmac-sha256' instead of '"hmac-sha256"' if you require traitlets >=5.
warn(
/data//ASR/.conda/lib/python3.11/site-packages/traitlets/traitlets.py:2499: FutureWarning: Supporting extra quotes around Bytes is deprecated in traitlets 5.0. Use '985a8a7e-8e19-4b80-96c6-2d510fbf4e72' instead of 'b"985a8a7e-8e19-4b80-96c6-2d510fbf4e72"'.
warn(

@michaelgfeldman
Copy link

I have good news! I was able to resolve the issue by installing cuDNN 8.9.0 and CUDA 11.6 (+ updating Nvidia drivers).
Thanks for the help!

@cpalappillil
Copy link
Author

@guillaumekln This solution works, thank you! Faster whisper is 2.5x faster than the vanilla whisper model on my device!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants