This example demonstrates how to run Whisper on edge with BentoML and whisper.cpp using a custom CPP Runner.
Install required dependencies:
pip install -r requirements.txt
To load a pretrained model, use Whisper.from_pretrained()
:
from whispercpp import Whisper
model = Whisper.from_pretrained("tiny.en")
# preprocess audio file and transcribe. You can use any preprocessing library you wish.
# the example uses librosa for convenience.
import librosa
import numpy as np
audio, _ = librosa.load("/path/to/audio.wav")
model.transcribe(audio.astype(np.float32))
To package the bento, use build_bento.py
:
python build_bento.py
To override existing bento, pass in --overrride
:
python build_bento.py --override
To containerize the bento, run bentoml containerize
:
bentoml containerize whispercpp_asr