Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Whisper speech to text #71

Open
nalbion opened this issue Sep 5, 2023 · 3 comments
Open

Whisper speech to text #71

nalbion opened this issue Sep 5, 2023 · 3 comments

Comments

@nalbion
Copy link
Member

nalbion commented Sep 5, 2023

I've implemented support for a various Whisper implementations - https://github.com/OpenASR/idiolect/tree/feature/navigation-with-whisper/src/main/java/org/openasr/idiolect/asr/whisper.

whisper-server

The JNA bindings for Whisper.cpp are nearly ready - ggerganov/whisper.cpp#1246.

Alternatively there is also a JNI wrapper https://github.com/GiviMAD/whisper-jni

@nalbion
Copy link
Member Author

nalbion commented Sep 5, 2023

@breandan are you able to get any of these working on Mac?

@breandan
Copy link
Collaborator

breandan commented Sep 5, 2023

I was able to successfully run the Whisper.cpp stream demo, but am unable to build the JAR on my machine. At first I encountered Execution failed for task ':javadoc' (full stacktrace I received here and here are the contents of the file javadoc.options after running ./gradlew build). I then ran ./gradlew build -x javadoc to skip the Javadoc task, then got java.lang.UnsatisfiedLinkError: Unable to load library 'whisper', so I copied the file whisper.cpp/libwhisper.dylib to whisper.cpp/bindings/java/libwhisper.dylib. Then I got the error:

whisper_init_from_file_no_state: loading model from '../../models/ggml-tiny.en.bin'
whisper_init_from_file_no_state: failed to open '../../models/ggml-tiny.en.bin'

so then I tried building tiny.en using the same instructions from base.en but got another error:

I whisper.cpp build info: 
I UNAME_S:  Darwin
I UNAME_P:  arm
I UNAME_M:  arm64
I CFLAGS:   -I.              -O3 -DNDEBUG -std=c11   -fPIC -D_DARWIN_C_SOURCE -pthread -DGGML_USE_ACCELERATE
I CXXFLAGS: -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -D_DARWIN_C_SOURCE -pthread
I LDFLAGS:   -framework Accelerate
I CC:       Apple clang version 14.0.3 (clang-1403.0.22.14.1)
I CXX:      Apple clang version 14.0.3 (clang-1403.0.22.14.1)

bash ./models/download-ggml-model.sh tiny.en
Downloading ggml model tiny.en from 'https://huggingface.co/ggerganov/whisper.cpp' ...
Model tiny.en already exists. Skipping download.

===============================================
Running tiny.en on all samples in ./samples ...
===============================================

----------------------------------------------
[+] Running tiny.en on samples/jfk.wav ... (run 'ffplay samples/jfk.wav' to listen)
----------------------------------------------

whisper_init_from_file_no_state: loading model from 'models/ggml-tiny.en.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51864
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 384
whisper_model_load: n_audio_head  = 6
whisper_model_load: n_audio_layer = 4
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 384
whisper_model_load: n_text_head   = 6
whisper_model_load: n_text_layer  = 4
whisper_model_load: n_mels        = 80
whisper_model_load: ftype         = 1
whisper_model_load: qntvr         = 0
whisper_model_load: type          = 1
whisper_model_load: mem required  =  201.00 MB (+    3.00 MB per decoder)
whisper_model_load: adding 1607 extra tokens
whisper_model_load: model ctx     =   73.62 MB
whisper_model_load: model size    =   73.54 MB
whisper_init_state: kv self size  =    2.62 MB
whisper_init_state: kv cross size =    8.79 MB
whisper_init_state: loading Core ML model from 'models/ggml-tiny.en-encoder.mlmodelc'
whisper_init_state: first run on a device may take a while ...
whisper_init_state: failed to load Core ML model from 'models/ggml-tiny.en-encoder.mlmodelc'
error: failed to initialize whisper context

So instead of using tiny.en, changed this line to String modelName = "../../models/ggml-base.en.bin"; and finally tried to build the JAR via ./gradlew build -x javadoc, but encountered the following error:

Starting a Gradle Daemon, 2 incompatible Daemons could not be reused, use --status for details

> Task :test
whisper_init_from_file_no_state: loading model from '../../models/ggml-base.en.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51864
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 512
whisper_model_load: n_audio_head  = 8
whisper_model_load: n_audio_layer = 6
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 512
whisper_model_load: n_text_head   = 8
whisper_model_load: n_text_layer  = 6
whisper_model_load: n_mels        = 80
whisper_model_load: ftype         = 1
whisper_model_load: qntvr         = 0
whisper_model_load: type          = 2
whisper_model_load: mem required  =  310.00 MB (+    6.00 MB per decoder)
whisper_model_load: adding 1607 extra tokens
whisper_model_load: model ctx     =  140.66 MB
whisper_model_load: model size    =  140.54 MB
whisper_init_state: kv self size  =    5.25 MB
whisper_init_state: kv cross size =   17.58 MB
whisper_init_state: loading Core ML model from '../../models/ggml-base.en-encoder.mlmodelc'
whisper_init_state: first run on a device may take a while ...
whisper_init_state: Core ML model loaded
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x000000018393be64, pid=36696, tid=8451
#
# JRE version: OpenJDK Runtime Environment JBR-17.0.1.12-164.8-jcef (17.0.1+12) (build 17.0.1+12-b164.8)
# Java VM: OpenJDK 64-Bit Server VM JBR-17.0.1.12-164.8-jcef (17.0.1+12-b164.8, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, bsd-aarch64)
# Problematic frame:
# C  [libsystem_platform.dylib+0xe64]  _platform_strlen+0x4
#
# Core dump will be written. Default location: /cores/core.36696
#
# An error report file with more information is saved as:
# /Users/breandan/IdeaProjects/whisper.cpp/bindings/java/hs_err_pid36696.log
#
# If you would like to submit a bug report, please visit:
#   https://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

> Task :test FAILED

FAILURE: Build failed with an exception.

* What went wrong:
Execution failed for task ':test'.
> Process 'Gradle Test Executor 1' finished with non-zero exit value 134
  This problem might be caused by incorrect test process configuration.
  Please refer to the test execution section in the User Manual at https://docs.gradle.org/8.1/userguide/java_testing.html#sec:test_execution

* Try:
> Run with --stacktrace option to get the stack trace.
> Run with --info or --debug option to get more log output.
> Run with --scan to get full insights.

* Get more help at https://help.gradle.org

BUILD FAILED in 11s
6 actionable tasks: 5 executed, 1 up-to-date

Here are the contents of the file hs_err_pid36696.log. Possibly related to ggerganov/whisper.cpp#963.

@nalbion
Copy link
Member Author

nalbion commented Sep 15, 2023

@breandan I've finally got an official whisper.cpp deployed to Maven Central

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants