GPU Memory Usage - whisper_state - Max States in Multi-Threaded solution #2103

bradmit · 2024-04-29T01:47:20Z

bradmit
Apr 29, 2024

I've been experimenting with using a GPU in an application I've done up using whisper cpp, however I'm running into an issue where I have a finite amount of whisper states I can create at any one time. There is always a finite amount obviously, but the problem comes down to processing data NRT and due to a max amount of resources, the GPU is going to be idle a large amount of time.
If I am processing data thats not NRT, that's fine, GPU at 100% is what you're expecting.

The 8GB of memory is loading in the small, medium and large models and then the rest is taken up by 4 whisper states. Given there is a bunch of context information in the state, is the correct method to pretty much free it after you've finished processing (before you get more data) or hold onto the state (i.e. need cards with more memory).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GPU Memory Usage - whisper_state - Max States in Multi-Threaded solution #2103

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

GPU Memory Usage - whisper_state - Max States in Multi-Threaded solution #2103

Uh oh!

bradmit Apr 29, 2024

Replies: 0 comments

bradmit
Apr 29, 2024