You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've been experimenting with using a GPU in an application I've done up using whisper cpp, however I'm running into an issue where I have a finite amount of whisper states I can create at any one time. There is always a finite amount obviously, but the problem comes down to processing data NRT and due to a max amount of resources, the GPU is going to be idle a large amount of time.
If I am processing data thats not NRT, that's fine, GPU at 100% is what you're expecting.
The 8GB of memory is loading in the small, medium and large models and then the rest is taken up by 4 whisper states. Given there is a bunch of context information in the state, is the correct method to pretty much free it after you've finished processing (before you get more data) or hold onto the state (i.e. need cards with more memory).
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I've been experimenting with using a GPU in an application I've done up using whisper cpp, however I'm running into an issue where I have a finite amount of whisper states I can create at any one time. There is always a finite amount obviously, but the problem comes down to processing data NRT and due to a max amount of resources, the GPU is going to be idle a large amount of time.
If I am processing data thats not NRT, that's fine, GPU at 100% is what you're expecting.
| N/A 60C P0 26W / 70W | 8041MiB / 15360MiB | 0% Default |
| N/A 60C P0 26W / 70W | 14397MiB / 15360MiB | 1% Default |
| N/A 64C P0 75W / 70W | 14727MiB / 15360MiB | 100% Default |
The 8GB of memory is loading in the small, medium and large models and then the rest is taken up by 4 whisper states. Given there is a bunch of context information in the state, is the correct method to pretty much free it after you've finished processing (before you get more data) or hold onto the state (i.e. need cards with more memory).
Beta Was this translation helpful? Give feedback.
All reactions