You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hey there, im new to github so if im creating this in the wrong area, please dont blast me. I've noticed with whisper models and implementations like this one that the displayed text segments often "self correct" or even jitter. this Jitter is even depicted in the main Demo during the chrome extension video demo. My thought was in the client, is there anyway to add a buffer to the words? maybe a 1-2 word buffer so it gives time for this? I know the alternative is waiting for the segment to complete but that would defeat the purpose of it being "Live" since it wouldn't be as real time. Im saying this because if youre trying to read along and the text is constantly updating words, punctuation, symbols, etc it makes it incredibly hard to follow. So im curious to see if anyone has found a way to prevent this while maintaining a Live Aspect. I know Youtube LiveStream captions does this well but even they have a 30 second delay. im curious to see if anyone has a work around this that is still "Live" but accurately displayed.
Thanks!!
The text was updated successfully, but these errors were encountered:
My thought was to trimm off the last 3 words in the process_segments function but that didnt seem to help as much as i thought it would. Not sure if anyone has found a work around or maybe this is something i missed in the server.py script. Anyways, thanks!
What you're asking for is basically impossible. The reason it changes it because the early hypotheses are less reliable and lower confidence. You either have to wait - probably 5s is enough IMO - or accept that it is going to change. Even people behave this way; I'll often revise my 'internal recognition' some seconds afterwards.
Hey there, im new to github so if im creating this in the wrong area, please dont blast me. I've noticed with whisper models and implementations like this one that the displayed text segments often "self correct" or even jitter. this Jitter is even depicted in the main Demo during the chrome extension video demo. My thought was in the client, is there anyway to add a buffer to the words? maybe a 1-2 word buffer so it gives time for this? I know the alternative is waiting for the segment to complete but that would defeat the purpose of it being "Live" since it wouldn't be as real time. Im saying this because if youre trying to read along and the text is constantly updating words, punctuation, symbols, etc it makes it incredibly hard to follow. So im curious to see if anyone has found a way to prevent this while maintaining a Live Aspect. I know Youtube LiveStream captions does this well but even they have a 30 second delay. im curious to see if anyone has a work around this that is still "Live" but accurately displayed.
Thanks!!
The text was updated successfully, but these errors were encountered: