Inference only uses last token except in the first forward pass #565
tom-huntington
started this conversation in
General
Replies: 1 comment
-
Lines 75 to 79 in eff383b so after forward is called on Line 254 in eff383b The keys and values for the audio features are also cached, but a hook is not used for these. I guess masking makes the caching possible |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Edit: Woops ignore this issue, this is just how key value caching is implemented.
whisper/whisper/decoding.py
Lines 141 to 143 in eff383b
This is probably much more efficient. Although I'm surprised. I though whisper would be using the power autoregressive language models, but it doesn't.
So this must mean there is no control over where the timestamps happen.
They just get filtered out here
whisper/whisper/transcribe.py
Line 195 in eff383b
Actually, you can just take the argmax of the the timestamp logits to get the timestamps for each word #3 (comment)
Beta Was this translation helpful? Give feedback.
All reactions