You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Apr 23, 2024. It is now read-only.
Could you please provide more information about the issue? I've tested yttm bpe dropout in python REPL and obtained different subword tokenization for different runs
Sure, I'm getting the same output by using yttm encode:
>>> for i in 1 2 3 4 5
... do
... echo "i do observe such behavior" | yttm encode --model model/path --output_type subword --dropout_prob 0.3
... done
...
n_threads: 4
▁ i ▁do ▁ob s erve ▁s uc h ▁behavior
bytes processed: 26
n_threads: 4
▁ i ▁do ▁ob s erve ▁s uc h ▁behavior
bytes processed: 26
n_threads: 4
▁ i ▁do ▁ob s erve ▁s uc h ▁behavior
bytes processed: 26
n_threads: 4
▁ i ▁do ▁ob s erve ▁s uc h ▁behavior
bytes processed: 26
n_threads: 4
▁ i ▁do ▁ob s erve ▁s uc h ▁behavior
bytes processed: 26
My version is 1.0.6.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
In
YouTokeToMe
BPE-dropout is always the same for the same input. That contradicts the idea described in the paper:The text was updated successfully, but these errors were encountered: