You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Below is a minimal example where the distribution over tokens defined by the CharacterProposal differs from the TokenProposal with K=None (i.e., the local product of experts). In this example, the character proposal places too much probability on a since the frequency of paths with a in them is too high.
Our weights will correct for this issue, but this nonetheless means that we are obtaining sub-optimal token samples from the character proposal.
The
CharacterProposal
is designed to be fast while still hopefully being a good proxy forTokenProposal(K=None)
. How good is it in practice?The text was updated successfully, but these errors were encountered: