You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We might need to be more specific about which probability is covered up to p.
The key method in the TokenProposal is traverse_trie(context, p_llm), which returns an iterator over the possible next tokens token and their raw scorep_llm(token | context) * p_guide(token | context). The challenge in defining the set of top-p tokens is that we cannot compute the normalization constant Z(context) = sum_{token} p_llm(token | context) * p_guide(token | context) without materializing the complete distribution.
Thus, for efficiency, the top-p set would need to be defined on the sum of p_llm(token | context) * p_guide(token | context). which could be really small if the llm and the guide disagree. However, if we rescaled it by Z(context) it would not be because we'd rescale it by the total agreement.
Modify the top K
TokenProposal
to lazily materialize the set of most probable tokens with cumulative probability less than probability p.The text was updated successfully, but these errors were encountered: