diff --git a/blog/2024-02-05-compressed-fsm.md b/blog/2024-02-05-compressed-fsm.md index c60570fa..d05833d9 100644 --- a/blog/2024-02-05-compressed-fsm.md +++ b/blog/2024-02-05-compressed-fsm.md @@ -109,8 +109,10 @@ to form a more frequent token Moreover, during jump-forward decoding, we've found that different tokenization strategies to the jump-forwarded part may lead to different logit distributions for the subsequent tokens. Simply appending the tokenized jump-forwarded section to the current token sequence might yield unexpected outcomes. To manage these issues, we propose the following solutions: -- Prefer the use of a comprehensive regular expression to guide the entire decoding process, rather than employing multiple concatenated regular expressions. This approach ensures that both Finite State Machines (FSM) and Large Language Models (LLM) are cognizant of the entire decoding process, thereby minimizing boundary-related issues as much as possible. - We have implemented a re-tokenization mechanism during the jump-forward phase. This involves appending the string instead of the tokens, followed by a re-tokenization of the entire text. This method effectively resolves most tokenization issues and results in only a minor increase in computational overhead, approximately 4\%. +- Prefer the use of a comprehensive regular expression to guide the entire decoding process, rather than employing multiple concatenated regular expressions. This approach ensures that both FSM and LLM are cognizant of the entire decoding process, thereby minimizing boundary-related issues as much as possible. + +You can also read some additional discussion in this [blog post](http://blog.dottxt.co/coalescence.html). ## Benchmark Results