v0.0.23
Speeding Things Up ⚡
This release moves the underlying constrained decoding engine from outlines to guidance.
The compilation of constraints + tokens to a FSM (finite-state machine) used in the outlines approach turned out to be a bottleneck for many BlendSQL operations. Instead, the trie-based guidance approach runs quicker in these settings where the constraints aren't known ahead of time, as is the case in many BlendSQL ingredients.
Below are the old/new runtimes for the benchmarks, using HuggingFaceTB/SmolLM-135M
.
Before:
Task | Average Runtime | # Unique Queries |
---|---|---|
financials | 0.0427749 | 7 |
rugby | 3.54232 | 4 |
national_parks | 2.63405 | 5 |
1966_nba_draft | 3.65771 | 2 |
After:
Task | Average Runtime | # Unique Queries |
---|---|---|
financials | 0.0487881 | 7 |
rugby | 0.909974 | 4 |
national_parks | 2.13209 | 5 |
1966_nba_draft | 1.39948 | 2 |
Anthropic models are also now supported.
What's Changed
Full Changelog: v0.0.21...v0.0.22