Skip to content

Commit

Permalink
add safe_subset argument to json_schema.to_regex, implement safe get_…
Browse files Browse the repository at this point in the history
…int_pattern / get_str_pattern
  • Loading branch information
lapp0 committed Sep 17, 2024
1 parent 0b9a3f1 commit 0f7ecd4
Show file tree
Hide file tree
Showing 4 changed files with 497 additions and 64 deletions.
11 changes: 9 additions & 2 deletions docs/reference/generation/json.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,14 +34,21 @@ print(result)
# User(name="John", last_name="Doe", id=11)
```

!!! Note "JSON and whitespaces"
!!! Note "JSON and unlimited patterns"

By default Outlines prevents the model from generating json with syntactic newlines, tabs, or multiple spaces. The default `whitespace_pattern` is `r"[ ]?"`. Small models tend to enter an infinite repetition loop if the `whitespace_pattern` allows infinite spacing. If you would like to allow the model to generate multiple tabs, newlines, and spaces, you can set the whitespace pattern as follows:
By default Outlines prevents the model from generating json with syntactic newlines, tabs, or multiple spaces. Additionally by default strings cannot be longer than 256 characters, and integers are bound between -1e19 and 1e19. Small models tend to enter an infinite repetition loop if JSON schema generation isn't constrained. If you would like to allow the model to generate multiple tabs, newlines, and spaces, you can set the whitespace pattern as follows:

```python
generator = generate.json(model, User, whitespace_pattern=r"[\n\t ]*")
```

Or you can remove all implicit constraints on json generation (whitespace, integer, and string) with

```python
generator = generate.json(model, User, safe_subset=False)
```


!!! Note "Performance"

`generation.json` computes an index that helps Outlines guide generation. This can take some time, but only needs to be done once. If you want to generate several times with the same schema make sure that you only call `generate.json` once.
Expand Down
Loading

0 comments on commit 0f7ecd4

Please sign in to comment.