Skip to content

Commit

Permalink
Merge pull request #8 from alea-institute/v0.1.5
Browse files Browse the repository at this point in the history
merge V0.1.5 onto master
  • Loading branch information
mjbommar authored Nov 9, 2024
2 parents 6bb7dd1 + 5256e24 commit 4ddb90e
Show file tree
Hide file tree
Showing 10 changed files with 1,001 additions and 632 deletions.
4 changes: 4 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
Version 0.1.5 (2024-11-08)
---------------------------
* Adding support for LLM-backed (decoder) search, e.g., via OpenAI, Anthropic, VLLM, Together

Version 0.1.4 (2024-09-04)
---------------------------
* Add prefix search for typeahead/search bars (with optional trie-based search)
Expand Down
23 changes: 23 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,9 @@ SOLI is an open, CC-BY licensed standard designed to represent universal element
- Access detailed information about each class, including labels, definitions, and examples
- Convert classes to OWL XML or Markdown format

## Changelog
The changelog can be found at [CHANGES.md](CHANGES.md).

## Installation

You can install the SOLI Python library using pip:
Expand Down Expand Up @@ -58,6 +61,26 @@ for area in areas_of_law:
print(area.label)
```

## Searching with an LLM

```python
# Search with an LLM
async def search_example():
for result in await soli.parallel_search_by_llm(
"redline lease agreement",
search_sets=[
soli.get_areas_of_law(max_depth=1),
soli.get_player_actors(max_depth=2),
],
):
print(result)

import asyncio
asyncio.run(search_example())
```

LLM search uses the `alea_llm_client` to provide abstraction across multiple APIs and providers.

## Documentation

For more detailed information about using the SOLI Python library, please refer to our [full documentation](https://soli-python.readthedocs.io/).
Expand Down
2 changes: 1 addition & 1 deletion docker/ubuntu2204-install/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
FROM ubuntu:22.04

# define package version
ARG SOLI_VERSION=0.1.4
ARG SOLI_VERSION=0.1.5

# Avoid prompts from apt
ENV DEBIAN_FRONTEND=noninteractive
Expand Down
2 changes: 1 addition & 1 deletion docker/ubuntu2404-install/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
FROM ubuntu:24.04

# define package version
ARG SOLI_VERSION=0.1.4
ARG SOLI_VERSION=0.1.5

# Avoid prompts from apt
ENV DEBIAN_FRONTEND=noninteractive
Expand Down
4 changes: 2 additions & 2 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,8 @@
project = "soli-python"
copyright = "2024, ALEA Institute"
author = "ALEA Institute (https://aleainstitute.ai)"
release = "0.1.4"
version = "0.1.4"
release = "0.1.5"
version = "0.1.5"
master_doc = "index"
language = "en"

Expand Down
55 changes: 55 additions & 0 deletions examples/llm_search.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
import asyncio
import time
from pathlib import Path
from soli import SOLI
from alea_llm_client import OpenAIModel, AnthropicModel


async def main():
# text to label/classify
example_text = "review and revise license agreement"

# set at initialization
g = SOLI(llm=OpenAIModel(model="gpt-4o"))

print("gpt-4o results:")
for x in await g.parallel_search_by_llm(
example_text,
):
print(x)

# use a small llama model for area of law
TOGETHER_API_KEY = (Path.home() / ".alea" / "keys" / "together").read_text().strip()
g.llm = OpenAIModel(
endpoint="https://api.together.xyz",
model="meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo",
api_key=TOGETHER_API_KEY,
)

print("\n\nmeta-llama/Llama-3.2-3B-Instruct-Turbo results:")
for x in await g.search_by_llm(
example_text,
search_set=g.get_areas_of_law(max_depth=1),
):
print(x)

# override via property
g.llm = AnthropicModel(model="claude-3-5-haiku-20241022")

# search specific branches
print("\n\nclaude-3-5-haiku-20241022 results:")
for x in await g.parallel_search_by_llm(
example_text,
search_sets=[
g.get_areas_of_law(max_depth=2),
g.get_document_artifacts(max_depth=2),
g.get_player_actors(max_depth=3),
],
):
print(x)


if __name__ == "__main__":
t0 = time.time()
asyncio.run(main())
print(time.time() - t0)
1,267 changes: 649 additions & 618 deletions poetry.lock

Large diffs are not rendered by default.

9 changes: 5 additions & 4 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[tool.poetry]
name = "soli-python"
version = "0.1.4"
version = "0.1.5"
description = "Python library for SOLI, the Standard for Open Legal Information"
authors = ["ALEA Institute <[email protected]>"]
license = "MIT"
Expand Down Expand Up @@ -39,9 +39,9 @@ python = ">=3.10,<4.0.0"
pydantic = "^2.8.2"
lxml = "^5.2.2"
httpx = "^0.27.2"
rapidfuzz = {version = "^3.9.7", optional = true}
rapidfuzz = {version = "^3.10.0", optional = true}
marisa-trie = {version = "^1.2.0", optional = true}

alea-llm-client = {version = "^0.1.1", optional = true}

[tool.poetry.group.dev.dependencies]
types-lxml = "^2024.8.7"
Expand All @@ -62,10 +62,11 @@ sphinx-plausible = "^0.1.2"
[tool.poetry.group.search.dependencies]
rapidfuzz = "^3.9.7"
marisa-trie = "^1.2.0"
alea-llm-client = "^0.1.1"

# extras
[tool.poetry.extras]
search = ["rapidfuzz", "marisa-trie"]
search = ["rapidfuzz", "marisa-trie", "alea-llm-client"]


[build-system]
Expand Down
2 changes: 1 addition & 1 deletion soli/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
# SPDX-License-Identifier: MIT
# (c) 2024 ALEA Institute.

__version__ = "0.1.4"
__version__ = "0.1.5"
__author__ = "ALEA Institute"
__license__ = "MIT"
__description__ = "Python library for SOLI, the Standard for Open Legal Information"
Expand Down
Loading

0 comments on commit 4ddb90e

Please sign in to comment.