Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor manual pydantics for scanpy pl agents #255

Open
wants to merge 173 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
173 commits
Select commit Hold shift + click to select a range
db60467
add the tools `tl` modules to API agent __init__.py
bastienchassagnol Dec 10, 2024
6abb3a9
add scanpy_tl module with general description
bastienchassagnol Dec 10, 2024
9044563
Change pymilvus dependency in the pyproject.toml from the fixed versi…
bastienchassagnol Dec 11, 2024
a46df7a
api agent for scnapy tl using the generate_pydantic_class_from_module
mengerj Dec 11, 2024
d4f3184
generic method to generate pydantic classes for functions in a module.
mengerj Dec 11, 2024
7b4df80
working progress on QueryBuilder and its unit tests
mengerj Dec 11, 2024
93be844
Merge pull request #1 from mengerj/just_the_generic_function
bastienchassagnol Dec 11, 2024
0b5ea9d
Merge branch 'main' into dev/tl-bastien
bastienchassagnol Dec 11, 2024
40a7751
Merge pull request #2 from bastienchassagnol/dev/tl-bastien
bastienchassagnol Dec 11, 2024
46e666b
Scanpy-pl (#226)
slobentanzer Dec 11, 2024
abbd42d
Anndata api integration (#229)
noahbruderer Dec 11, 2024
4f885ae
switch scanpy pl to langchain bind_tools
daniele-lucarelli Dec 11, 2024
835f096
Fixed the prompt issue in the `AnnDataIOQueryBuilder`, but now no sys…
noahbruderer Dec 11, 2024
ee0106b
Merge branch 'biohackathon3' of https://github.com/biocypher/biochatt…
noahbruderer Dec 11, 2024
f10839a
add in the benchmark a call to scanpy.pp to carry on a PCA with a giv…
bastienchassagnol Dec 11, 2024
78b9a66
fix schema issue with fixed length tuples
slobentanzer Dec 11, 2024
2010548
Merge branch 'biohackathon3' of https://github.com/biocypher/biochatt…
slobentanzer Dec 11, 2024
ee20735
remove nested list in benchmark
slobentanzer Dec 11, 2024
0e003f9
remove unnecessary variable
slobentanzer Dec 11, 2024
f032635
remove dual httpx definition
slobentanzer Dec 11, 2024
70d4a39
update ABC to return list from `parameterise_query`
slobentanzer Dec 11, 2024
e53bcbb
add umap pydantic class
daniele-lucarelli Dec 11, 2024
799ba8a
migrate legacy query builder and fetcher to work with list of pydanti…
slobentanzer Dec 11, 2024
8f705d7
Merge branch 'biohackathon3' of https://github.com/biocypher/biochatt…
slobentanzer Dec 11, 2024
993da5f
add draw_graph pydantic class
daniele-lucarelli Dec 11, 2024
0a3ab0c
assume list of classes as return
slobentanzer Dec 11, 2024
c3f8e12
Merge branch 'biohackathon3' of https://github.com/biocypher/biochatt…
slobentanzer Dec 11, 2024
8c1b646
add draw_graph to tool list
daniele-lucarelli Dec 11, 2024
02c8d19
return variable instead of call
slobentanzer Dec 11, 2024
cc4d718
Merge branch 'biohackathon3' of https://github.com/biocypher/biochatt…
slobentanzer Dec 11, 2024
e85b2e0
add spatial pydantic class
daniele-lucarelli Dec 11, 2024
90a9a75
Merge branch 'biohackathon3' into main
bastienchassagnol Dec 11, 2024
ad2b389
add anndata benchmark
slobentanzer Dec 11, 2024
a272d58
add anndata benchmark test case
slobentanzer Dec 11, 2024
e32802d
change from langchain pydantic to original pydantic
slobentanzer Dec 11, 2024
b129898
Merge branch 'biohackathon3' of https://github.com/biocypher/biochatt…
slobentanzer Dec 11, 2024
7e742df
Added mock test for ScanpyTLQueryBuilder (without module specification)
vd-dragan21 Dec 11, 2024
cb27ba2
Merge branch 'main' into mock_test
bastienchassagnol Dec 11, 2024
78b5d49
Merge pull request #3 from bastienchassagnol/mock_test
bastienchassagnol Dec 11, 2024
a1c20f9
remove irrelevant imports in scanpy.tl module
bastienchassagnol Dec 11, 2024
ecb3360
Merge branch 'main' of https://github.com/bastienchassagnol/biochatter
bastienchassagnol Dec 11, 2024
286bbd4
Merge branch 'biohackathon3' into main
bastienchassagnol Dec 11, 2024
c4fc4db
move method_name to title
slobentanzer Dec 11, 2024
6c30d68
add test script to gitignore
slobentanzer Dec 11, 2024
dfde80e
Merge branch 'biohackathon3' of https://github.com/biocypher/biochatter
bastienchassagnol Dec 11, 2024
52d5ea0
delete duplicated regex definition
slobentanzer Dec 12, 2024
e0945b2
Resolve merge conflicts (#239)
anisdismail Dec 12, 2024
32df318
first version of a function to build pydantic classes for all functio…
mengerj Dec 12, 2024
27cc430
add pl reduced args
daniele-lucarelli Dec 12, 2024
67dd260
Merge branch 'biohackathon3' of https://github.com/biocypher/biochatt…
daniele-lucarelli Dec 12, 2024
b3e36a7
Generic function pydantic classes (#241)
mengerj Dec 12, 2024
5e36c6d
revert title to method_name
slobentanzer Dec 12, 2024
9893341
include BaseAPIModel
mengerj Dec 12, 2024
a97e985
renamed anndata module to anndata_agent due to conflicts
mengerj Dec 12, 2024
6d22cd3
Add the scanpy tool `tl` modules to API agent, next and closing steps…
bastienchassagnol Dec 12, 2024
7998ea7
changing how pydantic classes are defined manually, alinging with
mengerj Dec 12, 2024
38595f6
resolve import issue
slobentanzer Dec 12, 2024
c6db9fa
Merge branch 'biohackathon3' into refactor_manual_pydantics
mengerj Dec 12, 2024
e509187
Merge pull request #246 from biocypher/refactor_manual_pydantics
mengerj Dec 12, 2024
6b14dad
Fix_benchmark_sc_plot (#244)
MDLDan Dec 12, 2024
fbf9fa2
add reduced builder class
slobentanzer Dec 12, 2024
edfde9c
add file ignores for test and benchmark
slobentanzer Dec 12, 2024
72fcc3a
uuid to question_uuid in BaseAPIModule
mengerj Dec 12, 2024
96fffb5
refactored scanpy_pl and scanpy_pl_reduced according to new structure
mengerj Dec 12, 2024
ecec7ec
fixed mock import error
mengerj Dec 12, 2024
1668818
Modify the benchmark to test querybuilders for scanpy tool operations…
bastienchassagnol Dec 12, 2024
f301e94
Create pydantic classes for the scanpy pp (#256)
kvitoslava-yarish Dec 12, 2024
3702fb5
formatting
slobentanzer Dec 12, 2024
7548c17
format
slobentanzer Dec 12, 2024
1f66e34
Merge branch 'biohackathon3' of https://github.com/biocypher/biochatt…
mengerj Dec 13, 2024
772b59f
Anndata concatentation + mapping integration (#257)
noahbruderer Dec 13, 2024
863ee0d
Include function description including to which module a tool belongs…
mengerj Dec 13, 2024
601c6bd
Merge branch 'biohackathon3' of https://github.com/biocypher/biochatt…
mengerj Dec 13, 2024
0266630
adapt python call formatter to work with 'title' containing method key
mengerj Dec 13, 2024
4f8926b
figure params
slobentanzer Dec 19, 2024
1884f3b
refactor hooks to be more modular
slobentanzer Dec 19, 2024
5295183
some ruff complaints
slobentanzer Jan 2, 2025
4957619
docstrings
slobentanzer Jan 2, 2025
4cf9494
Merge branch 'biohackathon3' into refactor_manual_pydantics
slobentanzer Jan 2, 2025
48fa5ac
make more readable, fix ruff warnings
slobentanzer Jan 2, 2025
dab6442
fix return value
slobentanzer Jan 2, 2025
cd840ed
comment cases for testing
slobentanzer Jan 2, 2025
4c7eb7f
move tool definitions to global module-level dictionary
slobentanzer Jan 2, 2025
a45b3f8
ignore long lines
slobentanzer Jan 2, 2025
06327bb
ruff demands
slobentanzer Jan 2, 2025
671e94f
fix test
slobentanzer Jan 2, 2025
ed027e5
fix double plural
slobentanzer Jan 2, 2025
16e5f10
ruff warning
slobentanzer Jan 2, 2025
d32f47f
replace BaseModel with BaseAPIModel
slobentanzer Jan 2, 2025
4a2480f
work in progress to fix formatting of python calls for new classes
Jan 16, 2025
98de300
move example into description
Jan 24, 2025
10e291f
add the tools `tl` modules to API agent __init__.py
bastienchassagnol Dec 10, 2024
6026895
add scanpy_tl module with general description
bastienchassagnol Dec 10, 2024
bf6d30b
Change pymilvus dependency in the pyproject.toml from the fixed versi…
bastienchassagnol Dec 11, 2024
b2b1ff8
api agent for scnapy tl using the generate_pydantic_class_from_module
mengerj Dec 11, 2024
2c8167e
generic method to generate pydantic classes for functions in a module.
mengerj Dec 11, 2024
7737c5c
working progress on QueryBuilder and its unit tests
mengerj Dec 11, 2024
ed91c2c
add in the benchmark a call to scanpy.pp to carry on a PCA with a giv…
bastienchassagnol Dec 11, 2024
e9bfbf5
Scanpy-pl (#226)
slobentanzer Dec 11, 2024
735d8ff
Anndata api integration (#229)
noahbruderer Dec 11, 2024
af62013
switch scanpy pl to langchain bind_tools
daniele-lucarelli Dec 11, 2024
ea49a69
fix schema issue with fixed length tuples
slobentanzer Dec 11, 2024
01bedf0
Fixed the prompt issue in the `AnnDataIOQueryBuilder`, but now no sys…
noahbruderer Dec 11, 2024
b9a6212
remove nested list in benchmark
slobentanzer Dec 11, 2024
f3f0bb3
remove unnecessary variable
slobentanzer Dec 11, 2024
741f375
remove dual httpx definition
slobentanzer Dec 11, 2024
e5a41ce
update ABC to return list from `parameterise_query`
slobentanzer Dec 11, 2024
9ef535e
migrate legacy query builder and fetcher to work with list of pydanti…
slobentanzer Dec 11, 2024
db14d0a
add umap pydantic class
daniele-lucarelli Dec 11, 2024
41f6302
assume list of classes as return
slobentanzer Dec 11, 2024
184358c
add draw_graph pydantic class
daniele-lucarelli Dec 11, 2024
f6f9b53
return variable instead of call
slobentanzer Dec 11, 2024
618251d
add draw_graph to tool list
daniele-lucarelli Dec 11, 2024
3caad1e
add spatial pydantic class
daniele-lucarelli Dec 11, 2024
5188698
remove irrelevant imports in scanpy.tl module
bastienchassagnol Dec 11, 2024
61e5d44
Added mock test for ScanpyTLQueryBuilder (without module specification)
vd-dragan21 Dec 11, 2024
9fe1cb0
add anndata benchmark
slobentanzer Dec 11, 2024
72772bc
add anndata benchmark test case
slobentanzer Dec 11, 2024
82203c2
change from langchain pydantic to original pydantic
slobentanzer Dec 11, 2024
828c0e7
move method_name to title
slobentanzer Dec 11, 2024
abb40e2
add test script to gitignore
slobentanzer Dec 11, 2024
3b4e5bd
Resolve merge conflicts (#239)
anisdismail Dec 12, 2024
3ef1e97
add pl reduced args
daniele-lucarelli Dec 12, 2024
0e4485a
first version of a function to build pydantic classes for all functio…
mengerj Dec 12, 2024
6bbd762
Generic function pydantic classes (#241)
mengerj Dec 12, 2024
a1c60c9
revert title to method_name
slobentanzer Dec 12, 2024
4a3ea22
include BaseAPIModel
mengerj Dec 12, 2024
0a678b0
renamed anndata module to anndata_agent due to conflicts
mengerj Dec 12, 2024
f49ed2c
changing how pydantic classes are defined manually, alinging with
mengerj Dec 12, 2024
777b18e
resolve import issue
slobentanzer Dec 12, 2024
f48ba51
Add the scanpy tool `tl` modules to API agent, next and closing steps…
bastienchassagnol Dec 12, 2024
aaf8286
uuid to question_uuid in BaseAPIModule
mengerj Dec 12, 2024
d1ea3c3
refactored scanpy_pl and scanpy_pl_reduced according to new structure
mengerj Dec 12, 2024
af267c7
fixed mock import error
mengerj Dec 12, 2024
d725aff
formatting
slobentanzer Dec 12, 2024
836b568
format
slobentanzer Dec 12, 2024
51b67c7
Fix_benchmark_sc_plot (#244)
MDLDan Dec 12, 2024
5e0f3d6
add reduced builder class
slobentanzer Dec 12, 2024
304d8e0
add file ignores for test and benchmark
slobentanzer Dec 12, 2024
e5b616a
Modify the benchmark to test querybuilders for scanpy tool operations…
bastienchassagnol Dec 12, 2024
d446c26
Create pydantic classes for the scanpy pp (#256)
kvitoslava-yarish Dec 12, 2024
f91ecd7
Include function description including to which module a tool belongs…
mengerj Dec 13, 2024
6d1a574
Anndata concatentation + mapping integration (#257)
noahbruderer Dec 13, 2024
e0a19aa
adapt python call formatter to work with 'title' containing method key
mengerj Dec 13, 2024
637411d
figure params
slobentanzer Dec 19, 2024
e994d63
refactor hooks to be more modular
slobentanzer Dec 19, 2024
32dd423
some ruff complaints
slobentanzer Jan 2, 2025
b247e5c
docstrings
slobentanzer Jan 2, 2025
47eb328
make more readable, fix ruff warnings
slobentanzer Jan 2, 2025
2186521
fix return value
slobentanzer Jan 2, 2025
c5cdf31
comment cases for testing
slobentanzer Jan 2, 2025
d1f4ced
move tool definitions to global module-level dictionary
slobentanzer Jan 2, 2025
56cc87d
ignore long lines
slobentanzer Jan 2, 2025
d88ac4f
ruff demands
slobentanzer Jan 2, 2025
392365d
fix test
slobentanzer Jan 2, 2025
5a026e4
fix double plural
slobentanzer Jan 2, 2025
1401166
ruff warning
slobentanzer Jan 2, 2025
c04ffb7
replace BaseModel with BaseAPIModel
slobentanzer Jan 2, 2025
457b837
work in progress to fix formatting of python calls for new classes
Jan 16, 2025
682f930
move example into description
Jan 24, 2025
ed96ed0
Merge branch 'refactor_manual_pydantics' of https://github.com/biocyp…
Jan 24, 2025
628c9db
changed autogeneration function to only collecting tool info, similar…
Jan 27, 2025
e03a9bd
removed old approach of using autogeneration function. Use AutoModule…
Jan 27, 2025
19fbccb
safe pydantic tools as at
Jan 27, 2025
7e8a2c2
save pydantic tools as attribute
Jan 27, 2025
1d0dc17
save pydantic classes as attirbute
Jan 27, 2025
f9260da
remove scanpy tl query builderemove scanpy tl query builder, because …
Jan 27, 2025
d050405
revert formatter changes to be confirm with previous code. I don't th…
Jan 27, 2025
12356b7
go back to using method_name for testing purposes
Jan 27, 2025
2b35768
latest version of anndata agent
Jan 27, 2025
8c70b97
a file that is not used, due to other, newer version of anndata_agent
Jan 27, 2025
a1986ce
remove the Tl query builder
Jan 27, 2025
bbaa480
remove ScanpyTlQueryBuilder
Jan 27, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -19,3 +19,5 @@ serve.sh
.blast/*
.api_results/*
*.coverage
scaling_test.py
myvenv/
72 changes: 37 additions & 35 deletions benchmark/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@

from biochatter.llm_connect import (
AnthropicConversation,
Conversation,
GptConversation,
XinferenceConversation,
)
Expand All @@ -17,20 +18,24 @@
from .load_dataset import get_benchmark_dataset

# how often should each benchmark be run?
N_ITERATIONS = 1
N_ITERATIONS = 3

# which dataset should be used for benchmarking?
BENCHMARK_DATASET = get_benchmark_dataset()

# which models should be benchmarked?
OPENAI_MODEL_NAMES = [
# "gpt-3.5-turbo-0125",
"gpt-3.5-turbo-0125",
# "gpt-4-0613",
# "gpt-4-1106-preview",
# "gpt-4-0125-preview",
# "gpt-4-turbo-2024-04-09",
# "gpt-4o-2024-05-13",
"gpt-4o-2024-08-06",
# "gpt-4o-2024-08-06",
# "gpt-4o-2024-11-20",
# "gpt-4o-mini-2024-07-18",
# "o1-preview-2024-09-12",
# "o1-mini-2024-09-12",
]

ANTHROPIC_MODEL_NAMES = [
Expand Down Expand Up @@ -288,8 +293,9 @@ def client():

@pytest.fixture(scope="session", autouse=True)
def register_model(client):
"""Register custom (non-builtin) models with the Xinference server. Should only
happen once per session.
"""Register custom (non-builtin) models with the Xinference server.

Should only happen once per session.
"""
if client is None:
return # ignore if server is not running
Expand All @@ -302,16 +308,11 @@ def register_model(client):
model = fd.read()
client.register_model(model_type="LLM", model=model, persist=False)

# if "custom-llama-3-instruct-70b" not in registered_models:
# with open("benchmark/models/custom-llama-3-instruct-70b.json") as fd:
# model = fd.read()
# client.register_model(model_type="LLM", model=model, persist=False)


def pytest_collection_modifyitems(items):
"""Pytest hook function to modify the collected test items.
Called once after collection has been performed.
"""Modify the collected test items (Pytest hook).

Called once after collection has been performed.
Used here to order items by their `callspec.id` (which starts with the
model name and configuration) to ensure running all tests for one model
before moving to the next model.
Expand All @@ -329,9 +330,9 @@ def model_name(request):
return request.param


@pytest.fixture()
@pytest.fixture
def multiple_testing(request):
def run_multiple_times(test_func, *args, **kwargs):
def run_multiple_times(test_func, *args, **kwargs) -> tuple[str, int, int]:
scores = []
for _ in range(N_ITERATIONS):
score, max = test_func(*args, **kwargs)
Expand All @@ -344,23 +345,22 @@ def run_multiple_times(test_func, *args, **kwargs):

def calculate_bool_vector_score(vector: list[bool]) -> tuple[int, int]:
score = sum(vector)
max = len(vector)
return (score, max)
max_score = len(vector)
return (score, max_score)


@pytest.fixture()
def conversation(request, model_name, client):
"""Decides whether to run the test or skip due to the test having been run
before. If not skipped, will create a conversation object for interfacing
with the model.
@pytest.fixture
def conversation(request, model_name, client) -> Conversation:
"""Return conversation object.

Could skip due to the test having been run before (but not sure how to
implement yet). If not skipped, will create a conversation object for
interfacing with the model.
"""
test_name = request.node.originalname.replace("test_", "")
subtask = "?" # TODO can we get the subtask here?
subtask = "?" # TODO: can we get the subtask here?
if benchmark_already_executed(model_name, test_name, subtask):
pass
# pytest.skip(
# f"benchmark {test_name}: {subtask} with {model_name} already executed"
# )

if model_name in OPENAI_MODEL_NAMES:
conversation = GptConversation(
Expand Down Expand Up @@ -428,7 +428,6 @@ def conversation(request, model_name, client):
quantization=_model_quantization,
)

# return conversation
conversation = XinferenceConversation(
base_url=BENCHMARK_URL,
model_name=_model_name,
Expand All @@ -439,14 +438,14 @@ def conversation(request, model_name, client):
return conversation


@pytest.fixture()
@pytest.fixture
def prompt_engine(request, model_name, conversation):
"""Generates a constructor for the prompt engine for the current model name."""
"""Generate a constructor for the prompt engine for current model name."""

def conversation_factory():
def conversation_factory() -> Conversation:
return conversation

def setup_prompt_engine(kg_schema_dict):
def setup_prompt_engine(kg_schema_dict) -> BioCypherPromptEngine:
return BioCypherPromptEngine(
schema_config_or_info_dict=kg_schema_dict,
model_name=model_name,
Expand All @@ -456,8 +455,8 @@ def setup_prompt_engine(kg_schema_dict):
return setup_prompt_engine


@pytest.fixture()
def evaluation_conversation():
@pytest.fixture
def evaluation_conversation() -> Conversation:
conversation = GptConversation(
model_name="gpt-3.5-turbo",
prompts={},
Expand All @@ -478,7 +477,9 @@ def pytest_addoption(parser):

@pytest.fixture(autouse=True, scope="session")
def delete_results_csv_file_content(request):
"""If --run-all is set, the former benchmark data are deleted and all
"""Delete the content of the result files.

If --run-all is set, the former benchmark data are deleted and all
benchmarks are executed again.
"""
if request.config.getoption("--run-all"):
Expand Down Expand Up @@ -524,7 +525,8 @@ def result_files():


def pytest_generate_tests(metafunc):
"""Pytest hook function to generate test cases.
"""Generate test cases (Pytest hook).

Called once for each test case in the benchmark test collection.
If fixture is part of test declaration, the test is parametrized.
"""
Expand Down Expand Up @@ -559,7 +561,7 @@ def pytest_generate_tests(metafunc):
)


@pytest.fixture()
@pytest.fixture
def kg_schemas():
data = BENCHMARK_DATASET
return data["kg_schemas"]
Loading