-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
expanded README, added toml, csv, yaml dumpers, anthropic provider #215
Conversation
README.md
Outdated
@@ -68,10 +69,25 @@ where `EXTRA_NAME` is one of the following: | |||
- `timeseries`: Time series similarity measures like `dtw` and `smith_waterman` | |||
- `transformers`: Advanced NLP tools based on `pytorch` and `transformers` | |||
|
|||
Alternatively, you can also clone this git repository and install CBRKit and its dependencies via uv: `uv pip install ".[all]"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, you can also clone this git repository and install CBRKit and its dependencies via uv: `uv pip install ".[all]"` | |
Alternatively, you can also clone this git repository and install CBRKit and its dependencies via uv: `uv sync --all-extras` |
README.md
Outdated
@@ -210,6 +220,7 @@ You first build a retrieval pipeline by specifying a global similarity function | |||
```python | |||
retriever = cbrkit.retrieval.build( | |||
cbrkit.sim.attribute_value(...), | |||
limit=10, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Has been moved to retireval.dropout, so that won't work
README.md
Outdated
@@ -300,7 +317,7 @@ You may even nest adaptation functions to handle object-oriented cases. | |||
|
|||
An overview of all available adaptation functions can be found in the [module documentation](https://wi2trier.github.io/cbrkit/cbrkit/adapt.html). | |||
|
|||
## Reuse | |||
## Reuse <a name="reuse"></a> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
## Reuse <a name="reuse"></a> | |
## Reuse |
GitHub automatically adds links to headings
examples/cars_rag_large.py
Outdated
import ollama | ||
import cbrkit | ||
|
||
df = pl.read_csv("data/cars-1k.csv")[:30] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
df = pl.read_csv("data/cars-1k.csv")[:30] | |
df = pl.read_csv("data/cars-1k.csv").head(30) |
examples/cars_rag_large.py
Outdated
retriever = cbrkit.retrieval.dropout( | ||
cbrkit.retrieval.build(sim_func), | ||
# limit=5, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
retriever = cbrkit.retrieval.dropout( | |
cbrkit.retrieval.build(sim_func), | |
# limit=5, | |
) | |
retriever = cbrkit.retrieval.build(sim_func) |
examples/cars_rag_large.py
Outdated
# metadata=cbrkit.helpers.get_metadata(sim_func), | ||
) | ||
# provider = cbrkit.synthesis.providers.anthropic(model="claude-3-haiku-20240307", response_type=str, max_tokens=400) | ||
client = ollama.AsyncClient(host="http://136.199.130.136:6789") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
client = ollama.AsyncClient(host="http://136.199.130.136:6789") | |
client = ollama.AsyncClient(host="http://IP:PORT") |
src/cbrkit/dumpers.py
Outdated
""" | ||
|
||
@staticmethod | ||
def __flatten_dict(nested_dict) -> list[dict[str, Any]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Type annotations missing.
- An inner function is not elegant, could this be a top-level method instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed a potential fix to the dev branch, could you try if it solves the issue with closed event loops for you?
uv.lock
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a merge conflict, I added the rtoml library and updated the lockfile in dev. You can remove the changes here.
pyproject.toml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a merge conflict, I added the rtoml library and updated the lockfile in dev. You can remove the changes here.
…into dumpers_docs
pyproject.toml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you remove the changes from this file?
src/cbrkit/dumpers.py
Outdated
|
||
from .typing import ConversionFunc, FilePath | ||
from typing import Union |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from typing import Union |
src/cbrkit/dumpers.py
Outdated
""" | ||
|
||
@staticmethod | ||
def _flatten_recursive(obj: Union[dict, Any], prefix: str = '') -> dict: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def _flatten_recursive(obj: Union[dict, Any], prefix: str = '') -> dict: | |
def _flatten_recursive(obj: Any, prefix: str = '') -> dict[str, Any]: |
uv.lock
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you remove the changes from this file?
@kilianbartz Could you remove the changes from pyproject.toml and uv.lock? |
The changes in synthesis/providers/model.py are meant to prevent asyncio "Event loop closed" errors, however even with this workaround, working with ollama and chunking still runs into asyncio problems in certain scenarios (e.g. the prompts on the chunks would run correctly and the pooling prompt would run into problems).