Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

expanded README, added toml, csv, yaml dumpers, anthropic provider #215

Merged
merged 31 commits into from
Jan 13, 2025

Conversation

kilianbartz
Copy link
Contributor

The changes in synthesis/providers/model.py are meant to prevent asyncio "Event loop closed" errors, however even with this workaround, working with ollama and chunking still runs into asyncio problems in certain scenarios (e.g. the prompts on the chunks would run correctly and the pooling prompt would run into problems).

README.md Outdated
@@ -68,10 +69,25 @@ where `EXTRA_NAME` is one of the following:
- `timeseries`: Time series similarity measures like `dtw` and `smith_waterman`
- `transformers`: Advanced NLP tools based on `pytorch` and `transformers`

Alternatively, you can also clone this git repository and install CBRKit and its dependencies via uv: `uv pip install ".[all]"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Alternatively, you can also clone this git repository and install CBRKit and its dependencies via uv: `uv pip install ".[all]"`
Alternatively, you can also clone this git repository and install CBRKit and its dependencies via uv: `uv sync --all-extras`

README.md Outdated
@@ -210,6 +220,7 @@ You first build a retrieval pipeline by specifying a global similarity function
```python
retriever = cbrkit.retrieval.build(
cbrkit.sim.attribute_value(...),
limit=10,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has been moved to retireval.dropout, so that won't work

README.md Outdated
@@ -300,7 +317,7 @@ You may even nest adaptation functions to handle object-oriented cases.

An overview of all available adaptation functions can be found in the [module documentation](https://wi2trier.github.io/cbrkit/cbrkit/adapt.html).

## Reuse
## Reuse <a name="reuse"></a>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## Reuse <a name="reuse"></a>
## Reuse

GitHub automatically adds links to headings

import ollama
import cbrkit

df = pl.read_csv("data/cars-1k.csv")[:30]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
df = pl.read_csv("data/cars-1k.csv")[:30]
df = pl.read_csv("data/cars-1k.csv").head(30)

Comment on lines 18 to 21
retriever = cbrkit.retrieval.dropout(
cbrkit.retrieval.build(sim_func),
# limit=5,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
retriever = cbrkit.retrieval.dropout(
cbrkit.retrieval.build(sim_func),
# limit=5,
)
retriever = cbrkit.retrieval.build(sim_func)

# metadata=cbrkit.helpers.get_metadata(sim_func),
)
# provider = cbrkit.synthesis.providers.anthropic(model="claude-3-haiku-20240307", response_type=str, max_tokens=400)
client = ollama.AsyncClient(host="http://136.199.130.136:6789")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
client = ollama.AsyncClient(host="http://136.199.130.136:6789")
client = ollama.AsyncClient(host="http://IP:PORT")

"""

@staticmethod
def __flatten_dict(nested_dict) -> list[dict[str, Any]]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Type annotations missing.
  • An inner function is not elegant, could this be a top-level method instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a potential fix to the dev branch, could you try if it solves the issue with closed event loops for you?

uv.lock Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a merge conflict, I added the rtoml library and updated the lockfile in dev. You can remove the changes here.

pyproject.toml Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a merge conflict, I added the rtoml library and updated the lockfile in dev. You can remove the changes here.

pyproject.toml Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you remove the changes from this file?


from .typing import ConversionFunc, FilePath
from typing import Union
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from typing import Union

"""

@staticmethod
def _flatten_recursive(obj: Union[dict, Any], prefix: str = '') -> dict:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def _flatten_recursive(obj: Union[dict, Any], prefix: str = '') -> dict:
def _flatten_recursive(obj: Any, prefix: str = '') -> dict[str, Any]:

uv.lock Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you remove the changes from this file?

@mirkolenz
Copy link
Member

@kilianbartz Could you remove the changes from pyproject.toml and uv.lock?

@mirkolenz mirkolenz merged commit 9ebee42 into wi2trier:dev Jan 13, 2025
0 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants