-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Tool
dataclass - unified abstraction to represent tools
#8652
Conversation
Pull Request Test Coverage Report for Build 12391961949Details
💛 - Coveralls |
Tool
dataclassTool
dataclass - unified abstraction to represent tools
@dfokina take a look at this when possible |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only one comment - should we undermine UX/DX for agents and tools over 80kb binary? :-)
function: Callable | ||
|
||
def __post_init__(self): | ||
jsonschema_import.check() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Knowing that Tool is central piece of the Agents push, should be we make this dependency default? It's 80Kb binary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not totally sure. Let's involve also @julian-risch in the decision.
- I remember that when we introduce new dependencies, we also have to pay attention to Anaconda.
- We had some issues in the past with
jsonschema
: ci: Skip collection oftest_json_schema.py
to fix CI failures #7353
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to be not occurring on 3.9 and after, but ok, let's get this integrated and then we can experiment with including it as a default dependency. Or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I capitalized all "Tools" in the docstrings so that it looks unified :)
* draft * del HF token in tests * adaptations * progress * fix type * import sorting * more control on deserialization * release note * improvements * support name field * fix chatpromptbuilder test * port Tool from experimental * release note * docs upd * Update tool.py --------- Co-authored-by: Daria Fokina <[email protected]>
…8605) * initial import * adding initial version + tests * adding more tests * more tests * incorporating SentenceSplitter based on NLTK * adding more tests * adding release notes * adding LICENSE header * removing unused imports * fixing example docstring * addding docstrings * fixing tests and returning a dictionary * updating release notes * attending PR comments * Update haystack/components/preprocessors/recursive_splitter.py Co-authored-by: Sebastian Husch Lee <[email protected]> * wip: updating tests for split_idx_start and _split_overlap * adding tests for split_idx and split_start and overlaps * adjusting file for LICENSE checking * adding more tests * adding tests for page numbering * adding tests for min split lenghts and falling back to character-level chunking based on size * fixing linting issue * Update haystack/components/preprocessors/recursive_splitter.py Co-authored-by: Sebastian Husch Lee <[email protected]> * Update haystack/components/preprocessors/recursive_splitter.py Co-authored-by: Sebastian Husch Lee <[email protected]> * Update haystack/components/preprocessors/recursive_splitter.py Co-authored-by: Sebastian Husch Lee <[email protected]> * Update haystack/components/preprocessors/recursive_splitter.py Co-authored-by: Sebastian Husch Lee <[email protected]> * Update haystack/components/preprocessors/recursive_splitter.py Co-authored-by: Sebastian Husch Lee <[email protected]> * Update haystack/components/preprocessors/recursive_splitter.py Co-authored-by: Sebastian Husch Lee <[email protected]> * Update haystack/components/preprocessors/recursive_splitter.py Co-authored-by: Sebastian Husch Lee <[email protected]> * Update haystack/components/preprocessors/recursive_splitter.py Co-authored-by: Sebastian Husch Lee <[email protected]> * wip * wip * updating tests * wip: fixing all tests after changes * more tests * wip: debugging sentence overlap * wip: debugging page number * wip * wip; fixed bug with sentence tokenizer, needs to keep white spaces * adding tests for counting pages on different split approaches * NLTK checks done on SentenceSplitter * fixing types * adding detecting for full overlap with previous chunks * fixing types * improving docstring * improving docstring * adding custom lenght, 'character' use case * customising overlap function for word and adding a few tests * updating docstring * Update haystack/components/preprocessors/recursive_splitter.py Co-authored-by: Sebastian Husch Lee <[email protected]> * Update haystack/components/preprocessors/recursive_splitter.py Co-authored-by: Sebastian Husch Lee <[email protected]> * Update haystack/components/preprocessors/recursive_splitter.py Co-authored-by: Sebastian Husch Lee <[email protected]> * wip: adding more tests for word unit length * fix * feat: `Tool` dataclass - unified abstraction to represent tools (#8652) * draft * del HF token in tests * adaptations * progress * fix type * import sorting * more control on deserialization * release note * improvements * support name field * fix chatpromptbuilder test * port Tool from experimental * release note * docs upd * Update tool.py --------- Co-authored-by: Daria Fokina <[email protected]> * fix: fix deserialization issues in multi-threading environments (#8651) * adding 'word' as default length * fixing types * handing both default strategies * wip * \f was not being counted properly * updating tests * fixing the overlap bug * adding more tests * refactoring _apply_overlap * further refactoring * Update haystack/components/preprocessors/recursive_splitter.py Co-authored-by: Sebastian Husch Lee <[email protected]> * Update haystack/components/preprocessors/recursive_splitter.py Co-authored-by: Sebastian Husch Lee <[email protected]> * Update haystack/components/preprocessors/recursive_splitter.py Co-authored-by: Sebastian Husch Lee <[email protected]> * Update haystack/components/preprocessors/recursive_splitter.py Co-authored-by: Sebastian Husch Lee <[email protected]> * adding ticks to close code block * fixing comments * applying changes: split with space and force keep_white_spaces=True * fixing some tests and replacing count words approach in more places * keep_white_spaces = True only if not defined * cleaning docs * handling some more edge cases, when split is still too big and all separators ran * fixing fallback whitespaces count to fixed word/char split based on split size * cleaning --------- Co-authored-by: Sebastian Husch Lee <[email protected]> Co-authored-by: Stefano Fiorucci <[email protected]> Co-authored-by: Daria Fokina <[email protected]> Co-authored-by: Tobias Wochinger <[email protected]>
Related Issues
Proposed Changes:
Tool
dataclass from experimental (introduced in feat: Tool class haystack-experimental#53 + feat: convert function into Tool haystack-experimental#114)How did you test it?
CI, several new tests
Notes for the reviewer
DO NOT MERGE
This PR is currently based on the
new-chatmessage
branch, to allow me to create other PRs for Chat Generators, which require both newChatMessage
andTool
.Checklist
fix:
,feat:
,build:
,chore:
,ci:
,docs:
,style:
,refactor:
,perf:
,test:
and added!
in case the PR includes breaking changes.