-
Notifications
You must be signed in to change notification settings - Fork 894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(memory-store, agents-api): Fix distance and confidence values for vector search #1035
Conversation
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Explore these optional code suggestions:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Reviewed everything up to db3e2f2 in 1 minute and 44 seconds
More details
- Looked at
139
lines of code in4
files - Skipped
0
files when reviewing. - Skipped posting
2
drafted comments based on config settings.
1. memory-store/migrations/000022_vector_search.up.sql:57
- Draft comment:
Consider calculating the normalized distance once and reusing it to avoid repetition and potential maintenance issues. - Reason this comment was not posted:
Confidence changes required:50%
The SQL functionsearch_by_vector
uses a formula to calculatedistance
which seems to be a normalization of the distance metric. This formula is repeated multiple times in the code, which can lead to maintenance issues if it needs to be changed. It would be better to calculate it once and reuse the result.
2. memory-store/migrations/000022_vector_search.up.sql:3
-
Draft comment:
This appears to be a duplicate of the existingsearch_by_vector
function. The only differences are in score normalization, which doesn't affect the core functionality. Consider modifying the existing function if changes are needed. -
function
search_by_vector
(000018_doc_search.up.sql) -
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
This appears to be a migration file that's deliberately replacing an existing function with an updated version that changes how vector similarity scores are calculated and normalized. Migration files are meant to evolve the database schema and functions over time. The fact that it's using CREATE OR REPLACE suggests this is intentional, not accidental duplication.
The comment raises a valid point about similarity between functions, but may be missing the purpose of database migrations.
Since this is in a migration file using CREATE OR REPLACE, the similarity is intentional - it's meant to update the existing function with new score normalization logic.
This comment should be deleted as it misunderstands the purpose of migration files, which are meant to evolve database functions over time.
Workflow ID: wflow_CzbnGakj4aDrz7nk
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on fc527fa in 34 seconds
More details
- Looked at
14
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
1
drafted comments based on config settings.
1. memory-store/migrations/000022_vector_search.up.sql:27
- Draft comment:
Consider adding a check to ensureowner_types
andowner_ids
are not empty arrays. This will prevent potential issues when these arrays are empty but not null. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The original code had a check for array length <= 0, which was removed. However, this check was logically incorrect since array_length returns NULL for empty arrays in PostgreSQL, not 0 or negative numbers. The remaining check for length mismatch between owner_types and owner_ids is sufficient since it will handle NULL cases. Empty arrays would be handled gracefully by the ANY operator in the query.
Maybe empty arrays could cause unexpected behavior in the owner filtering logic? The ANY operator's behavior with empty arrays should be verified.
In PostgreSQL, the ANY operator with an empty array returns false for all comparisons, which is actually the desired behavior here - it would effectively filter out all results, which makes sense if no owners are specified.
The comment should be deleted. The removed check was incorrect, and empty arrays are handled appropriately by PostgreSQL's ANY operator in the query.
Workflow ID: wflow_efGy5W5A6PTQa222
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
PR Type
Bug fix, Enhancement
Description
Fixed the calculation of confidence values in vector search queries.
Added a new SQL function
search_by_vector
for vector-based document search.Introduced input validation and improved query logic in the
search_by_vector
function.Updated migration files to include the new SQL function.
Changes walkthrough 📝
search_docs_by_embedding.py
Fix confidence value in embedding search
agents-api/agents_api/queries/docs/search_docs_by_embedding.py
search_docs_hybrid.py
Fix confidence value in hybrid search
agents-api/agents_api/queries/docs/search_docs_hybrid.py
000022_vector_search.down.sql
Add migration to drop vector search function
memory-store/migrations/000022_vector_search.down.sql
search_by_vector
function.000022_vector_search.up.sql
Add SQL function for vector-based document search
memory-store/migrations/000022_vector_search.up.sql
search_by_vector
.k
andconfidence
.Important
Fixes confidence value in vector search and adds new SQL function for vector-based document search with input validation.
search_docs_by_embedding.py
andsearch_docs_hybrid.py
.search_by_vector
SQL function in000022_vector_search.up.sql
with input validation and query logic for vector-based document search.search_by_vector
function in000022_vector_search.down.sql
.This description was created by for fc527fa. It will automatically update as commits are pushed.