-
-
Notifications
You must be signed in to change notification settings - Fork 229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: ID Mismatch Error in VectorDB During Evaluation #1033 #1056
base: main
Are you sure you want to change the base?
Conversation
…217/AutoRAG into fix/id-mismatch-with-vectordb
This PR may not fully align with your intentions in autorag. I tried to consider as many cases as possible, but there may be aspects you have been concerned about that I am unaware of. I understand that it might not be approved, but I would appreciate any feedback you can provide. Thank you. |
@e7217 Thank you for the PR! And apologize for the late review. |
@e7217 Actually we discussed about the structure that do not use corpus_df at all for the AutoRAG structure. Pros
Cons
|
description
Hello
I am suggesting some code changes to address issue #1033. The error occurs when an item in the vectordb is searched, but its ID does not match the ID of the raw_doc corpus. I think the retriever aims to retrieve the item with the highest score. To address this, I have added a key for the
content
. While this change may require additional storage capacity for the vectordb, it's similar to how Langchain uses apage_content
key.I have modified some code, but I have only referred to the documentation and have not run the code in practice, so there may be errors.
I appreciate your review. Thank you.
references