-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Question]: filtering by metadata, what is first? #1689
Comments
@kazunori279 Can you take a look at this? |
Hi @valenradovich , the answer is (2), pre-filtering with a very fast algorithm. |
hi @kazunori279 Again, thank you very much Kaz! |
Do you mean how to use filtering with Vector Search? You can refer to: The first page explains how to build an index for using filtering, and the second page explains how to use the filter on query. Please let me know if there's any question! |
Great, I've seen that documentation and we've done some testing with that. What the team wants to know is if there is documentation that explicitly says how the algorithm is working and that it's actually doing the filtering first and then the similarity search. Do you know if there is something like that? Thank you! @kazunori279 |
@valenradovich unfortunately there's no documentation explains that in detail, and I'm not sure if I can disclose it in detail.. but the filtering uses pretty fast (like O(1)) so please let us know if you're seeing significant latency when you apply the filtering. |
@kazunori279 I would like to find a way to show my team that the vector retrieval is indeed applying first the metadata filtering and then the semantic search. Is there a way to do something like that? Thank you! |
File Name
vertexAI and Vector Search
What happened?
This is just a question but is crucial for time inference and efficiency in my company.
We're using Vector Search and filtering to retrieve the rigth samples that we need. But, for efficiency and inference speed, I would like to know how is the retriever working with filtering under the hood.
(1) Is it first doing the similarity search and then filtering the outputs by the metadata that we chose? Or (2) it's first doing the filtering and then searching by similarity just for that metadata?
What I mean with this is; in case (1) it wont do the search directly in all the chunks related to the metadata that I want (let's call it a user)
Sorry if this question should not be here! But I'm looking for answers in all the internet and I cannot find it
Relevant log output
Code of Conduct
The text was updated successfully, but these errors were encountered: