Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Add Vector Search Query level metrics to understand latency for different steps in Vector Search Query #1985

Open
navneet1v opened this issue Aug 19, 2024 · 0 comments

Comments

@navneet1v
Copy link
Collaborator

navneet1v commented Aug 19, 2024

Description

As of 2.16 version of k-NN plugin a vector search query not only does vector search on a native engine index, but does more than that.

Example 1

In Efficient filtering we first run the filters, converts the filters iterators to bitsets and then do either exact search or ANN Search based on certain condition.

Example 2

With new Disk Based Vector Search feature we will be doing 2 phased search first on oversampled k and then rescoring of those top k with full precision vectors.

In both examples we are not just doing vector search but much more and as of today we have no way to know what is the latency of these internal operations. I do agree there is a profile API that gives the breakdown that API is doesn't track the above mentioned granular operations of a query. Another thing is profile api is a point in time latency and most of the time users are interested in query latency stats over time and also its sub operations.

Solution

I can think of below solutions:

  1. Improving the profile api results for vector search query to include these sub operations. We can take some inspiration of how Bool query/disMax query does it.
  2. We should look into QueryInsights plugin and see how we can add these sub operations stats via that plugin rather than emitting them via cluster stats.

I don't think the above 2 solutions will be enough but I see that as a start and may be we need to add integrations at few more places to be really have a good mechanism for query stats.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Backlog (Hot)
Development

No branches or pull requests

1 participant