You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 21, 2024. It is now read-only.
It would be VERY helpful if there were some rules of thumb for how to right-size these parameters. For example, pgvector gives the following guidance Choose an appropriate number of lists - a good place to start is rows / 1000 for up to 1M rows and sqrt(rows) for over 1M rows. Could similar guidance be created for pg_embedding?
There is some guidance about "large values" and what they do, but I don't know what a higher values mean. Should it be 10x higher or just 1 higher?
The text was updated successfully, but these errors were encountered:
Thanks, @dannyseismic for the suggestion. We are currently running more tests to determine the ideal parameters.
Here is a good place to start:
For millions of vectors, m should be between 48 and 64
Start with efConstruction = 2 x m efSeach >= k, k being the number of nearest neighbors.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
It would be VERY helpful if there were some rules of thumb for how to right-size these parameters. For example, pgvector gives the following guidance
Choose an appropriate number of lists - a good place to start is rows / 1000 for up to 1M rows and sqrt(rows) for over 1M rows
. Could similar guidance be created for pg_embedding?There is some guidance about "large values" and what they do, but I don't know what a
higher values
mean. Should it be 10x higher or just 1 higher?The text was updated successfully, but these errors were encountered: