Skip to content

Will there be a difference in search speed between the daily and monthly partitions? #4840

Discussion options

You must be logged in to vote

1.5M vectors, dimension=1536
For each daily partition, data size = 154641.5M=9.3G.
For each monthly partition, data size = 30*9.3G=279G.

Assume all index are created successfully.
Assume cache size is larger than monthly partition's index file size.
Assume your CPU cores is enough to do parallel computing.
The query performance will mainly depends on these factors:

  • how many segments(depends on index_file_size)
  • index parameters (for IVF, nlist)
  • search parameters(for IVF, nprobe)

For example:
index_file_size = 4GB, each daily partition has 3 segments, each monthly partition has 70 segments. Each segment contains 4GB/1546/4=650000 vectors.
nlist=10000, each segment has 10000 central vector…

Replies: 1 comment 3 replies

Comment options

You must be logged in to vote
3 replies
@rky0930
Comment options

@yhmo
Comment options

yhmo Mar 23, 2021
Collaborator

@rky0930
Comment options

Answer selected by rky0930
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants