Incremental indexing? #4

vishaal27 · 2024-05-26T19:25:27Z

Hey, thanks for the great implementation. Are there any plans to add an incremental index construction for large documents? This would be useful when we cannot store the entire corpus in memory, and want to continually corpuses to the constructed index?

jxmorris12 · 2024-05-27T18:34:02Z

Hi @vishaal27 -- can you describe a little bit about how this would work? It's a situation where you can't store the corpus in memory but can store the bm25 vectors in memory, is that right? Because otherwise I think the library as implemented would fail, since the entire BM25 weights are stored as a single tensor.

vishaal27 · 2024-05-28T04:52:55Z

Yes exactly, this is the use-case I was imagining. Thanks!

jxmorris12 added the enhancement New feature or request label May 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incremental indexing? #4

Incremental indexing? #4

vishaal27 commented May 26, 2024

jxmorris12 commented May 27, 2024

vishaal27 commented May 28, 2024

Incremental indexing? #4

Incremental indexing? #4

Comments

vishaal27 commented May 26, 2024

jxmorris12 commented May 27, 2024

vishaal27 commented May 28, 2024