Implement our own index builder functionality for faiss instead of index_description #1894
Labels
Refactoring
Improve the design, structure, and implementation while preserving its functionality
Description
In faiss right now, we use something called "index description" that is passed to the faiss index factory to build our index. See https://github.com/facebookresearch/faiss/blob/main/faiss/index_factory.cpp and https://github.com/facebookresearch/faiss/wiki/The-index-factory. While this is nice if you want to try out a lot of different index combinations, it has a couple downsides:
For those reasons, I think moving away from building this index description will make the code more maintainable and allow us to more easily extend the code in the future. Instead of the index factory, I think we should just pass a map of the parameters as is to the JNI library and let the JNI figure out how to configure the index.
From a migration perspective, we add the index description to the field type attributes (https://github.com/opensearch-project/k-NN/blob/main/src/main/java/org/opensearch/knn/index/mapper/MethodFieldMapper.java#L62-L63), but this will not impact new segments being created because these attributes are created for each segment. Because the mapping passed by the user has all of the parameters, it should be seemless from this perspective. Similarly, for training, it will not have any compatibility issues either because the description is just created at the time of model creation and not serialized. On the JNI side, we will need to basically duplicate how we are creating the indices in the faiss index_factory so that there are no regressions. This will need some investigation.
The text was updated successfully, but these errors were encountered: