-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement index directory traits based on S3 API #333
Comments
Basic POC working in #345, but following TODO should be done before we try this:
Making it runtime configurable is important so that we can run it in staging and flip back to the existing index format if needed. |
lulf
pushed a commit
that referenced
this issue
Aug 14, 2023
* Add a runtime config option to enable s3 directory backed index. * Implement tantivy directory trait based on s3 storage to bypass the local filesystem and syncing. Issue #333
lulf
pushed a commit
that referenced
this issue
Aug 14, 2023
* Add a runtime config option to enable s3 directory backed index. * Implement tantivy directory trait based on s3 storage to bypass the local filesystem and syncing. Issue #333
lulf
pushed a commit
that referenced
this issue
Aug 14, 2023
* Add a runtime config option to enable s3 directory backed index. * Implement tantivy directory trait based on s3 storage to bypass the local filesystem and syncing. Issue #333
lulf
pushed a commit
that referenced
this issue
Aug 14, 2023
* Add a runtime config option to enable s3 directory backed index. * Implement tantivy directory trait based on s3 storage to bypass the local filesystem and syncing. Issue #333
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This is on the border of an RFC, but it is not that much design work.
Today, the bombastic and vexination index relies on the local file system for the index, and there is a periodic 'sync' of the index to S3 for the indexer, and from S3 for the API processes. This comes with a few issues as described in trustification/trustification.dev#26:
Instead, the tantivy library has traits for the
Index Directory
that comes with 2 out of box implementations:Instead of going via the local file system, the proposal is to implement an
S3Directory
which implements the same trait. This is probably similar to what quickwit (search server based on tantivy, AGPL) is already doing, but I have not checked. In any case the S3 API supports all the operations needed by the Directory trait.The advantages of this approach would be:
Tantivy is pretty good at caching which means that search queries should still be fast enough.
The text was updated successfully, but these errors were encountered: