Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/index #128

Open
wants to merge 29 commits into
base: dev
Choose a base branch
from
Open

Feature/index #128

wants to merge 29 commits into from

Conversation

net-cscience-raphael
Copy link
Contributor

Adds the index for NNS and Fulltext.

The setup is documented on the Wiki page https://github.com/vitrivr/vitrivr-engine/wiki/Documentation#creating-indexes further the documentation is appended below.

The indexes have to be set up in the schema.json file.
The indexes were created during schema init.

Nearest Neighbor Search NNS

Index hnsw

The hierarchical navigable small world index (HNSW) can be set up for a VECTOR field by adding the following configuration to the field in the schema config.

Parameters:

  • attributes allowed max. 1. The value vector is the attribute name in database.
  • type: "NNS" describes the query type.
  • parameters.type: "hnsw" describes the indextype.
  • distance: describes the distance metric for this index. The hnsw index provides:
    • "manhatten"
    • "euclidean"
    • "cosine"
    • "hamming"
    • "jaccard"
  • m: the max number of connections per layer (16 by default)
  • efConstruction: the size of the dynamic candidate list for constructing the graph (64 by default)
  • efSearch: Specify the size of the dynamic candidate list for search (100 by default)
  "indexes": [
    {
      "attributes": [
        "vector"
      ],
      "type": "NNS",
      "parameters": {
        "type": "hnsw",
        "distance": "cosine",
        "m": "4",
        "efConstruction": "10"
        "efSearch": "1000"
      }
    }
  ]

FullText Search

Index gin

The generalized inverted index (GIN) can be set up for a FULLTEXT field by adding the following configuration to the field in the schema config.

Parameters:

  • attributes The value value is the attribute name in database. All attributes will be concatenated with delimiters " || ' ' || " to create a document.
  • type: "FULLTEXT" describes the query type.
  • parameters.type: "gin" describes the index type.
  • english: the Language for fulltext index. (Default "english")
  "indexes": [
    {
      "attributes": [
        "value"
      ],
      "type": "FULLTEXT",
      "parameters": {
        "type": "gin",
        "language": "english"
      }
    }
  ]

@net-cscience-raphael net-cscience-raphael self-assigned this Feb 6, 2025
@net-cscience-raphael net-cscience-raphael added the enhancement New feature or request label Feb 6, 2025
@net-cscience-raphael net-cscience-raphael marked this pull request as ready for review February 6, 2025 10:42
@net-cscience-raphael
Copy link
Contributor Author

TODO: Add warning if query exceeds hnsw limit.

@ppanopticon ppanopticon requested a review from lucaro February 12, 2025 10:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants