Cognitive Search doesn't host vectorization models, so one of your challenges is creating embeddings for query inputs and outputs. You can use any embedding model, but this article assumes Azure OpenAI embeddings models. Demos in the private preview tap the similarity embedding models of Azure OpenAI.
Dimension attributes have a minimum of 2 and a maximum of 2048 dimensions per vector field.
-
Query inputs will require that you submit user-provided input to an embedding model that quickly converts human readable text into a vector. Optimizing for speed is the objective.
-
We used text-embedding-ada-002 to generate text embeddings and Florence Vision API for image embeddings.
-
To increase the success rate of generation, we slowed the rate at which calls to the model are made. For the Python demo, we used tenacity.
-
-
Query outputs will be any matching documents found in a search index. Your search index must have been previously loaded with documents having one or more vector fields with embeddings. Whatever model you used for indexing, use the same model for queries.
If you want resources in the same region, start with:
-
A region for the similarity embedding model, currently in Europe and the United States.
-
To support hybrid queries that include semantic ranking, or if you want to try machine learning model integration using a custom skill in an AI enrichment pipeline, note the regions that provide those features.
The Postman collection assumes that you already have a vector query. Here's some Python code for generating an embedding that you can paste into the "values" property of a vector query.
! pip install openai
import openai
openai.api_type = "azure"
openai.api_key = "YOUR-API-KEY"
openai.api_base = "https://YOUR-OPENAI-RESOURCE.openai.azure.com"
openai.api_version = "2022-12-01"
response = openai.Embedding.create(
input="How do I use Python in VSCode?",
engine="text-embedding-ada-002"
)
embeddings = response['data'][0]['embedding']
print(embeddings)
-
Python and JavaScript demos offer more scalability than the REST APIs for generating embeddings. As of this writing, the REST API doesn't currently support batching.
-
We've done proof-of-concept testing with indexers and skillsets, where a custom skill calls a machine learning model to generate embeddings. There is currently no tutorial or walkthrough, but we intend to provide this content as part of the public preview launch, if not sooner.
-
We've done proof-of-concept testing of embeddings for a thousand images using image retrieval vectorization in Cognitive Services. We hope to provide a demo of this soon.
-
Similarity search expands your options for searchable content, for example by matching image content with text content, or matching across multiple languages. But not every query is improved with vector search. Keyword matching with BM25 is cheaper, faster, and easier, so integrate vector search only where it adds value.