How to create and use embeddings for search queries and documents

Cognitive Search doesn't host vectorization models, so one of your challenges is creating embeddings for query inputs and outputs. You can use any embedding model, but this article assumes Azure OpenAI embeddings models. Demos in the private preview tap the similarity embedding models of Azure OpenAI.

Dimension attributes have a minimum of 2 and a maximum of 2048 dimensions per vector field.

How models are used

Query inputs will require that you submit user-provided input to an embedding model that quickly converts human readable text into a vector. Optimizing for speed is the objective.
- We used text-embedding-ada-002 to generate text embeddings and Florence Vision API for image embeddings.
- To increase the success rate of generation, we slowed the rate at which calls to the model are made. For the Python demo, we used tenacity.
Query outputs will be any matching documents found in a search index. Your search index must have been previously loaded with documents having one or more vector fields with embeddings. Whatever model you used for indexing, use the same model for queries.

Co-locate resources in the same region

If you want resources in the same region, start with:

A region for the similarity embedding model, currently in Europe and the United States.
A region for Cognitive Search.
To support hybrid queries that include semantic ranking, or if you want to try machine learning model integration using a custom skill in an AI enrichment pipeline, note the regions that provide those features.

Generate an embedding for an ad hoc query

The Postman collection assumes that you already have a vector query. Here's some Python code for generating an embedding that you can paste into the "values" property of a vector query.

! pip install openai

import openai

openai.api_type = "azure"
openai.api_key = "YOUR-API-KEY"
openai.api_base = "https://YOUR-OPENAI-RESOURCE.openai.azure.com"
openai.api_version = "2022-12-01"

response = openai.Embedding.create(
    input="How do I use Python in VSCode?",
    engine="text-embedding-ada-002"
)
embeddings = response['data'][0]['embedding']
print(embeddings)

Tips and recommendations for embedding model integration

Python and JavaScript demos offer more scalability than the REST APIs for generating embeddings. As of this writing, the REST API doesn't currently support batching.
We've done proof-of-concept testing with indexers and skillsets, where a custom skill calls a machine learning model to generate embeddings. There is currently no tutorial or walkthrough, but we intend to provide this content as part of the public preview launch, if not sooner.
We've done proof-of-concept testing of embeddings for a thousand images using image retrieval vectorization in Cognitive Services. We hope to provide a demo of this soon.
Similarity search expands your options for searchable content, for example by matching image content with text content, or matching across multiple languages. But not every query is improved with vector search. Keyword matching with BM25 is cheaper, faster, and easier, so integrate vector search only where it adds value.

Learn more about embedding models in Azure OpenAI

Understanding embeddings in Azure OpenAI Service
Learn how to generate embeddings
Tutorial: Explore Azure OpenAI Service embeddings and document search

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vector-search-how-to.md

vector-search-how-to.md

How to create and use embeddings for search queries and documents

How models are used

Co-locate resources in the same region

Generate an embedding for an ad hoc query

Tips and recommendations for embedding model integration

Learn more about embedding models in Azure OpenAI

Files

vector-search-how-to.md

Latest commit

History

vector-search-how-to.md

File metadata and controls

How to create and use embeddings for search queries and documents

How models are used

Co-locate resources in the same region

Generate an embedding for an ad hoc query

Tips and recommendations for embedding model integration

Learn more about embedding models in Azure OpenAI