page_type | languages | products | name | description | |||||
---|---|---|---|---|---|---|---|---|---|
sample |
|
|
Document Processing with Azure AI Samples |
This collection of samples demonstrates how to use various Azure AI capabilities to build solution to extract structured data, classify, and analyze documents. |
This repository contains a collection of code samples that demonstrate how to use various Azure AI capabilities to process documents.
The samples are intended to help engineering teams establish techniques with Azure AI Studio, Azure OpenAI, and Azure Document Intelligence to build solutions to extract structured data, classify, and analyze documents.
The techniques demonstrated take advance of various capabilities from each service to:
- Reduce complexity of custom model training by taking advantage of the capabilities of Generative AI models to analyze and classify documents.
- Improve reliability in document processing by utilizing combining AI service capbilities to extract structured data from any document type, with high accuracy and confidence.
- Simplify document processing workflows by providing reusable code and patterns that can be easily modified and evaluated for most use cases.
Note
All data extraction samples provide both an accuracy and confidence score for the extracted data. The accuracy score is calculated based on the similarity between the extracted data and the ground truth data. The confidence score is calculated based on OCR analysis confidence and logprobs
in Azure OpenAI requests.
Sample | Description | Example Use Cases |
---|---|---|
Data Extraction - Azure AI Document Intelligence + Azure OpenAI GPT-4o | Demonstrates how to use Azure AI Document Intelligence pre-built layout and Azure OpenAI GPT models to extract structured data from documents. | Predominantly text-based documents such as invoices, receipts, and forms. |
Data Extraction - Azure AI Document Intelligence + Phi-3.5 MoE | Demonstrates how to use Azure AI Document Intelligence pre-built layout and Microsoft's Phi-3 models to extract structured data from documents. | Predominantly text-based documents such as invoices, receipts, and forms. |
Data Extraction - Marker/Surya + Azure OpenAI GPT-4o | Demonstrates how to use Marker/Surya and Azure OpenAI GPT models to extract structured data from documents. | Predominantly text-based documents such as invoices, receipts, and forms. |
Data Extraction - Azure OpenAI GPT-4o with Vision | Demonstrates how to use Azure OpenAI GPT-4o and GPT-4o-mini models to extract structured data from documents using their built-in vision capabilities. | Complex documents with a mix of text and images, including diagrams, signatures, selection marks, etc. such as reports and contracts. |
Data Extraction - Comprehensive Azure AI Document Intelligence + Azure OpenAI GPT-4o with Vision | Demonstrates how to improve the accuracy and confidence in extracting structured data from documents by combining Azure AI Document Intelligence and Azure OpenAI GPT-4o models with vision capabilities. | Any structured or unstructured document type. |
Classification - Azure OpenAI GPT-4o with Vision | Demonstrates how to use Azure OpenAI GPT-4o and GPT-4o-mini models to classify documents using their built-in vision capabilities. | Processing multiple documents types or documents with varying purposes, such as contracts, legal documents, and emails. |
Classification - Azure AI Document Intelligence + Embeddings | Demonstrates how to use Azure AI Document Intelligence pre-built layout and embeddings models to classify documents based on their content. | Processing multiple documents types or documents with varying purposes, such as contracts, legal documents, and emails. |
The sample repository comes with a Dev Container that contains all the necessary tools and dependencies to run the sample. To use the Dev Container, you need to have the following tools installed on your local machine:
- Install Visual Studio Code
- Install Docker Desktop
- Install Remote - Containers extension for Visual Studio Code
Additionally, you will require:
- An Azure subscription. If you don't have an Azure subscription, create an account.
To setup a local development environment, follow these steps:
Note
Ensure that Docker Desktop is running on your local machine.
- Clone the repository to your local machine.
- Open the repository in Visual Studio Code.
- Press
F1
to open the command palette and typeDev Containers: Reopen in Container
.
Once the Dev Container is up and running, you can setup the necessary Azure services and run the samples in the repository by running the following command in VS Code's pwsh
terminal:
Note
For the most optimal sample experience, it is recommended to run the samples in East US
which will provide support for all the services used in the samples. Find out more about region availability for Azure AI Document Intelligence, and GPT-4o
, Phi-3.5 MoE
, and text-embedding-3-large
models.
az login
./Setup-Environment.ps1 -DeploymentName <UniqueDeploymentName> -Location <AzureRegion> -SkipInfrastructure $false
The script will deploy the following resources to your Azure subscription:
- Azure AI Studio Hub & Project, a development platform for building AI solutions that integrates with Azure AI Services in a secure manner using Microsoft Entra ID for authentication.
- Note: Phi-3.5 MoE will be deployed as a PAYG serverless endpoint in the Azure AI Studio Project with its primary key stored in the associated Azure Key Vault.
- Azure AI Services, a managed service for all Azure AI Services, including Azure OpenAI and Azure AI Document Intelligence.
- Note: GPT-4o and GPT-4o-mini will be deployed as Global Standard models with 10K TPM quota allocation.
text-embedding-3-large
will be deployed as a Standard model with 115K TPM quota allocation. These can be adjusted based on your quota availability in the main.bicep file.
- Note: GPT-4o and GPT-4o-mini will be deployed as Global Standard models with 10K TPM quota allocation.
- Azure Storage Account, required by Azure AI Studio.
- Azure Monitor, used to store logs and traces for monitoring and troubleshooting purposes.
- Azure Container Registry, used to store container images for the Azure AI Studio environment.
Note
All resources are secured by default with Microsoft Entra ID using Azure RBAC. Your user client ID will be added with the necessary least-privilege roles to access the resources created. A user-assigned managed identity will also be deployed for the Azure AI Studio environment.
After the script completes, you can run any of the samples in the repository by following their instructions.
You can contribute to the repository by opening an issue or submitting a pull request. For more information, see the Contributing guide.
This project is licensed under the MIT License.