The purpose of this sample application is to demonstrate how Durable Functions can be leveraged to create intelligent applications, particularly in a document processing scenario. Order and durability are key here because the results from one activity are passed to the next. Also, calls to services like Cognitive Service or Azure Open AI can be costly and should not be repeated in the event of failures.
This sample integrates various Azure services, including Azure Durable Functions, Azure Storage, Azure Cognitive Services, and Azure Open AI.
The application showcases how PDFs can be ingested and intelligently scanned to determine their content.
The application's workflow is as follows:
- PDFs are uploaded to a blob storage input container.
- A durable function is triggered upon blob upload.
-
- Downloads the blob (PDF).
-
- Utilizes the Azure Cognitive Service Form Recognizer endpoint to extract the text from the PDF.
-
- Sends the extracted text to Azure Open AI to analyze and determine the content of the PDF.
-
- Save the summary results from Azure Open AI to a new file and upload it to the output blob container.
Below, you will find the instructions to set up and run this app locally..
- Create an active Azure subscription.
- Install the latest Azure Functions Core Tools to use the CLI
- Python 3.9 or greater
- Access permissions to create Azure OpenAI resources and to deploy models.
- Start and configure an Azurite storage emulator for local storage.
You will need to configure a local.settings.json
file at the root of the repo that looks similar to the below. Make sure to replace the placeholders with your specific values.
{
"Values": {
"AzureWebJobsStorage": "UseDevelopmentStorage=true",
"AzureWebJobsFeatureFlags": "EnableWorkerIndexing",
"FUNCTIONS_WORKER_RUNTIME": "python",
"BLOB_STORAGE_ENDPOINT": "<BLOB-STORAGE-ENDPOINT>",
"COGNITIVE_SERVICES_ENDPOINT": "<COGNITIVE-SERVICE-ENDPOINT>",
"AZURE_OPENAI_ENDPOINT": "AZURE-OPEN-AI-ENDPOINT>",
"AZURE_OPENAI_KEY": "<AZURE-OPEN-AI-KEY>",
"CHAT_MODEL_DEPLOYMENT_NAME": "<AZURE-OPEN-AI-MODEL>"
}
}
-
Start Azurite: Begin by starting Azurite, the local Azure Storage emulator.
-
Install the Requirements: Open your terminal and run the following command to install the necessary packages:
python3 -m pip install -r requirements.txt
-
Create two containers in your storage account. One called
input
and the other calledoutput
. -
Start the Function App: Start the function app to run the application locally.
func start --verbose
-
Upload PDFs to the
input
container. That will execute the blob storage trigger in your Durable Function. -
After several seconds, your appliation should have finished the orchestrations. Switch to the
output
container and notice that the PDFs have been summarized as new files.
Note: The summaries may be truncated based on token limit from Azure Open AI. This is intentional as a way to reduce costs.
This app leverages Durable Functions to orchestrate the application workflow. By using Durable Functions, there's no need for additional infrastructure like queues and state stores to manage task coordination and durability, which significantly reduces the complexity for developers.
Take a look at the code snippet below, the process_document
defines the entire workflow, which consists of a series of steps (activities) that need to be scheduled in sequence. Coordination is key, as the output of one activity is passed as an input to the next. Additionally, Durable Functions handle durability and retries, which ensure that if a failure occurs, such as a transient error or an issue with a dependent service, the workflow can recover gracefully.
Use the Azure Developer CLI (azd
) to easily deploy the app.
-
In the root of the project, run the following command to provision and deploy the app:
azd up
-
When prompted, provide:
- A name for your Azure Developer CLI environment.
- The Azure subscription you'd like to use.
- The Azure location to use.
Once the azd up command finishes, the app will have successfully provisioned and deployed.
To use the app, simply upload a PDF to the Blob Storage input
container. Once the PDF is transferred, it will be processed using document intelligence and Azure OpenAI. The resulting summary will be saved to a new file and uploaded to the output
container.