You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: ai_search_with_adi/README.md
+5-1Lines changed: 5 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -38,10 +38,14 @@ The properties returned from the ADI Custom Skill are then used to perform the f
38
38
39
39
## Provided Notebooks \& Utilities
40
40
41
-
-`./ai_search.py`, `./deployment.py` provide an easy Python based utility for deploying an index, indexer and corresponding skillset for AI Search.
41
+
-`./ai_search.py`, `./deploy.py` provide an easy Python based utility for deploying an index, indexer and corresponding skillset for AI Search.
42
42
-`./function_apps/indexer` provides a pre-built Python function app that communicates with Azure Document Intelligence, Azure OpenAI etc to perform the Markdown conversion, extraction of figures, figure understanding and corresponding cleaning of Markdown.
43
43
-`./rag_with_ai_search.ipynb` provides example of how to utilise the AI Search plugin to query the index.
44
44
45
+
## Deploying AI Search Setup
46
+
47
+
To deploy the pre-built index and associated indexer / skillset setup, see instructions in `./ai_search/README.md`.
48
+
45
49
## ADI Custom Skill
46
50
47
51
Deploy the associated function app and required resources. You can then experiment with the custom skill by sending an HTTP request in the AI Search JSON format to the `/adi_2_ai_search` HTTP endpoint.
# AI Search Indexing with Azure Document Intelligence - Pre-built Index Setup
2
+
3
+
The associated scripts in this portion of the repository contains pre-built scripts to deploy the skillset with Azure Document Intelligence.
4
+
5
+
## Steps
6
+
7
+
1. Update `.env` file with the associated values. Not all values are required dependent on whether you are using System / User Assigned Identities or a Key based authentication.
8
+
2. Adjust `rag_documents.py` with any changes to the index / indexer. The `get_skills()` method implements the skills pipeline. Make any adjustments here in the skills needed to enrich the data source.
9
+
3. Run `deploy.py` with the following args:
10
+
11
+
-`indexer_type rag`. This selects the `rag_documents` sub class.
12
+
-`enable_page_chunking True`. This determines whether page wise chunking is applied in ADI, or whether the inbuilt skill is used for TextSplit. **Page wise analysis in ADI is recommended to avoid splitting tables / figures across multiple chunks, when the chunking is performed.**
13
+
-`rebuild`. Whether to delete and rebuild the index.
14
+
-`suffix`. Optional parameter that will apply a suffix onto the deployed index and indexer. This is useful if you want deploy a test version, before overwriting the main version.
15
+
16
+
## ai_search.py & environment.py
17
+
18
+
This includes a variety of helper files and scripts to deploy the index setup. This is useful for CI/CD to avoid having to write JSON files manually or use the UI to deploy the pipeline.
# We send both image caption and the image body to GPTv for better understanding
149
+
ifcaption!="":
150
+
response=awaitclient.chat.completions.create(
151
+
model=deployment_name,
152
+
messages=[
153
+
{"role": "system", "content": "You are a helpful assistant."},
175
154
{
176
-
"type": "image_base64",
177
-
"image_base64": {"image": image_base64},
155
+
"role": "user",
156
+
"content": [
157
+
{
158
+
"type": "text",
159
+
"text": f"Describe this image with technical analysis. Provide a well-structured, description. IMPORTANT: If the provided image is a logo or photograph, simply return 'Irrelevant Image'. (note: it has image caption: {caption}):",
160
+
},
161
+
{
162
+
"type": "image_base64",
163
+
"image_base64": {"image": image_base64},
164
+
},
165
+
],
178
166
},
179
167
],
180
-
},
181
-
],
182
-
max_tokens=MAX_TOKENS,
183
-
)
168
+
max_tokens=MAX_TOKENS,
169
+
)
184
170
185
-
else:
186
-
response=client.chat.completions.create(
187
-
model=deployment_name,
188
-
messages=[
189
-
{"role": "system", "content": "You are a helpful assistant."},
190
-
{
191
-
"role": "user",
192
-
"content": [
193
-
{"type": "text", "text": "Describe this image:"},
171
+
else:
172
+
response=awaitclient.chat.completions.create(
173
+
model=deployment_name,
174
+
messages=[
175
+
{"role": "system", "content": "You are a helpful assistant."},
194
176
{
195
-
"type": "image_base64",
196
-
"image_base64": {"image": image_base64},
177
+
"role": "user",
178
+
"content": [
179
+
{
180
+
"type": "text",
181
+
"text": "Describe this image with technical analysis. Provide a well-structured, description. IMPORTANT: If the provided image is a logo or photograph, simply return 'Irrelevant Image'.",
0 commit comments