From 7174d352c679471aa88813d43d00599e5ad2789a Mon Sep 17 00:00:00 2001 From: Amna Mubashar Date: Tue, 10 Dec 2024 15:26:43 +0100 Subject: [PATCH] feat: add an integration page for Azure AI Search (#289) * Add an integration file for azure ai search --- integrations/azure-ai-search.md | 85 ++++++++++++++++++++++++++++++++ logos/azure-ai.png | Bin 0 -> 8476 bytes 2 files changed, 85 insertions(+) create mode 100644 integrations/azure-ai-search.md create mode 100644 logos/azure-ai.png diff --git a/integrations/azure-ai-search.md b/integrations/azure-ai-search.md new file mode 100644 index 00000000..ab51c87f --- /dev/null +++ b/integrations/azure-ai-search.md @@ -0,0 +1,85 @@ +--- +layout: integration +name: Azure AI Search +description: Use Azure AI Search with Haystack +authors: + - name: deepset + socials: + github: deepset-ai + twitter: deepset_ai + linkedin: https://www.linkedin.com/company/deepset-ai/ +pypi: https://pypi.org/project/azure-ai-search +repo: https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/azure-ai-search +type: Document Store +report_issue: https://github.com/deepset-ai/haystack-core-integrations/issues +logo: /logos/azure-ai.png +version: Haystack 2.0 +toc: true +--- + +### **Table of Contents** +- [Overview](#overview) +- [Installation](#installation) +- [Usage](#usage) + +## Overview + +`AzureAIDocumentStore` supports an integration of [Azure AI Search](https://learn.microsoft.com/en-us/azure/search/search-what-is-azure-search) which is an enterprise-ready search and retrieval system with [Haystack](https://haystack.deepset.ai/) by [deepset](https://www.deepset.ai). + +This integration allows using search indexes in Azure AI Search as a document store to build RAG-based applications on Azure, with native LLM integrations. To retrieve data from the document store, the integration supports three types of retrieval techniques: + +1. **Embedding Retrieval**: For vector-based searches. +2. **BM25 Retrieval**: Keyword retrieval utilizing the BM25 algorithm. +3. **Hybrid Retrieval**: A combination of vector and BM25 retrieval methods. + +## Installation + +Install the Azure AI Search integration: + +```bash +pip install "azure-ai-search-haystack" +``` + +## Usage + +To use the `AzureAISearchDocumentStore`, you need to have an active [Azure subscription](https://azure.microsoft.com/en-us/products/ai-services/ai-search) with a deployed Azure AI Search service. You need to provide a search service endpoint as an `AZURE_AI_SEARCH_ENDPOINT` and an API key as `AZURE_AI_SEARCH_API_KEY` for authentication. If the API key is not provided, the `DefaultAzureCredential` will attempt to authenticate you through the browser. + +```python +from haystack_integrations.document_stores.azure_ai_search import AzureAISearchDocumentStore +from haystack import Document + +document_store = AzureAISearchDocumentStore( + metadata_fields={"version": float, "label": str}, + index_name="document-store-example", +) + +documents = [ + Document( + content="This is an introduction to using Python for data analysis.", + meta={"version": 1.0, "label": "chapter_one"}, + ), + Document( + content="Learn how to use Python libraries for machine learning.", + meta={"version": 1.5, "label": "chapter_two"}, + ), + Document( + content="Advanced Python techniques for data visualization.", + meta={"version": 2.0, "label": "chapter_three"}, + ), +] +document_store.write_documents(documents) + +filters = { + "operator": "AND", + "conditions": [ + {"field": "meta.version", "operator": ">", "value": 1.2}, + {"field": "meta.label", "operator": "in", "value": ["chapter_one", "chapter_three"]}, + ], +} + +results = document_store.filter_documents(filters) +print(results) +``` + +You can supply all supported parameters as `index_creation_kwargs` for `SearchIndex` during the initialization of the `AzureAISearchDocumentStore` to customize index creation. Additionally, the `AzureAISearchDocumentStore` supports semantic ranking, which can be enabled by including the `SemanticSearch` configuration in index_creation_kwargs during initialization and utilizing it through one of the retrievers. For further details, refer to the [Azure AI tutorial](https://learn.microsoft.com/en-us/azure/search/search-get-started-semantic?tabs=dotnet) on this feature. + diff --git a/logos/azure-ai.png b/logos/azure-ai.png new file mode 100644 index 0000000000000000000000000000000000000000..d9272608660326de3313a0376c7af2abe2634d36 GIT binary patch literal 8476 zcmdT}hdY~H*nd#fR#9rS2(1>?2wky52h~!$YLA)~Vilc^ z?JYJjf=X=hJ>Kj4Bfjssu3XP`PVRG$bN}vfo)fC6uFQ0S`vL#}OwXSwY6AeRJ^;|j zo;wT9(2!25PG4`el^+8o-MmC_aK`qL+9Loci())_O%IOGJ3ljc3joY5r$3r@r+g~_ zxK{RD@zKk7rt6cQZpI6k&8;J)9wNUZGGz2aY|wPH+Jiu;z)QVnotrH_M_>C^*5>Iw zRfE77NuqRwuj`lJ{Kbe96_2&Ps(SCzWeFP7n@;B{^o8)(Ui}Kk#Rx`Z$2_>7*fz3m zR^4;oOYxth`+Rn|dLM-)AC0lx>8**+OVq_FS+qO@sN4VjN8??-jq~CY_)PIW{Cg@r zdps(su!`;7vlQMZt30muLWB4O%^h?mbGLc0jy2K z%pBKZ6nIbB;-HU&a~8wAu9la{it2mzgg&iQ z6Wa(0^yCH&cp9Z2!TjpLRI3koRF<9IBy<)A0CL?IUmY~NN9qe!F0;baA+*3WJZo6L zmMXz$XHVwFE^F*KD}Zzp>Vy`R^&=i(<}>lBXy(ESh=6QfpHIA}CK=mSSONt3ebU_!gl;12}(K));pl z_=)Qxa9&ucpHJxm!(dH)qlUt4JXRLhklX|_?%v>`13vbkCG^2ECmL_rFngZq-; zgah)<&Z_v0PhIB$pihl3j-=u=>nrDkmRj8n0AQrG$3Do$r6J+h?2l{!$nH5xHdiN4 zsr0EQ^QLhptNykDH><$HH``!BI?m{V?&mHh0Ek^*0)exSWOzY}y3Pprvum~t3fL*X zAVr*xZYG;u?$$M!|6W)t1P=d6$qOpVD1yK|>*TgwLHoA6z6yQX-cF7l`_4vxwHyFK zk{Wvk@hh5&Q=_q5DeV|=>QmjJee+dh&EKAI(#z1N-gji2S-S3tUj~3nMH5JZ63Z)w zlx&D-FMh@R5CeaIb#Va%sKwSp_xzpjE6cqXGPeiKO~n9U`Y~TVGbf|0j(~_G0RX-Q zPJp_xd1NXsIjx%GJoiDcg_7ZLmY0D?Ect0dTE&xQ9AR?$_oe#rA3^eWNLvG0m!;?I zI=cTp-M$CPu;d$&so8wL@@33`O|7i(U!}zb5N2dzT56&Byl++9&1B>OycNp>0Cz~G z7bFTqGxvB3?~6kL01wSWzR`*tyUQk6jHQ!aT%ZU1cbEI6rHul7U4)-M7#y$<(+LEC zo}UF562fJlW$qQ-6Tc6xn`?x<|9vE0B1~2(^GuGZW!2)s zS-}5r!2$BS(Wz+cJe#0wqy-ItpUpE0PWHvQcVv!?*9cp!dkY^ucY?@TLpFG^Snm-T z%t7TK&lT$Y$KJ^F-%D{;Nnju!<( zSq#%Uoeq2A>tegqOxA+tWiWq)W5%2*C13XBMsP6WHDu~i=jcxc?SuLJId8B7z4r1Q4=;)5PtSy*Xq)WoxvAiaL&tuWIoY#C4^-QJ|vrroC* zBNcXBQ4XaGUwXw1Vp5c4XQ=n9l+`-~!V2a&wfJOc@x9Vw4ioSY*)_@5!HZCVFDuwG zENu1NHHHwStQ$5LMT%w@Ua$5x+~`eQi_~u#^<30BUaiQ>I+;vHv<%&k$jgkPyt73{ zRT7O}9u4>7_{vv)|KbJ9M17X%9m~iAw~PksFVaTqAp~tbU=B--CGj*%6)T!M2XTbyb}UgF@@v!H5%YqWIgn-Vo*I# zJqIoCmcG-NHZh|@*qKgX3Bu;cDi;S2YTjg0-P-9ORSd;KZVXifH}_sw_u>t`En{{3 zh{>S&?Pt=W1M_bQaX$cHvsYKThVEW`zbNe${3K(vMB!}PbpciiOKN$#tenHbx_Y;! zd*k1gZbPUCi?W7-N)(ipc-`KV61|t2TTu^X*hf7E9?mZ7-0=EZSMT}-qrTujHu@*f zu*&ioFW;=1*(K_{{min6oE@N|%d!!spCBHb6_8Grno~1l5JHp^Uo=s>jA#M>z2Bn~ zP0!Yik1nGn+^CDXEJ6!tdV)O!$2YoFCbgMe6pZ$DXgKn94Jt-49o$Mh;>{~sOWu62 zJp=&ee@TT`e%BQV{PEefp>(DR>CVNp`dVt5+3HfpLUe1En%b_nGs8-G%6t%^klU2r zx)UFyrcY#lg0I=9>T_V%)=#*A_uiiNuv>b}RpOvwp}~anNW)$5Xy1+a_L8y+VO`by z;8$~?za+Bdsji1ck(PY*0sqcgZyob3lN7o~cI}V3NM^nYpZ?g#l0_b22)FG%mDwNg z1be5j$yyP`8K3o_W$KD7zD(dv0ASAnt;gJruK@K;ntJ~y-n&%LCGRW?L>k30kiijr zrM~(lf|SQB=-APh_lyzA(6%JU1W&F8OWIPNhG)LU0_$t#$8pPLbfNWvE#1``;|E?v z`bBS|PkeQrvy}F2r>~GEc+VwQGlL8Ih7BjR?rNfK1IO$%C){r?({3E(_RoHu9graL zNAkUorIy5Dr}rOi)oQpOgp_*n&;Xml=tO#(qM4uK>vC`YB5(?Kl}o4Wu12Pt#+~KJ z%Z+M9P`oVE(^j(d`?fPyNVhh6B%B1nmSoa^^_ErvgkG>PGJUU9a)pE@oMwc>!}{}e z^4Mul#C*s2b!*g$k;9Za0F?HlEl*sOzU(kp1$=3(vi7rZ{PoqRw4xCvHJZ34Z@Evm z#gYj+x)P#2{;pbZWwyzO2orRO zdP$j6J9b}Orw29#s&0JBxj>bw)A^#f+rGG03cX#7-YxhZdhDoo^Sbgebw8BGB_%um z>^3JI@UCmE{M#CT5`#c<_j@+YZcDl$&0CFGcNwP8!?3FssL95?C=0~#SagkvwzlV& zNqiTmR%UP!f5FU(GSc>8r^r-)z|3pReJ!-F{pRC^;gjmYAqYXM_{REcm-_5nUq^`x zfa!KtSfglA(fshcn~nVevV^Ow!7QaO>oj~#Ez}732i1mC5(ORjNszU(x52fy9-e;A z9-iMa=ZCo3qhzA^8eFLwGp%vd!)muSH2dlj{`V`W;3UGpJMkpVL9=Ixnbc5FMi&U5HmuF}%~+lU>Q ze&EI0gvB@Q?<~)%7u-r8zj&tQM!Et*s_}T^hh(vspUcCEr?`UR(k)zhV;k=zEf8Fb z71e}_t;NUt;sh@tTm{zGEDN^stHm8NM(P^2OoAz7`+!-GayusE^v^*BBkQR~0FbS# z!Z+6ZFa2z5FS8 z^5$@#4&tMN1j^rS4*TuupW{D2-+e>#&~SORWIph-cJE!nfQBvNScoohpw#r^hD`B- z_{EU%pS=aU($$_|KU-^!IVt8ZY)k4J?wl%M(KntJ+bv3xk6captkHPymLs&ix~#{~ z-eALFdxrtI86XeeYbkHENig&gF==~i`FrrN_Iv(-6Z)yB-?b?k)hScQ3a5dBPHV#KVN)39Mgn>)^9vkd3MPq-HUvs8g+*Iy6?P2#{?Ba|zEYyUcmtah?< z8mm0%W0oHZ+i_*s-5#8{OSuu?_ipeq-Ow9rjNn8*JHR%-Awh;8rx_ZH96y)c)9+J= zxxvKwV9Xn)!~b=Kz|B^j;Ju2%2d2c+KJ*!QoD35$kB`iT_=qx{(FhNqkMOARTxoi~ z+&}~6&bvrsUPvwgD0WpQ`o=PM*hsBNt8)k0driz^s zrxehsUrHox)zuzDn8{X}Ks+u%@8{TvnM2}5u;*SbwYDYp1I-^)h|qMWdaU#}6nD07 z_2CQS>ZU(jwW{MkRjpBL*x6Y$5b|g!Xo@XCX=mk=(@I|XZL@`^S=vpbtc0`)=+3E& zy!ThHlmpZEm$UQ~^a(sMTv{r&PW&kcPON`*q+M0fQ6W8g#fZNznrg6eqX$!l6<`}L zbnso~#KL7OyAr+G@TDd$D(9I0j`?%!xUhm*+0JyLyo-@1j?$oOBSHgw>uw3e+G{_^6QK42boC`1c9}=Bds94j@r(E)`nmrAygzEF|f(+du zPeP1-6)D(PxwlK22R641DZHeFjsIhzij_XP^#DL0n%bcgD*;f1tY+k_hR@<~c2Z*396o zW(yggC&H|C>C!L*2eFDlm;t4H-d+KgcBc!10 zsI@TtQvT4B_LQ7-Z^SkElX$%2E#l9+;89JgpGk0XVdZibb>JEEZ4+?K6pH8hVU@u? z#THwr_jt(sj&x5Obb~pxM?nzbdxqP;!lVgTudYAQ@niIL4L5;nq!-Qf)r`=M3yJBHe$UG+O0m%zdyuo|X8*UY!ellPCf>r*Fi&DN&!oY^(^c8m1>^em*DOu=1*j z#nI&C{Z__&wkebFT*W5J&w+R; zHyg&1y6^oum2FlkP1G}pr(P+EO`!@cQmYxT}hMF?SYr{8rNIEBDFnwP4(VRfRzb z@J+hJs9#r13>i;Oy<;5Fb3P!k(Z6BwVaDEGXYJ<&5L0vGMH-9NCGvJ!$isy>2}sG$ zy4H4CdBhvyq4FqkYc(Y>5x>mEFD|X{ziR%4#dAJTF=edwFrrd*X1_$2SU^gI5=&mI zo_L;|EJy2BH1!y2j>6<6Ts4NVo7SHM8G+PCC!Y!{zi7FZ&pGGtNBJ7}&85A7+IwS% z6|RKwsj2*k(pkOkOc)R7_11sxCtoXhRuqVQJIY2-BD`X}BG_b>e?7O|rX|5w?!e`F zl;jjSMiAjf4DYgp;K}Rk@(xUf^@NN1J&i;B$4eCAOCmvIikMP_nQ&nD-_EBmwwwL% zG_=dUg+oig@*C~+GlJoyK~DKC>0=A~B(>lov8^q(QP`w%qo7<0bzVT zDybv#t~rf!wo@s&E0z-;CLRRMY;9E{0lXp=1Cr}0NY<=Y>!leLFTA3MqYxb@kaw_H z*ezuDI;*qYSt)+x_dR-`dplwG$i;iExk^~D%p(n)I0oY|8KTZB$XYc zU->T^k5@$<*gOm8JuYA|3Yd+KE4fnA(cu0dCFeYlx?vBMMKd;)_WIM`l0cRUo#@j&?TG%XQQ_m6q2Xys_1>5~RHP}fz3d>6Z_9

uvQ_Y&)?Rm|_8oij zW96aM4VExEjoUmtp#948IGYmW!T}T@xl!(!EyAG#7Rp1tJLIY4>oUfJg)=Je927}K zh)h5KDN!Y0THeg0R|_icKUG*Ee3MzeAY>*CKctuR*840Vh91mnylSSKAvE~K+OyV; zN#7DQI%m?EBsA? zM&gv{AXjY^lU|?Ybyv?e8zR_P>4ckh{-g9qM>l&@g6X2({KCC3%>Ax%I)KE52&=*) zTq&+q=e)kw5AA0LL_N49SQjCCG;4Xy)!WQqs=G^x=3S3oVioL2#f_D0qt+hlm;BYM z)aJz^ALV+s@WowFyN#WxHW@3)j_>rz6+2UaW&n1RDeeR3Q zMyfb;=!v61RNE=(hHBfoXR`&dAS`1&eV|tqGqAFNrX8ZQmfs>M&sULy;T7Nw$qye- z18;2Q8!xOGO4e>219hGvhz<;+5T4(9%$3N*pIzFtAg@9nLghzm)Vi5=xFD-mu{C@3 z{@}sg7E!^|3kNLEfW_4qPAvg}GHX<=&3PIiwTGh(C5`Op%TNgok*_EU8|UuoqXBB7 zKW6U4ezZo?2nxt(=)vrnY>L4=XiW!PVyyBXQZv>J4f#=*8SQOK^DUd05^Y*ljD3_N zU%ie-XV}#*4W{$2SiJG zbsSh|8EtcAof4Hnld7JUh$7QaiS)ojfn^OWG6t1Vv&ZaAyEa309-&~$( z08ZnI$RhLtuFgrE=^bz<6!5T)7)}rue}k>VYr^^z=WE2E4xz@c-lY<_&H%D~8NP#$ zA$ZX`^UE>!kz0P(o2(N(dFX((<=7+3tH@qQIW()VK)t)&Bh@r{TGZ)>jr#h^FYpcR z2LH=3h&c-nz&TfSJE~_OBU&8hpv5S|I(qZuW9E?Pwgf#OH7KamZrgLq!H$#DOF-rI z{R@VL)Q>tJfN8O$d4978S<{M|u)kja3TF-s%H7DwdId}y9`tVu24V%xZb`ASdeADy zeUI-1VV0k#BA@bzj$=-|YMDV*< zsC=fo3w28=iM?_!C+!TYni$1&%-4k)`r^OOW|M;Z!mfke{~F|=*%Tkt7LqZ`zsJGb z35}F=LbAKdOz2;b-BGjpKOrv;5ce9sR{it+0&qv9_qBZiH!iUde2o=cih*P7e(ezV zk36mc*J#LJb383