Multimodal RAG Made Easy

This project proposes an architecture and demo of building a multimodal RAG on AWS as a serverless and CDK.

The code of the two projects below shows how to build a multimodal RAG.

aws-bedrock-examples Github: multimodal-rag-pdf.ipynb
aws-ai-ml-workshop-kr Github: 05_0_load_complex_pdf_kr_opensearch.ipynb

Architecture

To make the multimodal RAG implemented in the ipynb environment above available to the application by calling them as events, this project migrate them to a serverless environment. To do this, we break the code into multiple Lambdas per module and orchestrate them using AWS Step Functions.

The Step Functions workflow builds a multimodal RAG in 3 steps.

Load unstructed files using UnstructedFileLoader (or S3FileLoader)
Generate summarization of Images or Tables using Step Functions Map state to prevent Lambda Timeout issue (Summarized by Anthropic Claude Sonnet 3.0)
Start vector embedding using Amazon Titan Text Embeddings V2 model and indexing using Amazon OpenSearch Serverless

Issues

TBD

Contributor

Yoonseo Kim, AWS Associate Solutions Architect
TBD

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
bin		bin
lib		lib
test		test
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
cdk.json		cdk.json
jest.config.js		jest.config.js
lambda_function.py		lambda_function.py
package-lock.json		package-lock.json
package.json		package.json
requirements.txt		requirements.txt
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multimodal RAG Made Easy

Architecture

Issues

Contributor

About

Releases

Packages

Languages

ottlseo/multimodal-rag-made-easy

Folders and files

Latest commit

History

Repository files navigation

Multimodal RAG Made Easy

Architecture

Issues

Contributor

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages