RAG using LangChain with Amazon Bedrock Titan text, and embedding, using OpenSearch vector engine

This sample repository provides a sample code for using RAG (Retrieval augmented generation) method relaying on Amazon Bedrock Titan Embeddings Generation 1 (G1) LLM (Large Language Model), for creating text embedding that will be stored in Amazon OpenSearch with vector engine support for assisting with the prompt engineering task for more accurate response from LLMs.

After we successfully loaded embeddings into OpenSearch, we will then start querying our LLM, by using LangChain. We will ask questions, retrieving similar embedding for a more accurate prompt.

You can use --bedrock-model-id parameter, to seamlessly choose one of the available foundation model in Amazon Bedrock, that defaults to Anthropic Claude v2 and can be replaced to any other model from any other model provider to choose your best performing foundation model.

Anthropic:

Claude v2 python ./ask-bedrock-with-rag.py --ask "How will AI will change our every day life?"
Claude v1.3 python ./ask-bedrock-with-rag.py --bedrock-model-id anthropic.claude-v1 --ask "How will AI will change our every day life?"
Claude Instance v1.2 python ./ask-bedrock-with-rag.py --bedrock-model-id anthropic.claude-instant-v1 --ask "How will AI will change our every day life?"

AI21 Labs:

Jurassic-2 Ultra python ./ask-bedrock-with-rag.py --bedrock-model-id ai21.j2-ultra-v1 --ask "How will AI will change our every day life?"
Jurassic-2 Mid python ./ask-bedrock-with-rag.py --bedrock-model-id ai21.j2-mid-v1 --ask "How will AI will change our every day life?"

Prerequisites

This was tested on Python 3.11.4
It is advise to work on a clean environment, use virtualenv or any other virtual environment manager.
```
pip install virtualenv
python -m virtualenv venv
source ./venv/bin/activate
```
Install requirements pip install -r requirements.txt

Install terraform to create the OpenSearch cluster

brew tap hashicorp/tap
brew install hashicorp/tap/terraform

Go to the Model Access page and enable the foundation models you want to use.

Steps for using this sample code

In the first step we will launch an OpenSearch cluster using Terraform.
```
cd ./terraform
terraform init
terraform apply -auto-approve
```
This cluster configuration is for testing proposes only, as it's endpoint is public for simplifying the use of this sample code.
Now that we have a running OpenSearch cluster with vector engine support we will start uploading our data that will help us with prompt engineering. For this sample, we will use a data source from Hugging Face embedding-training-data gooaq_pairs, we will download it, and invoke Titan embedding to get a text embedding, that we will store in OpenSearch for next steps.
```
python load-data-to-opensearch.py --recreate 1 --early-stop 1
```
Optional arguments:
- --recreate for recreating the index in OpenSearch
- --early-stop to load only 100 embedded documents into OpenSearch
- --index to use a different index than the default rag
- --region in case you are not using the default us-east-1
Now that we have embedded text, into our OpenSearch cluster, we can start querying our LLM model Titan text in Amazon Bedrock with RAG
```
python ask-bedrock-with-rag.py --ask "your question here"
```
Optional arguments:
- --index to use a different index than the default rag
- --region in case you are not using the default us-east-1
- --bedrock-model-id to choose different models than Anthropic's Claude v2

Cleanup

cd ./terraform
terraform destroy # When prompt for confirmation, type yes, and press enter.

Contributing

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
terraform		terraform
utils		utils
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
ask-bedrock-with-rag.py		ask-bedrock-with-rag.py
load-data-to-opensearch.py		load-data-to-opensearch.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG using LangChain with Amazon Bedrock Titan text, and embedding, using OpenSearch vector engine

Prerequisites

Steps for using this sample code

Cleanup

Contributing

License

About

Contributors 3

Languages

License

aws-samples/rag-using-langchain-amazon-bedrock-and-opensearch

Folders and files

Latest commit

History

Repository files navigation

RAG using LangChain with Amazon Bedrock Titan text, and embedding, using OpenSearch vector engine

Prerequisites

Steps for using this sample code

Cleanup

Contributing

License

About

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Contributors 3

Languages