From 2d18f13a02a2a75b1122180d19e8b39310d3a617 Mon Sep 17 00:00:00 2001 From: Cindy Qi Li Date: Tue, 7 May 2024 17:22:32 -0400 Subject: [PATCH 01/10] feat: add the script and the documentation for using RAG --- docs/RAG.md | 59 +++++++++++++++++++++++++++ jobs/RAG/data/user_doc.txt | 43 ++++++++++++++++++++ jobs/RAG/rag.py | 82 ++++++++++++++++++++++++++++++++++++++ jobs/RAG/requirements.txt | 5 +++ 4 files changed, 189 insertions(+) create mode 100644 docs/RAG.md create mode 100644 jobs/RAG/data/user_doc.txt create mode 100644 jobs/RAG/rag.py create mode 100644 jobs/RAG/requirements.txt diff --git a/docs/RAG.md b/docs/RAG.md new file mode 100644 index 0000000..9c9588a --- /dev/null +++ b/docs/RAG.md @@ -0,0 +1,59 @@ +# Experiment with Retrieval-Augumented Generation (RAG) + +Retrieval-augmented generation (RAG) is a technique for enhancing the accuracy and reliability of +generative AI models with facts fetched from external sources. This approach aims to address the +limitations of traditional language models, which may generate responses based solely on their +training data, potentially leading to factual errors or inconsistencies. Read +[What Is Retrieval-Augmented Generation, aka RAG?](https://blogs.nvidia.com/blog/what-is-retrieval-augmented-generation/) +for more information. + +In a co-design session with an AAC (Augmentative and Alternative Communication)) user, RAG can +be particularly useful. When the user expressed a desire to invite "Roy nephew" to her birthday +party, the ambiguity occurred as to whether "Roy" and "nephew" referred to the same person or +different individuals. Traditional language models might interpret this statement inconsistently, +sometimes treating "Roy" and "nephew" as the same person, and other times as separate persons. + +RAG addresses this issue by leveraging external knowledge sources, such as documents or databases +containing relevant information about the user's family members and their relationships. By +retrieving and incorporating this contextual information into the language model's input, RAG +can disambiguate the user's intent and generate a more accurate response. + +The RAG experiments are located in the `jobs/RAG` directory. It contains these scripts: + +* `rag.py`: use RAG to address the "Roy nephew" issue described above. + +## Run Scripts Locally + +### Prerequisites + +* [Ollama](https://github.com/ollama/ollama) to run language models locally + * Follow [README](https://github.com/ollama/ollama?tab=readme-ov-file#customize-a-model) to + install and run Ollama on a local computer. +* Download a sentence transformer model + * Select [a sentence transformer model](https://huggingface.co/sentence-transformers) + from Hugging Face. Download it to a local directory. Scripts in this experiment uses + [`all-MiniLM-L6-v2` model](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2). + Adjust the variable value of `sentence_transformer_dir` in scripts to point to the sentence + transformer model directory. +* If you are currently in a activated virtual environment, deactivate it. + +### Create/Activitate Virtual Environment +* Go to the RAG scripts directory + - `cd jobs/RAG` + +* [Create the virtual environment](https://docs.python.org/3/library/venv.html) + (one time setup): + - `python -m venv .venv` + +* Activate (every command-line session): + - Windows: `.\.venv\Scripts\activate` + - Mac/Linux: `source .venv/bin/activate` + +* Install Python Dependencies (Only run once for the installation) + - `pip install -r requirements.txt` + +### Run Scripts +* Run `rag.py` + - `python rag.py` + - The last two responses in the exectution result shows the language model's output + with and without the use of RAG. diff --git a/jobs/RAG/data/user_doc.txt b/jobs/RAG/data/user_doc.txt new file mode 100644 index 0000000..0ecae99 --- /dev/null +++ b/jobs/RAG/data/user_doc.txt @@ -0,0 +1,43 @@ +My life begins with Bliss. Jane Green introduced Bliss to me. + Mrs. Green was the principle at Virginia Waters School + in St. John’s Newfoundland. She came from England. +Mrs. Green was in the classroom when I entered the room. She taught spelling and math. I was 12 years old when I started at Virginia Waters School. Mrs. Green had two students who couldn’t communicate one of them was me. She had the opportunity to go Toronto and learned about Bliss she came back and showed us about Bliss. She showed us how Bliss worked, I kind of liked and we watch the film about Mr. Bliss. God I must have saw it 100 times. + I grew up with Bliss and I still use it. +Bliss opened many doors. I could communicate my needs through Bliss. + +Jane was someone special to me, she came in my life through out the years. I loved car trips with Jane many of them to McDonalds. + Now those times V.O.C.A. (Voice Output Communication Aids) wasn’t in my world. Jane always took the time to stop and talked. I was out of Mount Pearl by then Jane knew I didn’t like it. +I learn Mrs. Jane Green die + +Explain exon house When all this happened I was living in a institution called Exon House. There nurses and counselors who help with school stuff. I had a close friend named Terry who took me to xhurch and her house. This also when I met my close friend Alen. I remember he had long hair and a piece of rope tied around his wrist.I started using Bliss with Alen. We could talk more than we did before. I remember I had the word ‘pollution’ in my Bliss Book. I didn’t know what that word meant so Alen taught it to me. Alen worked at the Exon House. But before he even came to work there and before I even had heard about Bliss, Alen used to come plan summer day trips for us at Exon House. We would play games with bean bags and go swimming. There was also a memory game, of course I always won. Nobody could beat me. I think Alen is a social worker now. Sometimes I wonder where Aken and the gang are now. + +---------------------------------------------------------------- + +In 1977 Kathy and Dave took me out of Exon House, which was an institution. They were a couple from Nova Scotia. They were wait on someone when I got home.affter me They thought about getting someone else. They hired a girl named Sue to relieve them one weekend a month and one night a week. Sue was great. She was also from Nova Scotia. I always joked about them being from the Mainland, because I was from Newfoundland Those were good times We had a little puppy named Buddy. Kathy always hit the dog with the newspaper. More girls came to work with us. When sue had leave and then Kathy got pregnant. +After awhile they were thinking about get another girl + +Kathy and I went on a trip to Winnipeg. After I graduated from school, Kathy and I went to Winnipeg. There was a conference. There were many people who used augmentative communication. I met some great people there. There was a sweet little girl, Her name was Louise. I went over to introduce myself to her. She used a wheelchair, She was a very intelligent girl. So We had a big dinner. They were filming us for a documentary. It was on Sue’s wedding. I don’t really know why they were filming her wedding, my guess is that it was because she was a Bliss user and Bliss communication was a huge deal at the time. Sue used her foot to communicate. Sue also worked hard. +Louise’s friend named her “blabber mouth” because she was always talking. Her friend was either her mother or her teacher. + +The building that we stayed in had all kinds of flags. It was white. It had stairs. It was beautiful. There were partitions in the lobby with information about augmentative communication on them for people to look at and find out about on their own. We had barrels of fun. Those lovely times We had dinner I had tto explane what Bliss was to everyone that I met We also took a part in a movie They were a couple with cp. Her name wa sue O’dell. + +Taught bliss to dave and Kathy, even made a bliss cook book with Kathy. I stayed with Kathy and Dave up until when I was 19 and had finished high school school. + +-----------------after high school went to OCCC for two weeks + +Bloorview. + +The first time I saw Shirley was through Plexiglass. She was pretty. I said, “That couldn’t be Shirley.” Sure enough it was thee one and only Shirley. + +When I was nineteen years old I went to the Ontario Crippled Children’s Centre for one week to be assesed. I thought Shirley was working with me but I got Lynette. +For a few days I saw a social worker and this lady. So Lynette introduced me to herself and I liked her. + +When my week was up I went home. In September I went back for school. Lynnette was my teacher and +I had a teacher named Marnie who taught me math and reading. We had some fun. I really liked her. Lynnette taught me math and how to add and subtract using money. I was staying at the OCCC. I really enjoyed it there. All the girls shared one big room. Every night we ate toast and tea. We had a Halloween party and I won a big teddy bear. +The social worker was Bob Masan, We got to know each other. Bob made me feel comfortable. we laughed. Bob was a sweet person. +A nurse named Vicky. We got along well, When I had a question I always went to her. + + +I learned life skills for three months, cooking, washing clothes. Then I went back home ST.John’s. + +I have a nephew whose name is Roy. I also have a niece. diff --git a/jobs/RAG/rag.py b/jobs/RAG/rag.py new file mode 100644 index 0000000..ddf7a55 --- /dev/null +++ b/jobs/RAG/rag.py @@ -0,0 +1,82 @@ +# Copyright (c) 2023-2024, Inclusive Design Institute +# +# Licensed under the BSD 3-Clause License. You may not use this file except +# in compliance with this License. +# +# You may obtain a copy of the BSD 3-Clause License at +# https://github.com/inclusive-design/baby-bliss-bot/blob/main/LICENSE + +import os +from langchain_community.document_loaders import TextLoader +from langchain_text_splitters import CharacterTextSplitter +from langchain_community.vectorstores import FAISS +from langchain_community.embeddings import HuggingFaceEmbeddings +from langchain_community.chat_models import ChatOllama +from langchain_core.output_parsers import StrOutputParser +from langchain_core.prompts import ChatPromptTemplate +from operator import itemgetter + +# The location of the user document +user_doc = "./data/user_doc.txt" + +# The location of the sentence transformer model in the local directory +sentence_transformer_dir = os.path.expanduser("~") + "/Development/LLMs/all-MiniLM-L6-v2" + +loader = TextLoader(user_doc) +documents = loader.load() +# print(f"Loaded documents (first 2 rows):\n{documents[:2]}\n\n") + +text_splitter = CharacterTextSplitter(chunk_size=200, chunk_overlap=0) +splitted_docs = text_splitter.split_documents(documents) +# print(f"Splitted documents (first 2 rows):\n{splitted_docs[:2]}\n\n") + +# Instantiate the embedding class +embedding_func = HuggingFaceEmbeddings(model_name=sentence_transformer_dir) + +# Load into the vector database +vectordb = FAISS.from_documents(splitted_docs, embedding_func) + +# Create a vector store retriever +retriever = vectordb.as_retriever() + +# query the vector db to test +queries = [ + "Roy nephew", + "my schools"] + +for query in queries: + results = retriever.invoke(query) + print(f"====== Test: Similarity search for \"{query}\" ======\n{results[0].page_content}\n\n") + +# Create prompt template +prompt_template_with_context = """ +### [INST] Help to convert Elaine's telegraphic input in the conversation to full sentences in first-person. Only respond with the converted full sentences. Here is context to help: + +{context} + +### Conversation: +{chat} [/INST] + """ + +llm = ChatOllama(model="llama3", system="Elaine is an AAC user who expresses herself telegraphically. She is now in a meeting with Jutta. Below is the conversation in the meeting. Please help to convert what Elaine said to first-person sentences. Only respond with converted sentences.") +prompt = ChatPromptTemplate.from_template(prompt_template_with_context) + +elain_reply = "Roy nephew" +full_chat = f"Jutta: Elain, who would you like to invite to your birthday party?\n Elaine: {elain_reply}." + +# using LangChain Expressive Language (LCEL) chain syntax +chain = prompt | llm | StrOutputParser() + +print(f"====== Response without RAG ======") + +print(chain.invoke({ + "context": "", + "chat": full_chat +}) + "\n") + +print(f"====== Response with RAG ======") + +print(chain.invoke({ + "context": retriever.invoke(elain_reply), + "chat": full_chat +}) + "\n") diff --git a/jobs/RAG/requirements.txt b/jobs/RAG/requirements.txt new file mode 100644 index 0000000..4330d46 --- /dev/null +++ b/jobs/RAG/requirements.txt @@ -0,0 +1,5 @@ +langchain_community +langchain_core +langchain_text_splitters +sentence_transformers +faiss-cpu From e0bcea52f2b9101b16911cb120cb68f1b22116e6 Mon Sep 17 00:00:00 2001 From: Cindy Qi Li Date: Tue, 7 May 2024 17:25:20 -0400 Subject: [PATCH 02/10] fix: linted --- jobs/RAG/rag.py | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/jobs/RAG/rag.py b/jobs/RAG/rag.py index ddf7a55..bc33845 100644 --- a/jobs/RAG/rag.py +++ b/jobs/RAG/rag.py @@ -14,7 +14,6 @@ from langchain_community.chat_models import ChatOllama from langchain_core.output_parsers import StrOutputParser from langchain_core.prompts import ChatPromptTemplate -from operator import itemgetter # The location of the user document user_doc = "./data/user_doc.txt" @@ -67,14 +66,14 @@ # using LangChain Expressive Language (LCEL) chain syntax chain = prompt | llm | StrOutputParser() -print(f"====== Response without RAG ======") +print("====== Response without RAG ======") print(chain.invoke({ "context": "", "chat": full_chat }) + "\n") -print(f"====== Response with RAG ======") +print("====== Response with RAG ======") print(chain.invoke({ "context": retriever.invoke(elain_reply), From ead77be868e4258be431f7318b78728dd2291a02 Mon Sep 17 00:00:00 2001 From: Cindy Qi Li Date: Wed, 8 May 2024 14:55:25 -0400 Subject: [PATCH 03/10] fix: clean user_doc.txt and the term for similarity search in the vector db --- jobs/RAG/data/user_doc.txt | 8 ++------ jobs/RAG/rag.py | 2 +- 2 files changed, 3 insertions(+), 7 deletions(-) diff --git a/jobs/RAG/data/user_doc.txt b/jobs/RAG/data/user_doc.txt index 0ecae99..fdc8338 100644 --- a/jobs/RAG/data/user_doc.txt +++ b/jobs/RAG/data/user_doc.txt @@ -9,9 +9,7 @@ Jane was someone special to me, she came in my life through out the years. I lov Now those times V.O.C.A. (Voice Output Communication Aids) wasn’t in my world. Jane always took the time to stop and talked. I was out of Mount Pearl by then Jane knew I didn’t like it. I learn Mrs. Jane Green die -Explain exon house When all this happened I was living in a institution called Exon House. There nurses and counselors who help with school stuff. I had a close friend named Terry who took me to xhurch and her house. This also when I met my close friend Alen. I remember he had long hair and a piece of rope tied around his wrist.I started using Bliss with Alen. We could talk more than we did before. I remember I had the word ‘pollution’ in my Bliss Book. I didn’t know what that word meant so Alen taught it to me. Alen worked at the Exon House. But before he even came to work there and before I even had heard about Bliss, Alen used to come plan summer day trips for us at Exon House. We would play games with bean bags and go swimming. There was also a memory game, of course I always won. Nobody could beat me. I think Alen is a social worker now. Sometimes I wonder where Aken and the gang are now. - ----------------------------------------------------------------- +Explain exon house When all this happened I was living in a institution called Exon House. There nurses and counselors who help with school stuff. I had a close friend named Terry who took me to xhurch and her house. This also when I met my close friend Alen. I remember he had long hair and a piece of rope tied around his wrist.I started using Bliss with Alen. We could talk more than we did before. I remember I had the word ‘pollution’ in my Bliss Book. I didn’t know what that word meant so Alen taught it to me. Alen worked at the Exon House. But before he even came to work there and before I even had heard about Bliss, Alen used to come plan summer day trips for us at Exon House. We would play games with bean bags and go swimming. There was also a memory game, of course I always won. Nobody could beat me. I think Alen is a social worker now. Sometimes I wonder where Aken and the gang are now. In 1977 Kathy and Dave took me out of Exon House, which was an institution. They were a couple from Nova Scotia. They were wait on someone when I got home.affter me They thought about getting someone else. They hired a girl named Sue to relieve them one weekend a month and one night a week. Sue was great. She was also from Nova Scotia. I always joked about them being from the Mainland, because I was from Newfoundland Those were good times We had a little puppy named Buddy. Kathy always hit the dog with the newspaper. More girls came to work with us. When sue had leave and then Kathy got pregnant. After awhile they were thinking about get another girl @@ -23,9 +21,7 @@ The building that we stayed in had all kinds of flags. It was white. It had Taught bliss to dave and Kathy, even made a bliss cook book with Kathy. I stayed with Kathy and Dave up until when I was 19 and had finished high school school. ------------------after high school went to OCCC for two weeks - -Bloorview. +after high school went to OCCC for two weeks in Bloorview. The first time I saw Shirley was through Plexiglass. She was pretty. I said, “That couldn’t be Shirley.” Sure enough it was thee one and only Shirley. diff --git a/jobs/RAG/rag.py b/jobs/RAG/rag.py index bc33845..5e3c2c3 100644 --- a/jobs/RAG/rag.py +++ b/jobs/RAG/rag.py @@ -41,7 +41,7 @@ # query the vector db to test queries = [ "Roy nephew", - "my schools"] + "high school"] for query in queries: results = retriever.invoke(query) From 50758d24c256259762a467092d91d5c70b672f52 Mon Sep 17 00:00:00 2001 From: Cindy Qi Li Date: Fri, 21 Jun 2024 14:36:04 -0400 Subject: [PATCH 04/10] feat: explore methods for retaining the chat history --- README.md | 35 +++++++-- docs/RAG.md | 6 +- docs/ReflectChatHistory.md | 77 +++++++++++++++++++ jobs/RAG/chat_history_with_prompt.py | 72 ++++++++++++++++++ jobs/RAG/chat_history_with_summary.py | 103 ++++++++++++++++++++++++++ 5 files changed, 286 insertions(+), 7 deletions(-) create mode 100644 docs/ReflectChatHistory.md create mode 100644 jobs/RAG/chat_history_with_prompt.py create mode 100644 jobs/RAG/chat_history_with_summary.py diff --git a/README.md b/README.md index 259577f..7703d44 100644 --- a/README.md +++ b/README.md @@ -58,31 +58,55 @@ with generating new Bliss symbols etc. ### Llama2 -Conclusion: useful +**Conclusion**: useful See the [Llama2FineTuning.md](./docs/Llama2FineTuning.md) in the [documentation](./docs) folder for details on how to fine tune, evaluation results and the conclusion about how useful it is. ### StyleGAN3 -Conclusion: not useful +**Conclusion**: not useful See the [TrainStyleGAN3Model.md](./docs/TrainStyleGAN3Model.md) in the [documentation](./docs) folder for details on how to train this model, training results and the conclusion about how useful it is. ### StyleGAN2-ADA -Conclusion: shows promise +**Conclusion**: shows promise See the [StyleGAN2-ADATraining.md](./docs/StyleGAN2-ADATraining.md) in the [documentation](./docs) folder for details on how to train this model and training results. ### Texture Inversion -Conclusion: not useful +**Conclusion**: not useful See the [Texture Inversion documentation](./notebooks/README.md) for details. +## Preserving Information + +### RAG (Retrieval-augmented generation) + +**Conclusion**: useful + +RAG (Retrieval-augmented generation) technique is explored to resolve ambiguities by retrieving relevant contextual +information from external sources, enabling the language model to generate more accurate and reliable responses. + +See [RAG.md](./docs/RAG.md) for more details. + +### Reflection over Chat History + +**Conclusion**: useful + +When users have a back-and-forth conversation, the application requires a form of "memory" to retain and incorporate past interactions into its current processing. Two methods are explored to achieve this: + +1. Summarizing the chat history and providing it as contextual input. +2. Using prompt engineering to instruct the language model to consider the past conversation. + +The second method, prompt engineering, yields more desired responses than summarizing chat history. + +See [ReflectChatHistory.md](./docs/RAG.md) for more details. + ## Notebooks [`/notebooks`](./notebooks/) directory contains all notebooks used for training or fine-tuning various models. @@ -90,7 +114,8 @@ Each notebook usually comes with a accompanying `dockerfile.yml` to elaborate th running in. ## Jobs -[`/jobs`](./jobs/) directory contains all jobs used for training or fine-tuning various models. +[`/jobs`](./jobs/) directory contains all jobs and scripts used for training or fine-tuning various models, as well +as other explorations with RAG (Retrieval-augmented generation) and preserving chat history. ## Utility Scripts diff --git a/docs/RAG.md b/docs/RAG.md index 9c9588a..54ff500 100644 --- a/docs/RAG.md +++ b/docs/RAG.md @@ -7,7 +7,7 @@ training data, potentially leading to factual errors or inconsistencies. Read [What Is Retrieval-Augmented Generation, aka RAG?](https://blogs.nvidia.com/blog/what-is-retrieval-augmented-generation/) for more information. -In a co-design session with an AAC (Augmentative and Alternative Communication)) user, RAG can +In a co-design session with an AAC (Augmentative and Alternative Communication) user, RAG can be particularly useful. When the user expressed a desire to invite "Roy nephew" to her birthday party, the ambiguity occurred as to whether "Roy" and "nephew" referred to the same person or different individuals. Traditional language models might interpret this statement inconsistently, @@ -18,8 +18,10 @@ containing relevant information about the user's family members and their relati retrieving and incorporating this contextual information into the language model's input, RAG can disambiguate the user's intent and generate a more accurate response. -The RAG experiments are located in the `jobs/RAG` directory. It contains these scripts: +The RAG experiment is located in the `jobs/RAG` directory. It contains these scripts: +* `requirements.txt`: contains python dependencies for setting up the environment to run +the python script. * `rag.py`: use RAG to address the "Roy nephew" issue described above. ## Run Scripts Locally diff --git a/docs/ReflectChatHistory.md b/docs/ReflectChatHistory.md new file mode 100644 index 0000000..b960944 --- /dev/null +++ b/docs/ReflectChatHistory.md @@ -0,0 +1,77 @@ +# Reflection over Chat History + +When users have a back-and-forth conversation, the application requires a form of "memory" to retain and incorporate +past interactions into its current processing. Two methods are explored to achieve this: + +1. Summarizing the chat history and providing it as contextual input. +2. Using prompt engineering to instruct the language model to consider the past conversation. + +The second method, prompt engineering, yields more desired responses than summarizing chat history. + +The scripts for this experiment is located in the `jobs/RAG` directory. + +## Method 1: Summarizing the Chat History + +### Steps + +1. Summarize the past conversation and include it in the prompt as contextual information. +2. Include a specified number of the most recent conversation exchanges in the prompt for additional context. +3. Instruct the language model to convert the telegraphic replies from the AAC user into full sentences to continue +the conversation. + +### Result + +The conversion process struggles to effectively utilize the provided summary, often resulting in inaccurate full +sentences. + +### Scripts + +* `requirements.txt`: Lists the Python dependencies needed to set up the environment. +* `chat_history_with_summary.py`: Implements the steps described above and displays the output. + +## Method 2: Using Prompt Engineering + +### Steps + +1. Include the past conversation in the prompt as contextual information. +2. Instruct the language model to reference this context when converting the telegraphic replies from the AAC user +into full sentences to continue the conversation. + +### Result + +The converted sentences are more accurate and appropriate compared to those generated using Method 1. + +### Scripts + +* `requirements.txt`: Lists the Python dependencies needed to set up the environment. +* `chat_history_with_prompt.py`: Implements the steps described above and displays the output. + +## Run Scripts Locally + +### Prerequisites + +* [Ollama](https://github.com/ollama/ollama) to run language models locally + * Follow [README](https://github.com/ollama/ollama?tab=readme-ov-file#customize-a-model) to + install and run Ollama on a local computer. +* If you are currently in a activated virtual environment, deactivate it. + +### Create/Activitate Virtual Environment +* Go to the RAG scripts directory + - `cd jobs/RAG` + +* [Create the virtual environment](https://docs.python.org/3/library/venv.html) + (one time setup): + - `python -m venv .venv` + +* Activate (every command-line session): + - Windows: `.\.venv\Scripts\activate` + - Mac/Linux: `source .venv/bin/activate` + +* Install Python Dependencies (Only run once for the installation) + - `pip install -r requirements.txt` + +### Run Scripts +* Run `chat_history_with_summary.py` or `chat_history_with_prompt.py` + - `python chat_history_with_summary.py` or `python chat_history_with_prompt.py` + - The last two responses in the exectution result shows the language model's output + with and without the contextual information. diff --git a/jobs/RAG/chat_history_with_prompt.py b/jobs/RAG/chat_history_with_prompt.py new file mode 100644 index 0000000..cd01e26 --- /dev/null +++ b/jobs/RAG/chat_history_with_prompt.py @@ -0,0 +1,72 @@ +# Copyright (c) 2024, Inclusive Design Institute +# +# Licensed under the BSD 3-Clause License. You may not use this file except +# in compliance with this License. +# +# You may obtain a copy of the BSD 3-Clause License at +# https://github.com/inclusive-design/baby-bliss-bot/blob/main/LICENSE + +from langchain_community.chat_models import ChatOllama +from langchain_core.output_parsers import StrOutputParser +from langchain_core.prompts import ChatPromptTemplate + +# from langchain_core.globals import set_debug +# set_debug(True) + +# Define the Ollama model to use +model = "llama3" + +# Telegraphic reply to be translated +message_to_convert = "she love cooking like share recipes" + +# Conversation history +chat_history = [ + "John: Have you heard about the new Italian restaurant downtown?", + "Elain: Yes, I did! Sarah mentioned it to me yesterday. She said the pasta there is amazing.", + "John: I was thinking of going there this weekend. Want to join?", + "Elain: That sounds great! Maybe we can invite Sarah too.", + "John: Good idea. By the way, did you catch the latest episode of that mystery series we were discussing last week?", + "Elain: Oh, the one with the detective in New York? Yes, I watched it last night. It was so intense!", + "John: I know, right? I didn't expect that plot twist at the end. Do you think Sarah has seen it yet?", + "Elain: I'm not sure. She was pretty busy with work the last time we talked. We should ask her when we see her at the restaurant.", + "John: Definitely. Speaking of Sarah, did she tell you about her trip to Italy next month?", + "Elain: Yes, she did. She's so excited about it! She's planning to visit a lot of historical sites.", + "John: I bet she'll have a great time. Maybe she can bring back some authentic Italian recipes for us to try.", +] + +# Instantiate the chat model and split the conversation history +llm = ChatOllama(model=model) + +# Create prompt template +prompt_template_with_context = """ +Elaine prefers to talk using telegraphic messages. +Given a chat history and Elain's latest response which +might reference context in the chat history, convert +Elain's response to full sentences. Only respond with +converted full sentences. + +Chat history: +{chat_history} + +Elaine's response: +{message_to_convert} +""" + +prompt = ChatPromptTemplate.from_template(prompt_template_with_context) + +# using LangChain Expressive Language (LCEL) chain syntax +chain = prompt | llm | StrOutputParser() + +print("====== Response without chat history ======") + +print(chain.invoke({ + "chat_history": "", + "message_to_convert": message_to_convert +}) + "\n") + +print("====== Response with chat history ======") + +print(chain.invoke({ + "chat_history": "\n".join(chat_history), + "message_to_convert": message_to_convert +}) + "\n") diff --git a/jobs/RAG/chat_history_with_summary.py b/jobs/RAG/chat_history_with_summary.py new file mode 100644 index 0000000..0c1cfdd --- /dev/null +++ b/jobs/RAG/chat_history_with_summary.py @@ -0,0 +1,103 @@ +# Copyright (c) 2024, Inclusive Design Institute +# +# Licensed under the BSD 3-Clause License. You may not use this file except +# in compliance with this License. +# +# You may obtain a copy of the BSD 3-Clause License at +# https://github.com/inclusive-design/baby-bliss-bot/blob/main/LICENSE + +from langchain_community.chat_models import ChatOllama +from langchain_core.output_parsers import StrOutputParser +from langchain_core.prompts import ChatPromptTemplate + +# from langchain_core.globals import set_debug +# set_debug(True) + +# Define the Ollama model to use +model = "llama3" + +# Define the number of the most recent chats to be passed in as the most recent chats. +# The summary of chats before the most recent will be passed in as another context element. +num_of_recent_chat = 1 + +# Telegraphic reply to be translated +message_to_convert = "she love cooking like share recipes" + +# Chat history +chat_history = [ + "John: Have you heard about the new Italian restaurant downtown?", + "Elain: Yes, I did! Sarah mentioned it to me yesterday. She said the pasta there is amazing.", + "John: I was thinking of going there this weekend. Want to join?", + "Elain: That sounds great! Maybe we can invite Sarah too.", + "John: Good idea. By the way, did you catch the latest episode of that mystery series we were discussing last week?", + "Elain: Oh, the one with the detective in New York? Yes, I watched it last night. It was so intense!", + "John: I know, right? I didn't expect that plot twist at the end. Do you think Sarah has seen it yet?", + "Elain: I'm not sure. She was pretty busy with work the last time we talked. We should ask her when we see her at the restaurant.", + "John: Definitely. Speaking of Sarah, did she tell you about her trip to Italy next month?", + "Elain: Yes, she did. She's so excited about it! She's planning to visit a lot of historical sites.", + "John: I bet she'll have a great time. Maybe she can bring back some authentic Italian recipes for us to try.", +] +recent_chat_array = [] +earlier_chat_array = [] + +# 1. Instantiate the chat model and split the chat history +llm = ChatOllama(model=model) + +if (len(chat_history) > num_of_recent_chat): + recent_chat_array = chat_history[-num_of_recent_chat:] + earlier_chat_array = chat_history[:-num_of_recent_chat] +else: + recent_chat_array = chat_history + earlier_chat_array = [] + +# 2. Summarize earlier chat +if (len(earlier_chat_array) > 0): + summarizer_prompt = ChatPromptTemplate.from_template("Summarize the following chat history. Provide only the summary, without any additional comments or context. \nChat history: {chat_history}") + chain = summarizer_prompt | llm | StrOutputParser() + summary = chain.invoke({ + "chat_history": "\n".join(earlier_chat_array) + }) +print("====== Summary ======") +print(f"{summary}\n") + +# 3. concetenate recent chat into a string +recent_chat_string = "\n".join(recent_chat_array) +print("====== Recent Chat ======") +print(f"{recent_chat_string}\n") + +# Create prompt template +prompt_template_with_context = """ +### Elaine prefers to talk using telegraphic messages. Help to convert Elaine's reply to a chat into full sentences in first-person. Only respond with the converted full sentences. + +### This is the chat summary: + +{summary} + +### This is the most recent chat between Elaine and others: + +{recent_chat} + +### This is Elaine's most recent response to continue the chat. Please convert: +{message_to_convert} +""" + +prompt = ChatPromptTemplate.from_template(prompt_template_with_context) + +# using LangChain Expressive Language (LCEL) chain syntax +chain = prompt | llm | StrOutputParser() + +print("====== Response without chat history ======") + +print(chain.invoke({ + "summary": "", + "recent_chat": recent_chat_string, + "message_to_convert": message_to_convert +}) + "\n") + +print("====== Response with chat history ======") + +print(chain.invoke({ + "summary": summary, + "recent_chat": recent_chat_string, + "message_to_convert": message_to_convert +}) + "\n") From 15239b402c8fb4d80cb8ab7fc9ef3e3b8f9b1cd2 Mon Sep 17 00:00:00 2001 From: Cindy Qi Li Date: Tue, 25 Jun 2024 16:01:12 -0400 Subject: [PATCH 05/10] fix: fix the mis-spelling on Elaine's name --- jobs/RAG/chat_history_with_prompt.py | 14 +++++++------- jobs/RAG/chat_history_with_summary.py | 10 +++++----- jobs/RAG/rag.py | 6 +++--- 3 files changed, 15 insertions(+), 15 deletions(-) diff --git a/jobs/RAG/chat_history_with_prompt.py b/jobs/RAG/chat_history_with_prompt.py index cd01e26..58bab9d 100644 --- a/jobs/RAG/chat_history_with_prompt.py +++ b/jobs/RAG/chat_history_with_prompt.py @@ -22,15 +22,15 @@ # Conversation history chat_history = [ "John: Have you heard about the new Italian restaurant downtown?", - "Elain: Yes, I did! Sarah mentioned it to me yesterday. She said the pasta there is amazing.", + "Elaine: Yes, I did! Sarah mentioned it to me yesterday. She said the pasta there is amazing.", "John: I was thinking of going there this weekend. Want to join?", - "Elain: That sounds great! Maybe we can invite Sarah too.", + "Elaine: That sounds great! Maybe we can invite Sarah too.", "John: Good idea. By the way, did you catch the latest episode of that mystery series we were discussing last week?", - "Elain: Oh, the one with the detective in New York? Yes, I watched it last night. It was so intense!", + "Elaine: Oh, the one with the detective in New York? Yes, I watched it last night. It was so intense!", "John: I know, right? I didn't expect that plot twist at the end. Do you think Sarah has seen it yet?", - "Elain: I'm not sure. She was pretty busy with work the last time we talked. We should ask her when we see her at the restaurant.", + "Elaine: I'm not sure. She was pretty busy with work the last time we talked. We should ask her when we see her at the restaurant.", "John: Definitely. Speaking of Sarah, did she tell you about her trip to Italy next month?", - "Elain: Yes, she did. She's so excited about it! She's planning to visit a lot of historical sites.", + "Elaine: Yes, she did. She's so excited about it! She's planning to visit a lot of historical sites.", "John: I bet she'll have a great time. Maybe she can bring back some authentic Italian recipes for us to try.", ] @@ -40,9 +40,9 @@ # Create prompt template prompt_template_with_context = """ Elaine prefers to talk using telegraphic messages. -Given a chat history and Elain's latest response which +Given a chat history and Elaine's latest response which might reference context in the chat history, convert -Elain's response to full sentences. Only respond with +Elaine's response to full sentences. Only respond with converted full sentences. Chat history: diff --git a/jobs/RAG/chat_history_with_summary.py b/jobs/RAG/chat_history_with_summary.py index 0c1cfdd..925e163 100644 --- a/jobs/RAG/chat_history_with_summary.py +++ b/jobs/RAG/chat_history_with_summary.py @@ -26,15 +26,15 @@ # Chat history chat_history = [ "John: Have you heard about the new Italian restaurant downtown?", - "Elain: Yes, I did! Sarah mentioned it to me yesterday. She said the pasta there is amazing.", + "Elaine: Yes, I did! Sarah mentioned it to me yesterday. She said the pasta there is amazing.", "John: I was thinking of going there this weekend. Want to join?", - "Elain: That sounds great! Maybe we can invite Sarah too.", + "Elaine: That sounds great! Maybe we can invite Sarah too.", "John: Good idea. By the way, did you catch the latest episode of that mystery series we were discussing last week?", - "Elain: Oh, the one with the detective in New York? Yes, I watched it last night. It was so intense!", + "Elaine: Oh, the one with the detective in New York? Yes, I watched it last night. It was so intense!", "John: I know, right? I didn't expect that plot twist at the end. Do you think Sarah has seen it yet?", - "Elain: I'm not sure. She was pretty busy with work the last time we talked. We should ask her when we see her at the restaurant.", + "Elaine: I'm not sure. She was pretty busy with work the last time we talked. We should ask her when we see her at the restaurant.", "John: Definitely. Speaking of Sarah, did she tell you about her trip to Italy next month?", - "Elain: Yes, she did. She's so excited about it! She's planning to visit a lot of historical sites.", + "Elaine: Yes, she did. She's so excited about it! She's planning to visit a lot of historical sites.", "John: I bet she'll have a great time. Maybe she can bring back some authentic Italian recipes for us to try.", ] recent_chat_array = [] diff --git a/jobs/RAG/rag.py b/jobs/RAG/rag.py index 5e3c2c3..5e1edcf 100644 --- a/jobs/RAG/rag.py +++ b/jobs/RAG/rag.py @@ -60,8 +60,8 @@ llm = ChatOllama(model="llama3", system="Elaine is an AAC user who expresses herself telegraphically. She is now in a meeting with Jutta. Below is the conversation in the meeting. Please help to convert what Elaine said to first-person sentences. Only respond with converted sentences.") prompt = ChatPromptTemplate.from_template(prompt_template_with_context) -elain_reply = "Roy nephew" -full_chat = f"Jutta: Elain, who would you like to invite to your birthday party?\n Elaine: {elain_reply}." +elaine_reply = "Roy nephew" +full_chat = f"Jutta: Elaine, who would you like to invite to your birthday party?\n Elaine: {elaine_reply}." # using LangChain Expressive Language (LCEL) chain syntax chain = prompt | llm | StrOutputParser() @@ -76,6 +76,6 @@ print("====== Response with RAG ======") print(chain.invoke({ - "context": retriever.invoke(elain_reply), + "context": retriever.invoke(elaine_reply), "chat": full_chat }) + "\n") From beeb5d28a45fbbc7914b3e56747a5f112d07abd6 Mon Sep 17 00:00:00 2001 From: Cindy Qi Li Date: Fri, 28 Jun 2024 12:35:45 -0400 Subject: [PATCH 06/10] fix: improve the rag.py and its doc --- docs/RAG.md | 30 ++++++++++++++++++++---------- jobs/RAG/rag.py | 19 ++++++++++++++++--- 2 files changed, 36 insertions(+), 13 deletions(-) diff --git a/docs/RAG.md b/docs/RAG.md index 54ff500..45b2ad1 100644 --- a/docs/RAG.md +++ b/docs/RAG.md @@ -28,16 +28,26 @@ the python script. ### Prerequisites -* [Ollama](https://github.com/ollama/ollama) to run language models locally +* If you are currently in a activated virtual environment, deactivate it. + +* Install [Ollama](https://github.com/ollama/ollama) to run language models locally * Follow [README](https://github.com/ollama/ollama?tab=readme-ov-file#customize-a-model) to install and run Ollama on a local computer. -* Download a sentence transformer model - * Select [a sentence transformer model](https://huggingface.co/sentence-transformers) - from Hugging Face. Download it to a local directory. Scripts in this experiment uses - [`all-MiniLM-L6-v2` model](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2). - Adjust the variable value of `sentence_transformer_dir` in scripts to point to the sentence - transformer model directory. -* If you are currently in a activated virtual environment, deactivate it. + +* Download a Sentence Transformer Model + 1. Select a Model + - Choose a [sentence transformer model](https://huggingface.co/sentence-transformers) from Hugging Face. + 2. Download the Model + - Download the selected model to a local directory. For example, to download the + [`all-MiniLM-L6-v2` model](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2), use the following + command: + ```sh + git clone https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2 + ``` + 3. Provide the Model Path + - When running the `rag.py` script, provide the path to the directory of the downloaded model as a parameter. + **Note:** Accessing a local sentence transformer model is much faster than accessing it via the + `sentence-transformers` Python package. ### Create/Activitate Virtual Environment * Go to the RAG scripts directory @@ -55,7 +65,7 @@ the python script. - `pip install -r requirements.txt` ### Run Scripts -* Run `rag.py` - - `python rag.py` +* Run `rag.py` with a parameter providing the path to the directory of a sentence transformer model + - `python rag.py ./all-MiniLM-L6-v2/` - The last two responses in the exectution result shows the language model's output with and without the use of RAG. diff --git a/jobs/RAG/rag.py b/jobs/RAG/rag.py index 5e1edcf..f6c5ea6 100644 --- a/jobs/RAG/rag.py +++ b/jobs/RAG/rag.py @@ -6,6 +6,7 @@ # You may obtain a copy of the BSD 3-Clause License at # https://github.com/inclusive-design/baby-bliss-bot/blob/main/LICENSE +import sys import os from langchain_community.document_loaders import TextLoader from langchain_text_splitters import CharacterTextSplitter @@ -15,12 +16,24 @@ from langchain_core.output_parsers import StrOutputParser from langchain_core.prompts import ChatPromptTemplate + +# A utility function that prints the script usage then exit +def printUsageThenExit(): + print("Usage: python rag.py ") + sys.exit() + + +# Read the path to the sentence transformer model +if len(sys.argv) != 2: + printUsageThenExit() +else: + sentence_transformer_dir = sys.argv[1] + if not os.path.isdir(sentence_transformer_dir): + printUsageThenExit() + # The location of the user document user_doc = "./data/user_doc.txt" -# The location of the sentence transformer model in the local directory -sentence_transformer_dir = os.path.expanduser("~") + "/Development/LLMs/all-MiniLM-L6-v2" - loader = TextLoader(user_doc) documents = loader.load() # print(f"Loaded documents (first 2 rows):\n{documents[:2]}\n\n") From 7eb58e59146d3648ee1b54c1ad929f65e5d10c34 Mon Sep 17 00:00:00 2001 From: Cindy Qi Li Date: Fri, 28 Jun 2024 15:37:27 -0400 Subject: [PATCH 07/10] doc: improve the RAG doc --- docs/RAG.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/RAG.md b/docs/RAG.md index 45b2ad1..7b042d8 100644 --- a/docs/RAG.md +++ b/docs/RAG.md @@ -30,7 +30,7 @@ the python script. * If you are currently in a activated virtual environment, deactivate it. -* Install [Ollama](https://github.com/ollama/ollama) to run language models locally +* Install and start [Ollama](https://github.com/ollama/ollama) to run language models locally * Follow [README](https://github.com/ollama/ollama?tab=readme-ov-file#customize-a-model) to install and run Ollama on a local computer. From daf18ddebf04e5dc13d65dbb3e937f2cc5c91a47 Mon Sep 17 00:00:00 2001 From: Cindy Qi Li Date: Wed, 10 Jul 2024 11:28:28 -0400 Subject: [PATCH 08/10] fix: fix the deprecation warning and typos --- docs/RAG.md | 2 +- docs/ReflectChatHistory.md | 2 +- jobs/RAG/chat_history_with_prompt.py | 3 --- jobs/RAG/chat_history_with_summary.py | 3 --- jobs/RAG/rag.py | 4 +--- jobs/RAG/requirements.txt | 1 + 6 files changed, 4 insertions(+), 11 deletions(-) diff --git a/docs/RAG.md b/docs/RAG.md index 7b042d8..2cd52f9 100644 --- a/docs/RAG.md +++ b/docs/RAG.md @@ -67,5 +67,5 @@ the python script. ### Run Scripts * Run `rag.py` with a parameter providing the path to the directory of a sentence transformer model - `python rag.py ./all-MiniLM-L6-v2/` - - The last two responses in the exectution result shows the language model's output + - The last two responses in the execution result shows the language model's output with and without the use of RAG. diff --git a/docs/ReflectChatHistory.md b/docs/ReflectChatHistory.md index b960944..10a49a2 100644 --- a/docs/ReflectChatHistory.md +++ b/docs/ReflectChatHistory.md @@ -73,5 +73,5 @@ The converted sentences are more accurate and appropriate compared to those gene ### Run Scripts * Run `chat_history_with_summary.py` or `chat_history_with_prompt.py` - `python chat_history_with_summary.py` or `python chat_history_with_prompt.py` - - The last two responses in the exectution result shows the language model's output + - The last two responses in the execution result shows the language model's output with and without the contextual information. diff --git a/jobs/RAG/chat_history_with_prompt.py b/jobs/RAG/chat_history_with_prompt.py index 58bab9d..41004a8 100644 --- a/jobs/RAG/chat_history_with_prompt.py +++ b/jobs/RAG/chat_history_with_prompt.py @@ -10,9 +10,6 @@ from langchain_core.output_parsers import StrOutputParser from langchain_core.prompts import ChatPromptTemplate -# from langchain_core.globals import set_debug -# set_debug(True) - # Define the Ollama model to use model = "llama3" diff --git a/jobs/RAG/chat_history_with_summary.py b/jobs/RAG/chat_history_with_summary.py index 925e163..3d1863f 100644 --- a/jobs/RAG/chat_history_with_summary.py +++ b/jobs/RAG/chat_history_with_summary.py @@ -10,9 +10,6 @@ from langchain_core.output_parsers import StrOutputParser from langchain_core.prompts import ChatPromptTemplate -# from langchain_core.globals import set_debug -# set_debug(True) - # Define the Ollama model to use model = "llama3" diff --git a/jobs/RAG/rag.py b/jobs/RAG/rag.py index f6c5ea6..ba511a0 100644 --- a/jobs/RAG/rag.py +++ b/jobs/RAG/rag.py @@ -11,7 +11,7 @@ from langchain_community.document_loaders import TextLoader from langchain_text_splitters import CharacterTextSplitter from langchain_community.vectorstores import FAISS -from langchain_community.embeddings import HuggingFaceEmbeddings +from langchain_huggingface import HuggingFaceEmbeddings from langchain_community.chat_models import ChatOllama from langchain_core.output_parsers import StrOutputParser from langchain_core.prompts import ChatPromptTemplate @@ -36,11 +36,9 @@ def printUsageThenExit(): loader = TextLoader(user_doc) documents = loader.load() -# print(f"Loaded documents (first 2 rows):\n{documents[:2]}\n\n") text_splitter = CharacterTextSplitter(chunk_size=200, chunk_overlap=0) splitted_docs = text_splitter.split_documents(documents) -# print(f"Splitted documents (first 2 rows):\n{splitted_docs[:2]}\n\n") # Instantiate the embedding class embedding_func = HuggingFaceEmbeddings(model_name=sentence_transformer_dir) diff --git a/jobs/RAG/requirements.txt b/jobs/RAG/requirements.txt index 4330d46..548fa7b 100644 --- a/jobs/RAG/requirements.txt +++ b/jobs/RAG/requirements.txt @@ -1,5 +1,6 @@ langchain_community langchain_core langchain_text_splitters +langchain-huggingface sentence_transformers faiss-cpu From e05168ce665b1ebed81d277d03f64c46eb5f86d8 Mon Sep 17 00:00:00 2001 From: Cindy Qi Li Date: Wed, 10 Jul 2024 15:07:52 -0400 Subject: [PATCH 09/10] doc: improve documentation --- README.md | 2 +- docs/RAG.md | 4 +++- docs/ReflectChatHistory.md | 2 +- 3 files changed, 5 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index 7703d44..504ee78 100644 --- a/README.md +++ b/README.md @@ -29,7 +29,7 @@ git clone https://github.com/your-username/baby-bliss-bot cd baby-bliss-bot ``` -### Create/Activitate Virtual Environment +### Create/Activiate Virtual Environment Always activate and use the python virtual environment to maintain an isolated environment for project's dependencies. * [Create the virtual environment](https://docs.python.org/3/library/venv.html) diff --git a/docs/RAG.md b/docs/RAG.md index 2cd52f9..fb7f899 100644 --- a/docs/RAG.md +++ b/docs/RAG.md @@ -38,6 +38,8 @@ the python script. 1. Select a Model - Choose a [sentence transformer model](https://huggingface.co/sentence-transformers) from Hugging Face. 2. Download the Model + - Make sure that your system has the git-lfs command installed. See + [Git Large File Storage](https://git-lfs.com/) for instructions. - Download the selected model to a local directory. For example, to download the [`all-MiniLM-L6-v2` model](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2), use the following command: @@ -49,7 +51,7 @@ the python script. **Note:** Accessing a local sentence transformer model is much faster than accessing it via the `sentence-transformers` Python package. -### Create/Activitate Virtual Environment +### Create/Activiate Virtual Environment * Go to the RAG scripts directory - `cd jobs/RAG` diff --git a/docs/ReflectChatHistory.md b/docs/ReflectChatHistory.md index 10a49a2..e815592 100644 --- a/docs/ReflectChatHistory.md +++ b/docs/ReflectChatHistory.md @@ -55,7 +55,7 @@ The converted sentences are more accurate and appropriate compared to those gene install and run Ollama on a local computer. * If you are currently in a activated virtual environment, deactivate it. -### Create/Activitate Virtual Environment +### Create/Activiate Virtual Environment * Go to the RAG scripts directory - `cd jobs/RAG` From 33737f6015e6fd2a6cc17ef71814ddb873d13d46 Mon Sep 17 00:00:00 2001 From: Cindy Qi Li Date: Thu, 11 Jul 2024 09:05:52 -0400 Subject: [PATCH 10/10] fix: fix typos --- README.md | 2 +- docs/RAG.md | 2 +- docs/ReflectChatHistory.md | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index 504ee78..16a32c3 100644 --- a/README.md +++ b/README.md @@ -29,7 +29,7 @@ git clone https://github.com/your-username/baby-bliss-bot cd baby-bliss-bot ``` -### Create/Activiate Virtual Environment +### Create/Activate Virtual Environment Always activate and use the python virtual environment to maintain an isolated environment for project's dependencies. * [Create the virtual environment](https://docs.python.org/3/library/venv.html) diff --git a/docs/RAG.md b/docs/RAG.md index fb7f899..8d42b99 100644 --- a/docs/RAG.md +++ b/docs/RAG.md @@ -51,7 +51,7 @@ the python script. **Note:** Accessing a local sentence transformer model is much faster than accessing it via the `sentence-transformers` Python package. -### Create/Activiate Virtual Environment +### Create/Activate Virtual Environment * Go to the RAG scripts directory - `cd jobs/RAG` diff --git a/docs/ReflectChatHistory.md b/docs/ReflectChatHistory.md index e815592..4046401 100644 --- a/docs/ReflectChatHistory.md +++ b/docs/ReflectChatHistory.md @@ -55,7 +55,7 @@ The converted sentences are more accurate and appropriate compared to those gene install and run Ollama on a local computer. * If you are currently in a activated virtual environment, deactivate it. -### Create/Activiate Virtual Environment +### Create/Activate Virtual Environment * Go to the RAG scripts directory - `cd jobs/RAG`