Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: explore the RAG technique, and methods to retain chat history #21

Merged
merged 10 commits into from
Jul 12, 2024
37 changes: 31 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ git clone https://github.com/your-username/baby-bliss-bot
cd baby-bliss-bot
```

### Create/Activitate Virtual Environment
### Create/Activate Virtual Environment
Always activate and use the python virtual environment to maintain an isolated environment for project's dependencies.

* [Create the virtual environment](https://docs.python.org/3/library/venv.html)
Expand Down Expand Up @@ -58,39 +58,64 @@ with generating new Bliss symbols etc.

### Llama2

Conclusion: useful
**Conclusion**: useful

See the [Llama2FineTuning.md](./docs/Llama2FineTuning.md) in the [documentation](./docs) folder for details
on how to fine tune, evaluation results and the conclusion about how useful it is.

### StyleGAN3

Conclusion: not useful
**Conclusion**: not useful

See the [TrainStyleGAN3Model.md](./docs/TrainStyleGAN3Model.md) in the [documentation](./docs) folder for details
on how to train this model, training results and the conclusion about how useful it is.

### StyleGAN2-ADA

Conclusion: shows promise
**Conclusion**: shows promise

See the [StyleGAN2-ADATraining.md](./docs/StyleGAN2-ADATraining.md) in the [documentation](./docs) folder for details
on how to train this model and training results.

### Texture Inversion

Conclusion: not useful
**Conclusion**: not useful

See the [Texture Inversion documentation](./notebooks/README.md) for details.

## Preserving Information

### RAG (Retrieval-augmented generation)

**Conclusion**: useful

RAG (Retrieval-augmented generation) technique is explored to resolve ambiguities by retrieving relevant contextual
information from external sources, enabling the language model to generate more accurate and reliable responses.

See [RAG.md](./docs/RAG.md) for more details.

### Reflection over Chat History

**Conclusion**: useful

When users have a back-and-forth conversation, the application requires a form of "memory" to retain and incorporate past interactions into its current processing. Two methods are explored to achieve this:

1. Summarizing the chat history and providing it as contextual input.
2. Using prompt engineering to instruct the language model to consider the past conversation.

The second method, prompt engineering, yields more desired responses than summarizing chat history.

See [ReflectChatHistory.md](./docs/RAG.md) for more details.

## Notebooks

[`/notebooks`](./notebooks/) directory contains all notebooks used for training or fine-tuning various models.
Each notebook usually comes with a accompanying `dockerfile.yml` to elaborate the environment that the notebook was
running in.

## Jobs
[`/jobs`](./jobs/) directory contains all jobs used for training or fine-tuning various models.
[`/jobs`](./jobs/) directory contains all jobs and scripts used for training or fine-tuning various models, as well
as other explorations with RAG (Retrieval-augmented generation) and preserving chat history.

## Utility Scripts

Expand Down
73 changes: 73 additions & 0 deletions docs/RAG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
# Experiment with Retrieval-Augumented Generation (RAG)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a python project, is there a minimum version of Python and Pip that are required? Its likely at least Python 3.8.17, and Pip 23.1.2 -- that's what I have at the moment, and it works. Are there version restrictions based on the packages in requirements.txt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python verion >= 3.8.1 should work fine with the latest langchain-community module. Even if the python version is lower than that, an older version of langchain should be installed. My laptop uses Python 3.11.2 and pip 23.2.1.


Retrieval-augmented generation (RAG) is a technique for enhancing the accuracy and reliability of
generative AI models with facts fetched from external sources. This approach aims to address the
limitations of traditional language models, which may generate responses based solely on their
training data, potentially leading to factual errors or inconsistencies. Read
[What Is Retrieval-Augmented Generation, aka RAG?](https://blogs.nvidia.com/blog/what-is-retrieval-augmented-generation/)
for more information.

In a co-design session with an AAC (Augmentative and Alternative Communication) user, RAG can
be particularly useful. When the user expressed a desire to invite "Roy nephew" to her birthday
party, the ambiguity occurred as to whether "Roy" and "nephew" referred to the same person or
different individuals. Traditional language models might interpret this statement inconsistently,
sometimes treating "Roy" and "nephew" as the same person, and other times as separate persons.

RAG addresses this issue by leveraging external knowledge sources, such as documents or databases
containing relevant information about the user's family members and their relationships. By
retrieving and incorporating this contextual information into the language model's input, RAG
can disambiguate the user's intent and generate a more accurate response.

The RAG experiment is located in the `jobs/RAG` directory. It contains these scripts:

* `requirements.txt`: contains python dependencies for setting up the environment to run
the python script.
* `rag.py`: use RAG to address the "Roy nephew" issue described above.

## Run Scripts Locally

### Prerequisites

* If you are currently in a activated virtual environment, deactivate it.

* Install and start [Ollama](https://github.com/ollama/ollama) to run language models locally
* Follow [README](https://github.com/ollama/ollama?tab=readme-ov-file#customize-a-model) to
install and run Ollama on a local computer.

* Download a Sentence Transformer Model
1. Select a Model
- Choose a [sentence transformer model](https://huggingface.co/sentence-transformers) from Hugging Face.
2. Download the Model
- Make sure that your system has the git-lfs command installed. See
[Git Large File Storage](https://git-lfs.com/) for instructions.
- Download the selected model to a local directory. For example, to download the
[`all-MiniLM-L6-v2` model](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2), use the following
command:
```sh
git clone https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2
```
3. Provide the Model Path
- When running the `rag.py` script, provide the path to the directory of the downloaded model as a parameter.
**Note:** Accessing a local sentence transformer model is much faster than accessing it via the
`sentence-transformers` Python package.

### Create/Activate Virtual Environment
* Go to the RAG scripts directory
- `cd jobs/RAG`

* [Create the virtual environment](https://docs.python.org/3/library/venv.html)
(one time setup):
- `python -m venv .venv`

* Activate (every command-line session):
- Windows: `.\.venv\Scripts\activate`
- Mac/Linux: `source .venv/bin/activate`

* Install Python Dependencies (Only run once for the installation)
- `pip install -r requirements.txt`

### Run Scripts
* Run `rag.py` with a parameter providing the path to the directory of a sentence transformer model
- `python rag.py ./all-MiniLM-L6-v2/`
Copy link
Contributor

@klown klown Jul 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error occurs:

OSError: Error no file named pytorch_model.bin, model.safetensors, tf_model.h5,
model.ckpt.index or flax_model.msgpack found in directory ../../all-MiniLM-L6-v2/.

In case the location of the all-MiniLM-L6-v2 folder mattered, I moved it into the same directory as rag.py, but the same error occurred.

I also tried following the all-MiniLM-L6-v2 README which advises to execute pip install -U sentence-transformers. But, that didn't help either.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The location of the all-MiniLM-L6-v2 directory does not matter. The error seems reporting there aren't right model files in that directory. If you run ls in that directory, files in the screenshot below should be there. If there are not, something wrong with the download.

Screenshot 2024-07-10 at 11 35 03 AM

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct in that there are a couple of error messages from the git clone ... at step 2 of RAG.md. It took a while to figure out, but the root cause was that git-lfs was missing from my system. Here is the output from the git clone ... command at step 2 of RAG.md:

Cloning into 'all-MiniLM-L6-v2'...
remote: Enumerating objects: 61, done.
remote: Counting objects: 100% (61/61), done.
remote: Compressing objects: 100% (39/39), done.
remote: Total 61 (delta 22), reused 54 (delta 19), pack-reused 0 (from 0)
Unpacking objects: 100% (61/61), 316.23 KiB | 2.82 MiB/s, done.
git-lfs filter-process: git-lfs: command not found
fatal: the remote end hung up unexpectedly
warning: Clone succeeded, but checkout failed.
You can inspect what was checked out with 'git status'
and retry with 'git restore --source=HEAD :/'

I first tried to figure out why the "Clone succeeded, but the checkout failed." and to either use git restore as the error message suggested, or git reset the pending deletions. Regarding the latter, the result of the suggested git status shows that a number of files have been deleted:

On branch main
Your branch is up to date with 'origin/main'.

Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
	deleted:    .gitattributes
	deleted:    1_Pooling/config.json
	deleted:    README.md
	deleted:    config.json
	deleted:    config_sentence_transformers.json
	deleted:    data_config.json
	deleted:    model.safetensors
	deleted:    modules.json
	deleted:    onnx/model.onnx
	deleted:    pytorch_model.bin
	deleted:    rust_model.ot
	deleted:    sentence_bert_config.json
	deleted:    special_tokens_map.json
	deleted:    tf_model.h5
	deleted:    tokenizer.json
	deleted:    tokenizer_config.json
	deleted:    train_script.py
	deleted:    vocab.txt

However, I noticed an earlier error: git-lfs: command not found. That's the real reason that the clone did not succeed and left the repository in an odd state. Once I installed it, the script worked.

I think there needs to be a note or warning that before running git clone ..., that git-lfs is required, something like:

  1. Download the model
  • Make sure that your system has the git-lfs command installed. See Git Large File Storage
    for instructions.
  • Download the selected model to a local directory. For example, to download the all-MiniLM-L6-v2 model, use the following command: ...

- The last two responses in the execution result shows the language model's output
with and without the use of RAG.
77 changes: 77 additions & 0 deletions docs/ReflectChatHistory.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
# Reflection over Chat History

When users have a back-and-forth conversation, the application requires a form of "memory" to retain and incorporate
past interactions into its current processing. Two methods are explored to achieve this:

1. Summarizing the chat history and providing it as contextual input.
2. Using prompt engineering to instruct the language model to consider the past conversation.

The second method, prompt engineering, yields more desired responses than summarizing chat history.

The scripts for this experiment is located in the `jobs/RAG` directory.

## Method 1: Summarizing the Chat History

### Steps

1. Summarize the past conversation and include it in the prompt as contextual information.
2. Include a specified number of the most recent conversation exchanges in the prompt for additional context.
3. Instruct the language model to convert the telegraphic replies from the AAC user into full sentences to continue
the conversation.

### Result

The conversion process struggles to effectively utilize the provided summary, often resulting in inaccurate full
sentences.

### Scripts

* `requirements.txt`: Lists the Python dependencies needed to set up the environment.
* `chat_history_with_summary.py`: Implements the steps described above and displays the output.

## Method 2: Using Prompt Engineering

### Steps

1. Include the past conversation in the prompt as contextual information.
2. Instruct the language model to reference this context when converting the telegraphic replies from the AAC user
into full sentences to continue the conversation.

### Result

The converted sentences are more accurate and appropriate compared to those generated using Method 1.

### Scripts

* `requirements.txt`: Lists the Python dependencies needed to set up the environment.
* `chat_history_with_prompt.py`: Implements the steps described above and displays the output.

## Run Scripts Locally

### Prerequisites

* [Ollama](https://github.com/ollama/ollama) to run language models locally
* Follow [README](https://github.com/ollama/ollama?tab=readme-ov-file#customize-a-model) to
install and run Ollama on a local computer.
* If you are currently in a activated virtual environment, deactivate it.

### Create/Activate Virtual Environment
* Go to the RAG scripts directory
- `cd jobs/RAG`

* [Create the virtual environment](https://docs.python.org/3/library/venv.html)
(one time setup):
- `python -m venv .venv`

* Activate (every command-line session):
- Windows: `.\.venv\Scripts\activate`
- Mac/Linux: `source .venv/bin/activate`

* Install Python Dependencies (Only run once for the installation)
- `pip install -r requirements.txt`

### Run Scripts
* Run `chat_history_with_summary.py` or `chat_history_with_prompt.py`
- `python chat_history_with_summary.py` or `python chat_history_with_prompt.py`
- The last two responses in the execution result shows the language model's output
with and without the contextual information.
69 changes: 69 additions & 0 deletions jobs/RAG/chat_history_with_prompt.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
# Copyright (c) 2024, Inclusive Design Institute
#
# Licensed under the BSD 3-Clause License. You may not use this file except
# in compliance with this License.
#
# You may obtain a copy of the BSD 3-Clause License at
# https://github.com/inclusive-design/baby-bliss-bot/blob/main/LICENSE

from langchain_community.chat_models import ChatOllama
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate

# Define the Ollama model to use
model = "llama3"

# Telegraphic reply to be translated
message_to_convert = "she love cooking like share recipes"

# Conversation history
chat_history = [
"John: Have you heard about the new Italian restaurant downtown?",
"Elaine: Yes, I did! Sarah mentioned it to me yesterday. She said the pasta there is amazing.",
"John: I was thinking of going there this weekend. Want to join?",
"Elaine: That sounds great! Maybe we can invite Sarah too.",
"John: Good idea. By the way, did you catch the latest episode of that mystery series we were discussing last week?",
"Elaine: Oh, the one with the detective in New York? Yes, I watched it last night. It was so intense!",
"John: I know, right? I didn't expect that plot twist at the end. Do you think Sarah has seen it yet?",
"Elaine: I'm not sure. She was pretty busy with work the last time we talked. We should ask her when we see her at the restaurant.",
"John: Definitely. Speaking of Sarah, did she tell you about her trip to Italy next month?",
"Elaine: Yes, she did. She's so excited about it! She's planning to visit a lot of historical sites.",
"John: I bet she'll have a great time. Maybe she can bring back some authentic Italian recipes for us to try.",
]

# Instantiate the chat model and split the conversation history
llm = ChatOllama(model=model)

# Create prompt template
prompt_template_with_context = """
Elaine prefers to talk using telegraphic messages.
Given a chat history and Elaine's latest response which
might reference context in the chat history, convert
Elaine's response to full sentences. Only respond with
converted full sentences.

Chat history:
{chat_history}

Elaine's response:
{message_to_convert}
"""

prompt = ChatPromptTemplate.from_template(prompt_template_with_context)

# using LangChain Expressive Language (LCEL) chain syntax
chain = prompt | llm | StrOutputParser()

print("====== Response without chat history ======")

print(chain.invoke({
"chat_history": "",
"message_to_convert": message_to_convert
}) + "\n")

print("====== Response with chat history ======")

print(chain.invoke({
"chat_history": "\n".join(chat_history),
"message_to_convert": message_to_convert
}) + "\n")
Loading
Loading