Skip to content

Commit

Permalink
Merge pull request #19 from cindyli/feat/Llama2
Browse files Browse the repository at this point in the history
feat: experiment with Llama2 (resolves #18)
  • Loading branch information
cindyli authored Apr 26, 2024
2 parents 21d173c + a7f7179 commit 3f6e183
Show file tree
Hide file tree
Showing 27 changed files with 6,511 additions and 21 deletions.
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
BSD 3-Clause License

Copyright (c) 2023, Inclusive Design Institute
Copyright (c) 2023-2024, Inclusive Design Institute

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
Expand Down
7 changes: 7 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,13 @@ Run the following command to lint all python scripts:
We performed experiments with a number of existing models listed below to understand how useful they are in helping
with generating new Bliss symbols etc.

### Llama2

Conclusion: useful

See the [Llama2FineTuning.md](./docs/Llama2FineTuning.md) in the [documentation](./docs) folder for details
on how to fine tune, evaluation results and the conclusion about how useful it is.

### StyleGAN3

Conclusion: not useful
Expand Down
178 changes: 178 additions & 0 deletions docs/Llama2FineTuning.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,178 @@
# Experiment with Llama2 model

The goal of this experiment is to evaluate how well a large language model (LLM) can learn the conversion
between English and Blissymbolics sentence structures. To leverage the LLM's knowledge of English, Blissymbolics
sentences are composed using English words while adhering to the grammatical and syntactical rules of Blissymbolics.
For instance, the English sentence "I slowly move towards the blue lake" would be expressed in Blissymbolics as
"present: I move slowly towards lake blue". Without delving into the linguistic intricacies of Blissymbolics, it is
essential to note that the language follows a specific ordering and structure to indicate verb tenses, as well as the
relationships between verbs and adverbs, and nouns and adjectives.

The experiment uses the [7B parameter Llama2 model pretrained by Meta](https://huggingface.co/meta-llama/Llama-2-7b-hf),
converted for the seamless use of the Hugging Face Transformers format. This model is choosen as a starting
point because it requires less training time and GPU resources compared to its larger counterparts, while it
potentially sacrifies some capability. Additionally, the Hugging Face Transformers format is selected because
of its extensive community support and standardized APIs.

This experiment is performed using Cedar clusters provided by [Digital Research Alliance of Canada](https://alliancecan.ca/en).
See [its technical documentation](https://docs.alliancecan.ca/wiki/Technical_documentation) regarding the content of
job scripts and job submission steps described below.

## Download Llama-2-7b-hf to Cedar

1. Request access to Llama2 models on [the Meta website](https://llama.meta.com/llama-downloads/);
2. Followed the instructions on [the Hugging Face website](https://huggingface.co/meta-llama/Llama-2-7b-hf)
to request the access to its Llama2 model;
3. Request a hugging face access token on [this page](https://huggingface.co/settings/tokens);
4. Login to the Cedar cluster;
5. Create a "llama" directory and run these commands to download the model:

```
mkdir llama2
cd llama2
# Load git-lfs first for downloading via Git large file storage
module load StdEnv/2020
module load git-lfs/3.3.0
git lfs install
git clone https://{hugging_face_id}:{hugging_face_access_token}@huggingface.co/meta-llama/Llama-2-7b-hf
// Fetch git large files in the repo directory
cd Llama-2-7b-hf
git lfs fetch
```

6. Copy the content of [`requirements.txt`](https://github.com/facebookresearch/llama/blob/main/requirements.txt)
for setting up the Llama2 models into a new file named `requirements-llama2.txt` in the "llama" directory.

## Use the Original Llama2 Model

In the [`jobs/Llama2/original_use`](../jobs/Llama2/original_use) directory, there are two scripts:

* original_use_7b_hf.py: The script that loads the downloaded model and tokenizer to perform text generation,
word predictions and making inferences
* job_original_use_7b_hf.sh: The job script submitted to Cedar to run `original_use_7b_hf.py`

Note that the job script must be copied to the user's `scratch` directory and is submitted from there using
the `sbatch` command.

Use FTP to transfer the above scripts to the cedar cluster in the users `llama2/original_use` directory. Run
the following command to submit the job.

```
cp llama2/original_use/job_original_use_7b_hf.sh scratch/.
cd scratch
sbatch job_original_use_7b_hf.sh
```

The result is written to the `llama2/original_use/result.txt`.

## Fine-tune the Llama2 Model

In the [`jobs/Llama2/finetune`](../jobs/Llama2/finetune) directory, there are these scripts:

* bliss.json: The dataset that converts English text to the structure in the Conceptual Bliss
* finetune_7b_hf.py: The script that fine-tunes the downloaded model
* job_finetune_7b_hf.sh: The job script submitted to Cedar to run `finetune_7b_hf.py`

Use FTP to transfer the above scripts to the cedar cluster in the users `llama2/finetune` directory. Run
the following command to submit the job.

```
cp llama2/finetune/job_finetune_7b_hf.sh scratch/.
cd scratch
sbatch job_finetune_7b_hf.sh
```

The fine-tuning script:

1. Creates an instruction dataset using `bliss.json`. This dataset contains bi-directional conversion between
English and Conceptual Bliss.
2. Uses the dataset to fine-tune the Llama2 model. See `finetune_7b_hf.py` about the fine-tuning parameters.
3. Evaluates the fine-tuned model by testing a few sentence conversions between the English and the Bliss languages.

Please note that due to the relatively small size of the dataset derived from bliss.json, the fine-tuning script
was run four times, adjusting the epoch number in the script from 1 to 4. As a result, 4 models were generated
corresponding to the different epoch counts.

## Evaluate the Fine-tuned Model

This section describes how to evaluate a fine-tuned model with instructions and input sentences.

In the [`jobs/Llama2/finetune`](../jobs/Llama2/finetune) directory, there are these scripts:

* eval_7b_hf.py: The script that fine-tunes the downloaded model. Common variables to adjust:
* `model_dir`: The location of the model directory
* `instruction`: At the bottom of the script, define the instruction part in a prompt
* `input`: At the bottom of the script, define the sentence to be converted
* job_eval_7b_hf.sh: The job script submitted to Cedar to run `eval_7b_hf.py`

Use FTP to transfer the above scripts to the cedar cluster in the users `llama2/finetune` directory. Run
the following command to submit the job.

```
cp llama2/finetune/job_eval_7b_hf.sh scratch/.
cd scratch
sbatch job_eval_7b_hf.sh
```

## Evaluate the Generated Sentences from the Fine-tuned Model

This section describes how to evaluate the generated sentences and compare them with original or expected sentences.
It evaluates the generated sentence in these aspects:

* Semantic Coherence
* Novelty and Creativity
* Fluency and Readability

In the [`jobs/Llama2/finetune`](../jobs/Llama2/finetune) directory, there are these scripts:

* eval_generated_sentence.py: The script that fine-tunes the downloaded model. Common variables to adjust:
* `sentence_orig`: The original sentence
* `sentence_expected`: The expected sentence
* `sentence_generated`: The sentence generated by the fine-tuned model
* job_eval_generated_sentence.sh: The job script submitted to Cedar to run `eval_generated_sentence.py`

Use FTP to transfer the above scripts to the cedar cluster in the users `llama2/finetune` directory. Run
the following command to submit the job.

```
cp llama2/finetune/job_eval_generated_sentence.sh scratch/.
cd scratch
sbatch job_eval_generated_sentence.sh
```

## Future Improvements

1. **Diversified Dataset Expansion**: Currently, the `bliss.json` dataset consists of 967 pairs of conversions
between English and Bliss, focusing on specific ordering and sentence structures. To enhance the model's versatility,
a key improvement is to enrich the dataset with a wider variety of sentence types and structures.

2. **Comprehensive Model Evaluation**: The evaluation of the fine-tuned model is not comprehensive. While individual
converted sentences are assessed, there's a need for a more thorough evaluation method. This includes comparing the
expected and actual converted results using a percentage of the dataset, and assessing for underfitting or overfitting.
Considering the fine-tuning runs from 1 to 4 epochs on a small dataset, overfitting risks may increase with more
epochs, which reqires a robust evaluation process.

3. **Understanding Bliss Language**: The current fine-tuned model effectively responds to two fixed instructions,
converting between English and Bliss. However, it lacks a deep understanding of the Bliss language itself. The next
step involves fine-tuning a model that comprehends broader queries in Bliss, going beyond instructional conversion
tasks. Tests show that while the model performs well in converting Bliss to English, likely because of its extensive
knowledge of English. However, its performance in the reverse direction is not ideal. This difference suggests a need
for additional fine-tuning, potentially by enhancing the model's understanding of the unique linguistic features of
Bliss.

## Conclusion

Although the fine-tuning uses a fairly small dataset, the fine-tuned model performs pretty well in converting English
and Conceptual Bliss sentence structure, especially with the two-epochs and three-epochs models.

## References

* [Llama2 in the Facebook Research Github repository](https://github.com/facebookresearch/llama)
* [Llama2 fine-tune, inference examples](https://github.com/facebookresearch/llama-recipes)
* [Llama2 on Hugging Face](https://huggingface.co/docs/transformers/model_doc/llama2)
* [Use Hugging Face Models on Cedar Clusters](https://docs.alliancecan.ca/wiki/Huggingface)
* [Running Jobs](https://docs.alliancecan.ca/wiki/Using_GPUs_with_Slurm)
* [Request GPUs with Slurm](https://docs.alliancecan.ca/wiki/Using_GPUs_with_Slurm)
Loading

0 comments on commit 3f6e183

Please sign in to comment.