From 58c6ae8f990b1d68be3e85a0106ad0c2b2f07f81 Mon Sep 17 00:00:00 2001
From: sekyonda <127536312+sekyondaMeta@users.noreply.github.com>
Date: Tue, 22 Aug 2023 11:06:38 -0400
Subject: [PATCH 1/2] Update inference.md

Adding Llama 2 prompt information
---
 docs/inference.md | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)
diff --git a/docs/inference.md b/docs/inference.md
index 67ee3dca6..6930efee9 100644
--- a/docs/inference.md
+++ b/docs/inference.md
@@ -33,11 +33,11 @@ Currently pad token by default in [HuggingFace Tokenizer is `None`](https://gith
 ```python
 tokenizer.add_special_tokens(
         {
-         
+
             "pad_token": "<PAD>",
         }
     )
-model.resize_token_embeddings(model.config.vocab_size + 1) 
+model.resize_token_embeddings(model.config.vocab_size + 1)
 ```
 Padding would be required for batch inference. In this this [example](../inference/inference.py), batch size = 1 so essentially padding is not required. However,We added the code pointer as an example in case of batch inference.
 
@@ -69,7 +69,7 @@ In case you have fine-tuned your model with pure FSDP and saved the checkpoints
 This is helpful if you have fine-tuned you model using FSDP only as follows:
 
 ```bash
-torchrun --nnodes 1 --nproc_per_node 8  llama_finetuning.py --enable_fsdp --model_name /patht_of_model_folder/7B --dist_checkpoint_root_folder model_checkpoints --dist_checkpoint_folder fine-tuned --pure_bf16 
+torchrun --nnodes 1 --nproc_per_node 8  llama_finetuning.py --enable_fsdp --model_name /patht_of_model_folder/7B --dist_checkpoint_root_folder model_checkpoints --dist_checkpoint_folder fine-tuned --pure_bf16
 ```
 Then convert your FSDP checkpoint to HuggingFace checkpoints using:
 ```bash
@@ -82,10 +82,22 @@ By default, training parameter are saved in `train_params.yaml` in the path wher
 Then run inference using:
 
 ```bash
-python inference/inference.py --model_name <training_config.output_dir> --prompt_file <test_prompt_file> 
+python inference/inference.py --model_name <training_config.output_dir> --prompt_file <test_prompt_file>
 
 ```
 
+## Prompt Llama 2
+
+As outlined by [this blog by Hugging Face](https://huggingface.co/blog/llama2#how-to-prompt-llama-2), you can use the template below to prompt Llama 2. Review the [blog article](https://huggingface.co/blog/llama2#how-to-prompt-llama-2) for more information on how to prompt Llama 2.
+
+```
+<s>[INST] <<SYS>>
+{{ system_prompt }}
+<</SYS>>
+
+{{ user_message }} [/INST]
+
+```
 
 ## Other Inference Options
 

From 1e0e4fb8a94c4642d216f884c7b7fe6573880780 Mon Sep 17 00:00:00 2001
From: sekyonda <127536312+sekyondaMeta@users.noreply.github.com>
Date: Mon, 28 Aug 2023 11:37:01 -0400
Subject: [PATCH 2/2] Update inference.md

---
 docs/inference.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/inference.md b/docs/inference.md
index 6930efee9..3563ceb4e 100644
--- a/docs/inference.md
+++ b/docs/inference.md
@@ -88,7 +88,7 @@ python inference/inference.py --model_name <training_config.output_dir> --prompt
 
 ## Prompt Llama 2
 
-As outlined by [this blog by Hugging Face](https://huggingface.co/blog/llama2#how-to-prompt-llama-2), you can use the template below to prompt Llama 2. Review the [blog article](https://huggingface.co/blog/llama2#how-to-prompt-llama-2) for more information on how to prompt Llama 2.
+As outlined by [this blog by Hugging Face](https://huggingface.co/blog/llama2#how-to-prompt-llama-2), you can use the template below to prompt Llama 2 chat models. Review the [blog article](https://huggingface.co/blog/llama2#how-to-prompt-llama-2) for more information.
 
 ```
 <s>[INST] <<SYS>>