From 58c6ae8f990b1d68be3e85a0106ad0c2b2f07f81 Mon Sep 17 00:00:00 2001 From: sekyonda <127536312+sekyondaMeta@users.noreply.github.com> Date: Tue, 22 Aug 2023 11:06:38 -0400 Subject: [PATCH 1/2] Update inference.md Adding Llama 2 prompt information --- docs/inference.md | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-) diff --git a/docs/inference.md b/docs/inference.md index 67ee3dca6..6930efee9 100644 --- a/docs/inference.md +++ b/docs/inference.md @@ -33,11 +33,11 @@ Currently pad token by default in [HuggingFace Tokenizer is `None`](https://gith ```python tokenizer.add_special_tokens( { - + "pad_token": "", } ) -model.resize_token_embeddings(model.config.vocab_size + 1) +model.resize_token_embeddings(model.config.vocab_size + 1) ``` Padding would be required for batch inference. In this this [example](../inference/inference.py), batch size = 1 so essentially padding is not required. However,We added the code pointer as an example in case of batch inference. @@ -69,7 +69,7 @@ In case you have fine-tuned your model with pure FSDP and saved the checkpoints This is helpful if you have fine-tuned you model using FSDP only as follows: ```bash -torchrun --nnodes 1 --nproc_per_node 8 llama_finetuning.py --enable_fsdp --model_name /patht_of_model_folder/7B --dist_checkpoint_root_folder model_checkpoints --dist_checkpoint_folder fine-tuned --pure_bf16 +torchrun --nnodes 1 --nproc_per_node 8 llama_finetuning.py --enable_fsdp --model_name /patht_of_model_folder/7B --dist_checkpoint_root_folder model_checkpoints --dist_checkpoint_folder fine-tuned --pure_bf16 ``` Then convert your FSDP checkpoint to HuggingFace checkpoints using: ```bash @@ -82,10 +82,22 @@ By default, training parameter are saved in `train_params.yaml` in the path wher Then run inference using: ```bash -python inference/inference.py --model_name --prompt_file +python inference/inference.py --model_name --prompt_file ``` +## Prompt Llama 2 + +As outlined by [this blog by Hugging Face](https://huggingface.co/blog/llama2#how-to-prompt-llama-2), you can use the template below to prompt Llama 2. Review the [blog article](https://huggingface.co/blog/llama2#how-to-prompt-llama-2) for more information on how to prompt Llama 2. + +``` +[INST] <> +{{ system_prompt }} +<> + +{{ user_message }} [/INST] + +``` ## Other Inference Options From 1e0e4fb8a94c4642d216f884c7b7fe6573880780 Mon Sep 17 00:00:00 2001 From: sekyonda <127536312+sekyondaMeta@users.noreply.github.com> Date: Mon, 28 Aug 2023 11:37:01 -0400 Subject: [PATCH 2/2] Update inference.md --- docs/inference.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/inference.md b/docs/inference.md index 6930efee9..3563ceb4e 100644 --- a/docs/inference.md +++ b/docs/inference.md @@ -88,7 +88,7 @@ python inference/inference.py --model_name --prompt ## Prompt Llama 2 -As outlined by [this blog by Hugging Face](https://huggingface.co/blog/llama2#how-to-prompt-llama-2), you can use the template below to prompt Llama 2. Review the [blog article](https://huggingface.co/blog/llama2#how-to-prompt-llama-2) for more information on how to prompt Llama 2. +As outlined by [this blog by Hugging Face](https://huggingface.co/blog/llama2#how-to-prompt-llama-2), you can use the template below to prompt Llama 2 chat models. Review the [blog article](https://huggingface.co/blog/llama2#how-to-prompt-llama-2) for more information. ``` [INST] <>