From dfd9d71b18918bb6595ea3712ce0ae02b3f61303 Mon Sep 17 00:00:00 2001 From: mikkaatje Date: Wed, 8 Nov 2023 22:22:24 +0100 Subject: [PATCH 1/2] Improved English README Minor spelling and grammar fixes. --- finetune/README.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/finetune/README.md b/finetune/README.md index 2f5b8866..acceb7c3 100644 --- a/finetune/README.md +++ b/finetune/README.md @@ -45,15 +45,15 @@ pip install torch==2.0.1 deepspeed==0.10 tensorboard transformers datasets sente ## Hardware Setup -For Yi-6B model, a node with 4 GPUs, each has GPU mem larger than 60GB is recommended. +For the Yi-6B model, a node with 4 GPUs, each has GPU mem larger than 60GB is recommended. -For Yi-34B model, because the usage of zero-offload technique takes a lot CPU mem, please be careful to limit the GPU numbers in 34B finetune training. Please use CUDA_VISIBLE_DEVICES to limit the GPU number (as shown in scripts/run_sft_Yi_34b.sh). +For the Yi-34B model, because the usage of zero-offload technique takes a lot CPU memory, please be careful to limit the GPU numbers in 34B finetune training. Please use CUDA_VISIBLE_DEVICES to limit the GPU number (as shown in scripts/run_sft_Yi_34b.sh). A typical hardware setup for finetuning 34B model is a node with 8GPUS (limit to 4 in running by CUDA_VISIBLE_DEVICES=0,1,2,3), each has GPU mem larger than 80GB, with total CPU mem larger than 900GB. ## Quick Start -Download a LLM-base model to MODEL_PATH (6B and 34B). A typical folder of model is like: +Download a LLM-base model to MODEL_PATH (6B and 34B). A typical folder of models is like: ```bash |-- $MODEL_PATH @@ -80,7 +80,7 @@ Download a dataset from huggingface to local storage DATA_PATH, e.g. Dahoas/rm-s | |-- README.md ``` -`finetune/yi_example_dataset` has example datasets, which is modified from [BAAI/COIG](https://huggingface.co/datasets/BAAI/COIG) +`finetune/yi_example_dataset` has example datasets, which are modified from [BAAI/COIG](https://huggingface.co/datasets/BAAI/COIG) ```bash |-- $DATA_PATH @@ -89,7 +89,7 @@ Download a dataset from huggingface to local storage DATA_PATH, e.g. Dahoas/rm-s |-- eval.jsonl ``` -`cd` into scripts folder, copy and paste the script and run. For example: +`cd` into the scripts folder, copy and paste the script, and run. For example: ```bash cd finetune/scripts @@ -97,9 +97,9 @@ cd finetune/scripts bash run_sft_Yi_6b.sh ``` -For Yi-6B base model, setting training_debug_steps=20 and num_train_epochs=4 can output a chat model, which takes about 20 minutes. +For the Yi-6B base model, setting training_debug_steps=20 and num_train_epochs=4 can output a chat model, which takes about 20 minutes. -For Yi-34B base model, it takes a relatively long time for initialization. Please be patient. +For the Yi-34B base model, it takes a relatively long time for initialization. Please be patient. ## Evaluation @@ -109,4 +109,4 @@ cd finetune/scripts bash run_eval.sh ``` -Then you'll see the answer from both base model and finetuned model +Then you'll see the answer from both the base model and the finetuned model From 2e0b22ddc06975a381388935e64908ec303df000 Mon Sep 17 00:00:00 2001 From: mikkaatje Date: Wed, 8 Nov 2023 22:26:11 +0100 Subject: [PATCH 2/2] Update README.md --- demo/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/demo/README.md b/demo/README.md index 12ac31a0..8be5403e 100644 --- a/demo/README.md +++ b/demo/README.md @@ -13,7 +13,7 @@ python text_generation.py \ You can also provide an extra `--prompt` argument to try some other prompts. -When dealing with extreme long input sequence, you may need multiple GPU devices and to enable tensor parallelism acceleration during inference to avoid insufficient memory error. +When dealing with extremely long input sequences, you may need multiple GPU devices and to enable tensor parallelism acceleration during inference to avoid insufficient memory error. To run text generation task using tensor parallelism acceleration with 2 GPU devices: