Skip to content

Commit 9da9269

Browse files
authored
Update README.md
1 parent 09349fb commit 9da9269

File tree

1 file changed

+10
-5
lines changed

1 file changed

+10
-5
lines changed

README.md

+10-5
Original file line numberDiff line numberDiff line change
@@ -42,15 +42,20 @@ We have pushed the processed train set to huggingface:
4242
### 3. Training
4343

4444
1)
45+
4546
```bash
4647
BiLLM_START_INDEX=31 WANDB_MODE=disabled CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --nproc_per_node=4 --master_port=1234 train.py \
4748
--train_name_or_path SeanLee97/all_nli_angle_format_b \
4849
--save_dir ckpts/bellm-llama-7b-nli \
49-
--model_name NousResearch/Llama-2-7b-hf \
50-
--ibn_w 1.0 --cosine_w 0.0 --angle_w 0.0 --learning_rate 5e-4 --maxlen 60 \
51-
--is_llm 1 --apply_lora 1 --lora_r 32 --lora_alpha 32 --lora_dropout 0.1 \
50+
--model_name NousResearch/Llama-2-7b-chat-hf \
51+
--prompt_template 'The representative word for sentence {text} is:"' \
52+
--pooling_strategy avg \
53+
--ibn_w 20.0 --cosine_w 0.0 --angle_w 1.0 --learning_rate 2e-4 --maxlen 60 \
54+
--apply_lora 1 --lora_r 64 --lora_alpha 128 --lora_dropout 0.1 \
55+
--is_llm 1 --apply_billm 1 --billm_model_class LlamaForCausalLM \
5256
--push_to_hub 0 \
53-
--save_steps 200 --batch_size 256 --seed 42 --load_kbit 4 --gradient_accumulation_steps 4 --epochs 1 --fp16 1
57+
--logging_steps 5 --save_steps 50 --warmup_steps 80 --batch_size 256 --seed 42 --load_kbit 4 \
58+
--gradient_accumulation_steps 32 --epochs 3 --fp16 1
5459
```
5560

5661
If you want to push the model to HuggingFace automatically, you can add following extra arguments:
@@ -72,7 +77,7 @@ BiLLM_START_INDEX=31 WANDB_MODE=disabled CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun -
7277
--ibn_w 1.0 --cosine_w 0.0 --angle_w 0.0 --learning_rate 2e-4 --maxlen 60 \
7378
--is_llm 1 --apply_lora 1 --lora_r 32 --lora_alpha 32 --lora_dropout 0.1 \
7479
--push_to_hub 0 \
75-
--save_steps 200 --batch_size 256 --seed 42 --load_kbit 4 --gradient_accumulation_steps 64 --epochs 1 --fp16 1
80+
--save_steps 200 --batch_size 256 --seed 42 --load_kbit 4 --gradient_accumulation_steps 32 --epochs 3 --fp16 1
7681
```
7782

7883

0 commit comments

Comments
 (0)