Skip to content

Latest commit

 

History

History
60 lines (34 loc) · 2.16 KB

summary.md

File metadata and controls

60 lines (34 loc) · 2.16 KB

Summary

Trainning commands

(For training on your own input file) run this command first to analyze tokens within input.txt and generate corresponding token files (train.bin and val.bin). These will be by default under the custom folder. You need to move or copy these files to under custom_char folder. python data/custom/prepare.py input.txt

Then, run this command to commence training cycles. Adjust parameters as needed.

python train.py config/train_custom.py --device=cuda --compile=False --eval_iters=20 --log_interval=1 --block_size=64 --batch_size=12 --n_layer=4 --n_head=4 --n_embd=128 --max_iters=2000 --lr_decay_iters=2000 --dropout=0.0

Generating inference commands

After training process finishes, run this command to obtain samples of generated text:

python sample.py --out_dir=out-custom-char

Calculating metrics

To calculate BLEU and ROUGE metrics, run the following command (might take a while)

python metrics.py abstractsCLEAN.txt out/filename.txt

Results

The tables below record run time for different parameter combinations as well as samples of the generated text. The BLEU score (4-gram) and ROUGE score (2-gram & Longest common subsequence) reflects how similar the generated text is compared with the reference input on a scale of 0 to 1.

Iterations Block Size Time Result BLEU-4 ROUGE-2 ROUGE-L
2000 64 1:45 output 1 0.145 0.135 0.301
4000 64 3:24 output 2 0.174 0.135 0.325
10000 64 8:16 output 3 0.244 0.172 0.331
40000 64 32:25 output 4 0.164 0.180 0.328
Iterations Block Size Time Result BLEU-4 ROUGE-2 ROUGE-L
10000 64 8.16 output 5 0.244 0.172 0.331
10000 128 14:20 output 6 0.202 0.151 0.310
10000 256 59:00 output 7 0.169 0.144 0.307