Summary

Trainning commands

(For training on your own input file) run this command first to analyze tokens within input.txt and generate corresponding token files (train.bin and val.bin). These will be by default under the custom folder. You need to move or copy these files to under custom_char folder. python data/custom/prepare.py input.txt

Then, run this command to commence training cycles. Adjust parameters as needed.

python train.py config/train_custom.py --device=cuda --compile=False --eval_iters=20 --log_interval=1 --block_size=64 --batch_size=12 --n_layer=4 --n_head=4 --n_embd=128 --max_iters=2000 --lr_decay_iters=2000 --dropout=0.0

Generating inference commands

After training process finishes, run this command to obtain samples of generated text:

python sample.py --out_dir=out-custom-char

Calculating metrics

To calculate BLEU and ROUGE metrics, run the following command (might take a while)

python metrics.py abstractsCLEAN.txt out/filename.txt

Results

The tables below record run time for different parameter combinations as well as samples of the generated text. The BLEU score (4-gram) and ROUGE score (2-gram & Longest common subsequence) reflects how similar the generated text is compared with the reference input on a scale of 0 to 1.

Iterations	Block Size	Time	Result	BLEU-4	ROUGE-2	ROUGE-L
2000	64	1:45	output 1	0.145	0.135	0.301
4000	64	3:24	output 2	0.174	0.135	0.325
10000	64	8:16	output 3	0.244	0.172	0.331
40000	64	32:25	output 4	0.164	0.180	0.328

Iterations	Block Size	Time	Result	BLEU-4	ROUGE-2	ROUGE-L
10000	64	8.16	output 5	0.244	0.172	0.331
10000	128	14:20	output 6	0.202	0.151	0.310
10000	256	59:00	output 7	0.169	0.144	0.307

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!