Skip to content

Commit

Permalink
update readme about num_epoch and mems=None move to first seq step
Browse files Browse the repository at this point in the history
  • Loading branch information
graykode committed Jul 3, 2019
1 parent ce5d7fa commit cb793a1
Show file tree
Hide file tree
Showing 2 changed files with 3 additions and 4 deletions.
5 changes: 2 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,15 +15,14 @@ $ pip install pytorch_pretrained_bert
$ python main.py --data ./data.txt --tokenizer bert-base-uncased \
--seq_len 512 --reuse_len 256 --perm_size 256 \
--bi_data True --mask_alpha 6 --mask_beta 1 \
--num_predict 85 --mem_len 384 --num_step 100
--num_predict 85 --mem_len 384 --num_epoch 100
```

Also, You can run code in [Google Colab](https://colab.research.google.com/github/graykode/xlnet-Pytorch/blob/master/XLNet.ipynb) easily.

- Hyperparameters for Pretraining in Paper.

<p align="center"><img width="300" src="images/hyperparameters.png" /> </p>

#### Option

- `—data`(String) : `.txt` file to train. It doesn't matter multiline text. Also, one file will be one batch tensor. Default : `data.txt`
Expand All @@ -37,7 +36,7 @@ Also, You can run code in [Google Colab](https://colab.research.google.com/githu
- `—mask_beta`(Integer) : How many tokens to mask within each group. Default : `1`
- `—num_predict`(Interger) : Num of tokens to predict. In Paper, it mean Partial Prediction. Default : `85`
- `—mem_len`(Interger) : Number of steps to cache in Transformer-XL Architecture. Default : `384`
- `number_step`(Interger) : Number of Step(Epoch). Default : `100`
- `num_epoch`(Interger) : Number of Epoch. Default : `100`



Expand Down
2 changes: 1 addition & 1 deletion main.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,9 +67,9 @@

criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001, weight_decay=0.01)
mems = None

for num_epoch in range(args.num_epoch):
mems = None

features = data_utils._create_data(sp=sp,
input_paths=args.data,
Expand Down

0 comments on commit cb793a1

Please sign in to comment.