Some questions associated with pre-training #160

vigneshvalliappan · 2022-01-24T15:40:23Z

vigneshvalliappan
Jan 24, 2022

Hi,

Thank you for your amazing work.

I have some questions regarding pre-training that I hope you can help with.

Can I find out which directory the code associated with the token-masking is, or is that code not released yet?
Can I find out what the purpose of 'self.args.max_positions' in model.py, is? Is this maximum length of a sequence?
I was reading the Supplementary information. Can I find out how many epochs are used for pre-training?

Thank you for your time and consideration.

Regards,
Vignesh

Answered by vigneshvalliappan

Feb 12, 2022

The following answer is from @rmrao, via email:

I don’t think the esm repo has this code, but it was trained using the fairseq repo at github.com/pytorch/fairseq. I also have a version in a personal repo: https://github.com/rmrao/evo/blob/ac86e2b8a6f78d5e3a0b8bf47a978a12735ca8c4/evo/dataset.py#L653.

Yes, max_positions is the maximum length of a sequence.

It was trained for a total of ~500,000 updates, with an approximate batch size of 512 sequences (batch size was actually based on # tokens, with a maximum of 1024 tokens / sequence, so a total of 512 * 1024 tokens per update).

Hope that helps!

View full answer

vigneshvalliappan · 2022-02-12T09:32:28Z

vigneshvalliappan
Feb 12, 2022
Author

The following answer is from @rmrao, via email:

I don’t think the esm repo has this code, but it was trained using the fairseq repo at github.com/pytorch/fairseq. I also have a version in a personal repo: https://github.com/rmrao/evo/blob/ac86e2b8a6f78d5e3a0b8bf47a978a12735ca8c4/evo/dataset.py#L653.

Yes, max_positions is the maximum length of a sequence.

It was trained for a total of ~500,000 updates, with an approximate batch size of 512 sequences (batch size was actually based on # tokens, with a maximum of 1024 tokens / sequence, so a total of 512 * 1024 tokens per update).

Hope that helps!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some questions associated with pre-training #160

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

Some questions associated with pre-training #160

vigneshvalliappan Jan 24, 2022

Replies: 1 comment

vigneshvalliappan Feb 12, 2022 Author

vigneshvalliappan
Jan 24, 2022

vigneshvalliappan
Feb 12, 2022
Author