<eos> token/depth of MSAs (MSA Transformer) #89

AlexanderKroll · 2021-06-16T12:08:54Z

AlexanderKroll
Jun 16, 2021

Hi everyone,

Congratulations on your great paper and thanks for making the model/code publicly available!
We are currently trying to implement the training for the MSA Transformer. 2 questions came up during the implementation:

(1) In the MSA paper you used "a batch size of 512 MSAs". Does this mean that you kept the number of MSAs fixed for all batches independent of the length of the sequences and that chose the depth of the MSAs (after subsampling) such that the maximum number of tokens per batch is not exceeded? If yes, this would mean that shorter sequences are trained with a much higher depth (of the subsampled MSA) than longer sequences. Did you also try/consider to fix the depth of the subsampled MSAs and instead choose the number of MSAs per batch such that maximum number of tokens is not exceeded?

(2) We recognized that you didn't use the end of sequence token for encoding of the input sequence in the MSA Transformer in comparison to the encoding for the ESM-1b model. Is there a particular reason why the token is not used for the MSA Transformer?

Thanks.

Answered by tomsercu

Jun 16, 2021

Hi Alexander!

correct, each individual sample contains 2*14 tokens and fills up the GPU. Batch size 512 is achieved with distributed dataparallel and accumulating gradients of several fw/bw passes into a single batch (fairseq flag --update-freq)
no particular reason, should be completely irrelevant for results.

View full answer

tomsercu · 2021-06-16T18:18:49Z

tomsercu
Jun 16, 2021

Hi Alexander!

correct, each individual sample contains 2*14 tokens and fills up the GPU. Batch size 512 is achieved with distributed dataparallel and accumulating gradients of several fw/bw passes into a single batch (fairseq flag --update-freq)
no particular reason, should be completely irrelevant for results.

1 reply

AlexanderKroll Jun 18, 2021
Author

Thanks for your quick response and for answering our questions!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

<eos> token/depth of MSAs (MSA Transformer) #89

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

<eos> token/depth of MSAs (MSA Transformer) #89

AlexanderKroll Jun 16, 2021

Replies: 1 comment · 1 reply

tomsercu Jun 16, 2021

AlexanderKroll Jun 18, 2021 Author

AlexanderKroll
Jun 16, 2021

Replies: 1 comment 1 reply

tomsercu
Jun 16, 2021

AlexanderKroll Jun 18, 2021
Author