-
Hi, I've tried to reproduce supervised contact prediction with MSA transformer, but with the limited details present in the paper "MSA Transformer" (Section 4.2), I only reached Top L-> 0.52, Top L/5->0.76 on CASP13-FM which on your paper are 0.57, 0.86, respectively. So, I have some questions about this and I'd appreciate your answers. (1) Which layer's output of msa transformer did you use for training the resnet?
(2) What is the input channel number for resnet?
(3) Did you use the same MSA data that trrosetta used for supervisd contact precision (both train and test(casp13, cameo))? or you generated your own new MSA data from the protein sequences? (4) What is your MSA subsampling strategy for training the resnet?
(5) What is your MSA subsampling strategy for testing on CASP13-FM dataset?
(6) Did you masked input tokens corresponding to missing coordinates in protein structure for training supervised contact prediction, or just masked final distogram when calculating the loss? I would be very grateful if you can tell more not mentioned details or tricks that can help improve contact precision. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Hi! Thank you for the interest in our work! (1 + 2) We used the output of the last repr_layer (768) embedding size. However, as noted in the supplement (A.13) of https://www.biorxiv.org/content/10.1101/622803v4.full.pdf , we project this into 128 dimensions. thus, the input channel to the resnet is 1282 + 1212 = 400. (3) We used the same MSA data as trRosetta (4) Yes, we subsample as you have described. There may be a slight difference between a batch size of 1 and a large batch size; we haven't extensively experimented here. Note that trRosetta's setup is to use a batch size of 1! (5) Yes, it is as you have described (6) We only mask the final distogram when calculating the lass. No input tokens are masked. I would recommend also collecting precision scores from the CAMEO-hard test set as well. |
Beta Was this translation helpful? Give feedback.
Hi! Thank you for the interest in our work!
(1 + 2) We used the output of the last repr_layer (768) embedding size. However, as noted in the supplement (A.13) of https://www.biorxiv.org/content/10.1101/622803v4.full.pdf , we project this into 128 dimensions. thus, the input channel to the resnet is 1282 + 1212 = 400.
(3) We used the same MSA data as trRosetta
(4) Yes, we subsample as you have described. There may be a slight difference between a batch size of 1 and a large batch size; we haven't extensively experimented here. Note that trRosetta's setup is to use a batch size of 1!
(5) Yes, it is as you have described
(6) We only mask the final distogram when calculating the lass. No inpu…