extract and save embeddings for multiple sequence alignment based on msa transformer #92
-
Can you please suggest how to extract and save embeddings for MSA with MSA transformer in a manner like the extract.py. So to batch extract for each fasta file containing MSA a single embedding, not for each sequence in a single MSA file ? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
The internal state of MSA transformer is |
Beta Was this translation helpful? Give feedback.
The internal state of MSA transformer is
M x L x d
(msa size x seqlength x embedding dim).Typically you want the MSA to produce sequence-level features that summarize all MSA information, and taking the final layer's embedding of the first (typically query) sequence, gives good results. You could also try (weighted) averaging over the whole MSA but we didn't see much difference.