This repository has been archived by the owner on Aug 1, 2024. It is now read-only.
Reduce protein embedding dimensions #612
Unanswered
lincoln-harris
asked this question in
Q&A
Replies: 1 comment 2 replies
-
@lincoln-harris did you try to adopt vector decomposition method (PCA, UMAP or t-SNE) after the origin embedding and before the network? It should meet your requirements. |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I'm using the ESM-2 model to generate embeddings for proteins. I'm following the instructions on the README, namely, batching the data, generating per-residue representations and marginalizing those to get per-sequence representations. I'm wondering if there is a way to generate an embedding vector for a protein with fewer than 1280 dimensions? I have a small-scale deep neural network model that may struggle to learn linear layer parameters for such high-dimensional vectors. Using the ESM-2 model to generate, say, 32 dimensional protein embeddings would be super useful to me.
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions