Replies: 2 comments
-
Did you find a way to enable dropout?
|
Beta Was this translation helpful? Give feedback.
-
The paper mentions:
An alternative idea to achieve what you are trying to achieve, is to randomly mask the sequence multiple times and make predictions on the different randomized inputs. |
Beta Was this translation helpful? Give feedback.
-
Hi ESM Team,
I want to produce multiple structures with the internal dropout layers active in order to build a distribution of structures (Drawing from Gal and Ghahramani, 2015).
Currently I have just modified the example in the readme from
eval
totrain
like this:However I have some concerns that from inspecting the code this is only activating the dropout in ESMFold portion, not ESM. I believe the remaining dropout in ESM2 is in the ESM2 -> TransformerLayer -> MultiheadAttention, but it is set to zero.
I understand how I could change this by forking the repo, but how would I load the saved weights into my slightly adjusted model?
Thanks for your help,
Finn
Beta Was this translation helpful? Give feedback.
All reactions