How can I use ESMFold with ESM2 dropout layers active at inference? #495

FinnOD · 2023-03-06T14:50:06Z

FinnOD
Mar 6, 2023

Hi ESM Team,

I want to produce multiple structures with the internal dropout layers active in order to build a distribution of structures (Drawing from Gal and Ghahramani, 2015).

Currently I have just modified the example in the readme from eval to train like this:

import torch
import esm

model = esm.pretrained.esmfold_v1()
model = model.train().cuda()

However I have some concerns that from inspecting the code this is only activating the dropout in ESMFold portion, not ESM. I believe the remaining dropout in ESM2 is in the ESM2 -> TransformerLayer -> MultiheadAttention, but it is set to zero.

I understand how I could change this by forking the repo, but how would I load the saved weights into my slightly adjusted model?

Thanks for your help,
Finn

flix42 · 2023-03-28T12:46:42Z

flix42
Mar 28, 2023

Did you find a way to enable dropout?
I'm trying to get some variability from ESM2, but not calling eval or explicitly calling train dosnt seem to work.
Last line in the code snippet always returns true.

model, alphabet = esm.pretrained.esm2_t33_650M_UR50D()
[...]
# no model.eval()
model.train()
[...]
with torch.no_grad():
    results_dropout = [model(batch_tokens, repr_layers=[33], return_contacts=True) for i in range(3)]

(results_dropout[0]["representations"][33] == results_dropout[2]["representations"][33]).all()

0 replies

tomsercu · 2023-04-03T03:09:00Z

tomsercu
Apr 3, 2023

The paper mentions:

In ESM-2, we have made multiple small modifications to ESM-1b with the goal of increasing the effective capacity. ESM-1b had dropout both in hidden layers and attention which we removed completely to free up more capacity. In our experiments, we did not observe any significant perfor- mance regressions with this change.

An alternative idea to achieve what you are trying to achieve, is to randomly mask the sequence multiple times and make predictions on the different randomized inputs.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How can I use ESMFold with ESM2 dropout layers active at inference? #495

{{title}}

Replies: 2 comments

{{title}}

{{title}}

Select a reply

How can I use ESMFold with ESM2 dropout layers active at inference? #495

FinnOD Mar 6, 2023

Replies: 2 comments

flix42 Mar 28, 2023

tomsercu Apr 3, 2023

FinnOD
Mar 6, 2023

flix42
Mar 28, 2023

tomsercu
Apr 3, 2023