Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protein unfolds during hybrid ML/MM simulations #34

Open
victorprincipe opened this issue May 11, 2021 · 8 comments
Open

Protein unfolds during hybrid ML/MM simulations #34

victorprincipe opened this issue May 11, 2021 · 8 comments
Labels
wontfix This will not be worked on

Comments

@victorprincipe
Copy link

Hi,

I'm trying to run hybrid ML/MM simulations of a protein-ligand system in which the ML model is used for the ligand and a handful of binding site residues. However, certain parts of the protein binding site unfold very quickly, whereas this did not happen when the ML model was used on only the ligand. I was just generally wondering if these ML models can be used for only certain atoms within the same macromolecule, or if they can (currently) only be used for the entirety of a molecule?
Here is a part of the code for the production simulation:

from __future__ import print_function
from simtk.openmm import *
from simtk.openmm.app import *
from parmed.openmm import *
from simtk import unit
from sys import stdout
import sys
from openmmml import MLPotential

def production(prmtop_file, crd_file, positions, velocities):
    prmtop = AmberPrmtopFile(prmtop_file)
    inpcrd = AmberInpcrdFile(crd_file)
    #use ANI-2x potential for ML system
    potential = MLPotential('ani2x')
    #index ligand atoms and binding site residue atoms in the solvated Tyk2+ligand system
    ml_atoms=[atom.index for atom in prmtop.topology.atoms() if atom.residue.id== "289"
             or atom.residue.id== "14"
             or atom.residue.id== "95"
             or atom.residue.id== "138"
             or atom.residue.id== "151"
             or atom.residue.id== "92"
             or atom.residue.id== "94"
             or atom.residue.id== "152"
             or atom.residue.id== "139"
             or atom.residue.id== "22"
             or atom.residue.id== "89"
             or atom.residue.id== "12"
             or atom.residue.id== "24"
             or atom.residue.id== "91"
             or atom.residue.id== "93"
             or atom.residue.id== "15"
             or atom.residue.id== "141"
             or atom.residue.id== "90"
             or atom.residue.id== "71"
             or atom.residue.id== "16"
             or atom.residue.id== "17"
             or atom.residue.id== "39"
             or atom.residue.id== "96"]
    #create traditional MM system from topology file
    mm_system = prmtop.createSystem(nonbondedMethod=PME,
    nonbondedCutoff=1.0*unit.nanometers, constraints=HBonds, rigidWater=True,
    ewaldErrorTolerance=0.0005)
    #apply ANI-2x ML potential to the ligand atoms
    ml_system = potential.createMixedSystem(prmtop.topology, mm_system, ml_atoms)
    # NPT for production:
    ml_system.addForce(MonteCarloBarostat(1*unit.atmospheres, 300*unit.kelvin, 25))
@peastman
Copy link
Member

In principle it should work for a subset of a molecule, though I haven't tried doing that. One thing to be aware of is that ANI only supports neutral molecules. If any of the residues you're using it for has a charged side chain, it won't work correctly.

@victorprincipe
Copy link
Author

Ah that'll be it! I used ANI on a charged Arginine residue so that's probably where it went wrong. I'll try again without using ANI on that residue and see what happens. Thanks!

@raimis
Copy link
Contributor

raimis commented May 12, 2021

Currently createMixedSystem uses the mechanical embedding, where ML and MM regions interact only via non-bonded terms. This means that the peptide bonds of all the residues included into the ML region are cut (https://github.com/openmm/openmm-ml/blob/4f0546678eeb9b2703d28d3fce7f678802640701/openmmml/mlpotential.py#L236).

In principle, it would possible to adapt one of QM/MM schemes to handle the covalent bonds between the different regions, but it isn't trivial. Check these reviews for inspiration (https://link.springer.com/article/10.1007/s00214-006-0143-z, http://iopenshell.usc.edu/chem545/lectures2011/QMMM_Thiel_Review_2009.pdf).

@raimis
Copy link
Contributor

raimis commented May 12, 2021

@peastman it would make sense createMixedSystem to throw an exception if there is a bond between the ML and MM regions.

@peastman
Copy link
Member

I don't think that's quite correct. A bond, angle, or torsion only gets removed if all atoms it involves are in the ML region. Bonds that connect an MM atom to an ML atom are preserved. That said, I don't think this will really give correct results unless your ML potential was specifically parametrized for it. Especially with the torsions, you'll get both ML and MM terms contributing to the same torsion which isn't correct.

For the moment, I think it's safest to stick to only using ML for entire molecules.

@victorprincipe
Copy link
Author

Thanks for the suggestions. I've just finished running simulations using ANI on only uncharged binding site residues but the protein still unfolded throughout the simulation. As @peastman said, I guess ML can only be used for entire molecules at the moment.

@peastman
Copy link
Member

Hopefully we'll eventually be able to go beyond that. But I think it will require potentials that are specifically designed for it.

@raimis
Copy link
Contributor

raimis commented Dec 14, 2021

This is more issues for https://github.com/openmm/openmm-ml. @peastman could you move it there?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

3 participants