Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TorchForce inside CustomCVForce fails with nnpops implementation #32

Closed
dominicrufa opened this issue Jun 10, 2022 · 16 comments · Fixed by openmm/openmm-torch#78
Closed

Comments

@dominicrufa
Copy link
Contributor

dominicrufa commented Jun 10, 2022

I have been trying to create a minimally-modified implementation to get nnpops working on a conda-installable openmm-ml environment, and i am encountering an error in TestMLPotential.py with the implementation.

If you drop in/replace the anipotential.py with this and run the test, (making sure to set the platform to CUDA), you should see that the test fails on line 26,

======================================================================
ERROR: testCreateMixedSystem (__main__.TestMLPotential)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/lila/home/rufad/github/openmm-ml/test/TestMLPotential.py", line 26, in testCreateMixedSystem
    interpEnergy1 = interpContext.getState(getEnergy=True).getPotentialEnergy().value_in_unit(unit.kilojoules_per_mole)
  File "/home/rufad/miniconda3/envs/omm_dev2/lib/python3.9/site-packages/openmm/openmm.py", line 12318, in getState
    state = _openmm.Context_getState(self, types, enforcePeriodicBox, groups_mask)
openmm.OpenMMException: Error invoking kernel: CUDA_ERROR_INVALID_HANDLE (400)

meaning that the createMixedSystem(..., interpolate=False) is compatible with nnpops, but interpolate=True is not.
I have found a way around this (by taking the TorchForce out of the customCVForce and adding a scale parameter to the ANIForce manually), but I don't want to open a PR attempting to merge my modifications unless the compatibility issue with nnpops can't be fixed an easier way.

@dominicrufa
Copy link
Contributor Author

@peastman , @raimis: turns out ^this is actually the crux of this issue

@peastman
Copy link
Member

This may also be related to openmm/openmm#3588.

@dominicrufa
Copy link
Contributor Author

in the interest of getting openmm-ml working with nnpops, i think the easiest thing would be to just extract the TorchForce from the customCV (as in my fork) and run with it until a more stable solution is found. is that agreeable?

@dominicrufa
Copy link
Contributor Author

@peastman, have you found a stable workaround for this issue, or should we proceed with removing the TorchForce from the CVForce?

@dominicrufa
Copy link
Contributor Author

tagging @jchodera

@peastman
Copy link
Member

Let me make one more effort at trying to figure it out. I suspect this is related to openmm/openmm#3588, and most especially to openmm/openmm#3588 (comment). If I can't figure out a proper solution, we can use your suggested workaround.

@jchodera
Copy link
Member

@dominicrufa : Can you test the PR here?
openmm/openmm-torch#78

@dominicrufa
Copy link
Contributor Author

@peastman , I am unable to source install the PR and make it compatible with the rest of the conda-installed bits needed for the nnpops test. if you can source install it locally, can you run the test i mentioned at the top of this issue?

@peastman
Copy link
Member

It runs without problem now. It looks like that fixes it!

@dominicrufa
Copy link
Contributor Author

@peastman , awesome! can you merge the PR and make it conda-installable? i can add a PR with the nnpops implementation for anipotential.py here and modify the test to test for the torchani and nnpops implementation if you'd like.

@raimis
Copy link

raimis commented Jun 16, 2022

@peastman could reopen this. I don't permissions.

@peastman peastman reopened this Jun 16, 2022
@peastman
Copy link
Member

I just added you as a maintainer. You should have full permissions now.

@raimis
Copy link

raimis commented Jun 16, 2022

@dominicrufa I can reproduce the issue and created a minimal test openmm/openmm-torch@63a49c1.

@dominicrufa
Copy link
Contributor Author

@raimis , great! let me know if/when i can conda-install the fix to write the openmm-ml tests.

@jchodera
Copy link
Member

@dominicrufa : Can you confirm the latest release fixes this? If so, can you close this?

@dominicrufa
Copy link
Contributor Author

yes, this is fixed. closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants