Failed to read molecules from AMBER files #137

SofiaBariami · 2019-12-02T19:23:00Z

Hello,
I am trying to use combine.py to a system of a protein and a ligand, but I am having this error:

OSError: Failed to read molecules from: ['QUBE_pro.prm7', 'QUBE_pro.rst7']

I got the prm, rst files by using the followin script, that reads pdb/xml files and prints prm/rst ones:

from simtk.openmm.app import * 
from simtk.openmm import * 
from simtk.unit import * 
from sys import stdout 
import parmed as pmd
from parmed.openmm import XmlFile

pdbfile = PDBFile('QUBE_pro.pdb')
ff = ForceField("QUBE_pro.xml")
system = ff.createSystem( pdbfile.topology)
#Swap in positions
integrator = openmm.VerletIntegrator(1.0 )
context = openmm.Context(system, integrator)
context.setPositions(pdbfile.positions)
# Create parmed Structure
structure = parmed.openmm.topsystem.load_topology( pdbfile.topology, system, pdbfile.positions)
# Write AMBER parameter/crd
structure.save('system.prmtop', overwrite=True)
structure.save('system.crd', format='rst7', overwrite=True)

I couldn't upload the files on GitHub, so you can find them through onedrive with the following link:
https://uoe-my.sharepoint.com/:f:/g/personal/s1786388_ed_ac_uk/EubzxglqUZVFvlioJKhJD3UBYhy0428TuMRRBiqcVsAcjQ?e=E8ysnb

Have you seen this problem before, or do you have any ideas on how to solve it?
Thanks a lot!

The text was updated successfully, but these errors were encountered:

lohedges · 2019-12-02T20:01:18Z

Hi @SofiaBariami, I was online and happened to have a spare few minutes so took a quick look at this. For reference, this is how you can see the full Sire error trace in BioSimSpace:

import BioSimSpace as BSS

# Turn on verbose error messages.
BSS.setVerbose(True)

# Try to read the molecules.
s = BSS.IO.readMolecules(["Qube_pro.rst7", "Qube_pro.prm7"])

This gives the following error:

UserWarning: Exception 'SireIO::parse_error' thrown by the thread 'master'.
Atoms in residue 1 are not listed as part of the expected molecule 1. Should be atoms 1 to 6107, but molecule has [ 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223,224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,249,250,251,252,253,254,255,256,257,258,259,260,261,262,263,264,265,266,267,268,269,270,271,272,273,274,275,276,277,278,279,280,281,282,283,284,285,286,287,288,289,290,291,292,293,294,295,296,297,298,299,300,301,302,303,304,305,306,307,308,309,310,311,312,313,314,315,316,317,318,319,320,321,322,323,324,325,326,327,328,329,330,331,332,333,334,335,336,337,338,339,340,341,342,343,344,345,346,347,348,349,350,351,352,353,354,355,356,357,358,359,360,361,362,363,364,365,366,367,368,369,370,371,372,373,374,375,376,377,378,379,380,381,382,383,384,385,386,387,388,389,390,391,392,393,394,395,396,397,398,399,400,401,402,403,404,405,406,407,408,409,410,411,412,413,414,415,416,417,418,419,420,421,422,423,424,425,426,427,428,429,430,431,432,433,434,435,436,437,438,439,440,441,442,443,444,445,446,447,448,449,450,451,452,453,454,455,456,457,458,459,460,461,462,463,464,465,466,467,468,469,470,471,472,473,474,475,476,477,478,479,480,481,482,483,484,485,486,487,488,489,490,491,492,493,494,495,496,497,498,499,500,501,502,503,504,505,506,507,508,509,510,511,512,513,514,515 ].
Thrown from FILE: /Users/runner/runners/2.160.0/work/1/s/corelib/src/libs/SireIO/amberprm.cpp, LINE: 3783, FUNCTION: SireMol::MolStructureEditor SireIO::AmberPrm::getMolStructure(int, const SireBase::PropertyName &) const
__Backtrace__
(  0) libSireError.2019.2.1.dylib (0x000000010a0f515b +59)
  -- SireError::getBackTrace()

It looks like there is an inconsistency with the information in the files. Can Parmed read the files back in successfully?

SofiaBariami · 2019-12-03T10:21:42Z

Hi Lester, thanks for looking at it. By taking a closer look to the protein files, I saw that all the residues of the protein are called "QUP", so probably this is what is wrong. Both the pdb and the prm files are like that. Is there a way to overcome this problem using the property map or I should find the file with the correct residues?

SofiaBariami · 2019-12-03T11:16:54Z

For the record, parmed is able to load back the prm7 files as a system made of a single residue

In [1]: import parmed as pmd                                                                                                                                            
In [2]: pmd.load_file("QUBE_pro.prm7", "QUBE_pro.rst7")                                                                                                                 
Out[2]: <AmberParm 6107 atoms; 1 residues; 6192 bonds; parametrized>
In [3]:

jmichel80 · 2019-12-03T13:07:53Z

A quick update. The issue was that the input pdb actually contains several molecules, so the Sire parser presumably bails out because the connectivity appears incomplete. parmed doesn't care about these sanity checks and proceeds. @SofiaBariami is testing a workaround.

lohedges · 2019-12-03T14:14:28Z

I didn't try reading the PDB file that Sofia uploaded, the issue was with the prm7/rst7 files generated by Parmed. As mentioned previously, we don't rely on CONECT records since they are rarely present and often incomplete. If you want to represent a multi molecule system with a PDB file then place single TER records between the molecules. If these are present Sire will split the system into separate molecules. If not, you'll just end up with a single molecule. Cheers

…

On Tue, 3 Dec 2019, 13:07 Julien Michel, ***@***.***> wrote: A quick update. The issue was that the input pdb actually contains several molecules, so the Sire parser presumably bails out because the connectivity appears incomplete. parmed doesn't care about these sanity checks and proceeds. @SofiaBariami <https://github.com/SofiaBariami> is testing a workaround. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#137?email_source=notifications&email_token=AAE6K3L6HKS2KGVR3L5XRR3QWZK2TA5CNFSM4JUCHQA2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFZJWLY#issuecomment-561158959>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAE6K3IAEAECNYRWLQDZF3DQWZK2TANCNFSM4JUCHQAQ> .

jmichel80 · 2019-12-04T10:53:04Z

Good point, @SofiaBariami another workaround may be to split the pdb file with TER statements, that may give you a prm7 file that combine.py can read.

SofiaBariami · 2019-12-04T11:06:10Z

I did that but I get this Value error:
ValueError: No template found for residue 1 (QUP). This might mean your input topology is missing some atoms or bonds, or possibly that you are using the wrong force field
This is probably because it is assumed that QUP residue has 6107 atoms, but there are only 515 before the first TER

SofiaBariami · 2019-12-09T18:11:50Z

Hello, an update on this issue:

By including TER records between the protein fragments, the residues (all named QUP) were not recognised: ValueError: No template found for residue 1 (QUP). This might mean your input topology is missing some atoms or bonds, or possibly that you are using the wrong force field
By including TER records between the protein fragments, changing the names of the residues , and also splitting the xml file into residues with different name that correspond to the pdb ones, I am getting a list index out of range error.

~/sire_test.app/lib/python3.7/site-packages/simtk/openmm/app/forcefield.py in addBond(self, atom1, atom2)
    608             """Add a bond between two atoms in a template given their indices in the template."""
    609             self.bonds.append((atom1, atom2))
--> 610             self.atoms[atom1].bondedTo.append(atom2)
    611             self.atoms[atom2].bondedTo.append(atom1)

I tried to debug the simtk code that is found at forcefield.py, and I found that the program doesn't read all the bond entries of the second fragment, even if the atom names and types are all read successfully. More specifically, the output that I am getting is the following:

...
atomname =  H5024
atomtype =  QUBE_5023
atomname =  H5025
atomtype =  QUBE_5024
atomname =  H5026
atomtype =  QUBE_5025
from, to = 519 516
from, to = 519 520
from, to = 516 515
from, to = 516 517
from, to = 516 518
.
.
.
from, to = 4483 4485
from, to = 4477 4476
from, to = 4489 4488
from, to = 4490 4488
from, to = 4491 4493

It stops at the bond between atoms 4491 and 4493. The next bond-pair is <Bond from="4513" to="4495"/> . I checked the pdb file (with VMD as well), and the specific connection exists. Also, at the XML file, we can find parameters about this bond: <Bond class1="C4513" class2="C4495" k="166657.088" length=".1523"/>, so there isn't a typo. Any ideas on that?
Here you can find the pdb and xml files. Thanks a lot!

lohedges mentioned this issue Jan 8, 2020

Slow performance of combine.py #139

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failed to read molecules from AMBER files #137

Failed to read molecules from AMBER files #137

SofiaBariami commented Dec 2, 2019

lohedges commented Dec 2, 2019

SofiaBariami commented Dec 3, 2019

SofiaBariami commented Dec 3, 2019

jmichel80 commented Dec 3, 2019

lohedges commented Dec 3, 2019 via email

jmichel80 commented Dec 4, 2019

SofiaBariami commented Dec 4, 2019 •

edited

Loading

SofiaBariami commented Dec 9, 2019

Failed to read molecules from AMBER files #137

Failed to read molecules from AMBER files #137

Comments

SofiaBariami commented Dec 2, 2019

lohedges commented Dec 2, 2019

SofiaBariami commented Dec 3, 2019

SofiaBariami commented Dec 3, 2019

jmichel80 commented Dec 3, 2019

lohedges commented Dec 3, 2019 via email

jmichel80 commented Dec 4, 2019

SofiaBariami commented Dec 4, 2019 • edited Loading

SofiaBariami commented Dec 9, 2019

SofiaBariami commented Dec 4, 2019 •

edited

Loading