Chemical component identifier lost for unobserved non-standard residues #43

josemduarte · 2019-09-03T00:08:38Z

Since mmtf stores the SEQRES groups as 1-letter code strings, the chemical component id for any residue that is non-standard and happens to be unobserved will be lost. E.g. for 2X3T chain E (a glycopeptide) contains several unobserved non-standard aminoacids that are represented like "KXXXXXXEX". For groups that are observed, the chemical component identifier is recoverable from the ATOM information, but not for those that are unobserved.

josemduarte · 2019-09-03T17:13:50Z

A possible solution proposed by @pwrose is to store the full chemical component ID with the group data for unobserved residues here:
https://github.com/rcsb/mmtf/blob/master/spec.md#group-data

However, that requires either a new flag observed y/n or making the formalChargeList, elementList and atomNameList optional fields (now they are required).

gtauriello · 2019-09-16T13:51:33Z

Given that the group data in MMTF only lists the observed atoms, I would say that an unobserved residue could be represented with a group which has 0-length arrays formalChargeList, atomNameList and elementList. I don't see a problem with those arrays being empty. At least the C++ decoder/encoder shouldn't have any issues with it.

Given that the fields are required, the arrays should always be written in the MMTF file, but there is no problem with writing 0-length arrays in msgpack.

josemduarte mentioned this issue Sep 3, 2019

Bioassembly fails for a structure with null seqres groups (MMTF only) biojava/biojava#792

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chemical component identifier lost for unobserved non-standard residues #43

Chemical component identifier lost for unobserved non-standard residues #43

josemduarte commented Sep 3, 2019

josemduarte commented Sep 3, 2019

gtauriello commented Sep 16, 2019

Chemical component identifier lost for unobserved non-standard residues #43

Chemical component identifier lost for unobserved non-standard residues #43

Comments

josemduarte commented Sep 3, 2019

josemduarte commented Sep 3, 2019

gtauriello commented Sep 16, 2019