Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chemical component identifier lost for unobserved non-standard residues #43

Open
josemduarte opened this issue Sep 3, 2019 · 2 comments

Comments

@josemduarte
Copy link
Member

Since mmtf stores the SEQRES groups as 1-letter code strings, the chemical component id for any residue that is non-standard and happens to be unobserved will be lost. E.g. for 2X3T chain E (a glycopeptide) contains several unobserved non-standard aminoacids that are represented like "KXXXXXXEX". For groups that are observed, the chemical component identifier is recoverable from the ATOM information, but not for those that are unobserved.

@josemduarte
Copy link
Member Author

A possible solution proposed by @pwrose is to store the full chemical component ID with the group data for unobserved residues here:
https://github.com/rcsb/mmtf/blob/master/spec.md#group-data

However, that requires either a new flag observed y/n or making the formalChargeList, elementList and atomNameList optional fields (now they are required).

@gtauriello
Copy link

Given that the group data in MMTF only lists the observed atoms, I would say that an unobserved residue could be represented with a group which has 0-length arrays formalChargeList, atomNameList and elementList. I don't see a problem with those arrays being empty. At least the C++ decoder/encoder shouldn't have any issues with it.

Given that the fields are required, the arrays should always be written in the MMTF file, but there is no problem with writing 0-length arrays in msgpack.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants