Skip to content
This repository has been archived by the owner on Aug 10, 2021. It is now read-only.

Compound assignment feature request #143

Open
samseaver opened this issue Mar 6, 2012 · 2 comments
Open

Compound assignment feature request #143

samseaver opened this issue Mar 6, 2012 · 2 comments

Comments

@samseaver
Copy link
Contributor

One of the things that will happen as we work with teams in manually curating their metabolic models, is that we will find compounds that are erroneously merged, or not. Sugars are a culprit here because of stereoisomerism.

What I need, as we move forward and discover these, is a way of either merging compounds or splitting compounds. A merge would be straight-forward, as all aliases and reactions would be merged, but a split is a little more complex because one must make sure that the correct InChI strings and synonyms go with the correct compound.

In addition to this, all reactions in the database, and in the models, which use these compounds must be updated. If a compound is split into two compounds, then the care must be taken to ask whether the new reactions that emerge actually exist.

@devoid
Copy link
Contributor

devoid commented Mar 6, 2012

Would it be sufficient to start with a simple copy() command on the compound for the split? You could then manually add / remove stuff from each compound.

For reactions, compoundSets and media, I'm not sure what base-level the correct behavior would be for a copy() call. It would not be difficult to create new reactions, adding the compound to the compoundSet and creating new media conditions. However, I think these things should probably be handled independently--e.g. with specific functions for each of these areas.

Does this make sense? I'm trying to think of names for the "copyReactions", "copyMedia", "updateCompoundSets" functions...and how these would look.

@samseaver
Copy link
Contributor Author

The copy() then manual assignment strategy works. There's going to be very few people who'll do this, and as time goes by, we'll be doing it less and less. How would I manually alter the two compounds, by printing them, editing them, and re-loading them? I like this, because I can keep the edited files in my history.

Using a split will invariably mean separating a lesser seen compound from a more common compound (some stereoisomers occur less than others). In this way, I would keep the original id for the more common compound. In turn, this means that the current set of reactons, compoundSets and media would stay the same. If, after the split, the lesser used compound does belong to its own reaction/compoundSet/media, then they too would have to be copied and altered.

Perhaps a generic copy() command for any biochemistry database object can be used?

It would be up to me as a biochemistry curator to be sure I don't miss anything, so if I do a copy() on the compound, I'd like the copy() function to print out to file the list of reactions/compoundSets/media that I would have to consider.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants