Skip to content

Computation framework of the metabolic tasks

Anne Richelle edited this page Jun 24, 2021 · 6 revisions

We define a metabolic task as the capacity of producing a defined list of output products when only a defined list of input substrates is available. For each available reference genome-scale model, we computed the list of tasks the model successfully passed (i.e. list of functions the model is able to describe). If a task successfully passes in a model, we determined the set of reactions associated to this task. Therefore, based on the GPR rules, we can enumerate the genes that may contribute to the acquisition of this metabolic function.

We developed a computation framework for attributing a score to each metabolic task based on the availability of transcriptomic data. To this end, we first attribute a gene activity level (GAL) for each gene present in the model and for which expression data have been provided.

GAL=5*log(1+ gene expression value / threshold)

We propose multiple options to define the threshold (see section Threshold Definition for more details).

We further used the GPR rule associated with each reaction required for a task to decide which gene will be the main determinant of the enzyme abundance associated with this reaction and attribute the corresponding gene activity level. Therefore, each reaction involved in a task is associated with a reaction activity level (RAL) that corresponds to the preprocessed gene expression value of the gene selected as the main determinant for this reaction.

We also computed the significance of each gene selected with regard to its overall use in the observed condition. Actually, some genes will be mapped to multiples reactions (e.g. promiscuous enzyme). Therefore, we assume that it may exist some competition between the reactions using this gene. We define the significance of a gene (S) by its specificity for a reaction. It os computed as the inverse of the number of reactions in which this gene is used as the main determinant.

Finally, the metabolic score can be computed as the mean of the product of the activity level of each reaction with the significance of its associated gene:

MT score= sum(RAL*S)/number of reactions involved in the task

MT score provides a relative quantification of the activity of a metabolic task in a specific condition based on the availability of data for multiple conditions. Indeed, it has been shown that some important housekeeping genes always present very low expression value. Therefore, a metabolic function that will completely rely on this set of gene will actually always present very low MT score. Contrarily, some tasks can be associated with gene presenting very high expression levels. Therefore, MT scores cannot be compared across tasks but only across samples. To partly overcome this problem, we also propose this scoring approach in its binary version to determine whether a metabolic task is active or not based on a gene expression profile. To this end, the MT score no longer take into account the significance of gene determinant for each reaction but is just computed as the mean of the reaction activity levels. Doing so, a metabolic task will be considered as active if its MT score in its binary version has a value superior to 5log(2).

References
Richelle A, Chiang AWT, Kuo CC, Lewis NE (2019) Increasing consensus of context-specific metabolic models by integrating data-inferred cell functions. PLOS Computational Biology 15(4): e1006867

Notes The concept of “metabolic tasks” was originally introduced as a model benchmarking tool to evaluate the quality and capabilities of genome-scale metabolic models. However, these studies used various frameworks to define the cell’s capacity to sustain a metabolic task. Therefore, the library of metabolic tasks differed across studies in content and form preventing the comparison of results from the various studies. Thus, we unified the formalism of the metabolic tasks and the associated computational framework for their use in the modelling context. This framework is available in COBRA Toolbox 3.0 under the script checkMetabolicTasks (see Richelle et al., 2019 for more details)