Skip to content

refineGEMs is a python package inteded to help with the curation of genome-scale metabolic models (GEMS).

License

Notifications You must be signed in to change notification settings

draeger-lab/refinegems

Repository files navigation

License: MIT Python Version from PEP 621 TOML Documentation Status GitHub release (with filter) GitHub last commit (branch) Repo Size GitHub all releases PyPI version PyPI - Format PyPI downloads
Zenodo DOI
Frontiers DOI

refineGEMs

refineGEMs is a python package intended to help with the curation of genome-scale metabolic models (GEMS).
The documentation can be found here.

Table of contents

  1. Overview
  2. Installation
  3. How to cite
  4. Repositories using refineGEMs

Overview

Currently refineGEMs can be used for the investigation of a GEM, it can complete the following tasks:

  • loading GEMs with COBRApy and libSBML
  • report number of metabolites, reactions and genes
  • report orphaned, deadends and disconnected metabolites
  • report mass and charge unbalanced reactions
  • report Memote score
  • compare the genes present in the model to the genes found in:
    • the KEGG Database (Note: This requires the GFF file and the KEGG identifier of your organism.)
    • Or the BioCyc Database (Note: This requires that a database entry for your organism exists in BioCyc.)
  • compare the charges and masses of the metabolites present in the model to the charges and masses denoted in the ModelSEED Database.

Other applications of refineGEMs to curate a given model include:

  • The correction of a model created with CarveMe v1.5.1 or v1.5.2 (for example moving all relevant information from the notes to the annotation field or automatically annotating the GeneProduct section of the model with the respective NCBI gene/protein identifiers from the GeneProduct identifiers),
  • The addition of KEGG Pathways as Groups (using the libSBML Groups Plugin),
  • Updating the SBO-Term annotations based on SBOannotator,
  • Updating the annotation of metabolites and extending the model with reactions (for the purpose of filling gaps) based on a table filled by the user data/manual_annotations.xlsx (Note: This only works when the structure of the example Excel file is used.),
  • And extending the model with all information surrounding reactions including the corresponding GeneProducts and metabolites by filling in the table data/modelName_gapfill_analysis_date_example.xlsx (Note: This also only works when the structure of the example Excel file is used).

Installation

You can install refineGEMs via pip:

pip install refineGEMs

or to a local conda environment where refineGEMs is distributed via this GitHub repository and all dependencies are denoted in the pyproject.toml file:

# clone or pull the latest source code
git clone https://github.com/draeger-lab/refinegems.git
cd refinegems

conda create -n <EnvName> python=3.10 (or higher)

conda activate <EnvName>

# check that pip comes from <EnvName>
which pip

pip install .

refineGEMs depends on the tools MCC and BOFdat which cannot directly be installed via PyPI or the pyproject.toml. Please install both tools before using refineGEMs:

# For MCC, until hot fix is merged into main:
pip install "masschargecuration@git+https://github.com/Biomathsys/MassChargeCuration@installation-fix"

# For BOFdat, our fork with hot fix(es):
pip install "bofdat@git+https://github.com/draeger-lab/BOFdat"

How to cite

When using refineGEMs, please cite the latest publication:

Famke Bäuerle, Gwendolyn O. Döbel, Laura Camus, Simon Heilbronner, and Andreas Dräger. Genome-scale metabolic models consistently predict in vitro characteristics of Corynebacterium striatum. Front. Bioinform., oct 2023. doi:10.3389/fbinf.2023.1214074.

Repositories using refineGEMs

  • C_striatum_GEMs
  • draeger-lab/Shaemolyticus - private
  • draeger-lab/Ssanguinis - private