forked from ParkinsonLab/metabolic-modeling-tools
-
Notifications
You must be signed in to change notification settings - Fork 0
/
gapseq
57 lines (34 loc) · 1.93 KB
/
gapseq
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
The gapseq paper is publised in Genome Biology 2021 (https://genomebiology.biomedcentral.com/articles/10.1186/s13059-021-02295-1).
All required tools has been installed on the lab server /home/pubadmin/programs/gapseq.
To run gapseq properly, it has to be installed in the user's local folder as follows:
## Install gapseq locally
git clone https://github.com/jotech/gapseq && cd gapseq
## Download Bacteria and Archaea databases
src/update_sequences.sh
src/update_sequences.sh Archaea
## Test the gapseq installation
./gapseq test
## Download genome assemblies from NCBI
wget ftp://ftp.ncbi.nlm.nih.gov/genomes/refseq/bacteria/Lactobacillus_johnsonii/representative/GCF_014841035.1_ASM1484103v1/GCF_014841035.1_ASM1484103v1_genomic.fna.gz -P /home/xuejian/Evonik/Data/
mv /home/xuejian/Evonik/Data/GCF_014841035.1_ASM1484103v1_genomic.fna.gz /home/xuejian/Evonik/Data/Lactobacillus_johnsonii.fna.gz
Here is an example to run the gapseq pipeline:
#!/bin/bash
# set the path to gapseq
gapseqdir=/home/username/gapseq
workdir=/home/username/test
data=$workdir/Data
outdir=$workdir/Results
gapseq=$gapseqdir/./gapseq
taxonomy=Bacteria
modelA="Lactobacillus_johnsonii"
cd $outdir
# Reaction & Pathway prediction
$gapseq find -p all -v 0 -b 200 -t $taxonomy -m $taxonomy $data/$modelA.fna.gz
# Transporter prediction
$gapseq find-transport -b 200 $data/$modelA.fna.gz
# Building Draft Model - based on Reaction-, Pathway-, and Transporter prediction
$gapseq draft -r $outdir/$modelA-all-Reactions.tbl -t $outdir/$modelA-Transporter.tbl -p $outdir/$modelA-all-Pathways.tbl -c $data/$modelA.fna.gz -u 200 -l 100 -b $taxonomy
# get growth medium
Rscript $gapseqdir/src/predict_medium.R -m $outdir/$modelA-draft.RDS -p $outdir/$modelA-all-Pathways.tbl
# Gapfilling
$gapseq fill -m $outdir/$modelA-draft.RDS -n $outdir/$modelA-medium.csv -c $outdir/$modelA-rxnWeights.RDS -g $outdir/$modelA-rxnXgenes.RDS -b 100