-
Notifications
You must be signed in to change notification settings - Fork 0
/
Param.in
116 lines (115 loc) · 3.48 KB
/
Param.in
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
# Start: 1
# Stop: 8
#
# \\ 1 - Download the data
# \\ 2 - Perform adapter trimming on downloaded FASTQ files
# \\ 3 - Perform quality filtering on trimmed FASTQ files
# \\ 4 - Remove sequences that align to non-coding RNAs from the filtered FASTQ files
# \\ 5 - Align sequences to the genome from the ncRNA subtracted FASTQ files
# \\ 6 - Create Raw Assignment from the genome aligned sorted/indexed BAM file
# \\ 7 - Create Metagene tables
# \\ 8 - Create Metagene plots
# \\ ------------ Pipeline part 1 end -----------------
# \\ ------------ Pipeline part 2 start -----------------
# \\ 9 - Corrected assignment
# \\ 10 - Metagene tables from corrected assignment
# \\ 11 - Create Metagene plots from corrected assignemnt
# \\ 12 - Codon Tables for conditions A & B
# \\ 13 - Master Tables for all replicas
#
# Clean: gzip
#
# \\ bgzip - Gzip files after they are no longer necessary using multiple threads - 6 is hard coded
# \\ gzip - Gzip files after they are no longer necessary
# \\ rm - Remove files after they are no longer necessary
#
# Quality: 0.80
#
# \\ Quality filter threshold for retaining reads (0.0 - 1.0). Higher is more selective.
#
# Mapping: 5
#
# \\ 5 - Map reads according to their 5' end
# \\ 3 - Map reads according to their 3' end - Steps 9-13 are not checked yet !
#
# ReadLenMiN: 25
# ReadLenMaX: 35
#
# \\ 25 - ReadLenMiN length of reads included -! going to replace Length
# \\ 35 - ReadLenMaX length of reads included -! going to replace Length
#
# Normalised: rpm
#
# \\ rpm - use normalized data - reads per miljon
# \\ raw - raw counts
#
# MappedTwice: No
#
# \\ No or Yes IF Yes include reads mapped once and twice
#
# MetagThreshold: 30
#
# \\ 30 - raw counts around start or stop if less ignore a gene/region
# \\ If Normalized = "rpm" MetagThreshold is readjusted by dividing normalization_factor inside program
# \\ FILTER 3
#
# MetagSpan: 60
#
# \\ 60 - number of nucleotides before and after start/stop in metagene profiles
#
# MetagSpancorr: 90
#
# \\ 90 - number of nucleotides before and after start/stop in metagenomic profile from corrected data
#
#
# GeneRawMeanThr: 0.3
# \\ 0.3 -> 30 raw counts per 100 nt - low threshold
# \\ FILTER 1 - is used in functions codonTablesA & codonTablesB
# \\ converted to GeneRpmMeanThr
# \\ GeneRpmMeanThr = GeneRawMeanThr / norm_factor # FILTER 1
#
# CodonRawMeanThr: 1.6
# \\ 1.666 -> 6 raw counts per codon - more strict than previous
# \\ FILTER 2 - is used in functions codonTablesA & codonTablesB
# \\ converted to CodonRpmMeanThr
# \\ CodonRpmMeanThr = CodonRawMeanThr / norm_factor # FILTER 2
#
#
# OffsetFile: readlength_offsets_eEF3.txt
#
# \\ file with read length and offsets - only these readlength are included into corrected/final assignment
#
# cpu: 6
#
# \\ number of cores - have effect in some steps
#
#
# \\ Groups
# \\ Group refers to samples of different replicas but same conditions
# \\ Replica is defined as order of sample names under Group
# \\ An Example. GroupA S2;S4 GroupB S1;S3
# \\ Replica 1 - S1, S2
# \\ Replica 2 - S3, S4
#
# GroupA: MetS2;MetS4
#
# \\ Names of semicolon ";" separated samples incubated under specific conditions TREATED
# \\ S1;S2
#
#
# GroupB: WTS1;WTS3
#
# \\ Names of semicolon ";" separated samples used as CONTROL
# \\ S3;S4
#
# \\ affects sequence part of codon tables and Master table
#
# CodonsBeforePSite: 1
#
# \\ 1 - A-site included (default)
# \\ 2 - A-site + one codon before
#
ERR2660262 WTS1
ERR2660266 MetS2
ERR2660263 WTS3
ERR2660267 MetS4