You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: docs/usage.rst
+30-46
Original file line number
Diff line number
Diff line change
@@ -16,19 +16,16 @@ Data files from each group of biological replicates should be placed into a uniq
16
16
17
17
.. code-block:: console
18
18
19
-
reads
19
+
reads/
20
20
├── exp1
21
21
│ ├── Dam.fastq.gz
22
-
│ ├── HIF1A.fastq.gz
23
-
│ └── HIF2A.fastq.gz
22
+
│ └── Piwi.fastq.gz
24
23
├── exp2
25
24
│ ├── Dam.fastq.gz
26
-
│ ├── HIF1A.fastq.gz
27
-
│ └── HIF2A.fastq.gz
25
+
│ └── Piwi.fastq.gz
28
26
└── exp3
29
27
├── Dam.fastq.gz
30
-
├── HIF1A.fastq.gz
31
-
└── HIF2A.fastq.gz
28
+
└── Piwi.fastq.gz
32
29
33
30
.. note::
34
31
@@ -44,66 +41,55 @@ In some cases the number of non-Dam and Dam samples might not match. In this cas
44
41
45
42
.. code-block:: console
46
43
47
-
reads
44
+
reads/
48
45
├── Dam_1.fastq.gz
49
-
├── HIF1A_1.fastq.gz
50
-
├── HIF2A_1.fastq.gz
51
46
├── Dam_2.fastq.gz
52
-
├── HIF1A_2.fastq.gz
53
-
├── HIF2A_2.fastq.gz
54
-
├── HIF1A_3.fastq.gz
55
-
└── HIF2A_3.fastq.gz
47
+
├── Piwi_1.fastq.gz
48
+
├── Piwi_2.fastq.gz
49
+
└── Piwi_3.fastq.gz
56
50
57
51
When `damid-seq` is run is this case, it will create directories in reads/ for each Dam-only sample matching all non-Dam samples. Symlinks will be created in these directories to the original files in reads/:
@@ -114,22 +100,20 @@ The config/ directory contains `samples.csv` with sample meta data as follows:
114
100
+-----------+----------+-----------+
115
101
| sample | genotype | treatment |
116
102
+===========+==========+===========+
117
-
|HIF1A| WT | Hypoxia|
103
+
|Piwi| Piwi_ko | None|
118
104
+-----------+----------+-----------+
119
-
|HIF2A| WT |Hypoxia|
105
+
|Dam| WT |None |
120
106
+-----------+----------+-----------+
121
-
|Dam | WT | Hypoxia |
122
-
+-----------+----------+-----------+
123
107
124
108
`config.yaml` in the same directory contains the settings for the analysis:
125
109
126
110
.. code-block:: yaml
127
111
128
-
genome: hg38
112
+
genome: dm6
129
113
ensembl_genome_build: 110
130
114
plasmid_fasta: none # Path to plasmid fasta file with sequences to be removed
131
115
fusion_genes:
132
-
genes: ENSG00000100644,ENSG00000116016# Ensembl gene IDs for genes to be masked from the fasta file
116
+
genes: FBgn0004872# Ensembl gene IDs for genes to be masked from the fasta file
133
117
feature_to_mask: "exon"# Gene feature to mask from the fasta file (exon or gene)
134
118
damidseq_pipeline:
135
119
normalization: kde # kde, rpm or rawbins
@@ -210,7 +194,7 @@ A lot of the DamID signal can come from the plasmids that are used to express th
210
194
211
195
To prevent this, two approaches are available:
212
196
213
-
1. The genes (Ensembl gene IDs) fused to Dam can be set in config.yaml["fusion_genes] (separated by commas if multiple plasmids are used). This will mask the genomic locations of these genes in the fasta file that will be used to build the Bowtie2 index, hence excluding these regions from the analysis.
197
+
1. The genes (Ensembl gene IDs) fused to Dam can be set in config.yaml["fusion_genes] (separated by commas if multiple plasmids are used). This will mask the features set in config > fusion_genes > feature_to_mask (exons or gene) of these genes in the fasta file that will be used to build the Bowtie2 index, hence excluding these regions from the analysis.
0 commit comments