Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How does this work? #13

Open
evolgenomology opened this issue Aug 1, 2017 · 2 comments
Open

How does this work? #13

evolgenomology opened this issue Aug 1, 2017 · 2 comments
Assignees

Comments

@evolgenomology
Copy link

Hi guys,

unfortunately I really have some problems understanding how the pipeline can be adapted to our dataset. For some initial tests I wanted to use bc_demultiplex.py, but I don't understand how the sample sheet should look and how the files need to be named. So to start from the beginning.
We have 5 libraries (of ~20 tissue slices each, but I guess this is irrelevant), which were pooled and sequenced all together on a HiSeq. We used barcodes:

Primer_ID | Unique Barcode
1 | AGACTC
2 | AGCTAG
4 | AGCTTC
5 | CATGAG
9 | CAGATC
10 | TCACAG
11 | AGGATC
14 | TCCTAG
17 | TCGAAG
20 | GTACAG
23 | GTCTAG
25 | GTTGCA
26 | GTGACA
28 | ACAGTG
29 | ACCATG
31 | ACTCGA
32 | ACGTAC
35 | CTAGAC
40 | CTTCGA
46 | TGCAGA

and also UMIs of course.
What we got is 5 Illumina datasets, Lib_1 ... Lib-5, and each has two readsets, e.g.
Lib-2_S2_L002_R2_001.fastq.gz Lib-2_S2_L001_R1_001.fastq.gz Lib-2_S2_L001_R2_001.fastq.gz Lib-2_S2_L002_R1_001.fastq.gz.
I cleaned these with trimmomatic (I think this is necessary in this case, since the sequencing quality is not really super. My reads are now named, e.g.
Lib-4_S4_L002_reverse_paired.fq.gz Lib-4_S4_L002_forward_paired.fq.gz Lib-4_S4_L001_reverse_paired.fq.gz Lib-4_S4_L001_forward_paired.fq.gz.
But this can of course be changed.

Now my simple questions are:

  • how do I need to layout the bc_index file? (I suspect like the barcode_umis.tab file, right?)
  • how do I need to layout the sample sheet?
  • would it make sense to just combine all reads into one giant fastq file?

Thanks for your help!

Cheers

Philipp

@evolgenomology
Copy link
Author

I realised that at least this "would it make sense to just combine all reads into one giant fastq file?" would not make sense.

@Puriney
Copy link
Member

Puriney commented Sep 25, 2017

experiment-design-old-pipeline-demo

This is to visualize the sample sheet as defined here

A good news is that we are going to release a new version of pipeline next week.

@Puriney Puriney self-assigned this Oct 9, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants