Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

already demultiplexed fastqs: mapping file + prep of fastq files #3

Open
elpape opened this issue Feb 10, 2017 · 9 comments
Open

already demultiplexed fastqs: mapping file + prep of fastq files #3

elpape opened this issue Feb 10, 2017 · 9 comments

Comments

@elpape
Copy link

elpape commented Feb 10, 2017

Hi Taruna or Holly!

I just wanted to make sure I understood how the mapping file for already demultiplexed fastqs should be set up.

  • BarcodeSequence is in my case not needed (since reads are already demultiplexed) and this column can be either left empty (according to description on the qiime website) or can contain a dummy string (e.g. NNNN)

  • LinkerPrimerSequence: is this the sequence of the forward gene-specific primer? (if so, why is it called LinkerPrimerSequence - very confusing) the Qiime website says: "each value in that column corresponds to the primer used to amplify that sample". since in my case primers were extended in a 2-step PCR where in the first step the gene-specific primer was added with a special tag attached, do I also need to include the sequence of the tag? My forward gene-specific primer should be SSU_F04, so I should just put down the sequence of this primer in this column, right?

  • ReversePrimer: this is the reverse gene-specific Primer? should I include the tag? (in my case, primer should be SSU_R22)

In addition, I was wondering for step 1c in the tutorial, why you only truncate the reverse primer? should you not truncate the forward primer as well?

Thanks!
Ellen

@tarunaaggarwal
Copy link

tarunaaggarwal commented Feb 14, 2017

Hi Ellen,

  1. I'd keep the dummy string in the barcode column. Sometimes Qiime complains if a cell is empty.

  2. The LinkerPrimerSequence is a little confusing. Yes, just put the sequence down in the LinkerPrimerSequence column. If it helps, here is sample mapping file from Holly. You will notice that she has both forward and reverse primer seqs in the LinkerPrimerSequence column but has an additional column with the primer orientation info. You can create a mapping file like that as well.

  3. We truncate the reverse primer only because the forward primer should have already been removed during the demultiplexing step. You can check for the presence of your forward primer in your seqs by typing the following. The result should be 0.

grep -c "^primer-seq" file.fasta

Hope this helps. Let me know if you have more questions. Thanks!
Taruna

@elpape
Copy link
Author

elpape commented Feb 28, 2017

Hi Taruna,

Thanks for your feedback and my apologies for my late reply (I will hopefully find the time to continue working on this in April).

I am confused about putting both the forward and reverse primers in the same column (under LinkerPrimerSequence). As can be seen from the example mapping file, this means that you have two diff sample IDs for the same sample (one with suffix F04 and another one with suffix R22). Does that not complicate later processing/analyses, as they are in fact the same sample? Probably you can just merge these two later on (making use of pattern recognition), I guess..

Thanks,
Ellen

@tarunaaggarwal
Copy link

tarunaaggarwal commented Feb 28, 2017

Hi Ellen,

Right! I see how that is confusing. So when I posted that reply, I guess my understanding of the dataset was a little lacking. These data of ours contain non-overlapping amplicons which makes things complicated. I agree with you that the samples IDs are two different IDs.

So here is my new answer--

LinkerPrimerSequence is the forward primer and the ReversePrimer is the reverse primer. Given your primers, I believe I have the correct sequences for them. This is what your mapping file should look like. Please double check the primer sequences.

@elpape
Copy link
Author

elpape commented Mar 20, 2017 via email

@tarunaaggarwal
Copy link

It is fixed now...Sorry about that

@elpape
Copy link
Author

elpape commented Mar 22, 2017 via email

@tarunaaggarwal
Copy link

Weird! Okay, here it is. Hopefully this one works!

@elpape
Copy link
Author

elpape commented Mar 22, 2017 via email

@jianshu93
Copy link

what if forward primer is not removed in per sample based situation. How can I remove forward primer in that case. How should I change the parameter in split_libraries_fastq.py parameters to remove forward primer? add 'mapping_fps' parameter?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants