Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

frequency per feature after proceesing nanopore seq data in qiime2 #8

Open
mzakram219 opened this issue Nov 4, 2023 · 2 comments
Open

Comments

@mzakram219
Copy link

mzakram219 commented Nov 4, 2023

Dear,
I am writing to get input from experienced ones. I got nanopore data and using q2ONT command line to process my 16srRNA gene seq data. After demuliplexing, adapters removal, and trimming the reads to 1400 length, i imported my sequencing data into qiime2.
Ised these commands for deprelication of sequences, and for obtaining feature table seqs and feature table summary.

Dereplication of sequences
qiime vsearch dereplicate-sequences
--i-sequences 4_single-end-demux.qza
--o-dereplicated-table 5_derep-table.qza
--o-dereplicated-sequences 5_derep-seqs.qza

visualization files
qiime feature-table tabulate-seqs
--i-data 5_derep-seqs.qza
--o-visualization 5_derep-seqs.qzv

qiime feature-table summarize
--i-table 5_derep-table.qza
--o-visualization 5_derep-table.qzv

After these steps, i got two files, derep-seqs.qzv and derep-table.qzv.

Upon checking derep-table.qzv using qiime2 view, i realized that something might have gone wrong, as Frequency per feature is showing 1. Photo is attached.
Screenshot 2023-11-04 230458

Could you please provide insights what could have gone wrong that i obtained such outcomes, or it is normal to get such outcomes while processing nanopore seq data?
Thank you

@DeniRibicic
Copy link
Owner

DeniRibicic commented Nov 6, 2023

You have your output, meaning nothing is wrong there.

If you read carefully the pipeline description, you will learn that the vsearch is clustering OTUs based on 85% similarity, which is sort of a threshold that has been used in this study.
That being said, ONT has a fairly high error rate, depending on chemistry up to 20%. In this case this would mean that 85% threshold would be too stringent here, and yielding single frequency per feature. One thing you can do is to try to lower a bit the threshold in the source code and explore to what extent does it alter the output. But don't get too crazy here, looser threshold might simply cluster sequences from biologically different species into same OTU- this is something you definitely want to avoid.
In my opinion, ONT with such high base calling error rates are still not suitable for OTU clustering, less so denoising, the best thing would be to dereplicate them at 100% and treat each sequence as separate OTU.
In your case you have that output already, albeit using default 85% threshold, Just look into taxonomy, and based on that group your reads for potential downstream statistics.

I would also advise you to explore different pipelines for analysis of 16S rRNA generated by ONT. q2ONT is a fairly old pipeline, and it is not updated anymore.

@mzakram219
Copy link
Author

mzakram219 commented Nov 10, 2023

Dear,
Thank you for providing the help with up mentioned issue. I solved the problem and successfully performed the taxonomic assignment. Later, I exported all the required files to make phyloseq object in R. I now have couple of questions regarding downstream analysis. Could you please provide some help!

1 - Do you think data should be rarefied before analyzing alpha and beta diversity?
2 - What do you think about B-diversity? I observed that B-diversity based on bray-curtis disimilarity did not give satisfactory results for Nanopore 16s data. Is it normal with Nanopore 16s data?
3 - Should i more focus on other weighted and unweighted unifrac diversities?

Looking forward to having answer based on your expertise.
Thank you!
Regards
Muhammad

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants