Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Hi Rory, We've been using umis for our single cell RNAseq and really like it and made some modification to it. We'll be putting the pipeline on github and wanted to ask you if you have any preference how to put the umis branch up or if you would perhaps include our changes in the master branch. I'll briefly describe our pipeline, and the additions to umis. 1. celseq2 reads 2. umis fastqtransforum 3. umis cb_filter 4. umis sb_filter (new function to correct the sample reads with nedit 1 to the true sample barcode) 5. umis mb_filter (new function to remove any umi reads with non ACGT bases , N bases in our case) 6. umis add_uid (new function to add unique identifier (UID) tag to read name consisting of the concatenation of SB CB MB, this is needed to allow UMI-tools dedup on bam files with multiple cells/samples 7. align STAR 8. count metafeatures with featurecounts and transfer geneID to bam file with XF:Z: tag (htseq can also do this but is slower) 9. UMI-tools dedup on a per gene basis using the geneID tag in the bam file 10. expression matrix from bam file We're now using our own script to get the expression matrix. if you would make umis tag_count compatible with this pipeline, I suggest having both the sample and cell barcode in the column headers in stead of only the cell barcode, and allow tagcounts to count the XF:Z: tags. We also changed the barcodehash in umis to include N bases so that those get also corrected. There seems to be interest in such a pipeline as there are quite a few request in the umi-tools comment section on how to preprocess single-cell reads. umis is ideal for this and that is why we modified it. The bc-bio pipeline currently only works with the pseudo aligners, which don't work so well on umi end tagged libraries, so I think an option to do traditional alignment/gene counting in the bc-bio pipeline would be useful. (as least we would have used that if it was available) I'm mostly running experiments so this is my first github project so not really sure how to best branch umis or whatever. I think our preferred way would be to have our changes in the umis master branch and maybe a link on the bc-bio scRANseq page to our pipeline for people that would like to do traditional gene alignement/counting. But let us know what you prefer. We could also just have the branch at the umis or our page. https://github.com/MarinusVL/scRNApipe in the pull request the umis.py and barcodes.py, they work on our cellseq2 data but I think there may be some bugs for data in other formats, (without sample barcodes or with multiple cell barcodes most likely) Thanks, Marinus
- Loading branch information