Open
Description
When we run this tool on an rna sample we have some results that we find misleading, where spliced reads are inflating the depth of certain locations, particularly locations in intronic regions. Here's an output vcf example:
1 157550066 . C T . . . DP:RD:AD:VF:DPP:DPN:RDP:RDN:ADP:ADN 8:0:1:0.125:6:2:0:0:1:0
So here the depth is 8 but the ref and alt add up to 1. At first I thought it was a third allele but looking at the actual alignment shows that it's not really aligned:
$ samtools tview -p 1:157550066 -d T /path/to/sample.Aligned.sortedByCoord.out.bam --reference /resources/genomes/GRCh37/fasta/b37.fasta
157550071 157550081 157550091 157550101 157550111 157550121 157550131
TGTACTCGGGAAACTAAAAAGGAATGGCAGAAACTGAGGTCTCACCTGGTTTCGTCTCCCAAGAAACCAACTCCTGCAAA
................................................................................
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<,,,,,,,,,,,,,,,,,,,,,,,,,,,,<<<<<<<
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<,,,,,,,,,,,,,,,,,,,,,,,,,,,,<<<<<<<
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>............................>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>............................>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>............................>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>...................................
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>...................................
................................................................................
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
....................................
....................................
..>>>>>>>
Seems as though the depth is inflated due to these reads that have <<< or >>>. I think this might be misleading, at least for rna. Do you know if --filter_indel filter out these reads? My concern is that indel and spliced reads will both be filtered out. Is there any way to treat these two types of alignment differently?
Metadata
Metadata
Assignees
Labels
No labels