Skip to content

rna genotyping #13

Open
Open
@anoronh4

Description

@anoronh4

When we run this tool on an rna sample we have some results that we find misleading, where spliced reads are inflating the depth of certain locations, particularly locations in intronic regions. Here's an output vcf example:

1	157550066	.	C	T	.	.	.	DP:RD:AD:VF:DPP:DPN:RDP:RDN:ADP:ADN	8:0:1:0.125:6:2:0:0:1:0

So here the depth is 8 but the ref and alt add up to 1. At first I thought it was a third allele but looking at the actual alignment shows that it's not really aligned:

$ samtools tview -p 1:157550066 -d T /path/to/sample.Aligned.sortedByCoord.out.bam --reference /resources/genomes/GRCh37/fasta/b37.fasta
     157550071 157550081 157550091 157550101 157550111 157550121 157550131      
TGTACTCGGGAAACTAAAAAGGAATGGCAGAAACTGAGGTCTCACCTGGTTTCGTCTCCCAAGAAACCAACTCCTGCAAA
................................................................................
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<,,,,,,,,,,,,,,,,,,,,,,,,,,,,<<<<<<<
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<,,,,,,,,,,,,,,,,,,,,,,,,,,,,<<<<<<<
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>............................>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>............................>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>............................>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>...................................
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>...................................
................................................................................
                                     ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
                                            ....................................
                                            ....................................
                                                                       ..>>>>>>>

Seems as though the depth is inflated due to these reads that have <<< or >>>. I think this might be misleading, at least for rna. Do you know if --filter_indel filter out these reads? My concern is that indel and spliced reads will both be filtered out. Is there any way to treat these two types of alignment differently?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions