Clarifications #1

kmhernan · 2016-09-13T17:45:58Z

I have been looking through the source code and I would like some clarifications.

I am unable to determine whether or not the software makes any assumptions about sample names in the VCF? It doesn't appear to really care about the GT columns in the VCF, but I wanted to confirm this. I am mostly concerned that the sample ID in my bam files aren't the same as the ones in my VCF (just TUMOR/NORMAL in my vcf).
In your docs it looks like the order of the disease and normal bam is different than what the command-line help prompt states for the extract observations part.

Edit:

One more question. Does prosic consider soft-clipped bases?

Thanks!

johanneskoester · 2016-09-15T08:01:04Z

Hi,
thanks for your interest in prosic! Regarding your questions:

indeed, the whole sample columns in the vcf are not considered. With extract observations, the first bam is expected to be the healthy sample, the second the cancer sample. extract-observations then extracts evidence from the bams for each entry in the vcf, but it only considers the position, ref, alt and info columns. The calling then recalculates the posteriors for the variants.
Indeed, that was a mistake. Thanks for pointing out!

I should note that we are currently rewriting prosic into a much faster and more powerful and solid version (see here). It should be ready in at most 1 or 2 months.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarifications #1

Clarifications #1

kmhernan commented Sep 13, 2016 •

edited

Loading

johanneskoester commented Sep 15, 2016

Clarifications #1

Clarifications #1

Comments

kmhernan commented Sep 13, 2016 • edited Loading

johanneskoester commented Sep 15, 2016

kmhernan commented Sep 13, 2016 •

edited

Loading