Open
Description
I have samples basecalled with dorado software (v0.4.1) with detection of 5mCG_5hmCG modifications enabled using either:
[email protected]
[email protected]
I've noticed the two "batches" have distinct distribution of probabilities in base modifications. In particular the R9.4.1 samples have a massive peak of C:m at the far right of the histogram which looks like some sort of artefact (second plot).
I'm wondering what the explanation for this would be and what is the best way to mitigate this issue?
command:
modkit sample-probs \
${input_bam} \
--log-filepath ${log} \
--percentiles 0.1,0.25,0.5,0.75,0.9 \
--out-dir ${output} \
--hist \
--prefix ${prefix} \
--suppress-progress \
--force
done