Mutation randomization based on trinucleotide context specific mutation rates.
The following topics are covered:
- Trinucleotide context profile
- Mutation rate as normalized trinucleotide context profile
- Randomize mutations based on mutation rate
- Expectation and variance of a score based on mutation rate
- Sample-specific profiles based on prior profiles
Main explainer:
/notebooks/randomization.ipynb
Test data:
/notebooks/create_test_data.ipynb
conda env create -f environment.yml
bgsignatures:
$ pip install bgsignatures
bgreference:
conda install -c conda-forge -c bbglab bgreference
The trinucleotide content of the reference exome was downloaded from the SigProfilerMatrixGenerator
github:
- triplet count per chromosome in the exome: context_counts_GRCh38_96_exome.csv
- check out other categories: https://github.com/AlexandrovLab/SigProfilerMatrixGenerator/tree/master/SigProfilerMatrixGenerator/references/chromosomes/context_distributions