The objective of this python module is to combine PCAngsd with lostruct to analyse low-coverage data (genotype likelihoods) with local PCA.
Genotype likelihood files can be large and will often not fit into memory.
This module leverages the use of Xarray to store and access genotype likelihoods on disk, in a data structure comparable to sgkit
.
Similarly, PCA results are stored as an xarray dataset for easy manipulation and storage.
Requirements can be seen in the conda_env.yaml
file.
The easiest way to install local_pcangsd and its dependencies is through conda:
mamba env create -f conda_env.yaml
# OR conda env create -f conda_env.yaml
conda activate local_pcangsd
git clone https://github.com/alxsimon/local_pcangsd.git
pip install ./local_pcangsd
If you want to add the conda environment as a jupyter kernel
conda activate local_pcangsd
python -m ipykernel install --user --name local_pcangsd
Example code is presented in example.ipynb
.