11 Aug 14:03

nebfield

28a0971

pgsc_calc v2.0.0-alpha.1

This patch fixes a bug when running the workflow directly from github with the test profile (i.e. without cloning first). Thanks to @staedlern for reporting the problem.

Assets 2

07 Aug 16:14

smlmbrt

v2.0.0-alpha

af4882c

pgsc_calc v2.0.0-alpha

This is the alpha release of the pgsc_calc pipeline's major new feature: to compare samples to a reference population in order to adjust PGS with genetic ancestry data (see documentation for details). The normal calculation of PGS is largely unaffected and directly comparable with previous versions of the calculator and PGS calculated with other tools.

Features

Major

Breaking changes to samplesheet structure to provide more flexible support for extra genomic file types in the future.
Genetic ancestry group similarity is calculated to a population reference panel (default: 1000 Genomes) when the --run_ancestry flag is supplied. This runs using PCA and projection implemented in the fraposa_pgsc (v0.1.0) package.
Calculated PGS can be adjusted for genetic ancestry using empirical PGS distributions from the most similar reference panel population or continuous PCA-based regressions.

These new features are optional and don't run in the default workflow.

Minor

Speed optimizations for PGS scoring (skipping allele frequency calculation). Thanks to @mglev1n for the suggestion!

Credits

Contributions from: @nebfield @smlmbrt @ens-lgil

Contributors

smlmbrt, ens-lgil, and 2 other contributors

Assets 2

27 Jan 21:21

nebfield

v1.3.2

bd1ca59

pgsc_calc v1.3.2

This patch fixes a bug that caused the effect weight column in some PGS Catalog scoring files to be read as strings instead of floats, which triggered an assertion error. Thanks to @j0n-a for reporting the problem.

Contributors

j0n-a

Assets 2

24 Jan 11:58

nebfield

v1.3.1

54a425a

pgsc_calc v1.3.1

This patch fixes a bug that breaks the workflow if all variants in one or more PGS scoring files match perfectly with the target genomes. Thanks to @lemieuxl for reporting the problem.

Contributors

lemieuxl

Assets 2

21 Nov 17:17

nebfield

v1.3.0

94d054e

pgsc_calc v1.3.0

This release is focused on improving scalability.

Features

Variant matching is made more efficient using a split - apply - combine approach when the data is split across chromosomes. This supports parallel PGS calculation for the largest traits in the PGS Catalog (e.g. cancer, 418 PGS [avg 261,000 variants/score]) on big datasets such as UK Biobank.
Better support for running in offline environments:
- Internet access is only required to download scores by ID. Scores can be pre-downloaded using the utils package (https://pypi.org/project/pgscatalog-utils/)
- Scoring file metadata is read from headers and displayed in the report (removed API calls during report generation)
Implemented flag (-–efo_direct) to return only PGS tagged with exact EFO term (e.g. no PGS for child/descendant terms in the ontology)

Assets 2

19 Oct 09:11

nebfield

v1.2.0

7e11510

pgsc_calc v1.2.0

This release is focused on improving memory and storage usage.

Features

Allow genotype dosages to be imported from VCF to be specified in vcf_genotype_field of samplesheet (default: GT / hard calls)
Makes use of durable caching when relabelling and recoding target genomes (--genotypes_cache)
Improvements to use less storage space:
- All intermediate files are now compressed by default
- Add parameter to support zstd compressed input files
Improved memory usage when matching variants

(updated tagged release to fix docs)

Assets 2

16 Sep 13:50

nebfield

v1.1.0

4952d21

pgsc_calc v1.1.0

The first public release of the pgsc_calc pipeline. This release adds compatibility
for every score published in the PGS Catalog. Each scoring file in the PGS Catalog
has been processed to provide consistent genomic coordinates in builds GRCh37 and GRCh38.
The pipeline has been updated to take advantage of the harmonised scoring files (see
PGS Catalog downloads for additional details).

Features

Many of the underlying software tools are now implemented within a pgscatalog_utils
package (v0.1.2, https://github.com/PGScatalog/pgscatalog_utils and
https://pypi.org/project/pgscatalog-utils/). The packaging allows for independent
testing and development of tools for downloading and working with the scoring files.
The output report has been improved to have more detailed metadata describing
the scoring files and how well the variants match the target sampleset(s).
Improvements to variant matching:
- More precise control of variant matching parameters is now possible, like
  ignoring strand flips
- match_variants should now use less RAM by default:
  - A laptop with 16GB of RAM should be able to comfortably calculate scores on
    the 1000 genomes dataset
  - Fast matching mode (--fast_match) is available if ~32GB of RAM is
    available and you'd like to calculate scores for larger datasets
Groups of scores from the PGS Catalog can be calculated by specifying a specific
--trait (EFO ID) or --publication (PGP ID), in addition to using individual
scoring files --pgs_id (PGS ID).
Score validation has been integrated with the test suite
Support for M1 Macs with --platform parameter (docker executor only)

Bug fixes

Implemented a more robust prioritisation procedure if a variant has multiple
candidate matches or duplicated IDs
Fixed processing multiple samplesets in parallel (e.g. 1000 Genomes + UK
Biobank)
When combining multiple scoring files, all variants are now kept to reflect the
correct denominator for % matching statistics.
When trying to correct for strand flips the matched effect allele wasn't being
correctly complemented

Assets 2

26 May 13:39

nebfield

v1.0.0

4949306

v1.0.0 Pre-release

Pre-release

This release reliably calculates scores that contain chromosomal positions (scores with only rsID information will fail). Significant effort has been made to validate scores on different reference datasets. In the next release we'll add score validation to our test suite to make sure calculated scores are consistent between releases.

Changelog

Add support for PLINK2 format (samplesheet structure changed)
Add support for allosomes (e.g. X, Y)
Improve PGS Catalog compatibility (e.g. missing other allele)
Add automatic liftover of scoring files to match target genome build
Performance improvements to support UK BioBank scale data (500,000 genomes)
Support calculation of multiple scores in parallel
Significantly improved test coverage (> 80%)
Lots of other small changes to improve correctness and handling edge cases

In Development

This is marked as a pre-release because it will will fail for PGS Catalog scores that only have an rsID. Mapped positions will eventually be provided for existing scores via the PGS Catalog API and these will be integrated into the calculator pipeline.

Assets 2

04 Feb 10:16

nebfield

0.1.3dev

6f3f042

0.1.3dev Pre-release

Pre-release

[0.1.3dev] - 2022-02-04

pgsc_calc should run on GrCh37 scoring files from the PGS Catalog & GrCh37 target genomic data but 🚨 don't trust the output 🚨

This release is the final implementation of the MVP.

Changelog

Better support for calling pipeline via an API
Documentation(!)
Better schemas for validation

Assets 2

17 Jan 17:35

nebfield

0.1.2dev

581fc48

0.1.2dev Pre-release

Pre-release

[0.1.2dev] - 2022-01-17

pgsc_calc should run on GrCh37 scoring files from the PGS Catalog & GrCh37 target genomic data but 🚨 don't trust the output 🚨

Enhancements & fixes

#2: Set up github action CI and linting
A lot of work to integrate with IGS4EU (e.g. JSON input)

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Features

Major

Minor

Credits

Contributors

Contributors

Contributors

Features

Features

Features

Bug fixes

Changelog

In Development

[0.1.3dev] - 2022-02-04

Changelog

[0.1.2dev] - 2022-01-17

Enhancements & fixes

Releases: PGScatalog/pgsc_calc

pgsc_calc v2.0.0-alpha.1

pgsc_calc v2.0.0-alpha

Features

Major

Minor

Credits

Contributors

pgsc_calc v1.3.2

Contributors

pgsc_calc v1.3.1

Contributors

pgsc_calc v1.3.0

Features

pgsc_calc v1.2.0

Features

pgsc_calc v1.1.0

Features

Bug fixes

v1.0.0

Changelog

In Development

0.1.3dev

[0.1.3dev] - 2022-02-04

Changelog

0.1.2dev

[0.1.2dev] - 2022-01-17

Enhancements & fixes