Skip to content

pgsc_calc v1.3.0

Compare
Choose a tag to compare
@nebfield nebfield released this 21 Nov 17:17
· 365 commits to main since this release
94d054e

This release is focused on improving scalability.

Features

  • Variant matching is made more efficient using a split - apply - combine approach when the data is split across chromosomes. This supports parallel PGS calculation for the largest traits in the PGS Catalog (e.g. cancer, 418 PGS [avg 261,000 variants/score]) on big datasets such as UK Biobank.
  • Better support for running in offline environments:
    • Internet access is only required to download scores by ID. Scores can be pre-downloaded using the utils package (https://pypi.org/project/pgscatalog-utils/)
    • Scoring file metadata is read from headers and displayed in the report (removed API calls during report generation)
  • Implemented flag (-–efo_direct) to return only PGS tagged with exact EFO term (e.g. no PGS for child/descendant terms in the ontology)