Skip to content

v0.1.4

Latest
Compare
Choose a tag to compare
@shz9 shz9 released this 03 Dec 08:55

Changed

  • Updated the data type for the index pointer in the LDMatrix object to be int64. int32 does
    not work well for very large datasets with millions of variants and it causes overflow errors.
  • Updated the way we determine the pandas chunksize when converting from plink tables to zarr.
  • Simplified the way we compute the quantization scale in model_utils.
  • Fixed major bug in how LD window thresholds that are passed to plink1.9 are computed.
  • Fixed in-place fillna in from_plink_table in LDMatrix to conform to latest pandas API.
  • Update run_shell_script to check for and capture errors.
  • Refactored code to slightly reduce import/load times.
  • Cleaned up load_data method of LDMatrix and subsumed functionality in load_rows.
  • Fixed bugs in match_snp_tables.
  • Fixed bugs and re-wrote how the block LD estimator is computed using both the plink and xarray backends.
  • Updated from_plink_table method in LDMatrix to handle cases where boundaries are different from what
    plink computes.
  • Fixed bug in symmetrize_ut_csr_matrix utility functions.
  • Changed default storage data type for LD matrices to int16.

Added

  • Added extra validation checks in LDMatrix to ensure that the index pointer is formatted correctly.
  • LDLinearOperator class to allow for efficient linear algebra operations on the LD matrix without
    representing the full symmetric matrix in memory.
  • Added utility methods to LDMatrix class to allow for computing eigenvalues, performing SVD, etc.
  • Added Spectral properties to the attributes of LD matrices.
  • Added support to slice/retrieve entries of LD matrix by using SNP rsIDs.
  • Added support to reading LD matrices from AWS s3 storage.
  • Added utility method to detect if a file contains header information.
  • Added utility method to generate overlapping windows over a sequence.
  • Added compute_extremal_eigenvalues to allow the user to compute extremal (minimum and maximum) eigenvalues
    of LD matrices.
  • Added the utility function combine_ld_matrices to allow for combining LD matrices from different sources.