Releases: lczech/genesis
Releases · lczech/genesis
genesis v0.33.0
Notable changes
- sequence
- Add k-mer classes and functions
- Extraction from sequences and strings
- Scanning microvariant neighborhood
- Minimal canonical encoding functionality
- Helper functions for canonical k-mers
- Add setting for Fasta Reader to not validate labels
- Allow empty sequences in Fastx Input View Stream
- Add k-mer classes and functions
- taxonomy
- Change Taxon ID to uint instead of string
- Refactor Taxonomy for simplicity and speed
- Refactor NCBI taxonomy reader to improve speed and memory
- Add Accession Lookup table class
- Add Taxon Kmer Data class
- Add Taxonomy kmer grouping functions
- Add Taxonomy Json Reader and Writer
- utils
- Bitvector
- Refactor Bitvector to use more free functions
- Add functions offering different lengths policies
- Add find first and last bit set functions
- Add Jaccard similarity functions
- Add Bitvector serialization functionality
- Add hierarchical agglomerative clustering functionality
- Add Bitpacked Vector class
- Add Concurrent Vector Guard class
- Add Shannon entropy function
- Extend median and quartiles functions to other numerical types
- Refactor serialization functionality for simplicity
- Add Thread Local Cache helper class
- Add exception handler callback to Thread Pool
- Refine Thread Pool timed wait functions
- Bitvector
- CMake and build
- Fix compatibility issues with C++17 and later
- Add auto-detection of C++ standard to CMake
- Activate automatic AVX detection in CMake
- Deprecate MacOS 12 in GitHub Actions CI
Bug fixes
- Fix several issues with different compilers in CI
- Fix exception handling bug in Thread Pool
- Fix numerical epsilon for SVG color bar
- Fix nan lengths in Jplace Writer
- Fix taxonomy data check functions
- Fix taxonomy preorder iterator reference bugs
- Fix edge case in Newick Reader
genesis v0.32.0
Notable Changes
- population
- Add per-sample mask tags and functions
- Add mask for provided loci to window averaging function
- Add Genome Locus Set invert function
- Add Genome Window View chromosome lengths
- Exclude missing data from available loci window avg function
- tree
- Add tree drawing wrapper functions with node and edge shapes
- Refine tree postorder iterator for speed
- sequence
- Refactor and rename Fasta and Fastq Iterators to Streams
- Speed up Fasta and Fastq reading and offer string view reading
- utils
- Speed up Thread Pool by using Concurrent Queue
- Add several convenience functions to Thread Pool and threading
- Add hardware feature detection and current resource usage functions
- Add Sequential Output Buffer class
- Add stdin input source and stdout stderr output targets
- Refactor Input Stream get line functions to use AVX2
- Outsource int parsing from input stream
- bugfixes
- Fix reference base lower case comparison issue in population
- Fix VCF with non-SNP AD field entries and deletions
genesis v0.31.1
This is mainly a release to fix the window averaging approach for FST, which was statistically nonsensical in the last release by accident.
Notable Changes
- Redesign FST window averaging implementation
- Refine diversity denominator for low read depths
- Add Window Stream begin and end callbacks
genesis v0.31.0
This release is a major clean-up of the population classes and functions. In particular: (1) "Iterator" classes have been renamed to the more appropriate "Stream", and (2) the Variant filtering approach has been completely redesigned to use tags instead of fully removing positions from the stream, allowing us to properly compute per-window averages of statistics.
Notable Changes
- General
- Rename all "Visitor" instances to "Observer" to follow the pattern
- Refactor observers to have on-enter and on-leave functionality
- population
- Rename Base Counts class to Sample Counts
- Rename all Variant and Window "Iterator" classes to "Stream"
- Rename Sliding Entries Window Stream to Queue Window Stream
- Rename usage of "coverage" to "read depth" in function names
- Major refactor of Variant and Sample Counts filter to use tagging filters
- Add filter categories and summaries, to simplify user output
- Refactor file formats and streams to use tagging filters
- Refactor statistics computations to use tagging filters
- Add proper window averaging support for statistics using tagging filters
- Refactor Queue Window Stream to use tagging filters
- Outsource Genome Stream from Chromosome Stream
- Add Position Window Stream
- Add Variant Gapless Input Stream
- Add Variant Input Stream that merges sample groups
- Add Diversity Processor helper class
- Add re-scaling and re-sampling functions for Sample Counts
- utils
- Add Kendall's Tau correlation functions
- Add multinomial and multivariate hypergeometric distribution functions
- Add betas and intercept coefficients estimation functions for GLM
- Refactor Thread Pool using Proactive Future for nested tasks
- Add thread-safe random engines
- Refine guess thread number functions
- Add auto waiting to parallel for loop functions
- Use global thread pool in gzip block compression
- Disable the local build of htslib if HTSLIB_DIR is provided
Bug fixes
- Fix end of iteration bug in Lambda Iterator
- Fix cmake clang htslib incompatibility
- Fix virtual override destructors
- Fix Matrix output stream operator for char types
- Fix htslib lib64 issue lczech/grenedalf#12
- Add regression interaction test for lczech/gappa#29
genesis v0.30.0
Notable Changes
- population
- Add improved Tajima D empirical pool size estimators
- Add cathedral plot functions with efficient algorithm
- Add support for sample name header row in Sync Reader
- Improve Fst Pool Calculator classes
- Allow multiallelic SNPs in pool VCF and Karlsson Fst
- Refine automatic sample naming for formats without names
- Refine sample filter and numerical filter functions
- Improve input order and chromosome length check functionality
- utils
- Add Matrix inplace transpose function
- Add advanced compensated summation algorithms
- Add begin and end callbacks to Lambda Iterator
- Refine text join functions
Bug fixes
- Fix Matrix output operator for char types
- Fix missing return statements in Lambda Iterator and Base Window Iterator
- Fix htslib check in Variant Input Iterator test case
- Fix Dataframe and Matrix string to double conversions
- Fix generic convert function
- Fix thread collision for cache in Reference Genome class
- Fix backslash escape bug
genesis v0.29.0
Notable changes
- population
- Add Reference Genome based ref and alt handling
- Add Chromosome/Genome Iterator classes
- Add Window, WindowView, and Iterator abstractions and helpers
- Add diversity pool calculator, refactor diversity functions
- Refine diversity measures, for speed and robustness for large coverages
- Refactor FST pool functions into classes, for streaming
- Refactor Variant filters and transformations
- Add Kapun-style missing data entries to Sync Reader
- sequence
- Add Reference Genome class and functions
- Add Sequence Dict class and functions
- utils
- Refine binomial functions for larger values, increase speed
- Add visitor functions to Lambda Iterator
- Add ranged pop count function to Bitvector
- Move exceptions back to utils namespace
- build
- Export the cmake include targets so that genesis can be used as a subproject
Bug fixes
- Fix MRU Cache copy constructor
- Fix thread pool nested deadlocks and seg faults
- Fix date time sprintf function and gcc macro test
- Fix string split default argument overload
genesis v0.28.1
Notable changes
- population
- Add Sliding Entries Window Iterator
- Add user-provided column names to Frequency Table Reader
- utils
- Compute proper bounding boxes for SVG Path objects
- Compute proper SVG bounding boxes with transformations
- Add pie chart SVG helper function
- Add cache stats to MRU Cache
- build
- Update htslib version to fix autoconf issues
- Add LTO/IPO support with CMake build
- Add GitHub Actions CI
- Deactivate OpenMP on MacOS by default, too much trouble
- Change CMakeLists to use an object library to speed up compliation
Bug fixes
- Fix various minor compiler warnings found due to CI
- Fix clang issue with std::tm initialization
- Proper linking against OpenMP for tests and apps
- Remove deprecrated std dependency
- Fix Base Window Iterator categories
genesis v0.28.0
Notable changes
- Add generic Frequency Table Input Iterator
- Add generic Genome Region Reader
- Add Genome Locus Set for fast position queries
- Add whole chromosome coverage functionality to Genome Region List
- Add Genome Region Window Iterator
- Add Map/Bim Reader
- Make Fst functions more lenient for small pool sizes
- Add global thread pool for eliminating core oversubscription
Bug fixes
- Fix memory leak in Base Window Iterator
- Fix sliding window iterator for empty input
genesis v0.27.0
Notable Changes
- Add SAM/BAM/CRAM Input Iterator, with RG read group splitting and filtering, and SAM flags filters
- Refactor Variant Input Iterators for ease of use
- Add Variant Input Iterator for Parallel Input
- Refactor Genome Region List to use Interval Tree, and add surrounding functionality
- Rename and refactor Kofler and Karlsson F_ST pool functions for clarity
- Add our unbiased F_ST estimators for pool sequencing data
- Refactor and refine diversity measure settings
- Refactor Window Iterator
- Non-virtual iterator interface
- Base class abstraction for SlidingWindowIterator
- Deprecate SlidingWindowGenerater, use SlidingWindowIterator instead
- Deprecate Vcf Window Generator function
- Add BED Reader
- Add Genome Region List reader for GFF
- Speed improvements and async block buffering for Lambda Iterator
- Refine CMake setup for htslib, improve autotools combatibility
genesis v0.26.1
Notable Changes
- This is mostly a version bump because the
version.hpp
file did not get updated properly with genesis v0.26.0 due to the new year, but also: - Add pendant length filters for placements