Releases: pmelsted/bifrost
Releases · pmelsted/bifrost
v1.3.5
New querying options
-Q, --files-as-queries
: It is now possible to use a file containing one or multiple sequences as one query.-E, --min-nb-colors
: Requires that a query occurs in a user-provided minimum number of colors to report the query as present in the graph
Bug fixes
- Fix bug with incorrect read positions in input files causing suboptimal construction time
- Fix bug with the resizing of
MinimizerIndex
andKmerHashtable
causing program hanging for very small input data
Refactor
- CLI updated. Querying works a little differently from previous versions. Default behavior for previous versions was to report presence/absence of each query. New behavior is to report the number of k-mers from each query shared with the graph or with its colors, unless parameter
-e
is used explicitly in which case presence/absence is reported. Parameter-p
will make the querying report a ratio of found k-mers (with respect to the number of k-mers in each query) rather than a number - README updated: New "Benchmarking" section provides guidelines on how to properly compare other tools to Bifrost, use cases updated, citation updated, installation description simplified, etc.
v1.3.1
Minor upgrade to 1.3.0:
- Fix compilation issues for MacOS and FreeBSD
- Fix issue with reporting the ratio of k-mers found in each query for colored dBGs
- Fix issue with the inexact k-mer search in colored dBGs
v1.3.0
Bifrost memory usage and running time has improved by up to 30% (depending on the input dataset)
- The minimizer index and kmer hash-table use now robin-hood hashing instead of linear probing. They default to 95% maximum bucket occupancy instead of 80% in previous versions. Furthermore, hash table resizing uses now a 20% memory increase instead of doubling the size.
- Blocked Bloom Filter uses a default 24 bits per key to enable a much faster joining step.
- Extracting the exact unitigs from the Blocked Bloom Filter is done in 3 steps instead of 2: approximate unitigs are extracted to disk, read from disk and indexed in memory before annotating k-mers with their respective counts. Reader-writer locks can be avoided this way. However, Bifrost now uses a bit of disk space.
- Code refactoring
New options for querying
-p, --nb-found-km
: Output the number of found k-mers for each query (disable parameter -e)-P, --ratio-found-km
: Output the ratio of found k-mers for each query (disable parameter -e)- Fix rounding issue in #67
This version breaks Bifrost Index File (.bfi) compatibility with previous Bifrost versions, i.e. v1.3.0 cannot read .bfi files generated by v1.2.1 and older versions.
v1.2.1
Update kseq to latest version:
- Faster reading from FASTA/FASTQ
- Reads sequences larger than 2 GB
- Experiments show lower memory usage
v1.2.0
- GFA and FASTA output are now compressed by default. Compressed GFA/FASTA can be read by Bifrost.
- Optional Bifrost binary output (instead of GFA/FASTA). Uses less disk space than GFA/FASTA and loads faster in memory.
- Bifrost index output by default. Enables to load a Bifrost graph (GFA/FASTA or binary) much faster in memory.
- Time and memory reduction on some datasets.
- Minor bug fixes.
API and CLI compatibility breaks with previous versions but only minor changes (see Changelog). GCC version minimum requirement changes to 5.1 (from 4.8).
v1.0.6.5
- Major bugfix for single-threaded construction
- Minor bugfixes
v1.0.6.4
Minor update to 1.0.6.3:
- Remove XXHash dependency from install which had been put back by mistake in the cmake files.
v1.0.6.3
Minor update to v1.0.6.2:
- Add MAX_GMER_SIZE compile variable to choose the maximum size of g-mers (minimizers). Works exactly the same as MAX_KMER_SIZE. By default, MAX_GMER_SIZE is equal to MAX_KMER_SIZE.
- Fix an issue with the
update
function where the k-mer size of the input graph was not used to make the output graph (default k-mer size was used instead).
v1.0.6.2
Overall, Bifrost is now much faster and a lot more memory efficient compared to v1.0.5.
- Improvement on the rolling hash-function RepHash (much less collisions for small values).
- Blocked Bloom Filter cannot use over-loaded blocks anymore. When using the 2-blocks hashing method, if the 2 blocks are at >65% capacity, the minimizer is rehashed to find 2 new blocks which are hopefully not over-loaded. After 8 iterations of 2-overloaded blocks, the k-mer hash is stored in an unsorted_set.
- AVX2 version of the Blocked Bloom Filter is disabled at the moment. It doesn't bring any performance gain over the non-AVX2 version anymore. Need some rework.
- XXHash replaced with Wyhash
v1.0.5
Fix major problem with unitig deletion in colored dBGs