Releases: muellan/metacache
Releases · muellan/metacache
MetaCache 2.2.0
- Fixed the NCBI genome download script (the ftp path can be empty for some genomes).
- Changed the default data type for storing reference sequence ids from 16 to 32 bits in order to fit all complete bacterial, viral and archaea genomes of the latest NCBI RefSeq releases.
- The error message during the build process that should have reported that the number of sequences exceeds the supported number is fixed now.
MetaCache 2.1.1
- More refactoring and code cleanup
- Bug fixes
MetaCache 2.1.0
Improvements
- Enabled longer reads for the GPU version (previous limit was 127 bps)
- Significantly reduced number of temporary allocations on host
- Some more refactoring and code cleanup
Fixes
- Skip empty lines in input files
- Fixed conversion warning
MetaCache 2.0.1
- Fixed wrong load factor in query mode resulting in high load times (#22)
MetaCache 2.0.0
New: MetaCache-GPU
- building and querying on CUDA-capable accelerators
- multi-GPU support for distributing and/or replicating databases across multiple GPUs
- ultra-fast database building: ~10 seconds for the complete Refseq 202 (bacteria+viruses+archaea) on 4 NVIDIA(R) Tesla(R) V100 GPUs
- faster querying: ~300 million reads per minute for a Refseq 202 database on 4 NVIDIA(R) Tesla(R) V100 GPUs
- GPU-built databases can be used with the CPU version and vice versa (needs to be partitioned so that each part fits in GPU memory)
- MetaCache-GPU will be presented at ICPP '21
Further Improvements
- support for reading gzipped FASTA/FASTQ files (requires zlib to be installed, see installation instructions)
- mode "build+query" for building and directly querying a database (great for the GPU version)
- improved CPU querying speed
- new database format (incompatible with previous versions of MetaCache)
- improved database reading/writing performance
- more useful progress indicators
MetaCache v1.1.1
- fixed critical bug that lead to increased memory consumption during database build
- control characters like \t are now properly handled in the -separator option
- fixed g++5 compilation bug
- fixed some smaller bugs
- improved the documentation explaining the output analysis and display options
- more consistent progress bar display across different operations
- some code reorganization
MetaCache v1.0.0
- re-written command line interface with proper diagnostics; there shouldn’t be any breaking changes (regarding option names, etc.)
- updated and improved documentation
- removed hardly used and brittle "annotate" mode that was out of the scope of MetaCache anyway
- small performance improvements
- some minor bug fixes
MetaCache v0.9.0
- improved query speed, especially better thread scaling beyond 32 threads (ca. 50% faster with 88 threads)
- improved database loading speed
- database loading indicator
- small bug fixes
- improved code structure
Attention:
You need to rebuild your databases for this version, because it uses a new database binary format. Disk and RAM consumption are not affected.
MetaCache v0.8.0
New feature "Coverage Filter"
Option -cov-percentile <p>
removes the p-th percentile of hit targets (reference genomes) with the lowest coverage. A first pass does the normal mapping of queries (reads) to targets (reference genomes). The actual classification is then done in a second pass using only the remaining hit targets.
This will lead to a very small increase in runtime and memory consumption but can improve accuracy by detecting and removing stray false positive hits.
The coverage filter is deactivated by default.
Other Changes
- improved multi-threading in query mode
- improved database format (layout better suited for future loading on GPUs)
- code cleanup
MetaCache v0.6.2
- improved accession number / sequence id parsing
- file reading improvements
- code cleanup