v1.3.0
Summary
- A new graph object is introduced: GraphUnitigs, optimized to traverse unitigs but not to query individual kmers.
- A few graph API functions changed.
- Updated MPHF and HDF5.
- This releases now requires your compiler to be C++11-compatible.
Details
-
Tech notice
-
Compiling GATB-Core library now requires c++/11 capable compilers.
-
CMake 3.1.0 is the minimum release of CMake required to compile GATB-Core.
-
HDF5 library (use for data storage) upgraded to latest release 1.8.18
-
Parameters "-mphf none", "-mphf emphf" and "-mphf boophf" and variable WITH_MPHF are deprecated. Please remove them from your applications (e.g. in Graph::create()). BooPHF is now the default MPHF object and it is always compiled. Emphf has been removed from the library.
-
Debug compilation is now done using standard Cmake rule "-DCMAKE_BUILD_TYPE=Debug", instead of "-Ddebug=1".
-
-
API changes
-
Developers, please pay attention to these breaking changes:
Graph::Vector
is now ``GraphVector`Graph::Iterator
is nowGraphIterator
Graph::create()
does not accept anymore '-mphf ...' (see Tech Notice, above)
-
-
New features
-
New GraphUnitigs class that offers a de Bruijn graph representation based on unitigs (created by BCALM2) loaded in memory. It has the same API as the Graph class although some functions aren't implemented, as accessing a node that is not an extremity of a unitig isn't supported in this representation. The representation is designed to traverse unitigs quickly, skipping over all non-branching nodes. This representation doesn't use the Bloom filter nor the MPHF. To use this representation, have a look at Minia's code: https://github.com/GATB/minia/blob/ee00a34f1a49a1fcdd757e0bdaf7d03190896322/src/Minia.cpp#L116
-
New functions to traverse the graph have been added . See
simplePath*
in Graph.hpp. These functions are mostly designed to take advantage of GraphUnitigs and they have the same API in Graph too. They also will replace the Traversal class. Partial compatibility with the original Graph class has been implemented so far. -
BooPHF is now the default MPHF object used by GATB-Core
-
In addition to HDF5, we introduce a new experimental support for raw file format. It was made for two reasons: avoid potential memory leaks due to hdf5 (unclear at this point), and avoid hdf5 file corruption (whenever a job is interrupted after kmer counting, sometimes the h5 file containing the kmer counts cannot be re-opened). The format is experimental, so use at your own risks. The file format is basically the same content as the previous HDF5 format but with each dataset being into its own file. Also, JSON is used instead of XML for structured configuration. To enable this format, pass "-storage-type file" in your configuration string (e.g. Graph::create()).
-