This is a Redis module for the t-digest data structure which can be used for accurate online accumulation of rank-based statistics such as quantiles and cumulative distribution at a point. The implementation is based on the Merging Digest implementation by the author.
Before going ahead, make sure that the Redis server you're using has support for Redis modules.
First, you'll have to build the Redis t-digest module from source.
make
This should generate a shared library called tdigest.so
in the root folder. You can now load it into Redis by using the following redis.conf
configuration directive:
loadmodule /path/to/tdigest.so
Alternatively, you can load it on an already running Redis server by issuing the following commands:
MODULE LOAD /path/to/tdigest.so
Initializes a key
to an empty t-digest structure with the compression
provided or with the default compression of 400
.
Reply: "OK"
Adds a value
with the specified count
. If key
is missing, an empty t-digest structure is initialized with a default compression of 400
. Returns the sum of counts for all values added.
Reply: long long
Merges one or more sourcekey
into destkey
. If destkey
is missing, an empty t-digest structure is initialized with a default compression of 400
.
Reply: "OK"
Returns the cumulative distribution for all provided values. value
must be a double. The cumulative distribution returned for all values is between 0..1
.
Reply: double
array or nil
if key missing
Returns the estimate values at all provided quantiles. quantile
must be a double
between 0..1
.
Reply: double
array or nil
if key missing
Prints debug information about the t-digest.
Reply: bulk strings array
The reply is of the form:
1) TDIGEST (<compression>, <num_centroids>, <memory size>)
2) CENTROID (<mean>, <weight>)
3) CENTROID (<mean>, <weight>)
4) CENTROID (<mean>, <weight>)
5) ...
Centroids are printed in sorted order with respect to their mean.
The integration tests require a running Redis server so you must have redis-server
on your PATH
or pass its location in an environment variable called REDIS_SERVER
. Tests are written in Python and use the pytest unit testing library.
make test
Bug reports, feature and pull requests are welcome! Please add tests for any non-trivial changes you submit.