The following constants (Macros) are defined as part of the protocol as they name frequently reccuring values:
%% dproto.hrl
%% number of bits used to encode the bucket size.
%% => buckets can be 255 byte at most!
-define(BUCKET_SS, 8).
%% The number of bits used to encode the size of the a single metric part.
%% => each part can be 255 byte at max!
-define(METRIC_ELEMENT_SS, 8).
%% The number of bits used to encode the size of a whole metric (All of its
%% parts)
%% => the maximum number of bytes in a whole metric can be 65,536, so a single
%% metric can hold at least 255 elments, more if their size is < 256 byte.
-define(METRIC_SS, 16).
%% The number of bits used to encode a list of metrics.
%% => this means a list operation can return at least 281,474,976,710,656
%% metrics.
%%
%% That is a lot! Good problem to have if we ever face it!
-define(METRICS_SS, 64).
%% The number of bits used to encode the length of th payload data.
%% => this means we can encode 4,294,967,296 byte or 536,870,912 points
%% at 8 byte / point in a single request.
%%
%% we should never do that!
-define(DATA_SS, 32).
%% Tye number of bits used for encoding the time.
-define(TIME_SIZE, 64).
%% The number of bits used for encoding the count.
-define(COUNT_SIZE, 32).
%% Number of bits used to encode the delay as part of the streaming protocol.
-define(DELAY_SIZE, 8).
%% The type used to encode sizes.
-define(SIZE_TYPE, unsigned-integer).
%% The type used to encode time.
-define(TIME_TYPE, unsigned-integer).
%% mmath.hrl
-define(BITS, 56).
-define(INT_TYPE, signed-integer).
-define(NONE, 0).
-define(INT, 1).
-define(TYPE_SIZE, 8).
-define(DATA_SIZE, ((?BITS + ?TYPE_SIZE) div 8)).
On both read and write datapoints are encoded as follows:
<<?INT:?TYPE_SIZE, Value:?BITS/?INT_TYPE>>.
<<?NONE:?TYPE_SIZE, 0:?BITS/?INT_TYPE>>.
Note
Not every language handles 56 bit integers as well as erlang does, however 32 bit integers can be used when padded from the left with three bytes (24 bit) of zeros (0
) for positive values, or minus one (-1
/ 255
) bytes for negative integers. An alternative is using 64 bit integers and discarding the left most byte.
<<10:56/signed-integer>> = <<0:24/signed-integer, 10:32/signed-integer>>.
<<0,0,0,0,0,0,10>>
<<-10:56/signed-integer>> = <<-1:24/signed-integer, -10:32/signed-integer>>.
<<"ÿÿÿÿÿÿö">>
<<-1,-1,-1,-1,-1,-1,-10>>
Metric names are not simple strings but a length prefixed list of elements. The upside of this is that there are no reserved characters (such as .
) and it allows faster parsing and matching against them.
An example would be:
<<2, "my", 3, "key">>.
<<3, "yet", 7, "another", 3, "one">>.
The TCP endpoint can only accept incoming data when switched to stream mode. This way a connection is dedicated to send data to a single bucket. Flushing can be handled either manually or automatically. Automatic flushing sets a maximum delta between the first data cached for the connection and the newest arrived bit of information.
It is possible to switch the TCP connection, this stream allows to specify a bucket for the stream and by that prevent it to be resent with every metric. Also, it makes it possible for the connection cache to have a specified maximal duration between the first and the last metric received before the data is flushed.
This will switch the TCP connection to stream mode. From then on, only payload and flush messages are accepted.
Warning
Once initialized there is no more 4 byte prefix! This allows for a more efficient way of streaming data since even partially arived packages can be handled in a way.
% Identifies entering stream mode.
<<4,
% We will flush when the delay is greater or equal Delay
Delay:?DELAY_SIZE/?SIZE_TYPE,
% All metrics on this stream will be stored in this bucket.
BucketSize:?BUCKET_SS/?SIZE_TYPE, Bucket/binary
>>.
In addition, it is possible to specify a resolution for a bucket when switching a connection to stream mode:
% Identifies entering stream mode.
<<4,
% We will flush when the delay is greater or equal Delay
Delay:?DELAY_SIZE/?SIZE_TYPE,
% DataPoints are measured every N number of seconds
Resolution:?TIME_SIZE/?TIME_TYPE,
% All metrics on this stream will be stored in this bucket.
BucketSize:?BUCKET_SS/?SIZE_TYPE, Bucket/binary
>>.
If a resolution is not supplied, the default value of 1s (1000) will be used.
Warning
Bucket resolutions cannot change once a bucket has been created. Attempting to set a different resolution will result in an error.
The metric packages automatically flush the connection cache when (Time - min(All Times)) > MaxDelay
.
The data can hold one or more metric values and it is possible to include 'unset'.
<<5, %% Identifies this as a metric package
Time:?TIME_SIZE/?SIZE_TYPE, %% The time offset
_MetricSize:?METRIC_SS/?SIZE_TYPE, %% Length of the metric name in bytes.
Metric:_MetricSize/binary, %% The metric.
_DataSize:?DATA_SS/?SIZE_TYPE, %% Length of the data in bytes.
Data:_DataSize/binary %% One or more metric points
>>.
It is possible to control the flush time outside of the timing by forcing a flush as part of the stream. To do that the flush
message can be used.
<<6>>. % Indicates that at this point the connection cache should be flushed.
It is possible to batch multiple inserts that are targeted at the same time, this allows to save some extra bandwith when transmitting data. The batching is only available in stream mode.
The batch is initialized with the following message:
<<10, %% Command code for batch start
Time:?TIME_SIZE/?SIZE_TYPE, %% The time offset
>>.
This can be followed by as many batch packages are desired, each package include one metric name and a single datapoint:
<<_MetricSize:?METRIC_SS/?SIZE_TYPE, %% Length of the metric name in bytes.
Metric:_MetricSize/binary, %% The metric.
Point:8/binary %% One or more metric points
>>.
When no more datapoints are desired for this batch, the batch can be terminated by sending a 2 0 byte (which would not be a valid payload package since the MetricSize must be at least 1).
<<0:?METRIC_SS/?SIZE_TYPE>>. %% This would not be a valid payload package.
This command list all buckets known to the system. The command is received and a reply send directly.
<<3>>.
The reply is prefixed with the total size of the whole reply in bytes (not including the size prefix itself). Then each bucket is prefixed by a size of the bucket name.
%% Outer wrapper
<<ReplySize:?BUCKETS_SS/?SIZE_TYPE, Reply:ReplySize/binary>>.
%% Elements of the reply
<<BucketSize:?BUCKET_SS/?SIZE_TYPE, Bucket:BucketSize/binary>>.
Lists all metrics in a bucket. The bucket to look for is prefixed by 1 byte size for the bucket name.
<<1,
%% The size and the bucket binary to read the metric list from
BucketSize:?BUCKET_SS/?SIZE_TYPE,
Bucket:BucketSize/binary
>>.
The reply is prefixed with the total size of the whole reply in bytes (not including the size prefix itself). Then each metric is prefixed by a size of the metric name.
%% Outer wrapper
<<ReplySize:32/integer, Reply:ReplySize/binary>>.
%% Elements of the reply
<<MetricSize:16/integer, Metric:MetricSize/binary>>.
Retrieves data for a metric, bucket and metric are size prefixed as strings, Time and count are unsigned integers.
<<2,
%% The Size of the bucket binary and the bucket itself
BucketSize:?BUCKET_SS/?SIZE_TYPE,
Bucket:BucketSize/binary,
%% The Size of the metric binary and the bucket itself
MetricSize:?BUCKET_SS/?SIZE_TYPE,
Metric:MetricSize/binary,
%% The start time to read from (given in bucket resolution)
Time:?TIME_SIZE/?SIZE_TYPE,
%% The number of points to read.
Count:?COUNT_SIZE/?SIZE_TYPE
>>.
The number of messages that will be returned will always be equal to Count
. If there is no data or insufficient data or the bucket/metric doesn't exist the missing data will be filled with blanks.
<<Reply:(?DATA_SIZE*Count)/binary>>.
where each of the elements looks like one of these:
<<?INT:?TYPE_SIZE, Value:?BITS/?INT_TYPE>>.
<<?NONE:?TYPE_SIZE, 0:?BITS/?INT_TYPE>>.
Gets information applicable to the given bucket, namely the resolution, expiry time and the points per file.
<<7,
%% The Size of the bucket binary and the bucket itself
BucketSize:?BUCKET_SS/?SIZE_TYPE, Bucket:BucketSize/binary
>>.
The reply will return the resolution and the points per file of the bucket.
<<
Resolution:?TIME_SIZE/?TIME_TYPE, %% The resolution of the bucket
PPF:?TIME_SIZE/?TIME_TYPE %% The points per file of the bucket
TTL:?TIME_SIZE/?TIME_TYPE %% The time a bucket will retain data, 0 indicates data is retained indefinitely
>>.
Adding a bucket can be achieved by the following call.
Warning
Not yet implemented.
<<8,
%% The Size of the bucket binary and the bucket itself
BucketSize:?BUCKET_SS/?SIZE_TYPE, Bucket/binary,
%% The resolution of data in this bucket.
Resolution:?TIME_SIZE/?TIME_TYPE,
%% The points per file in this bucket
PPF:?TIME_SIZE/?TIME_TYPE,
%% The time a bucket will retain data, 0 indicates indefinite
TTL:?TIME_SIZE/?TIME_TYPE
>>.
Deletes a bucket from the system.
Warning
Not yet implemented.
<<9,
%% The Size of the bucket binary and the bucket itself
BucketSize:?BUCKET_SS/?SIZE_TYPE, Bucket:BucketSize/binary
>>.