Distbench tests are specified in protocol buffers that are defined in the traffic_config.proto.
This document decribes the different options available. The format is still evolving and subject to change.
For examples of tests, a good starting point is the workloads already included with Distbench (see workloads/README.md).
For pointers to other Distbench Documents, see the Distbench README.
The test is defined by a DistributedSystemDescription
message with the
following entities, each of this entity will be described in more detail later
in this document.
services
: describe the services and the number of instances to create. Each instance will be placed on a Distbenchnode_manager
.node_service_bundles
: a map of services to bundle; each bundle will share a Distbenchnode_manager
.action_lists
: define a list of action to execute.actions
: define a action to execute such as running a RPC or calling anotheraction_lists
repetitively.rpc_descriptions
: describe a RPC to perform, including the type of payload and fanout involved.payload_descriptions
: define a payload that can be associated with an RPC.attributes
:test_timeout
: Maximum time to run the test in seconds.
distribution_config
: describe a distribution for random RPCpayload_size
,request_size
, orresponse_size
.
Note: by convention, repeated fields in the proto are described by plural
names. So a services
block describes a single service, but there may be
multiple of them.
name
(string): name of the service.count
(int32): number of instances to start (each instance will occupy anode_manage
unless it is bundled with other services).protocol_driver_options_name
: name of the ProtocolDriverOptions to use for the service.
services
(string, repeated): list of the services (by name) to bundle together.
name
(string): name of the ActionList. If the name match a service, the action list will be automatically executed by the service.action_names
(string, repeated): define the list of actions to run.
Note: the actions specified are run in no specific order, unless a
dependencies
is specified in the Action
itself.
name
(string): name the action.dependencies
(string, repeated): define a dependency on another action. This action will wait until the action specified is complete.iterations
(Iteration): optionally define iterations (see next section)action
: Define the action to execute, as one of the following:rpc_name
: run the RPC (defined in arpc_descriptions
).action_lists
: run another ActionList (defined by anactions
)
Iterate on an action (performs repetition of the action).
max_iteration_count
(int32): Maximum number of iterations to perform.max_duration_us
(int32): Maximum duration in microseconds.max_parallel_iterations
(int64, default=1): The number of iterations to perform in parallel (at the same time).open_loop_interval_ns
(int64): Interval, in nano-seconds, for open loop iterations.open_loop_interval_distribution
(string, default=constant):sync_burst
: all the instances will try to perform the action at the same time.constant
: run at a constant interval.
name
(string): name the RPC.client
(string): The clientservice
(initiator of the RPC)server
(string): The serverservice
(target of the RPC)request_payload_name
(string): PayloadSpec to use as a payload for the requestresponse_payload_name
(string): PayloadSpec to use as a payload for the responsefanout_filter
(string, default=all): select the instance(s) ofserver
to send the RPC to.all
: Send the RPC to all the instances ofserver
, every time.random
: Choose a random instance.round_robin
: Choose one instance in a round-robin fashion.stochastic
: Allow to specify a list of probability to reach a different number of instances.- Format:
stochastic{probability:nb_targets,...}
- Example:
stochastic{0.7:1,0.2:3,0.1:5}
will targets:- A single instance of
server
with 70% chance - 3 random instances of
server
with 20% chance - 5 random instances of
server
with 10% chance
- A single instance of
- Format:
- An unrecognized value will target the instance 0 of
server
.
tracing_interval
(int32)- 0: Disable tracing
- >0: Create a trace of the RPC in the report every
tracing_interval
times (rpc.id % tracing_interval == 0
).
Define the payload attached to an RPC.
name
(string): name of the PayloadSpec.size
(int32): The size, in bytes, of the payload
Configure the protocol driver options. It can be refered by the service message.
name
(string): name the ProtocolDriverOptionsprotocol_name
(string): name of the protocol driver to use (e.g.grpc
,grpc_async_callback
)netdev_name
(string): name of the network device interface to use (e.g.eth0
)server_settings
: Setting to apply to the protocol driver, for example:See GRPC Options for applicable options.server_settings { name: "grpc.max_send_message_length" int_value: 2048 } server_settings { name: "grpc.per_message_compression" int_value: 1 }
The grpc protocol driver has a server_type
server_settings
option to
configure the server:
server_type
:inline
(requests processed inline) orhandoff
(create a thread and use a reactor to respond to incoming RPCs).
The grpc protocol driver also provides a client_type
client_settings
option
to configure the client:
client_type
:polling
(uses a completion thread polling the completion queue) orcallback
(grpc performs a callback to notify the completion).
The grpc_async_callback
behaves as a grpc with client_type=callback
and
server_type=handoff
; the grpc_async_callback
is deprecated, use the grpc
protocol driver with the correct client_type
and server_type
options.
This describes a (possibly) multi-dimensional joint distribution. For
convenience it is possible to describe a one dimension distribution as a CDF.
For the more general multi-dimensional case, each pmf point can describe
multiple dimensions of the distribution independently, with the meaning of
each dimension being described by the coresponding field_names
.
name
(string): name the DistributionConfigpmf_points
: This is used to define the probability mass function (PMF) of a distribution.- The pmf values of all the points in the distribution must add up to (or be near) 1.0.
- The number of data_points must match the number of dimensions of the joint distribution.
cdf_points
: This can be used to define the cumulative distribution function (CDF) of a distribution.- The cdf value of the last cdf_point must be equal to 1.0
- If the first point's cdf value is equal to zero, then the distribution is interpreted as piece-wise uniform with the lower bound of each subsequent interval being (the previous value + 1) automatically. E.g. values of 5, 10, 25, 40 would define inclusive intervals of [5, 10], [11, 25], [26, 40]. Note that N points define N-1 intervals.
field_names
(string): The name(s) describing the meaning of each dimension.
Examples of files with distributions in them can be found in the homa_cp_node_configs folder.
default_protocol
: Select the protocol driver to use (by defaultgrpc_async_callback
is used)grpc
: Use a completion thread to poll the completion queuegrpc_async_callback
: Use the Asynchronous API with a callback function
The TestSequence
RPC also have the following options:
keep_instance_log
(boolean, default=true): The test results contains, by default, the instance log which include a trace of all the RPCs executed. If the output is too verbose, the instance log can be suppressed by assigning false to this setting:tests_setting { keep_instance_log: false } tests { ...
shutdown_after_tests
(boolean, default=false): If true, quit Distbench (node managers & test sequencers) when all the tests in the RPC are done.