CASH is a utility for administrators of large computer clusters to quickly run shell commands on all or a subset of the cluster nodes. CASH generates a cascading, or tree-like topology of the nodes, and is therefore much faster than other tools that simply iterate the nodes or try to access many nodes in parallel.
CASH is supposed to be run from the administrator's machine, but may also be run from one of the cluster nodes. In the first case, all communication between the computer cluster and the admin machine is channelled over a gateway host.
Please see below for the execution/communication model.
CASH has the following requirements:
- python > 3.6 on each node
- password-less SSH access to and between all nodes
Please run pip install cascading-shell
and use the cash
command line tool on the admin machine. Then, configure your
cluster(s). Nodes and nodegroups are configured in ~/.cash.topo.json
like this:
{
"nodes": {
"group1": "clus1node001,clus1node002,clus1node003",
"all": {
"site1": {
"cluster1": {
"rack1": "clus1node[001-020]",
"rack2": "clus1node[021-040]",
"rack3": "clus1node[041-060]"
},
"cluster2": {
"rack1": "clus2node[001-020]",
"rack2": "clus2node[021-040]",
"rack3": "clus2node[041-060]"
}
},
"site2": {
"cluster3": {
"rack1": "clus3node[001-020]",
"rack2": "clus3node[021-040]",
"rack3": "clus3node[041-060]"
},
"cluster4": {
"rack1": "clus4node[001-020]",
"rack2": "clus4node[021-040]",
"rack3": "clus4node[041-060]"
},
"cluster5": "clus5node[001-020]"
}
}
}
}
The config file has the following rules:
- Right now, everything lives under the
nodes
object. - The file format is standard JSON, where each key is a group name and each value is a comma separated list of nodes.
- Nodes with sequential numbers can be shortened using square brackets, e.g.,
node[001-003]
resolves tonode001,node002,node003
. Be careful with leading zeros here! You may also use a comma here, such as:node[001-003,005]
->node001,node002,node003,node005
. You can also use multiple bracket instances:clus[1-3]node[001-003]
->clus1node001,clus1node002,clus1node003,clus2node001,clus2node002,clus2node003,clus3node001,clus3node002,clus3node003
and so on. - Groups can be nested. The topology of the node tree is specified in the mandatory
all
group. It is wise to reflect network latency/bandwidth in the tree; for instance, as in the above example, you may divide your HPC into groups of site, cluster, rack if applicable. - Aside from
all
, you can specify as many groups as you wish and nest them to your liking.
CASH communicates with each node in a cascading fashion, where CASH itself on each node acts as a proxy for its immediate children and forwards all messages from the children to its parent and vice versa. Let's try to understand this with an example. Imaging the following topology configuration:
{
"nodes": {
"all": {
"site1": {
"cluster1": {
"rack1": "clus1node[1-3]",
"rack2": "clus1node[4-6]"
},
"cluster2": {
"rack1": "clus2node[1-3]",
"rack2": "clus2node[4-6]"
}
},
"site2": {
"cluster3": {
"rack1": "clus3node[1-3]",
"rack2": "clus3node[4-6]"
},
"cluster4": {
"rack1": "clus4node[1-3]",
"rack2": "clus4node[4-6]"
}
}
}
}
}
We have a total of four clusters in two geographical sites, each cluster has two racks with three nodes each. We now
want to execute a command on all nodes using CASH. First, CASH spawns an instance of itself on the gateway host, that
can be specified via the DEFAULT_JUMP_HOST
variable or via the command line parameter --jump-host
. From the gateway,
a connection to the first host of site1
and the first host of site2
is established, i.e., clus1node1
and
clus3node1
. From each of those two nodes, CASH hops to the first node of each cluster (e.g., clus1node2
for
cluster1
, as clus1node1
was already used, and clus2node1
), from there to the first
node of each rack, and then to the remaining nodes.
For example, clus4node5
is reached in the following way:
ADMIN_MACHINE -> gateway -> clus3node1 (site) -> clus4node1 (cluster) -> clus4node4 (rack) -> clus4node5 (node)
. This
tiered or cascading execution model of course makes sense only for a larger number of nodes than in this example. You
can tell CASH to use a flat instead of cascading connection model with the --flatten
parameter.
The number of parallel connections on each node is limited by the --fan-size
parameter (env DEFAULT_FANSIZE = 50
).
When more that FANSIZE nodes are direct children of one node, they are grouped by FANSIZE and an additional layer is
formed.
Every node that is part of the tree receives and forwards messages from/to its parent and its children, and also executes the desired shell command locally.
Here is a copy of cash --help
:
usage: cash [-h] [-n NODES] [--jumphost JUMPHOST] [--ssh-timeout SSH_TIMEOUT]
[-s FANSIZE] [--flatten] [-p] [--json | --shell | --quiet]
{run,plan} ...
positional arguments:
{run,plan} Please use one of the following sub commands
run Run command
plan Print tree as json to stdout (view with, e.g.,
firefox)
optional arguments:
-h, --help show this help message and exit
-n NODES, --nodes NODES
Node or node groups.
--jumphost JUMPHOST Gateway host to cluster.
--ssh-timeout SSH_TIMEOUT
Define a timeout for SSH sessions. 0 = no timeout
-s FANSIZE, --fansize FANSIZE
Maximum number of parallel SSH sessions.
--flatten Disable tree mode.
-p, --progress Show progress of received answers
--json JSON output format
--shell Shell friendly output format
--quiet No output
- Node groups can be specified with
@group_name
in the--nodes
parameter. - You can exclude hosts by using
-n "@group,-node01"
. - You can use the square bracket syntax here, too:
-n "node[1-9]"
.
You can specify the defaults of the CLI parameter via the following environment variables:
DEFAULT_SSH_TIMEOUT = 30
DEFAULT_FANSIZE = 50
DEFAULT_NODES_STRING = "@all"
DEFAULT_OUT_FORMAT = "text"
DEFAULT_JUMP_HOST = "jumphost"
DEFAULT_FLATTEN = False
DEFAULT_RUN_SHELL = True