Skip to content
/ cash Public

A command line utility to run shell commands on a large number of HPC nodes

License

Notifications You must be signed in to change notification settings

janoliver/cash

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CASH - Cascading Shell

CASH is a utility for administrators of large computer clusters to quickly run shell commands on all or a subset of the cluster nodes. CASH generates a cascading, or tree-like topology of the nodes, and is therefore much faster than other tools that simply iterate the nodes or try to access many nodes in parallel.

CASH is supposed to be run from the administrator's machine, but may also be run from one of the cluster nodes. In the first case, all communication between the computer cluster and the admin machine is channelled over a gateway host.

Please see below for the execution/communication model.

Requirements

CASH has the following requirements:

  • python > 3.6 on each node
  • password-less SSH access to and between all nodes

Setup

Please run pip install cascading-shell and use the cash command line tool on the admin machine. Then, configure your cluster(s). Nodes and nodegroups are configured in ~/.cash.topo.json like this:

{
  "nodes": {
    "group1": "clus1node001,clus1node002,clus1node003",
    "all": {
      "site1": {
        "cluster1": {
          "rack1": "clus1node[001-020]",
          "rack2": "clus1node[021-040]",
          "rack3": "clus1node[041-060]"
        },
        "cluster2": {
          "rack1": "clus2node[001-020]",
          "rack2": "clus2node[021-040]",
          "rack3": "clus2node[041-060]"
        }
      },
      "site2": {
        "cluster3": {
          "rack1": "clus3node[001-020]",
          "rack2": "clus3node[021-040]",
          "rack3": "clus3node[041-060]"
        },
        "cluster4": {
          "rack1": "clus4node[001-020]",
          "rack2": "clus4node[021-040]",
          "rack3": "clus4node[041-060]"
        },
        "cluster5": "clus5node[001-020]"
      }
    }
  }
}

The config file has the following rules:

  • Right now, everything lives under the nodes object.
  • The file format is standard JSON, where each key is a group name and each value is a comma separated list of nodes.
  • Nodes with sequential numbers can be shortened using square brackets, e.g., node[001-003] resolves to node001,node002,node003. Be careful with leading zeros here! You may also use a comma here, such as: node[001-003,005] -> node001,node002,node003,node005. You can also use multiple bracket instances: clus[1-3]node[001-003] -> clus1node001,clus1node002,clus1node003,clus2node001,clus2node002,clus2node003,clus3node001,clus3node002,clus3node003 and so on.
  • Groups can be nested. The topology of the node tree is specified in the mandatory all group. It is wise to reflect network latency/bandwidth in the tree; for instance, as in the above example, you may divide your HPC into groups of site, cluster, rack if applicable.
  • Aside from all, you can specify as many groups as you wish and nest them to your liking.

Cascading communication model

CASH communicates with each node in a cascading fashion, where CASH itself on each node acts as a proxy for its immediate children and forwards all messages from the children to its parent and vice versa. Let's try to understand this with an example. Imaging the following topology configuration:

{
  "nodes": {
    "all": {
      "site1": {
        "cluster1": {
          "rack1": "clus1node[1-3]",
          "rack2": "clus1node[4-6]"
        },
        "cluster2": {
          "rack1": "clus2node[1-3]",
          "rack2": "clus2node[4-6]"
        }
      },
      "site2": {
        "cluster3": {
          "rack1": "clus3node[1-3]",
          "rack2": "clus3node[4-6]"
        },
        "cluster4": {
          "rack1": "clus4node[1-3]",
          "rack2": "clus4node[4-6]"
        }
      }
    }
  }
}

We have a total of four clusters in two geographical sites, each cluster has two racks with three nodes each. We now want to execute a command on all nodes using CASH. First, CASH spawns an instance of itself on the gateway host, that can be specified via the DEFAULT_JUMP_HOST variable or via the command line parameter --jump-host. From the gateway, a connection to the first host of site1 and the first host of site2 is established, i.e., clus1node1 and clus3node1. From each of those two nodes, CASH hops to the first node of each cluster (e.g., clus1node2 for cluster1, as clus1node1 was already used, and clus2node1), from there to the first node of each rack, and then to the remaining nodes.

For example, clus4node5 is reached in the following way: ADMIN_MACHINE -> gateway -> clus3node1 (site) -> clus4node1 (cluster) -> clus4node4 (rack) -> clus4node5 (node). This tiered or cascading execution model of course makes sense only for a larger number of nodes than in this example. You can tell CASH to use a flat instead of cascading connection model with the --flatten parameter.

The number of parallel connections on each node is limited by the --fan-size parameter (env DEFAULT_FANSIZE = 50). When more that FANSIZE nodes are direct children of one node, they are grouped by FANSIZE and an additional layer is formed.

Every node that is part of the tree receives and forwards messages from/to its parent and its children, and also executes the desired shell command locally.

Usage

Here is a copy of cash --help:

usage: cash [-h] [-n NODES] [--jumphost JUMPHOST] [--ssh-timeout SSH_TIMEOUT]
            [-s FANSIZE] [--flatten] [-p] [--json | --shell | --quiet]
            {run,plan} ...

positional arguments:
  {run,plan}            Please use one of the following sub commands
    run                 Run command
    plan                Print tree as json to stdout (view with, e.g.,
                        firefox)

optional arguments:
  -h, --help            show this help message and exit
  -n NODES, --nodes NODES
                        Node or node groups.
  --jumphost JUMPHOST   Gateway host to cluster.
  --ssh-timeout SSH_TIMEOUT
                        Define a timeout for SSH sessions. 0 = no timeout
  -s FANSIZE, --fansize FANSIZE
                        Maximum number of parallel SSH sessions.
  --flatten             Disable tree mode.
  -p, --progress        Show progress of received answers
  --json                JSON output format
  --shell               Shell friendly output format
  --quiet               No output
  • Node groups can be specified with @group_name in the --nodes parameter.
  • You can exclude hosts by using -n "@group,-node01".
  • You can use the square bracket syntax here, too: -n "node[1-9]".

You can specify the defaults of the CLI parameter via the following environment variables:

DEFAULT_SSH_TIMEOUT = 30
DEFAULT_FANSIZE = 50
DEFAULT_NODES_STRING = "@all"
DEFAULT_OUT_FORMAT = "text"
DEFAULT_JUMP_HOST = "jumphost"
DEFAULT_FLATTEN = False
DEFAULT_RUN_SHELL = True

About

A command line utility to run shell commands on a large number of HPC nodes

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages