Specific instructions to run harvester with ssh rpc middleware

Instructions

Basic description of SSH RPC middleware: see here

The queue configuration depends on the architecture of the cluster (HPC) and the use case.

Limitations

The plugins to run remotely through SSH + RPC (by rpc_bot) should be able run independently of harvester server. That is, the plugins to run remotely CANNOT access Harvester DB (cannot call dbInterface methods) or access file paths on the harvester server (when there is no shared file system across harvester server and HPC).

Generic use case

Consider an HPC:

No outbound connectivity on login nodes and worker nodes
One can access login nodes via SSH
Login nodes have the same environment and mount the same shared filesystem as worker nodes do
Allowing service process to run on login nodes (no cputime limit per process or other limitations)
With DTNs (data transfer nodes) which has outbound connectivity and grid data transfer tools (globus, gfal, xroot, etc.)
DTNs are accessible from login nodes

Then it suffices to run harvester rpc_bot process on the login node of HPC, and let all harvester plugins run on the login node (run remotely).

That is, harvester runs the following plugins remotely:

submitter
monitor
sweeper
messenger
preparator
stager

The queue configuration (partial) may look like this:

                "preparator": {
                        "name": "SomePreparator",
                        "module": "pandaharvester.harvesterpreparator.some_preparator",
                        "basePath": "/some/remote/base/path",
                        "middleware": "rpc"
                },
                "submitter": {
                        "name":"SlurmSubmitter",
                        "module":"pandaharvester.harvestersubmitter.slurm_submitter",
                        "nCore": 9600,
                        "nCorePerNode": 48,
                        "templateFile": "/some/remote/template.sh",
                        "middleware": "rpc"
                },
                "messenger": {
                        "name": "SharedFileMessenger",
                        "module": "pandaharvester.harvestermessenger.shared_file_messenger",
                        "accessPoint": "/some/remote/path/${workerID}",
                        "middleware": "rpc"
                },
                "stager": {
                        "name":"SomeStager",
                        "module":"pandaharvester.harvesterstager.some_stager",
                        "middleware": "rpc"
                },
                "monitor": {
                        "name":"SlurmMonitor",
                        "module":"pandaharvester.harvestermonitor.slurm_monitor",
                        "middleware": "rpc"
                },
                "sweeper": {
                        "name": "SlurmSweeper",
                        "module": "pandaharvester.harvestersweeper.slurm_sweeper",
                        "middleware": "rpc"
                },
                "rpc": {
                        "name": "RpcHerder",
                        "module": "pandaharvester.harvestermiddleware.rpc_herder",
                        "remoteHost": "some.remote.host",
                        "remoteBindPort": 18861,
                        "numTunnels": 3,
                        "sshUserName": "someusername",
                        "sshPassword": null,
                        "privateKey": "/some/private/key",
                        "passPhrase": "somepassphrase",
                        "jumpHost": "some.jump.host",
                        "jumpPort": 22
                }

Note that the paths set for the plugin with rpc (e.g. messenger accessPoint) are the remote ones; i.e. on the HPC side.

Home

Getting started
Installation and configuration
Testing and running
Debugging
Work with Middleware
Admin FAQ

Developer pages
Code structure
DB structure
State and sequence diagrams
Plugin API specifications
Agents and Plugins descriptions
Plugin utilities
Workflows supported by harvester
Developer Q&A
Release

Development guides
Development workflow
Tagging

Production & commissioning
Condor experiences
Commissioning on the grid
Production servers
Service monitoring
Auto Queue Configuration with AGIS
GCE setup
Kubernetes setup
SSH+RPC middleware setup

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Specific instructions to run harvester with ssh rpc middleware

Instructions

Limitations

Generic use case

Clone this wiki locally