CLI can run in a persistent worker mode and receive tasks from the Cirrus Cloud. This allows you to go beyond the cloud offerings and use your own infrastructure for running cloud tasks. The main use case is to run Cirrus tasks directly on a hardware without any isolation and not in a virtual ephemeral environment.
Follow the instruction in the INSTALL.md but note that Docker or Podman installation is not required.
The simplest invocation looks like this:
cirrus worker run --token <poll registration token>
This will start the persistent worker that periodically poll for new tasks in the foreground mode.
By default, the worker's name is equal to the name of the current system. Specify --name
to explicitly provide the worker's name:
cirrus worker run --token <poll registration token> --name z390-worker
Note that persistent worker's name should be unique within a pool.
Note that by default a persistent worker has the privileges of the user that invoked it. Read more about isolation below to learn how to limit or extend persistent worker privileges.
Path to the YAML configuration can be specified via the --file
(or -f
for short version) command-line flag.
Example configuration:
token: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
name: "MacMini-Rack-1-Slot-2"
labels:
connected-device: iPhone12ProMax
resources:
iphone-devices: 1
Currently configuration files support the same set of options exposed via the command-line flags, while the rest of the options can only be configured via configuration file and are documented here.
Worker automatically populates the following lables:
os
—GOOS
of the CLI binary (for example,linux
,windows
,darwin
,solaris
, etc.).arch
—GOARCH
of the CLI binary (for example,amd64
,arm64
, etc.).version
— CLI's version.hostname
— host name reported by the host kernel.name
— worker name passed via--name
flag of the YAML configuration. Defaults tohostname
with stripped common suffixes like.local
and.lan
.
Can be configured in the log
section of the configuration file. The following things can be customized:
level
— logging level to use, eitherpanic
,fatal
,error
,warning
,info
,debug
ortrace
(defaults toinfo
)file
— log to the specified file instead of terminalrotate-size
— rotate the log file if it reaches the specified size, e.g. 640 KB or 100 MiB (defaults to no rotation)max-rotations
— how many already rotated files to keep (defaults to no limit)
Example:
log:
level: warning
file: cirrus-worker.log
rotate-size: 100 MB
max-rotations: 10
Persistent Worker supports resource management, similarly to Kubernetes, but in a slightly more simplified way.
Resources are key-value pairs, where key represents an arbitrary resource name, and value is a floating-point number specifying how many of that resource the worker has.
When scheduling tasks, Cirrus CI ensures that all the tasks the worker receives do not exceed the resources defined in it's configuration file, for example, with the following configuration:
token: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
name: "mac-mini-1"
resources:
tart-vms: 2
...a worker won't run more than 2 tasks simultaneously from the following .cirrus.yml
:
persistent_worker:
isolation:
tart:
image: ghcr.io/cirruslabs/macos-monterey-base:latest
user: admin
password: admin
resources:
tart-vms: 1
task:
name: Test
script: make test
task:
name: Build
script: make build
task:
name: Release
script: make release
By default, Persistent Worker allows running tasks with any isolations, which is roughly equivalent to the following configuration:
security:
allowed-isolations:
none: {}
container: {}
tart: {}
vetu: {}
To only allow running tasks inside of Tart VMs, for example, specify the following in your Persistent Worker configuration:
security:
allowed-isolations:
tart: {}
Further, you can also restrict which Tart VM images can be used (wildcard character *
is supported), and force Softnet to enable better network isolation:
security:
allowed-isolations:
tart:
allowed-images:
- "ghcr.io/cirruslabs/*"
force-softnet: true
To restrict which volume paths can be mounted and in which mode, use allowed-volumes
:
security:
allowed-isolations:
tart:
allowed-volumes:
# Allow mounting /Volumes/SSD and all directories inside of it
- source: "/Volumes/SSD/*"
# Allow mounting /var/src in read-only mode, but not directories inside of it
- source: "/var/src"
force-readonly: true
Similarly to Tart, you can also restrict which Vetu VM images can be used (wildcard character *
is supported):
security:
allowed-isolations:
vetu:
allowed-images:
- "ghcr.io/cirruslabs/*"
Here's an example of how to run a task on one of the persistent workers registered in the dashboard:
task:
persistent_worker:
labels:
os: darwin
arch: arm64
script: echo "running on-premise"
By default, a persistent worker does not isolate execution of a task. All the task instructions are executed directly on the worker which can have side effects. This is intended since the main use case for persistent workers is to test on bare metal.
Note that the user that starts the Persistent Worker is the user under which the task will run. You may create a separate user with limited access to limit tasks privileges, or conversely grant tasks access to the whole machine by running the Persistent Worker as root
:
sudo cirrus worker run --token <poll registration token>
To use this isolation type, install and configure a container engine like Docker or Podman (essentially the ones supported by the Cirrus CLI).
Here's an example that runs a task in a separate container with a couple directories from the host machine being accessible:
persistent_worker:
isolation:
container:
image: debian:latest
cpu: 24
memory: 128G
volumes:
- /path/on/host:/path/in/container
- /tmp/persistent-cache:/tmp/cache:ro
task:
script: uname -a
To use this isolation type, install the Tart on the persistent worker's host machine.
Here's an example of a configuration that will run the task inside of a fresh macOS virtual machine created from a remote ghcr.io/cirruslabs/macos-ventura-base:latest
VM image:
persistent_worker:
isolation:
tart:
image: ghcr.io/cirruslabs/macos-ventura-base:latest
user: admin
password: admin
task:
script: uname -a
Once the VM spins up, persistent worker will connect to the VM's IP-address over SSH using user
and password
credentials and run the latest agent version.
To use this isolation type, install the Vetu on the persistent worker's host machine.
persistent_worker:
isolation:
vetu:
image: ghcr.io/cirruslabs/ubuntu:latest
user: admin
password: admin
task:
script: uname -a
Once the VM spins up, persistent worker will connect to the VM's IP-address over SSH using user
and password
credentials and run the latest agent version.
You can define a VM that the Persistent Worker will run even if no tasks are scheduled.
When an actual task is assigned to a Persistent Worker, and if the isolation specification of that task matches the defined standby configuration, the standby VM will be used to run the task instead of creating a new VM from scratch. Otherwise, the standby VM will be terminated to simplify the resource management.
This has an effect of speeding up the start times significantly, since the cloning, configuring and booting of a new VM would be normally be already done at the time the request to run a new VM arrives to a Persistent Worker.
Here's an example on how to configure a standby VM for Tart:
standby:
resources:
tart-vms: 1
isolation:
tart:
image: ghcr.io/cirruslabs/macos-sonoma-base:latest
user: admin
password: admin
cpu: 4
memory: 12
This corresponds to the following CI configuration:
persistent_worker:
isolation:
tart:
image: ghcr.io/cirruslabs/macos-sonoma-base:latest
user: admin
password: admin
cpu: 4
memory: 12
Vetu configuration on arm64
is similar:
standby:
isolation:
vetu:
image: ghcr.io/cirruslabs/ubuntu-runner-arm64:latest
user: admin
password: admin
cpu: 16
memory: 48
networking:
host: {}
disk_size: 100
On amd64
, simply replace the image
with ghcr.io/cirruslabs/ubuntu-runner-amd64:latest
.
Currently only Tart and Vetu isolations are supported for standby.
Resource modifiers allow you to change the behavior of the underlying isolation engine when a certain amount of resources is allocated to the task.
This comes in handy when you want to pass through one or multiple physical devices such as GPUs using e.g. Vetu, as vetu run
expects a path to a PCI device in its --device
argument.
As an example, let's say you've split a single physical GPU using vGPU or SR-IOV technology as follows:
- 1 ×️ 1/2 of GPU (PCI device
01:00.0
) - 2 × 1/4 of GPU (PCI devices
02:00.0
and03:00.0
)
You can then ensure that each task that runs on a Persistent Worker and asks for a gpu
resource will get the corresponding vGPU that matches the requirements:
token: <TOKEN>
name: "gpu-enabled-worker"
resources:
gpu: 1
resource-modifiers:
- match:
gpu: 0.5
append:
run: ["--device", "/sys/bus/pci/devices/0000:01:00.0/,iommu=on"]
- match:
gpu: 0.25
append:
run: ["--device", "/sys/bus/pci/devices/0000:02:00.0/,iommu=on"]
- match:
gpu: 0.25
append:
run: ["--device", "/sys/bus/pci/devices/0000:03:00.0/,iommu=on"]
Note that Persistent Worker keeps track of used resource modifiers and no --device
will be passed to the vetu run
more than once if a match occurs on a modifier that is already in use.
Keep this in mind when assigning resources:
to a Persistent Worker and writing resource-modifiers:
.
Persistent worker produces some useful OpenTelemetry metrics. Metrics are scoped with org.cirruslabs.persistent_worker
prefix and include information about resource utilization, running tasks and VM images used.
By default, the telemetry is sent to https://localhost:4317 using the gRPC protocol and to http://localhost:4318 using the HTTP protocol.
You can override this by setting the standard OpenTelemetry environment variable OTEL_EXPORTER_OTLP_ENDPOINT
.
Please refer to OTEL Collector documentation for instruction on how to install and configure the collector or find out if your SaaS monitoring has an available OTEL endpoint (see Datadog and Honeycomb as an example).
There are two standard options of ingesting metrics procuded by the persistent worker into the GCP:
- OpenTelemetry Collector + Google Cloud Exporter — open-source solution that can be later re-purposed to send metrics to any OTLP-compatible endpoint by swapping a single exporter
- Ops Agent — Google-backed solution with a syntax similar to OpenTelemetry Collector, but tied to GCP-only