Skip to content

Commit

Permalink
examples: document expiry workflow design patterns
Browse files Browse the repository at this point in the history
  • Loading branch information
oliver-sanders committed Feb 12, 2025
1 parent 1593ef3 commit fd71a13
Show file tree
Hide file tree
Showing 5 changed files with 307 additions and 0 deletions.
110 changes: 110 additions & 0 deletions cylc/flow/etc/examples/expiry/.validate
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
#!/bin/bash
# THIS FILE IS PART OF THE CYLC WORKFLOW ENGINE.
# Copyright (C) NIWA & British Crown (Met Office) & Contributors.
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.

set -euxo pipefail

test_one () {
ID="$(< /dev/urandom tr -dc A-Za-z | head -c6 || true)"

# start the workflow
cylc vip \
--check-circular \
--no-detach \
--final-cycle-point=P0D \
--no-run-name \
--workflow-name "$ID" \
./one

# the start task should have expired
grep 'start.*(internal)expired' "$HOME/cylc-run/$ID/log/scheduler/log"

# the following task(s) should not have run
! grep 'a.*running' "$HOME/cylc-run/$ID/log/scheduler/log"
! grep 'b.*running' "$HOME/cylc-run/$ID/log/scheduler/log"

# lint
cylc lint "$ID"

# clean up
cylc clean "$ID"
}


test_two () {
ID="$(< /dev/urandom tr -dc A-Za-z | head -c6 || true)"

# start the workflow
cylc vip \
--check-circular \
--no-detach \
--final-cycle-point=P0D \
--no-run-name \
--workflow-name "$ID" \
./two

# the start task should run
grep 'start.*running' "$HOME/cylc-run/$ID/log/scheduler/log"

# some other task in the chain should expire
grep '(internal)expired' "$HOME/cylc-run/$ID/log/scheduler/log"

# the housekeep task at the end of the cycle should not run
! grep 'housekeep.*running' "$HOME/cylc-run/$ID/log/scheduler/log"

# lint
cylc lint "$ID"

# clean up
cylc clean "$ID"
}


test_three () {
ID="$(< /dev/urandom tr -dc A-Za-z | head -c6 || true)"

# start the workflow
cylc vip \
--check-circular \
--no-detach \
--final-cycle-point=P0D \
--no-run-name \
--workflow-name "$ID" \
./three

# the start task should expire
grep 'start.*(internal)expired' "$HOME/cylc-run/$ID/log/scheduler/log"
[[ ! -f "$HOME/cylc-run/$ID/log/job/"*"/a/NN/job" ]]

# only the "a" and "housekeep" tasks should run
[[ $(cd "$HOME/cylc-run/XOECeJ/log/job/"*; echo *) == 'a housekeep' ]]

# tasks b, c and d should skip
grep '\/b.*run mode=skip' "$HOME/cylc-run/$ID/log/scheduler/log"
grep '\/c.*run mode=skip' "$HOME/cylc-run/$ID/log/scheduler/log"
grep '\/d.*run mode=skip' "$HOME/cylc-run/$ID/log/scheduler/log"

# lint
cylc lint "$ID"

# clean up
cylc clean "$ID"
}


test_one
test_two
test_three
78 changes: 78 additions & 0 deletions cylc/flow/etc/examples/expiry/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
.. _examples.expiry:

Expiring Tasks / Cycles
-----------------------

Cylc is often used to write workflows which monitor real-world events.

For example, this workflow will run the task ``foo`` every day at 00:00am:

.. code-block:: cylc
[scheduling]
initial cycle point = previous(T00)
[[graph]]
P1D = """
@wall_clock => foo
"""
Sometimes such workflows might get behind, e.g. due to failures or slow task
execution. In this situation, it might be necessary to skip a few tasks in
order for the workflow to catch up with the real-world time.

Cylc has a concept called :ref:`expiry <ClockExpireTasks>` which allows tasks
to be automatcially "expired" if they are running behind schedule.

.. seealso::

:ref:`ClockExpireTasks`.


Example 1: Skip a whole cycle of tasks
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

If the workflow gets behind, skip whole cycles of tasks until it catches up.

.. admonition:: Get a copy of this example
:class: hint

.. code-block:: console
$ cylc get-resources examples/expiry/one
.. literalinclude:: one/flow.cylc
:language: cylc


Example 2: Skip the remainder of a cycle of tasks
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

If the workflow gets behind, skip the remainder of the tasks in the cycle,
then skip whole cycles of tasks until it catches up.

.. admonition:: Get a copy of this example
:class: hint

.. code-block:: console
$ cylc get-resources examples/expiry/two
.. literalinclude:: two/flow.cylc
:language: cylc


Example 3: Skip selected tasks in a cycle
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

If the workflow gets behind, turn off selected tasks to allow it to catch up
more quickly.

.. admonition:: Get a copy of this example
:class: hint

.. code-block:: console
$ cylc get-resources examples/expiry/three
.. literalinclude:: three/flow.cylc
:language: cylc
33 changes: 33 additions & 0 deletions cylc/flow/etc/examples/expiry/one/flow.cylc
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
[meta]
description = """
If the workflow runs slowly and the cycle time gets behind the real
world (wallclock) time, then it will skip cycles until it catches up.
Either a cycle runs or it is skipped.
When you start this workflow, the first cycle will be at 00:00am this
morning so will immediately expire causing the workflow to move onto
tomorrow's cycle.
"""

[scheduler]
allow implicit tasks = True

[scheduling]
# start the workflow at 00:00am this morning
initial cycle point = previous(T00)

# the "start" task will "expire" if the cycle time falls behind
# the wallclock time
[[special tasks]]
clock-expire = start

[[graph]]
P1D = """
# the chain of tasks we want to run
start => a => b => c => d => housekeep

# wait for the previous cycle to either complete or expire before
# continuing onto the next cycle
housekeep[-P1D] | start[-P1D]:expired? => start
"""
40 changes: 40 additions & 0 deletions cylc/flow/etc/examples/expiry/three/flow.cylc
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
[meta]
description = """
If the workflow runs slowly and the cycle time gets behind the real
world (wallclock) time, then it will skip selected tasks until it
catches up.
In this case, the tasks "b", "c" and "d" will be skipped to help the
workflow to catch up more quickly.
When this workflow starts up, the first cycle will be at 00:00am today
so the "start" task will immediately expire. This will cause tasks
"b", "c" and "d" to be configured to "skip" rather than run.
"""

[scheduler]
allow implicit tasks = True

[scheduling]
# start the workflow at 00:00am this morning
initial cycle point = previous(T00)
final cycle point = +P0D

# the "start" task will "expire" if the cycle time falls behind
# the wallclock time
[[special tasks]]
clock-expire = start

[[graph]]
P1D = """
# the chain of tasks we want to run
start | start:expired? => a => b => c => d => housekeep
"""

[runtime]
[[start]]
# if this task expires, configure the tasks "b", "c" and "d" to
# "skip" rather than run
# Note: This task will also be "skipped" if it expires
[[[events]]]
expired handlers = cylc broadcast "%(workflow)s" -p "%(point)s" -n b -n c -n d -s "run mode = skip"
46 changes: 46 additions & 0 deletions cylc/flow/etc/examples/expiry/two/flow.cylc
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
[meta]
description = """
If the workflow runs slowly and the cycle time gets behind the real
world (wallclock) time, then it will skip tasks until it catches up.
A cycle may be skipped part way through to allow the workflow to catch
up faster.
When this workflow starts up, the first cycle will be one minute ahead
of the wallclock time. At some point in the cycle, the wallclock time
will overtake the cycle time and the next task in the chain will
expire. The workflow will then move onto the next cycle.
"""

[scheduler]
allow implicit tasks = True

[scheduling]
# start the workflow at 00:00am this morning
initial cycle point = PT1M

# any task in the workflow will "expire" rather than run if the cycle
# time falls behind the wallclock time
[[special tasks]]
clock-expire = start, a, b, c, d, housekeep

[[graph]]
P1D = """
# the chain of tasks we want to run
start => a => b => c => d => housekeep

# start the next cycle as soon as the previous cycle has finished
# OR and task in the previous cycle has expired
housekeep[-P1D]
| start[-P1D]:expire?
| a[-P1D]:expired?
| b[-P1D]:expired?
| c[-P1D]:expired?
| d[-P1D]:expired?
| housekeep[-P1D]:expired?
=> start
"""

[runtime]
[[root]]
script = sleep 12

0 comments on commit fd71a13

Please sign in to comment.