Set up resource monitoring for tasks of cromwell runs #25

malachig · 2022-10-07T14:34:26Z

The Cromwell docs describe the capability to have monitoring for every step of your workflow. The docs I have been able to find are limited:

https://cromwell.readthedocs.io/en/stable/wf_options/Google/
Which states:

Specifies a GCS URL to a script that will be invoked prior to the user command being run. For example, if the value for monitoring_script is "gs://bucket/script.sh", it will be invoked as ./script.sh > monitoring.log &. The value monitoring.log file will be automatically de-localized.

https://cromwell.readthedocs.io/en/latest/backends/Google/
Which states:

In order to monitor metrics (CPU, Memory, Disk usage...) about the VM during Call Runtime, a workflow option can be used to specify the path to a script that will run in the background and write its output to a log file.

{
  "monitoring_script": "gs://cromwell/monitoring/script.sh"
}

The output of this script will be written to a monitoring.log file that will be available in the call gcs bucket when the call completes. This feature is meant to run a script in the background during long-running processes. It's possible that if the task is very short that the log file does not flush before de-localization happens and you will end up with a zero byte file.

The text was updated successfully, but these errors were encountered:

malachig · 2022-10-07T15:03:17Z

In order to test this idea in its simplest form I created an example monitor script and tested it on an active google instance that was running a compute intensive step.
https://github.com/griffithlab/cloud-workflows/blob/main/scripts/monitor.sh

I manually logged into the GCP instance using the Google console to test it.

To test on a cromwell run I am attempting the following:

I placed this script in our public google bucket: gs://griffith-lab-workflow-inputs/scripts/monitor.sh
I started a cromwell VM and edited the workflow options config file on this system: sudo vim /shared/cromwell/workflow_options.json. I added the following block to that (at the top level, not nested in another block):

  "monitoring_script": "gs://griffith-lab-workflow-inputs/scripts/monitor.sh"

According to the Cromwell docs, if you modify this conf file you do NOT need to restart Cromwell. These settings should take effect with the next workflow you run.
https://cromwell.readthedocs.io/en/stable/wf_options/Overview/

However, if you DID need to restart Cromwell, based on the startup script (https://github.com/griffithlab/cloud-workflows/blob/main/manual-workflows/server_startup.py) I think you could do: sudo systemctl start cromwell

malachig · 2022-10-07T15:04:58Z

If the my testing works as expected and we want to add this so it happens automatically, then I think it would be added here:
https://github.com/griffithlab/cloud-workflows/blob/3822d66e6a0423ade093f48f9c2535b07adfbb6a/manual-workflows/resources.sh#L135-L143

malachig · 2022-10-07T15:18:41Z

In my first test I looked in a gcs_localization.sh script for an individual task and I now see this:

# Localize singleton file 'gs://griffith-lab-workflow-inputs/scripts/monitor.sh' to '/cromwell_root/monitoring.sh'.
singleton_file_to_localize_573998f91cb96365bcb9696ac6baf714=(
  "griffith-lab"
  "3"
  "gs://griffith-lab-workflow-inputs/scripts/monitor.sh"
  "/cromwell_root/monitoring.sh"
)

localize_singleton_file "${singleton_file_to_localize_573998f91cb96365bcb9696ac6baf714[@]}"

malachig · 2022-10-07T20:49:53Z

And I see output like this (saved in the bucket as: monitoring.log) in a step that completed very quickly:

Seconds	Memory_Percent	Memory_Percent_Peak	Memory_GB	Memory_GB_Peak	Disk_Percent	Disk_Percent_Peak	Disk_GB	Disk_GB_Peak	CPU_Percent	CPU_Percent_Peak
0	8.86	8.86	0.34	0.34	23.00	23.00	7.43	7.43	2.29	2.29

malachig · 2022-10-11T14:04:36Z

This seems to be working as expected. To activate monitoring one can simply add this to /shared/cromwell/workflow_options.json on the head Cromwell VM:

  "monitoring_script": "gs://griffith-lab-workflow-inputs/scripts/monitor.sh"

Results for each task appear in the Google Bucket for each task result in a file named: monitoring.log

Layth17 mentioned this issue Nov 10, 2022

Add use of monitoring script to cromwell runs wustl-oncology/analysis-wdls#63

Open

Layth17 linked a pull request Nov 23, 2022 that will close this issue

Adding Monitoring Script Option #28

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set up resource monitoring for tasks of cromwell runs #25

Set up resource monitoring for tasks of cromwell runs #25

malachig commented Oct 7, 2022

malachig commented Oct 7, 2022

malachig commented Oct 7, 2022 •

edited

Loading

malachig commented Oct 7, 2022

malachig commented Oct 7, 2022

malachig commented Oct 11, 2022

Set up resource monitoring for tasks of cromwell runs #25

Set up resource monitoring for tasks of cromwell runs #25

Comments

malachig commented Oct 7, 2022

malachig commented Oct 7, 2022

malachig commented Oct 7, 2022 • edited Loading

malachig commented Oct 7, 2022

malachig commented Oct 7, 2022

malachig commented Oct 11, 2022

malachig commented Oct 7, 2022 •

edited

Loading