Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rocotohold/rocotorelease #25

Open
samtrahan opened this issue Jul 30, 2018 · 5 comments
Open

rocotohold/rocotorelease #25

samtrahan opened this issue Jul 30, 2018 · 5 comments

Comments

@samtrahan
Copy link
Contributor

Sometimes when running a workflow, one needs to temporarily halt execution of part of the workflow. For example, if the archiving system is down, the archiving-related jobs need to be suspended, but the other jobs can be continued. The idea is the user would do something like this:

rocotohold -w workflow.xml -d workflow.db -c 201705141800: -t /archive/

Then Rocoto would not run those jobs until some command like this is run:

rocotorelease -w workflow.xml -d workflow.db -c 201705141800: -t /archive/

There are two obvious ways of implementing this internally:

  1. Store a list of the exact tasks and cycles that are held. This is unfeasible with non-trivial workflows.

  2. Store the -c, -t, -m, and -a options used to specify the list. Give each "hold" a name so it can be released by name.

rocotohold -w workflow.xml -d workflow.db -c 201705141800: -t /archive/ -n hold_archive
rocotorelease -w workflow.xml -d workflow.db -n hold_archive

The second option is made trivial by the developments in the feature/principle-of-least-surprise branch, specifically the WorkflowMgr::WorkflowSelection and WorkflowSubset tasks. One can easily assemble the -t, -m, -c, and -a options into simple lists for later use in "holds"

Internally, this would be stored in a new database table in the workflow.db file.

@christopherwharrop
Copy link
Owner

Can you supply more than that one use case?

How many people are requesting this specific feature and for what purposes?

@RichHammett
Copy link

RichHammett commented Jul 30, 2018 via email

@christopherwharrop
Copy link
Owner

What ECFlow does isn't relevant here. I want to know why this feature is being requested, how many people will use it, and what they will use it for.

The reason I'm being a bit of a stickler here is that the whole point of a workflow management system is to remove the burden of managing the workflow from the user so that s/he can focus on other things. This feature, as requested, introduces direct interaction with the moment to moment operation of the workflow with complicated commands that add a lot for the user to keep track of. Yes, it's an optional feature, but it exposes the users to more complexity. I don't like that. The more complicated we make Rocoto, the less usable it will be. It needs to be as simple and straightforward to use as possible. And I really have to insist that features are added for good reasons, and not just "because ECFlow does it."

@samtrahan
Copy link
Contributor Author

samtrahan commented Jul 31, 2018

Chris,

We frequently have to suspend part of our workflow when part of the machine is broken. The fact of the matter is that sometimes the user does have to manage the workflow, especially when the machine is acting up. Rich does not know this because he has not run large-scale retrospectives and real-time parallels yet. Right now, it is nearly impossible to temporarily disable parts of a Rocoto workflow. This would give us an easy way to do so.

We use nasty workarounds right now, such as commenting out sections of the workflow, or using dependencies. That is problematic, and less powerful than he feature I propose. With a "rocotohold" we could hold certain groups of jobs for certain cycles, rather than holding certain jobs for all cycles. An example is holding a scrubbing job until some verification work is done. We could even hold a cycle, but let the rest of the cycles continue. This is needed sometimes when there are many potential problems with one cycle's output, but not in the parts that are used by the next cycle. (The forecast output is huge.)

Sincerely,
Sam Trahan

@samtrahan
Copy link
Contributor Author

Chris,

You have not objected to this idea, so I'm going to implement something and let you look at it to decide further. It should be straightforward to implement it on top of feature/principle-of-least-surprise, which adds powerful workflow subsetting capability within Rocoto, and the ability to represent subset requests (-c, -t, -a, -m) as simple objects for later reuse. That is 90% of the implementation of a rocotohold/rocotorelease feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants