-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add option to perform a dryrun
with some rocoto commands
#114
Comments
Thanks @aerorahul for your report and initial work. The existing design makes a dry-run submission a little tricky. There are some subtleties regarding how jobs get submitted within a detached daemon process, and how the job is added to the database before the submission attempt even occurs (so that it can track submit failures/delays etc), that need to be accounted for. The expected tuple returned by the submit is the jobid and the output of the submit command. If the submission succeeds, the jobid will be a valid jobid, and the output will be the usual output from the command that was parsed to retrieve the jobid. If it fails, jobid will be nil, and the output will be the error message. It might be better to create a new method just for dry run submissions and to add logic in the various boot, run, rewind, etc. to handle that based on whether the dry-run option is active. There is a lot of room for improvement in how all of this is designed and handled. |
Thanks @christopherwharrop-noaa. |
Let me think about it more deeply. There might be a simpler way that I'm just not thinking of. Of course, any user can already get the submit script using the |
The approach we have been using/brainstorming is not elegant and extremely hacky.
As you note, it likely breaks provenance of the rocoto db, and I am sure there are unintended consequences. |
I think your request for the dry-run (or whatever we want to call it) feature to get the script that Rocoto will submit for a particular task, is totally reasonable. I strongly suspect this is something that many others would find useful. One other thing, though, is that whatever is implemented has to work for PBSPro and any other supported batch system. Right now, those are, realistically, the only ones in use. The thing that makes it weird is Rocoto's way of submitting the jobs asynchronously in a daemon spawned by the main process. That daemon server process often lives after the main rocotorun process has terminated. And it is the thing that builds the submit script. I think we just need to make all parts aware of when a dry-run is active or not. Some of the plumbing that happens just before job submission attempts are made needs some modification so that it doesn't do things in dry-run mode like store the submit attempt in the database, etc. |
Thanks for explaining the work involved. |
@christopherwharrop-noaa In the branch mentioned above, I added the dryrun for the other batch systems in the same spirit as If you can help me with:
I would greatly appreciate your help. |
An ability to perform
dryrun
without executing the underlying rocoto command would be valuable. One such use case would be to be able torocotorun
with adryrun
option to obtain the batch card without actually submitting the job. This can enable the user to validate visually the batch card.An effort towards achieving this is made here
Is this the right track?
The text was updated successfully, but these errors were encountered: