Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to output a new config file #63

Merged
merged 11 commits into from
Sep 24, 2024
Merged

Conversation

KevinSayers
Copy link
Collaborator

Issue #, if available:

Description of changes:
DRAFT: Need to add CWL and WDL formatting still.
Adds --config=<path> which if specified will create a configuration file formatted correctly for the workflow's engine. The below for example would be correctly formatted for NF. This would enable users to directly utilize recommended settings from the runanalyzer.

withName: TEST {
    cpu = 2
    memory = 4
}

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Copy link
Contributor

@markjschreiber markjschreiber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some suggestions

@@ -24,6 +25,7 @@
-o, --out=<path> Write output to file
-P, --plot=<directory> Plot a run timeline to a directory
-H, --headroom=<float> Adds a fractional buffer to the size of recommended memory and CPU. Values must be between 0.0 and 1.0.
-c, --config=<path> Output a config file with recommended resources
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

specify that it is a Nextflow style config file

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or Nextflow / CWL

""")
out.write(task_string)
elif engine == 'WDL':
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raise an error saying we don't support WDL

wfid = res['workflow'].split('/')[-1]
engine = omics.get_workflow(id=wfid)['engine']
if res['type'] == 'task':
task_name = res['name'].split(" ")[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this always work? Do we know that our engines can't produce a task name like "[my workflow task]" with spaces?

It might make this more future proof to have a constant for the task name split string because if we need to change it due to new engines, this line of code doesn't advertise it's intent. Alternatively, perhaps have a function to split the name, perhaps with an engine name as an argument if different engines do different things.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add a function that can parse the base task name and enforce the format of it.

@KevinSayers
Copy link
Collaborator Author

@markjschreiber I made this Nextflow only for now. I will add in CWL and WDL once I have spent more time working out how to handle scattered tasks for those.

@KevinSayers KevinSayers marked this pull request as ready for review September 18, 2024 03:43
Copy link
Contributor

@markjschreiber markjschreiber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Before merging, can you update the README with information about the --config option.

Other suggestions for improvements:
docopt has a nice mechanism for separating mutually exclusive tags (see my most recent PR and docopt documentation). If --config shouldn't be logically used with some other options then you can include a new usage line such as omics-run-analyzer <runId> --config=<path>. This can also simplify the logic for the main method as conflicting options can be rejected by the arg parser before they get to the main logic.

--write-config might be a better long name? --config might suggest you can read a config file to change the behavior of the run analyzer. Not sure?

Can you see if you can split the config logic into a separate python file. I have been trying to do this for new features to declutter the __main__.py which has gotten large and confusing.

Copy link
Contributor

@markjschreiber markjschreiber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@KevinSayers KevinSayers merged commit e2a8286 into awslabs:main Sep 24, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants