Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reana.yaml: load parameters from an external file #699

Open
mdonadoni opened this issue Mar 1, 2023 · 1 comment
Open

reana.yaml: load parameters from an external file #699

mdonadoni opened this issue Mar 1, 2023 · 1 comment

Comments

@mdonadoni
Copy link
Member

mdonadoni commented Mar 1, 2023

Parameters of a workflow are specified in the inputs.parameters section of reana.yaml, like this:

inputs:
  parameters:
    myparam1: myvalue1
    myparam2: myvalue2

However, in case of a CWL or Snakemake workflow, it is also possible to load parameters from an external file by providing a special input parameter (source code)

inputs:
  parameters:
    input: my-parameters.yaml

We should introduce a new property in reana.yaml to support loading parameters from a file also for serial or Yadage workflows. One possible name for this property can be parameter_file (or parameterfile):

inputs:
  parameter_file: my-parameters.yaml

If we want to support multiple parameter files at the same time, then we should use parameter_files (or parameterfiles) instead:

inputs:
  parameter_files:
    - my-parameters-first-stage.yaml
    - my-parameters-second-stage.yaml

In this second case, we also need to decide how to handle parameters defined in different files but having the same name.

@giuseppe-steduto
Copy link
Member

Making this "parameter input file" clearer and better structured is definitely a nice and needed improvement.
In the rest of reana.yaml it looks to me like we often go for snakecase (see kubernetes_memory_limit, compute_backends, ...), so I would prefer parameter_files.

I also think that being able to support multiple parameter files at the same time is better; in this case, I think that parameters that are defined later should simply override the ones that were defined previously, and issue a warning about this. This is a very commonly followed approach when dealing with duplicate keys (see Oracle DB for example), and users seem to complain about the absence of warnings rather than about being allowed to put duplicate keys (see helm lint, prometheus). Note that this is something we have to address anyway, as right now the current code does not complain for duplicated keys in the same parameter file, but just overrides the first one with the value of the second (because this is how yaml.load behaves).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants