Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read IO plugin configuration (openPMD plugin) from reading applications (TOML) #3717

Closed
franzpoeschel opened this issue Aug 11, 2021 · 1 comment

Comments

@franzpoeschel
Copy link
Contributor

franzpoeschel commented Aug 11, 2021

Status Quo: The openPMD plugin is configured via command-line parameters of PIConGPU, including the period of activation and the written data.

Goal: Reading applications (data sinks, analyses, visualizations, post-processing) state their data requirements at the simulation's start (in which frequency is which data needed). PIConGPU collects these data requirements, merges them into one instance of the openPMD plugin. The openPMD plugin produces data only if a data source (species_all, e_all, fields_all, E, B, ...) has been requested by at least one reader for the current step. Unlike currently, written data sources may vary across iterations in the same instance of the openPMD plugin.

Suggested schema:
The openPMD plugin currently has the following command line parameters:

--openPMD.period                   // per data sink
--openPMD.source                   // per data sink
--openPMD.compression              // deprecated, use JSON
--openPMD.file                     // per instance of the plugin
--openPMD.ext                      // per instance of the plugin
--openPMD.infix                    // per instance of the plugin
--openPMD.json                     // per instance of the plugin
--openPMD.dataPreparationStrategy  // per instance of the plugin

As noted above, some of those parameters make sense to specify per data sink, while others should only be given once per instance of the openPMD plugin to avoid conflicts (specifying JSON twice.. what should we do? merge?, what if some readers specify a JSON file? the --openPMD.file must be the same for one instance of the plugin).

I propose the following configuration:
PIConGPU

--openPMD.period dynamic // maybe leave this one out
--openPMD.name simData
--openPMD.json @path/to/config.json
--openPMD.toml "path/to/sink1.toml;path/to/sink2.toml"

# second instance of the openPMD plugin if people like that
--openPMD.period dynamic
--openPMD.name whyAreYouDoingThis
--openPMD.toml "i/want/to/have/an/instance/for/myself.toml"

Sink 1 would write the following content to path/to/sink1.toml:

[period]
200 = ["E", "B"]
400 = ["fields_all"]

Sink 2 would write the following content to path/to/sink2.toml:

[period]
100 = ["e_all", "i_all"]
300 = ["species_all"]

Sink 3 would write the following content to i/want/to/have/an/instance/for/myself.toml:

[period]
1 = ["species_all", "fields_all"] # sink 3 is a greedy one

Explanation
A sink is linked to an instance of the openPMD plugin via the TOML file:

  • The sink writes the TOML file
  • The openPMD plugin is told about all of the TOML files that it should read via a command line parameter

No configuration is duplicated.

Upon startup, the openPMD plugin will wait until all TOML files are present in the file system and proceed to read its data requirements from there. All other configuration is the same as it is now. For each data requirement requirement=(period, sources), the openPMD plugin will output all of the configured sources in each step divisible by period. If multiple requirements apply for a single simulation step, the data is merged into a single output.

@sbastrakov
Copy link
Member

Implemented by #3720.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants