-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create prototype for YAML to python translation #434
Comments
@ashiklom can I ask you to give a simply example of how this would replace a yaml. Can you choose a case of a known yaml we have in the configuration on SWELL and illustrate (cryptically is ok) how the thing would work ... I am under the impression, looking at the sites you point to above, that this is more complicated the managing the yamls directly - but I am sure I am missing something - sorry. |
The name here is somewhat misleading. I'm not demanding that we replace all (or even any) of our YAMLs with pure Python implementations. I'm just asking for some basic offline prototypes so we can compare the designs and see if there are any obvious advantages. That said, I'll try to come up with some basic examples to give you a sense of how this might look. |
@rtodling Definitely needs more careful thought, but here are some quick sketches, based on the The YAMLs: # geos_ocean.yaml
jedi_interface: soca
total_processors: {{total_processors}}
executables:
hofx3D: soca_hofx3d.x
hofx4D: soca_hofx.x
variational3D: soca_var.x
variational4D: soca_var.x
variational4DEnsVar: soca_var.x
explicit_diffusion: soca_error_covariance_toolbox.x
variables:
hocn: h
socn: Salt
ssh: ave_ssh
tocn: Temp # geos_marine.yaml
jedi_interface: soca
total_processors: {{total_processors}}
executables:
hofx3D: soca_hofx3d.x
hofx4D: soca_hofx.x
variational3D: soca_var.x
variational4D: soca_var.x
variational4DEnsVar: soca_var.x
explicit_diffusion: soca_error_covariance_toolbox.x
convert_state_soca2cice: soca_convertstate.x
variables:
hocn: h
socn: Salt
ssh: ave_ssh
tocn: Temp Now, a sketch of a pure Python representation, with (basic!) usage of dataclasses. from dataclasses import dataclass
# Define the general structure.
# Note: A nice thing here is we can use type hinting to precisely define what
# kinds of values are allowed in the config.
@dataclass
class JediInterface:
jedi_interface: str
executables: dict[str, str]
variables: dict[str, str]
# Include a default value for this field, so we don't always have to set it
total_processors: int = 1
# Now (perhaps in a separate file), you define specific instances of that structure.
swell_config = get_swell_config(...)
geos_ocean_interface = JediInterface(
jedi_interface = "soca",
# Can set things directly via variables
total_processors = swell_config.total_processors,
executables = {
"hofx3D": "soca_hofx3d.x",
"hofx4D": "soca_hofx.x",
"variational3D": "soca_var.x",
"variational4D": "soca_var.x",
"variational4DEnsVar": "soca_var.x",
"explicit_diffusion": "soca_error_covariance_toolbox.x"
},
variables = {
"hocn": "h",
"socn": "Salt",
"ssh": "ave_ssh",
"tocn": "Temp"
}
)
# We can dynamically update instances of the classes based on conditions.
# Maybe a bad example, but, if doing tier 1 tests, reduce the complexity...
if swell_config.is_tier1_test:
# Only use 1 processor
geos_ocean_interface.total_processors = 1
# ...and only consider two variables
geos_ocean_interface.variables = {
key: geos_ocean_interface.variables[key]
for key in ("Temp", "ave_ssh")
}
# Share information between interfaces. Less typing, less stuff to update, and
# clearer relationships between different interfaces.
ocean_vars = {
"hocn": "h",
"socn": "Salt",
"ssh": "ave_ssh",
"tocn": "Temp"
}
common_execs = {
"hofx3D": "soca_hofx3d.x",
"hofx4D": "soca_hofx.x",
"variational3D": "soca_var.x",
"variational4D": "soca_var.x",
"variational4DEnsVar": "soca_var.x",
"explicit_diffusion": "soca_error_covariance_toolbox.x"
}
geos_ocean_interface = JediInterface(
jedi_interface = "soca",
# Some dynamically-computed value, just to show off...
# Here, cap the number of processors at 10
total_processors = min(swell_config.total_processors, 10),
executables = common_execs,
variables = ocean_vars
)
geos_marine_interface = JediInterface(
jedi_interface = "soca",
# Don't cap this one...
total_processors = swell_config.total_processors,
executables = {
# Same as geos_ocean...
**common_execs,
# ...except also add one more
"convert_state_soca2cice": "soca_convertstate.x"
},
# Same as geos_marine
variables = ocean_vars
) You can also get fancier (more precise) with your class definition, if you want to, e.g., restrict specific fields to specific values. For example, the apparently Pythonic thing to do (which, I admit, still looks a bit ugly and verbose to me, but a lot of serious Python people recommend it!) is to use enums instead of strings wherever a parameter can take on one of a few specific values: from dataclasses import dataclass
from enum import StrEnum
# Specify that there are only 2 valid types of JEDI interface...
class JediInterfaceType(StrEnum):
SOCA = "soca"
FV3 = "fv3-jedi"
# ...and only 8 possible keys for your executable.
class ExecutableKey(StrEnum):
HOFX3D = "hofx3D"
HOFX4D = "hofx4D"
VAR3D = "variational3D"
VAR4D = "variational4D"
VAR4DENS = "variational4DEnsVar"
EXPLICIT_DIFFUSION = "explicit_diffusion"
LOCAL_ENS_DA = "localensembleda"
ENS_VAR = "ensemble_variance"
# Now, restrict the interface and executable keys in a JediInterface to *only* the values defined above.
@dataclass
class JediInterface:
jedi_interface: JediInterfaceType
executables: dict[ExecutableKey, str]
variables: dict[str, str]
total_processors: int = 1
# Define an instance. This is code that a type checker will pass without errors.
valid_interface = JediInterface(
jedi_interface = JediInterfaceType.SOCA,
executables = {ExecutableKey.HOFX3D: "soca_hofx3d.x"},
variables = {"hofcn": "h"}
)
# NOTE: This will run, but a type checker will raise ArgumentType errors about
# jedi_interface and executables.
bad_interface = JediInterface(
jedi_interface = "fv4",
executables = {"hofx5D": "no_such_thing.x"},
variables = {"hofcn": "h"}
) Python itself will not enforce dataclass types at runtime, so this would rely on a type checker like Pydantic (linked above) data classes look very similar, except that they actually will throw meaningful runtime errors. if you do the wrong thing. |
Making a note to create prototype Python class(es) to replace configuration YAML files.
Recommendations:
https://docs.python.org/3/library/dataclasses.html
pydantic data model https://docs.pydantic.dev/latest/
The text was updated successfully, but these errors were encountered: