Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial run attempting to work out of temporary worker directory instead of workdir provided in calibration config file - Causing failed execution. #85

Open
Ben-Choat opened this issue Dec 8, 2023 · 1 comment

Comments

@Ben-Choat
Copy link
Contributor

Ben-Choat commented Dec 8, 2023

Short description explaining the high-level reason for the new issue.

Current behavior

On exectuing ngen-cal, I received the below traceback ending with an error suggesting pandas merge function was being applied to an object of NoneType. The error was produced in cal/search.py, line 25 in _objective_func when the following was executed: pd.merge(simulated_hydrograph, observed_hydrograph, left_index=True, right_index=True).

o Traceback (most recent call last):
o File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
o return _run_code(code, main_globals, None,
o File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
o exec(code, run_globals)
o File "/home/west/git_repositories/ngen_20231127_calib/ngen/venv/lib/python3.10/site-packages/ngen/cal/main.py", line 87, in
o main(general, conf['model'])
o File "/home/west/git_repositories/ngen_20231127_calib/ngen/venv/lib/python3.10/site-packages/ngen/cal/main.py", line 63, in main
o func(start_iteration, general.iterations, agent)
o File "/home/west/git_repositories/ngen_20231127_calib/ngen/venv/lib/python3.10/site-packages/ngen/cal/search.py", line 190, in dds_set
o _evaluate(0, calibration_set, info=True)
o File "/home/west/git_repositories/ngen_20231127_calib/ngen/venv/lib/python3.10/site-packages/ngen/cal/search.py", line 56, in _evaluate
o score = _objective_func(calibration_object.output, calibration_object.observed, calibration_object.objective, calibration_object.evaluation_range)
o File "/home/west/git_repositories/ngen_20231127_calib/ngen/venv/lib/python3.10/site-packages/ngen/cal/search.py", line 25, in _objective_func
o df = pd.merge(simulated_hydrograph, observed_hydrograph, left_index=True, right_index=True)
o File "/home/west/git_repositories/ngen_20231127_calib/ngen/venv/lib/python3.10/site-packages/pandas/core/reshape/merge.py", line 74, in merge
o op = _MergeOperation(
o File "/home/west/git_repositories/ngen_20231127_calib/ngen/venv/lib/python3.10/site-packages/pandas/core/reshape/merge.py", line 593, in init
o _left = _validate_operand(left)
o File "/home/west/git_repositories/ngen_20231127_calib/ngen/venv/lib/python3.10/site-packages/pandas/core/reshape/merge.py", line 2066, in _validate_operand
o raise TypeError(
o TypeError: Can only merge Series or DataFrame objects, a <class 'NoneType'> was passed

Expected behavior

The code runs, producing an automated approach to calibration.
On initial run, the working directory is workdir as defined in the calibration configuration file.

Steps to replicate behavior (include URLs)

  1. In Ubuntu 22.04

  2. use build_ngen_calib.sh in the attached zip folder to build ngen and set up ngen-cal. You may wish to edit the file to specifcy where ngen is built, for exmample.

  3. create a symlink in the attached folder to the ngen folder after it is built.

  4. Run ngen-cal with python -m ngen.cal calib_config_CAMELS_CFE_Calib_Sep_2.yaml

This should run, but let me know when it doesn.t.

Proposed solution

After scouring through the code, I found that JobMeta() in cal/meta.py, takes both an argument for parent_workdir, and workdir,
def __init__(self, name: str, parent_workdir: Path, workdir: Path=None, log=False):

But when JobMeta was called from the Agent() class, self._job was None, triggering the following call to JobMeta with only a value for parent_workdir provided.
https://github.com/NOAA-OWP/ngen-cal/blob/master/python/ngen_cal/src/ngen/cal/agent.py#L80
self._job = JobMeta(model_conf['type'], workdir, log=log)

So, workdir was being passed to JobMeta() as parent_workdir, and workdir was being passed as None (the default value), which triggered the xxx_worker directory to be the main working directory.

By providing workdir twice to JobMeta, the calibration seems to be running as expected.
self._job = JobMeta(model_conf['type'], workdir, workdir, log=log)

Screenshots

RecreateIssue.zip

Ben-Choat added a commit to Ben-Choat/ngen-cal that referenced this issue Dec 18, 2023
Added a second call to workdir in JobMeta under if self._job is None in class Agent()
@hellkite500
Copy link
Member

Forgot this was issue was referenced with #88 when I wrote my comment there. I need to dive a little deeper into these intended semantics and make sure there isn't a deep bug somewhere based on your issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants