Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bugfix/minor_updates #115

Merged
merged 35 commits into from
Dec 9, 2024
Merged

bugfix/minor_updates #115

merged 35 commits into from
Dec 9, 2024

Conversation

kels271828
Copy link
Member

@kels271828 kels271828 commented Dec 3, 2024

DESCRIPTION
Updates to get codem pipeline running.

FIXED

  • Pass config path instead of object in evaluate_jobmon.py
  • Commented out some code in data.py where we need to make updates for new DataInterface class
  • Made updates to rover_stage.py and spxmod_stage.py for new DataInterface class
  • Added ability to dump pandas data frames in interface.py
  • Added mkdir statement to DataIO.dump()
  • Updated logic for main.load_stage() when custom stage has same name as built-in stage
  • Changed subsets.create_subsets() back to pandas and added sorting for consistent subset_id assignment
  • Fixed some linting, mypy errors, and tests (but not all!)

ADDED

  • Added ability to specify a directory as stage input/output in io/base.py and stage/base.py
  • Added to docstrings about dataif syntax in stage/base.py
  • Added args to ModelStage.get_stage_subset() to make it more generalizable
  • Added "name=" to pipeline and stage __repr__ functions

REMOVED

  • Removed PreprocessingStage and PreprocessingConfig (various files)

CHANGED

  • Updated some variable names for clarity in evaluate_jobmon.py
  • Changed stages arg in evaluate_jobmon() and evaluate_local() from list to set for consistency; added stages arg to Pipeline.get_execution_order() and moved where function gets called from Pipeline.evaluate() to evaluate_jobmon() and evaluate_local()
  • Changed stage_name arg to stages in main.py
  • Moved error check for calling collect method on a pipeline from main.py to pipeline.py, added error check for calling collect method on a Stage instance (as opposed to ModelStage) or with Jobmon backend to stage/base.py
  • Moved function Pipeline.check_upstream_output_exists() to Input.check_missing(), added calls to Input.check_missing() in Pipeline.evaluate() and Stage.evaluate()
  • Add full paths to Stage.input and Stage.output items in Stage.set_dataif

TODO

  • Not all tests are passing; make separate PR?

kels271828 and others added 30 commits November 22, 2024 08:10
…e customized model higher priority compare to buildin stage
@kels271828 kels271828 marked this pull request as ready for review December 6, 2024 16:12
@kels271828
Copy link
Member Author

Some of these changes might need to be revisited in separate PRs. Will create tickets in JIRA.

Copy link
Member

@zhengp0 zhengp0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good to me! Thanks @kels271828! Indeed we need to fix the tests soon.

@kels271828 kels271828 merged commit aa648a9 into release/1.0 Dec 9, 2024
3 of 9 checks passed
@kels271828 kels271828 deleted the bugfix/minor_updates branch December 9, 2024 15:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants