Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fully implements db-centric rfc_max_forecast #621

Merged
merged 3 commits into from
Jan 31, 2024

Conversation

shawncrawley
Copy link
Collaborator

This PR, if merged, will replace the existing "rfc_max_forecast" service with a new version of the service whose backend data is now produced by direct access to the WRDS RFC Forecast database as opposed to the WRDS API. As a result:

  • The every_five_minute EventBridge rule that triggered the viz-wrds-api-handler Lambda has been modified to instead trigger the initialize-viz-pipeline Lambda directly. The viz-wrds-api-handler code is now deprecated and can be eventually removed.
    • The initialize-viz-pipeline Lambda code had to be modified to allow the "rfc" pipeline to be kicked off without referencing any input files - as there are none. This allows it to act on more of a "cron" basis and less of a "file-driven-event" basis.
  • The new products/rfc/rfc_max_forecast.sql query relies upon a few new database structures:
    • Two new ENUM types - flood_status and forecast_ts - are used for assigning trend and prioritizing "duplicate" forecasts. These are now created in the RDS Bastion postgresql_setup.sh.tftpl script.
    • Two new database views - rnr.stage_thresholds and rnr.flow_thresholds - reorganize data in the external/foreign WRDS Ingest DB threshold table (external.threshold) for more efficient use here and eventually in other places (e.g. RnR). The SQL for these views was committed for the record in the Core/LAMBDA/rnr_functions/rnr_domain_generator/sql/dba_stuff.sql file, but I believe this view should get copied over with the dump of the RnR schema on deployments and thus should not need to be recreated manually.
  • Forecasts that are distributed as flow-only (i.e. have no associated rating curve to produce stage) are now also included as a value-added win (addressing issue Add Flow-Only Gauges to rfc_max_stage #312). As a result of this:
    • The DB table produced by rfc_max_forecast.sql has new/modified column names, replacing every occurrence of "stage" with "value" (i.e. max_stage_timestamp to max_value_timestamp).
      • The rfc_max_forecast.mapx file was thus also modified to replace every occurrence of "stage" with "value" in both the "fieldName" and "alias" fields (i.e. max_stage_timestamp to max_value_timestamp and "Forecast Min Stage Timestamp" to "Forecast Min Value Timestamp")

This work will also allow the Replace and Route to be somewhat redesigned to be completely in-sync with this new rfc_max_forecast service - likely using the underlying publish.rfc_max_forecast table as its starting point.

Copy link
Contributor

@TylerSchrag-NOAA TylerSchrag-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been a long time coming, and I think a very exciting update to the pipeline. Much more transparent!

I'm a little surprised that the query takes as long to run as it does... but I'm guessing the forecasts table must be pretty large? In any case, this is soooo much better than having the wrds api handler (I'd just go ahead and remove that from the code in this PR as well).

I've left a few comments throughout. Way to go!

@nickchadwick-noaa nickchadwick-noaa merged commit 97e58a2 into ti Jan 31, 2024
1 check passed
@nickchadwick-noaa nickchadwick-noaa deleted the db-direct-rfc-max-forecast branch January 31, 2024 14:49
shawncrawley added a commit that referenced this pull request Jan 31, 2024
This change had been left out of my #621 PR on accident. This has already been hard-coded directly on TI to confirm it corrects my issue (so no need for an immediate redeploy). Everything is now working as expected.
nickchadwick-noaa pushed a commit that referenced this pull request Feb 16, 2024
This change had been left out of my #621 PR on accident. This has
already been hard-coded directly on TI to confirm it corrects my issue
(so no need for an immediate redeploy). Everything is now working as
expected.
nickchadwick-noaa pushed a commit that referenced this pull request Feb 21, 2024
This PR, if merged, will replace the existing "rfc_max_forecast" service
with a new version of the service whose backend data is now produced by
direct access to the WRDS RFC Forecast database as opposed to the WRDS
API. As a result:

* The `every_five_minute` EventBridge rule that triggered the
`viz-wrds-api-handler` Lambda has been modified to instead trigger the
`initialize-viz-pipeline` Lambda directly. The viz-wrds-api-handler code
is now deprecated and can be eventually removed.
* The `initialize-viz-pipeline` Lambda code had to be modified to allow
the "rfc" pipeline to be kicked off without referencing any input files
- as there are none. This allows it to act on more of a "cron" basis and
less of a "file-driven-event" basis.
* The new `products/rfc/rfc_max_forecast.sql` query relies upon a few
new database structures:
* Two new ENUM types - `flood_status` and `forecast_ts` - are used for
assigning trend and prioritizing "duplicate" forecasts. These are now
created in the RDS Bastion `postgresql_setup.sh.tftpl` script.
* Two new database views - `rnr.stage_thresholds` and
`rnr.flow_thresholds` - reorganize data in the external/foreign WRDS
Ingest DB threshold table (external.threshold) for more efficient use
here and eventually in other places (e.g. RnR). The SQL for these views
was committed for the record in the
`Core/LAMBDA/rnr_functions/rnr_domain_generator/sql/dba_stuff.sql` file,
but I believe this view should get copied over with the dump of the RnR
schema on deployments and thus should not need to be recreated manually.
* Forecasts that are distributed as flow-only (i.e. have no associated
rating curve to produce stage) are now also included as a value-added
win (addressing issue #312). As a result of this:
* The DB table produced by `rfc_max_forecast.sql` has new/modified
column names, replacing every occurrence of "stage" with "value" (i.e.
`max_stage_timestamp` to `max_value_timestamp`).
* The `rfc_max_forecast.mapx` file was thus also modified to replace
every occurrence of "stage" with "value" in both the "fieldName" and
"alias" fields (i.e. `max_stage_timestamp` to `max_value_timestamp` and
"Forecast Min Stage Timestamp" to "Forecast Min Value Timestamp")

This work will also allow the Replace and Route to be somewhat
redesigned to be completely in-sync with this new rfc_max_forecast
service - likely using the underlying `publish.rfc_max_forecast` table
as its starting point.
nickchadwick-noaa pushed a commit that referenced this pull request Feb 21, 2024
This change had been left out of my #621 PR on accident. This has
already been hard-coded directly on TI to confirm it corrects my issue
(so no need for an immediate redeploy). Everything is now working as
expected.
@nickchadwick-noaa nickchadwick-noaa added this to the V2.1.5 milestone Feb 23, 2024
@nickchadwick-noaa nickchadwick-noaa linked an issue Feb 23, 2024 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants