covid: evaluate current season #168

dshemetov · 2025-02-06T21:17:41Z

Backtesting our covid forecasters, re-organizing pipelines.

backtest on current season
try to fix breaking CI (stringfish compilation, missing header, issues?)
- updating our renv environment in hopes it just goes away
- updated all packages, updated renv, snapshot R to 4.4.1, and pruned unused packages (use renv::restore(clean=TRUE) to resync)
- targets 1.8.0 -> 1.10.1, there seem to have been a lot of fixes, including pipeline speedups for large pipelines
- crew 0.10 -> 1.0.0, a number of changes, hopefully positive
  - 1: launch_max was deprecated on 2024-11-04 (version 0.10.1.9000). Alternative: none. so I'm going to remove our launch_max = 10000L setting; it wasn't clear that that was helping us anyway.
- scoringutils -> 2.0, a number of breaking changes, I've updated the evaluate_predictions function
bunch of bug fixes for forecasters-basics; flusion still broken waiting on Hotfix growth rate epipredict#437

closes #167

* covid prod now has two modes: prod and backtest, new reports available * fix prod pipelines for package updates * add retry function for failing API calls * add timestamps to targets output * update forecast data Julia (ty David) * add forecast data R code * fix daily_to_weekly_archive for epiprocess update, add comments * add tar_change to some data-dependent targets * make flu_hosp_prod backtest_mode aware * fix data_substitutions code

R/aux_data_utils.R

R/utils.R

dsweber2 · 2025-02-14T01:57:54Z

R/utils.R

+
+
+#' Print recent targets errors.
+get_recent_targets_errors <- function(time_since = minutes(60)) {


yeah this is definitely nicer than the jank I have been doing to do this

thanks, yea, got tired of the same dplyr manipulation on the jobs df over and over. unfortunately this doesnt work like half the time for various reasons for different targets, for instance it wont show borked notebook errors because for some reason a failed notebook target doesnt get its timestamp updated. i would make an issue on the targets page about it, but havent had the energy.

in the cases where the above function doesnt work, this is handy, should probably note it in a README: tar_meta(complete_only = TRUE, fields = c("error", "time")) %>% slice_max(time, n=5)

scripts/covid_hosp_prod.R

dsweber2 · 2025-02-14T23:51:59Z

scripts/covid_hosp_prod.R

+# If TRUE, we don't run the report notebook, which is (a) slow and (b) should be
+# preserved as an ASOF snapshot of our production results for that week.


So since we embed the generation date in the notebook, it is actually preserved as a snapshot regardless. Then again, given that we're introducing scoring notebooks with this PR, it may not matter too much to retroactively generate report notebooks

But our code is not snapshotted and during the season we made a bunch of tweaks. So I did a bunch of digging on old machines to make sure we got as close to the day of forecast snapshots of these books, since no way we're gonna rerun old github commits.

scripts/covid_hosp_prod.R

dsweber2 · 2025-02-14T23:54:15Z

scripts/covid_hosp_prod.R

    command = {
      create_nhsn_data_archive(disease = "nhsn_covid")
    }
  ),
+  # TODO: tar_change maybe?


we should either switch this to tar_change or leave it as always mode

makes sense. i'm hoping tar_change works. i think it should, unless there's something borked with the timestamp getting code, since tar_change is just shorthand for a two-target chain.

scripts/covid_hosp_prod.R

Co-authored-by: David Weber <[email protected]>

scripts/covid_hosp_prod.R

dsweber2 · 2025-02-15T00:11:03Z

scripts/covid_hosp_prod.R

+combined_forecasts <- tar_combine(
+  name = forecast_full,
+  forecast_targets[["forecast_res"]],
+  command = {
+    dplyr::bind_rows(!!!.x)
+  }
+)


So an unfortunate thing this will result in is if any of the forecasts change for any forecast_date, every target downstream from this is going to get invalidated. Not sure how long scoring takes, but this could get annoying

~~less sure about an immediate alternative.~~ I guess if everything is part of the same map the problem goes away

hm yea. i did it this way because i think i needed it to resolve the "forecasts always rerunning" issue. im not 100% on it, but this is the way it's setup in the explore pipeline and that pipeline doesnt have spurious cache invalidation issues.

oh I forgot we actually did it this way in explore. My memory was that that did invalidate everything downstream of the single target, but it's been a while since I've run it

no i agree, i think things downstream of this will change if any component of this changes. but "things downstream of this aggregate sometimes rerun" is an upgrade over "everything reruns all the time"

Co-authored-by: David Weber <[email protected]>

dsweber2 · 2025-02-15T00:24:26Z

scripts/reports/score_report.Rmd

Ideally this would have some plots of representative/best/worst locations plots of the quantiles. Judging by the inputs this is definitely not included. May want to put off for now though and/or rely on the corresponding prod notebooks.

yea... somehow dealing with pipeline stuff takes me like 90% of the time and the actual plotting/analysis task just gets scraps of my attention. but when we get back monday, we can def get some more fine-grained views.

scripts/one_offs/get_forecast_data.r

dsweber2 · 2025-02-17T18:16:00Z

scripts/reports/new_data.Rmd

@@ -51,14 +54,15 @@ df %>%
    y = "Total Confirmed Flu Admissions"
  ) +
  theme(axis.text.x = element_text(angle = 90, hjust = 1))
+ggplotly(p, tooltip = "text", height = 800, width = 1000)


is this how we get the hovertext to actually work? the lack of that has meant needing to do some random tar_reads

i didnt actually try to fix that issue, this fix is to make the HTML self-contained rather than producing a folder with PNGs

dsweber2 · 2025-02-17T18:32:48Z

scripts/reports/score_report.Rmd

+    # self_contained: False
+    # lib_dir: libs


I'm wondering what these are about. Seem to be scattered in a couple of notebooks. I guess this is header copypasta?

yea it's copy pasta. self-contained will bundle external artifacts (like images) inside the html, not sure what lib_dir is about. i was vaguely trying to get some consistency of this header across our Rmds, but not too hard.

dshemetov self-assigned this Feb 7, 2025

dshemetov force-pushed the evaluate_current_season branch from 2cb56f2 to c85549c Compare February 12, 2025 00:09

dshemetov marked this pull request as ready for review February 12, 2025 00:11

dshemetov requested a review from dsweber2 February 13, 2025 17:57

dshemetov force-pushed the evaluate_current_season branch from 2838296 to 3e9a8de Compare February 13, 2025 18:28

repo: update renv, set R to 4.4.1

c91a53f

dshemetov force-pushed the evaluate_current_season branch from 3e9a8de to bd1d91e Compare February 13, 2025 18:29

dshemetov force-pushed the evaluate_current_season branch from bd1d91e to e0de52d Compare February 13, 2025 18:31

tests: bug fix a ton in forecasters-basics

aef9610

dsweber2 reviewed Feb 15, 2025

View reviewed changes

Update scripts/covid_hosp_prod.R

e1c71b5

Co-authored-by: David Weber <[email protected]>

dsweber2 reviewed Feb 15, 2025

View reviewed changes

scripts/covid_hosp_prod.R Show resolved Hide resolved

dsweber2 reviewed Feb 15, 2025

View reviewed changes

dshemetov and others added 2 commits February 14, 2025 16:14

Update scripts/covid_hosp_prod.R

d9c60e8

Co-authored-by: David Weber <[email protected]>

Update scripts/covid_hosp_prod.R

181d6c2

Co-authored-by: David Weber <[email protected]>

dsweber2 reviewed Feb 15, 2025

View reviewed changes

dsweber2 reviewed Feb 17, 2025

View reviewed changes

scripts/one_offs/get_forecast_data.r Outdated Show resolved Hide resolved

dsweber2 reviewed Feb 17, 2025

View reviewed changes

dsweber2 and others added 10 commits February 17, 2025 14:44

david's suggestions

9332666

flu_hosp_retro scoring

d94f4c3

fix: tar_change nssp, tar_file RMDs, fix flu prod

06d0cab

covid scoring using hubevals

63c0e90

minor: improve run_prod log messages

af76ef1

actually using non lin_clim weights

2f61f90

push scoring to the remote

cb2fddf

fix: update flu+covid exclusions weights, minor pipeline tweaks

dcd15e0

fix: generation_dates early in season

62122bf

fix: remove kludge from process_nhsn_data

747da48

dshemetov added 2 commits February 20, 2025 13:02

feat: add some s3 cleaning utility functions

9649099

delete: old scoring script

80702b3

dshemetov mentioned this pull request Feb 20, 2025

windowed_seasonal weights aren't being used #167

Closed

fix: improve get_targets_errors (shorter name)

5870ad8

dshemetov force-pushed the evaluate_current_season branch from 18d8391 to 5870ad8 Compare February 20, 2025 22:25

feat: add get_flu/covid_prod_errors

9005314

dshemetov merged commit cee1f2b into main Feb 21, 2025
1 check failed

dshemetov deleted the evaluate_current_season branch February 21, 2025 17:15



		#' Print recent targets errors.
		get_recent_targets_errors <- function(time_since = minutes(60)) {

		# If TRUE, we don't run the report notebook, which is (a) slow and (b) should be
		# preserved as an ASOF snapshot of our production results for that week.

covid: evaluate current season #168

covid: evaluate current season #168

Uh oh!

Conversation

dshemetov commented Feb 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dshemetov Feb 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dshemetov Feb 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dsweber2 Feb 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dshemetov Feb 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

dshemetov commented Feb 6, 2025 •

edited

Loading

dshemetov Feb 15, 2025 •

edited

Loading

dshemetov Feb 15, 2025 •

edited

Loading

dsweber2 Feb 15, 2025 •

edited

Loading

dshemetov Feb 18, 2025 •

edited

Loading