Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

covid: evaluate current season #168

Merged
merged 20 commits into from
Feb 21, 2025
Merged

covid: evaluate current season #168

merged 20 commits into from
Feb 21, 2025

Conversation

dshemetov
Copy link
Contributor

@dshemetov dshemetov commented Feb 6, 2025

Backtesting our covid forecasters, re-organizing pipelines.

  • backtest on current season
  • try to fix breaking CI (stringfish compilation, missing header, issues?)
    • updating our renv environment in hopes it just goes away
    • updated all packages, updated renv, snapshot R to 4.4.1, and pruned unused packages (use renv::restore(clean=TRUE) to resync)
    • targets 1.8.0 -> 1.10.1, there seem to have been a lot of fixes, including pipeline speedups for large pipelines
    • crew 0.10 -> 1.0.0, a number of changes, hopefully positive
      • 1: launch_max was deprecated on 2024-11-04 (version 0.10.1.9000). Alternative: none. so I'm going to remove our launch_max = 10000L setting; it wasn't clear that that was helping us anyway.
    • scoringutils -> 2.0, a number of breaking changes, I've updated the evaluate_predictions function
  • bunch of bug fixes for forecasters-basics; flusion still broken waiting on Hotfix growth rate epipredict#437

closes #167

@dshemetov dshemetov self-assigned this Feb 7, 2025
@dshemetov dshemetov force-pushed the evaluate_current_season branch from 2cb56f2 to c85549c Compare February 12, 2025 00:09
@dshemetov dshemetov marked this pull request as ready for review February 12, 2025 00:11
@dshemetov dshemetov requested a review from dsweber2 February 13, 2025 17:57
@dshemetov dshemetov force-pushed the evaluate_current_season branch from 2838296 to 3e9a8de Compare February 13, 2025 18:28
@dshemetov dshemetov force-pushed the evaluate_current_season branch from 3e9a8de to bd1d91e Compare February 13, 2025 18:29
* covid prod now has two modes: prod and backtest, new reports available
* fix prod pipelines for package updates
* add retry function for failing API calls
* add timestamps to targets output
* update forecast data Julia (ty David)
* add forecast data R code
* fix daily_to_weekly_archive for epiprocess update, add comments
* add tar_change to some data-dependent targets
* make flu_hosp_prod backtest_mode aware
* fix data_substitutions code
@dshemetov dshemetov force-pushed the evaluate_current_season branch from bd1d91e to e0de52d Compare February 13, 2025 18:31
R/utils.R Outdated


#' Print recent targets errors.
get_recent_targets_errors <- function(time_since = minutes(60)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah this is definitely nicer than the jank I have been doing to do this

Copy link
Contributor Author

@dshemetov dshemetov Feb 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, yea, got tired of the same dplyr manipulation on the jobs df over and over. unfortunately this doesnt work like half the time for various reasons for different targets, for instance it wont show borked notebook errors because for some reason a failed notebook target doesnt get its timestamp updated. i would make an issue on the targets page about it, but havent had the energy.

in the cases where the above function doesnt work, this is handy, should probably note it in a README: tar_meta(complete_only = TRUE, fields = c("error", "time")) %>% slice_max(time, n=5)

Comment on lines +26 to +27
# If TRUE, we don't run the report notebook, which is (a) slow and (b) should be
# preserved as an ASOF snapshot of our production results for that week.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So since we embed the generation date in the notebook, it is actually preserved as a snapshot regardless. Then again, given that we're introducing scoring notebooks with this PR, it may not matter too much to retroactively generate report notebooks

Copy link
Contributor Author

@dshemetov dshemetov Feb 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But our code is not snapshotted and during the season we made a bunch of tweaks. So I did a bunch of digging on old machines to make sure we got as close to the day of forecast snapshots of these books, since no way we're gonna rerun old github commits.

command = {
create_nhsn_data_archive(disease = "nhsn_covid")
}
),
# TODO: tar_change maybe?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should either switch this to tar_change or leave it as always mode

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense. i'm hoping tar_change works. i think it should, unless there's something borked with the timestamp getting code, since tar_change is just shorthand for a two-target chain.

Comment on lines +176 to +182
combined_forecasts <- tar_combine(
name = forecast_full,
forecast_targets[["forecast_res"]],
command = {
dplyr::bind_rows(!!!.x)
}
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So an unfortunate thing this will result in is if any of the forecasts change for any forecast_date, every target downstream from this is going to get invalidated. Not sure how long scoring takes, but this could get annoying

Copy link
Contributor

@dsweber2 dsweber2 Feb 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

less sure about an immediate alternative. I guess if everything is part of the same map the problem goes away

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm yea. i did it this way because i think i needed it to resolve the "forecasts always rerunning" issue. im not 100% on it, but this is the way it's setup in the explore pipeline and that pipeline doesnt have spurious cache invalidation issues.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh I forgot we actually did it this way in explore. My memory was that that did invalidate everything downstream of the single target, but it's been a while since I've run it

Copy link
Contributor Author

@dshemetov dshemetov Feb 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no i agree, i think things downstream of this will change if any component of this changes. but "things downstream of this aggregate sometimes rerun" is an upgrade over "everything reruns all the time"

dshemetov and others added 2 commits February 14, 2025 16:14
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally this would have some plots of representative/best/worst locations plots of the quantiles. Judging by the inputs this is definitely not included. May want to put off for now though and/or rely on the corresponding prod notebooks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea... somehow dealing with pipeline stuff takes me like 90% of the time and the actual plotting/analysis task just gets scraps of my attention. but when we get back monday, we can def get some more fine-grained views.

@@ -51,14 +54,15 @@ df %>%
y = "Total Confirmed Flu Admissions"
) +
theme(axis.text.x = element_text(angle = 90, hjust = 1))
ggplotly(p, tooltip = "text", height = 800, width = 1000)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this how we get the hovertext to actually work? the lack of that has meant needing to do some random tar_reads

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i didnt actually try to fix that issue, this fix is to make the HTML self-contained rather than producing a folder with PNGs

Comment on lines +9 to +10
# self_contained: False
# lib_dir: libs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering what these are about. Seem to be scattered in a couple of notebooks. I guess this is header copypasta?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea it's copy pasta. self-contained will bundle external artifacts (like images) inside the html, not sure what lib_dir is about. i was vaguely trying to get some consistency of this header across our Rmds, but not too hard.

@dshemetov dshemetov force-pushed the evaluate_current_season branch from 18d8391 to 5870ad8 Compare February 20, 2025 22:25
@dshemetov dshemetov merged commit cee1f2b into main Feb 21, 2025
1 check failed
@dshemetov dshemetov deleted the evaluate_current_season branch February 21, 2025 17:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

windowed_seasonal weights aren't being used
2 participants