-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
covid: evaluate current season #168
Conversation
2cb56f2
to
c85549c
Compare
2838296
to
3e9a8de
Compare
3e9a8de
to
bd1d91e
Compare
* covid prod now has two modes: prod and backtest, new reports available * fix prod pipelines for package updates * add retry function for failing API calls * add timestamps to targets output * update forecast data Julia (ty David) * add forecast data R code * fix daily_to_weekly_archive for epiprocess update, add comments * add tar_change to some data-dependent targets * make flu_hosp_prod backtest_mode aware * fix data_substitutions code
bd1d91e
to
e0de52d
Compare
R/utils.R
Outdated
|
||
|
||
#' Print recent targets errors. | ||
get_recent_targets_errors <- function(time_since = minutes(60)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah this is definitely nicer than the jank I have been doing to do this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, yea, got tired of the same dplyr manipulation on the jobs df over and over. unfortunately this doesnt work like half the time for various reasons for different targets, for instance it wont show borked notebook errors because for some reason a failed notebook target doesnt get its timestamp updated. i would make an issue on the targets page about it, but havent had the energy.
in the cases where the above function doesnt work, this is handy, should probably note it in a README: tar_meta(complete_only = TRUE, fields = c("error", "time")) %>% slice_max(time, n=5)
# If TRUE, we don't run the report notebook, which is (a) slow and (b) should be | ||
# preserved as an ASOF snapshot of our production results for that week. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So since we embed the generation date in the notebook, it is actually preserved as a snapshot regardless. Then again, given that we're introducing scoring notebooks with this PR, it may not matter too much to retroactively generate report notebooks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But our code is not snapshotted and during the season we made a bunch of tweaks. So I did a bunch of digging on old machines to make sure we got as close to the day of forecast snapshots of these books, since no way we're gonna rerun old github commits.
scripts/covid_hosp_prod.R
Outdated
command = { | ||
create_nhsn_data_archive(disease = "nhsn_covid") | ||
} | ||
), | ||
# TODO: tar_change maybe? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should either switch this to tar_change
or leave it as always mode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense. i'm hoping tar_change works. i think it should, unless there's something borked with the timestamp getting code, since tar_change is just shorthand for a two-target chain.
Co-authored-by: David Weber <[email protected]>
combined_forecasts <- tar_combine( | ||
name = forecast_full, | ||
forecast_targets[["forecast_res"]], | ||
command = { | ||
dplyr::bind_rows(!!!.x) | ||
} | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So an unfortunate thing this will result in is if any of the forecasts change for any forecast_date, every target downstream from this is going to get invalidated. Not sure how long scoring takes, but this could get annoying
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
less sure about an immediate alternative. I guess if everything is part of the same map the problem goes away
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm yea. i did it this way because i think i needed it to resolve the "forecasts always rerunning" issue. im not 100% on it, but this is the way it's setup in the explore pipeline and that pipeline doesnt have spurious cache invalidation issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh I forgot we actually did it this way in explore. My memory was that that did invalidate everything downstream of the single target, but it's been a while since I've run it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no i agree, i think things downstream of this will change if any component of this changes. but "things downstream of this aggregate sometimes rerun" is an upgrade over "everything reruns all the time"
Co-authored-by: David Weber <[email protected]>
Co-authored-by: David Weber <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally this would have some plots of representative/best/worst locations plots of the quantiles. Judging by the inputs this is definitely not included. May want to put off for now though and/or rely on the corresponding prod notebooks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea... somehow dealing with pipeline stuff takes me like 90% of the time and the actual plotting/analysis task just gets scraps of my attention. but when we get back monday, we can def get some more fine-grained views.
@@ -51,14 +54,15 @@ df %>% | |||
y = "Total Confirmed Flu Admissions" | |||
) + | |||
theme(axis.text.x = element_text(angle = 90, hjust = 1)) | |||
ggplotly(p, tooltip = "text", height = 800, width = 1000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this how we get the hovertext to actually work? the lack of that has meant needing to do some random tar_reads
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i didnt actually try to fix that issue, this fix is to make the HTML self-contained rather than producing a folder with PNGs
# self_contained: False | ||
# lib_dir: libs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering what these are about. Seem to be scattered in a couple of notebooks. I guess this is header copypasta?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea it's copy pasta. self-contained will bundle external artifacts (like images) inside the html, not sure what lib_dir is about. i was vaguely trying to get some consistency of this header across our Rmds, but not too hard.
18d8391
to
5870ad8
Compare
Backtesting our covid forecasters, re-organizing pipelines.
renv::restore(clean=TRUE)
to resync)targets 1.8.0 -> 1.10.1
, there seem to have been a lot of fixes, including pipeline speedups for large pipelinescrew 0.10 -> 1.0.0
, a number of changes, hopefully positive1: launch_max was deprecated on 2024-11-04 (version 0.10.1.9000). Alternative: none.
so I'm going to remove ourlaunch_max = 10000L
setting; it wasn't clear that that was helping us anyway.scoringutils -> 2.0
, a number of breaking changes, I've updated theevaluate_predictions
functioncloses #167