-
Notifications
You must be signed in to change notification settings - Fork 8
Step through pre-calculated start times for each group using closure rather than using .real
col in epi_slide
#397
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The test failures appear to be a problem with the new |
7ac9c6e
to
aae9dee
Compare
All the |
`slider::hop_index` doesn't require starts & stops to be in `.i`, and we aren't actually doing that anyway. Plus comment to help clarify that we're passing the group key to comps via `...`.
Rename `time_values` to `ref_time_values` or `kept_ref_time_values` depending on the context. Does not change the interface of `epi_slide`.
We checked them for nonzero length when we filtered `ref_time_values` down to those present in the `x$time_value`, but now we require `all(ref_time_values %in% unique(x$time_value))`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, please see minor TODOs above.
Nice 4x!!! Was this on something like 7-day averaging? [I'm getting something like a 2x doing jhu_csse_daily_subset %>% group_by(geo_value) %>% epi_slide(before = 6, cases_7davish = mean(cases))
, 3x on an ungrouped version. Still super nice.]
My (main) test case was indeed a 7-dav, calling the
Surprisingly,
is twice as fast as
It looks like converting a quosure to fn is the bottleneck, with this line being particularly slow. |
.ref_time_value
faster.real
col in epi_slide
Simplify and clarify parts of `epi_slide` implementation
Instead of re-calculating
.ref_time_value
s, use pre-calculated values stored instarts
. For each group, use a closure to keep track of position instarts
vector. This avoids the slow.real
filtering and removal steps, and simplifies the code by removing various bits of.real
handling.This is ~4x faster than the old version.