Closed
Description
[Refined scope: see these comments: A, B. We can then split off any other ideas into a separate issue.]
It's very easy to write an erroneous 7-day-average or 7-day-sum slide: e.g., if we epi_slide
with the natural mean
or sum
, we get the sum of less than n
* {n key values in group} things for
- the first
n - 1L
time_value
s - wherever there's a gap in availability (e.g., a skipped
time_value
in a/allgeo_value
s, an unavailablegeo_value
across a wide span oftime_value
s, etc.)
Ideas:
- Have some sort of option to auto-complete the window data, as writing this is a little tedious. Furthermore, we probably need this to be the default behavior for
epi_slide
or some sort of messaging to go on unless the user explicitly says that they're okay without completion. When this completion is enabled, we should balk if the key columns don't form a unique key. Counterpart forepix_slide
likely needs to complete a lot of leading edges through theref_time_value
, inclusive, and balk if there's anything beyond it (e.g., if archive is holding forecasts). - Have an option to skip over incomplete(-in-another-sense) windows for early
ref_time_value
s by filtering downref_time_values
(perhaps as an alternative to the defaultref_time_values
or manual specification, perhaps as a separate option that would allow for this filtering to be applied even to manually-suppliedref_time_values
). (Anepi_archive
equivalent might not be so simple due to the possibility of later versions extending the time series into the past or removingtime_value
s from the past, somin(time_value)
is not a fixed property of an archive group.) --- This seems closest to.complete=TRUE
forslider::slide_index*
, but I'm not sure we would want to ignore gaps like these functions, and there might be details to check regarding when thetime_value
set available varies across groups and/or "epikey" (non-time, non-version key col) values. - Have option to skip over or return some missing result indicator when there are incomplete windows in any sense (either because we are at the very beginning/end of the time series and the window extends past the edge, or because there are gaps).
- Reject
epi_df
s/epi_archive
s with gaps / gaps in slide computation inputs unless the user explicitly allows them. (For the user to easily address these gaps, we want Add acomplete
method forepi_df
s #250 + something forepi_archive
s.)
We might not really need all of these; some of them make the others unnecessary or not as useful; some brainstorming still is needed here.