Skip to content

(How) should we explicitly support partial version histories? #352

Open
@brookslogan

Description

@brookslogan

In developing epix_rbind() @dsweber2 identified some gnarly edge cases because NAs are overloaded in epi_archives, especially if we allow epi_archives holding partial version histories. We don't currently allow partial version histories, and some other operations (besides the epix_merge() output ambiguity) may malfunction if we try to use them.

Augmenting the epi_archive format may help disambiguate sources of missingness in diff data.

One potential scheme: add an extra column per signal (like NA codes in the API) indicating for each NA measurement, whether

  • it's an explicit NA
  • it's a missing row, or
  • it should be LOCF'd from the previous version

Or maybe something like one of the original epi_archive formats considered: flagging every measurement (NA or otherwise) with:

  • add this measurement
  • change this measurement
  • remove this measurement

Some things to think about:

  • How do we determine these flags? I'm not sure we get add/change/remove information out of issues queries from epidatr, for example. And we need to think about various epix_*() functions as well.
  • Would the user ever need to specify these flags, or would we be able to add these on automatically?
  • What if we try combining version histories randomly from different parts of history, or haphazardly alternating between doing rbinds and merges to arrive at a full archive? Will we need some extra metadata? Will it blow the entire thing up? Can we put some restrictions in to disallow any tricky cases?
  • Performance implications: time and memory cost of dealing with this. Do/can we make it opt-in if it's expensive?

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingperformancequestionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions