More investigation on `epi_df` duplicate detection

[This comment](https://github.com/cmu-delphi/epiprocess/issues/560#issuecomment-2609406667) identifies some alternative approaches (conversion to data.table paired with other approaches, as well as vctrs::vec_duplicate_any) (and some details with memory benchmarking).  If `as_epi_df()` is still consuming a lot of time in some operations (I need to package up the archive -> archive slide mentioned in the issue), then we may want to look at these some more.  (The memory aspect probably only matters for `epi_archive` duplicate-key detection not `epi_df` duplicated-key detection.)

First part of this is probably benchmarking some code to see if it's worth the time looking into further optimizations.  (profvis may not show properly if there is native code involved; be sure to check / instrument properly)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

More investigation on `epi_df` duplicate detection #598

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

More investigation on epi_df duplicate detection #598

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

More investigation on `epi_df` duplicate detection #598