Make `epi_cor()` / summarizer compare apples to apples, and/or warn against usage #628

brookslogan · 2025-03-11T23:39:24Z

Users might reach for epi_cor() for feature selection. This might be okay to judge whether something seems anywhere plausible for use at all, but for judging exact/relative usefulness it's not the best...

It doesn't provide a one-number summary of usefulness per signal-lag. (Also applies to even the anywhere-plausible judgment.)
It doesn't compare apples to apples across signal-lags with different availability.
It is not tailored to the evaluation metric of whatever model is actually going to be fit.
Selection and regularization should probably be handled by a modeling package, not by ad-hoc code atop an EDA utility.

We should address the points above and/or add appropriate cautions in docs.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make `epi_cor()` / summarizer compare apples to apples, and/or warn against usage #628

Make `epi_cor()` / summarizer compare apples to apples, and/or warn against usage #628

brookslogan commented Mar 11, 2025

Make epi_cor() / summarizer compare apples to apples, and/or warn against usage #628

Make epi_cor() / summarizer compare apples to apples, and/or warn against usage #628

Comments

brookslogan commented Mar 11, 2025

Make `epi_cor()` / summarizer compare apples to apples, and/or warn against usage #628

Make `epi_cor()` / summarizer compare apples to apples, and/or warn against usage #628