Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make epi_cor() / summarizer compare apples to apples, and/or warn against usage #628

Open
brookslogan opened this issue Mar 11, 2025 · 0 comments

Comments

@brookslogan
Copy link
Contributor

Users might reach for epi_cor() for feature selection. This might be okay to judge whether something seems anywhere plausible for use at all, but for judging exact/relative usefulness it's not the best...

  • It doesn't provide a one-number summary of usefulness per signal-lag. (Also applies to even the anywhere-plausible judgment.)
  • It doesn't compare apples to apples across signal-lags with different availability.
  • It is not tailored to the evaluation metric of whatever model is actually going to be fit.
  • Selection and regularization should probably be handled by a modeling package, not by ad-hoc code atop an EDA utility.

We should address the points above and/or add appropriate cautions in docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant