Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add outlier rejection to postprocessing operations #54

Open
CharlesC30 opened this issue Nov 23, 2022 · 1 comment
Open

Add outlier rejection to postprocessing operations #54

CharlesC30 opened this issue Nov 23, 2022 · 1 comment

Comments

@CharlesC30
Copy link
Contributor

CharlesC30 commented Nov 23, 2022

Denis also requested we add outlier rejection to postprocessing. This could be incorporated into an AverageData operator, or there could be two separate operators for averaging with/without outlier rejection.

Determining outliers

First the trimmed mean and trimmed standard deviation will be calculated at each energy point (see here).
For each spectrum the following will be then be calculated:
1 / number of energy points * sum[(trimmed_mean - spectrum / trimmed_stddev)**2]
This value is essentially a measure of how many trimmed standard deviations the spectrum typically deviates from the trimmed mean, and it can be compared to a threshold to determine if a given spectrum is an outlier in the group.

A few notes from Denis:

  • Different amounts of data can be trimmed during calculation. Typically trimming the top and bottom 20% works well and can be used by default.
  • A threshold value of ~10-25 typically works well for outlier determination. By default we can use 10 as a conservative threshold.
  • This method works best when the number of spectra is ≥10. Some kind of warning should be presented if run on a set of less than 10 spectra.
@matthewcarbone
Copy link
Contributor

Not entirely sure I understand what the trimmed_mean and trimmed_stddev are, but I'll follow your lead on this. Thanks for documenting and linking to some resources, I'll take a look. Keep me updated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants