You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Brought up by #2205 the sum kernel has been adapted to account for underflow issues when adding large amounts of values. However, there are several other locations within the code where summations are done naively like stdev/var, and the more general reduction abstraction.
Describe the solution you'd like
I would prefer a more robust implementation (e.g. KahanBabushkaNeumaierSum as in the sum kernel ) to be placed as a special implementation for the add-operator within the general reduction implementation (coalesced/strided). We already have a specialization for add reduction operator in the strided reduction. APIs like sum and stdev should delegate to these abstractions.
Additional context
See #2205 for the initial bug report.
The text was updated successfully, but these errors were encountered:
This PR consists of multiple parts:
1. redirect custom reduction kernels within `stats `namespace to `linalg::reduce`
2. Specialize reduction kernels for addition utilizing the _Kahan-Babushka-Neumaier-Sum_ [link](https://en.wikipedia.org/wiki/Kahan_summation_algorithm)
3. Slightly adjust kernel heuristics for coalesced reductions
This should address #2366 and #2205. With the kernel heuristics adjusted the maximum performance drop is 4%.
FYI, @tfeher
Authors:
- Malte Förster (https://github.com/mfoerste4)
Approvers:
- Tamas Bela Feher (https://github.com/tfeher)
- Corey J. Nolet (https://github.com/cjnolet)
URL: #2381
Is your feature request related to a problem? Please describe.
Brought up by #2205 the sum kernel has been adapted to account for underflow issues when adding large amounts of values. However, there are several other locations within the code where summations are done naively like stdev/var, and the more general reduction abstraction.
Describe the solution you'd like
I would prefer a more robust implementation (e.g. KahanBabushkaNeumaierSum as in the sum kernel ) to be placed as a special implementation for the add-operator within the general reduction implementation (coalesced/strided). We already have a specialization for add reduction operator in the strided reduction. APIs like
sum
andstdev
should delegate to these abstractions.Additional context
See #2205 for the initial bug report.
The text was updated successfully, but these errors were encountered: