Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Consolidate SUM reductions across raft #2366

Open
mfoerste4 opened this issue Jun 25, 2024 · 0 comments
Open

[FEA] Consolidate SUM reductions across raft #2366

mfoerste4 opened this issue Jun 25, 2024 · 0 comments
Assignees
Labels
feature request New feature or request

Comments

@mfoerste4
Copy link
Collaborator

Is your feature request related to a problem? Please describe.
Brought up by #2205 the sum kernel has been adapted to account for underflow issues when adding large amounts of values. However, there are several other locations within the code where summations are done naively like stdev/var, and the more general reduction abstraction.

Describe the solution you'd like
I would prefer a more robust implementation (e.g. KahanBabushkaNeumaierSum as in the sum kernel ) to be placed as a special implementation for the add-operator within the general reduction implementation (coalesced/strided). We already have a specialization for add reduction operator in the strided reduction. APIs like sum and stdev should delegate to these abstractions.

Additional context
See #2205 for the initial bug report.

@mfoerste4 mfoerste4 added the feature request New feature or request label Jun 25, 2024
@mfoerste4 mfoerste4 self-assigned this Jul 8, 2024
rapids-bot bot pushed a commit that referenced this issue Jul 24, 2024
This PR consists of multiple parts:

1. redirect custom reduction kernels within `stats `namespace to `linalg::reduce`
2. Specialize reduction kernels for addition utilizing the _Kahan-Babushka-Neumaier-Sum_ [link](https://en.wikipedia.org/wiki/Kahan_summation_algorithm)
3. Slightly adjust kernel heuristics for coalesced reductions

This should address #2366 and #2205. With the kernel heuristics adjusted the maximum performance drop is 4%.

FYI, @tfeher

Authors:
  - Malte Förster (https://github.com/mfoerste4)

Approvers:
  - Tamas Bela Feher (https://github.com/tfeher)
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #2381
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant