Add configurable Pwelch scaling and improve performance #897

tmartin-gh · 2025-03-03T22:53:05Z

This PR addresses concerns in issue #887:

Configurable scaling is supported. Density and Spectrum modes, similar to Matlab or scipy.signal, as well as fused dB modes frequently needed for spectrum-analyzer-like plotting
Custom reduction kernel is used instead of MatX sum() using CUB DeviceSegmentedReduce (DSR).

Three reasons why the custom reduction kernel is preferred:

In the nominal case where the input signal is a memory backed complex-valued tensor and the output power spectrum is a memory backed real-valued tensor, the intermediate overlapping signal is a 2D tensor {batches, nfft}. This is an optimal memory layout for the FFTs, but a very suboptimal layout for CUB DSR which would the permuted layout {nfft, batches} as CUB DSR uses one block of threads to read all the 'batches' of a single fft bin. To improve CUB performance, we'd either need to have the batched FFTs directly write to a permuted tensor (about 4x slower for the scenario considered) or permute the intermediate tensor after the FFT (extra round trip through global memory).
CUB currently needs begin/end indexes for each data segment, the generation of which takes around 50% longer than the custom reduction (ignoring the permutation issue)
The custom reduction allows for configurable scaling, such as fusing a dB conversion.

…nel that performs better than CUB when in memory FFT bin powers are {batches, nfft}

tmartin-gh · 2025-03-03T23:00:19Z

/build

examples/pwelch.cu

tmartin-gh · 2025-03-04T04:04:02Z

/build

include/matx/operators/pwelch.h

include/matx/transforms/pwelch.h

tmartin-gh · 2025-03-04T17:02:42Z

/build

static_asserts for signal type

tmartin-gh · 2025-03-04T18:31:19Z

/build

include/matx/kernels/pwelch.cuh

tmartin-gh · 2025-03-04T20:20:45Z

/build

tmartin-gh · 2025-03-04T23:45:55Z

/build

tmartin-gh · 2025-03-04T23:46:03Z

/build

tmartin-gh · 2025-03-05T18:07:15Z

/build

tmartin-gh requested a review from cliffburdick March 3, 2025 22:53

Add configurable scaling modes for pwelch, using custom reduction ker…

f568d00

…nel that performs better than CUB when in memory FFT bin powers are {batches, nfft}

tmartin-gh force-pushed the issue_887 branch from 7c49418 to f568d00 Compare March 3, 2025 22:59

tmartin-gh mentioned this pull request Mar 3, 2025

[BUG] Poor pwelch performance #887

Closed

cliffburdick reviewed Mar 4, 2025

View reviewed changes

examples/pwelch.cu Outdated Show resolved Hide resolved

Update pwelch documentation

7491b19

cliffburdick reviewed Mar 4, 2025

View reviewed changes

include/matx/operators/pwelch.h Outdated Show resolved Hide resolved

cliffburdick reviewed Mar 4, 2025

View reviewed changes

include/matx/transforms/pwelch.h Outdated Show resolved Hide resolved

cliffburdick approved these changes Mar 4, 2025

View reviewed changes

Move nvcc-specific features behind __CUDACC__ guards and add

9bb79f1

static_asserts for signal type

tmartin-gh force-pushed the issue_887 branch from 340d7d3 to 9bb79f1 Compare March 4, 2025 18:31

cliffburdick reviewed Mar 4, 2025

View reviewed changes

include/matx/kernels/pwelch.cuh Outdated Show resolved Hide resolved

cliffburdick approved these changes Mar 4, 2025

View reviewed changes

Cleanup

405147a

tmartin-gh changed the title ~~Add configuration Pwelch scaling and improve performance~~ Add configurable Pwelch scaling and improve performance Mar 5, 2025

tmartin-gh merged commit 4a83974 into main Mar 5, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add configurable Pwelch scaling and improve performance #897

Add configurable Pwelch scaling and improve performance #897

tmartin-gh commented Mar 3, 2025

tmartin-gh commented Mar 3, 2025

tmartin-gh commented Mar 4, 2025

tmartin-gh commented Mar 4, 2025

tmartin-gh commented Mar 4, 2025

tmartin-gh commented Mar 4, 2025

tmartin-gh commented Mar 4, 2025

tmartin-gh commented Mar 4, 2025

tmartin-gh commented Mar 5, 2025

Add configurable Pwelch scaling and improve performance #897

Add configurable Pwelch scaling and improve performance #897

Conversation

tmartin-gh commented Mar 3, 2025

tmartin-gh commented Mar 3, 2025

tmartin-gh commented Mar 4, 2025

tmartin-gh commented Mar 4, 2025

tmartin-gh commented Mar 4, 2025

tmartin-gh commented Mar 4, 2025

tmartin-gh commented Mar 4, 2025

tmartin-gh commented Mar 4, 2025

tmartin-gh commented Mar 5, 2025