Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add configurable Pwelch scaling and improve performance #897

Merged
merged 4 commits into from
Mar 5, 2025
Merged

Conversation

tmartin-gh
Copy link
Collaborator

This PR addresses concerns in issue #887:

  • Configurable scaling is supported. Density and Spectrum modes, similar to Matlab or scipy.signal, as well as fused dB modes frequently needed for spectrum-analyzer-like plotting

  • Custom reduction kernel is used instead of MatX sum() using CUB DeviceSegmentedReduce (DSR).

Three reasons why the custom reduction kernel is preferred:

  1. In the nominal case where the input signal is a memory backed complex-valued tensor and the output power spectrum is a memory backed real-valued tensor, the intermediate overlapping signal is a 2D tensor {batches, nfft}. This is an optimal memory layout for the FFTs, but a very suboptimal layout for CUB DSR which would the permuted layout {nfft, batches} as CUB DSR uses one block of threads to read all the 'batches' of a single fft bin. To improve CUB performance, we'd either need to have the batched FFTs directly write to a permuted tensor (about 4x slower for the scenario considered) or permute the intermediate tensor after the FFT (extra round trip through global memory).

  2. CUB currently needs begin/end indexes for each data segment, the generation of which takes around 50% longer than the custom reduction (ignoring the permutation issue)

  3. The custom reduction allows for configurable scaling, such as fusing a dB conversion.

@tmartin-gh tmartin-gh requested a review from cliffburdick March 3, 2025 22:53
…nel that performs better than CUB when in memory FFT bin powers are {batches, nfft}
@tmartin-gh
Copy link
Collaborator Author

/build

@tmartin-gh
Copy link
Collaborator Author

/build

@tmartin-gh
Copy link
Collaborator Author

/build

@tmartin-gh
Copy link
Collaborator Author

/build

@tmartin-gh
Copy link
Collaborator Author

/build

2 similar comments
@tmartin-gh
Copy link
Collaborator Author

/build

@tmartin-gh
Copy link
Collaborator Author

/build

@tmartin-gh tmartin-gh changed the title Add configuration Pwelch scaling and improve performance Add configurable Pwelch scaling and improve performance Mar 5, 2025
@tmartin-gh tmartin-gh merged commit 4a83974 into main Mar 5, 2025
1 check passed
@tmartin-gh
Copy link
Collaborator Author

/build

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants