Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOLVED: An ESP32(-S3) 3-cycle/tap fp32 FIR filter of arbitrary length and decimation (AUD-5626) #1251

Open
f4lc0n-asm opened this issue Aug 19, 2024 · 6 comments

Comments

@f4lc0n-asm
Copy link

f4lc0n-asm commented Aug 19, 2024

Hello,

these 3-cycle/tap ASM FIR routines for ESP32(-S3) are drop-in replacements for the old 4-cycle/tap ones for ESP32. Just add preprocessor directives if needed. Turn off Task WDT before running the validation tests!
Benchmarks in the format: FIR_Length: new_ASM/old_ASM/C in cyc/tap (basically the number of cycles in a FIR loop for long FIRs):

 11: 6.05/6.47/17.10 |  21: 4.61/5.24/15.36 |   51: 3.66/4.50/14.26 |  101: 3.34/4.23/13.88
201: 3.17/4.12/13.69 | 501: 3.07/4.05/13.58 | 1001: 3.04/4.03/13.54 | 2001: 3.02/4.01/13.53

Cheers!

f4lc0n

FIR_fp32_3c_per_tap_v2.0.zip (7-Zip)
Added: decimating FIR with fixed input data length and with normal FIR coefficient order
Fixed: implemented C function prototype in the assembly
Fixed: C src code leftovers

@github-actions github-actions bot changed the title SOLVED: An ESP32(-S3) 3-cycle/tap fp32 FIR filter of arbitrary length and decimation SOLVED: An ESP32(-S3) 3-cycle/tap fp32 FIR filter of arbitrary length and decimation (AUD-5626) Aug 19, 2024
@hbler99
Copy link

hbler99 commented Aug 30, 2024

Thanks for your contribution! We will test it as soon!

@f4lc0n-asm
Copy link
Author

f4lc0n-asm commented Aug 30, 2024

Well, I just had a look at the actual FP32 FIR sources on ESP-DSP - there are notable changes like modified fir_f32_t in dsps_fir.h and decimation FIR ANSI-C source dsps_fird_f32_ansi.c. Both changes for the worse - making the FIR less flexible (e.g. no adjustable output delay and no arbitrary input length). Please compare the changes with the dsps_fir.h and dsps_fird_f32_ansi.c I included in my post - they are older and preferable. I also use uint instead of int in fir_f32_t - much more logical because of what info the struct members actually store… To conclude, dmitry1945 et al. somehow overthought it…

I just use ESP-DSP as an inspiration, always producing my own code, which suits me best! :)

Should you need any changes to my code to use it in ESP-ADF, just ask here!

P.S.: Looking at the new dsps_fird_f32_ansi.c - just think what happens at https://github.com/espressif/esp-dsp/blob/b3841d696950b2591cd84c94a0494c724a9f322e/modules/fir/float/dsps_fird_f32_ansi.c#L22 when e.g. input length=10 and fir->decim=20 - *input++ loads invalid data after 10-th for-loop iteration… In order to proceed correctly, the input length must be exactly len * fir->decim or len * 20 samples in my example. This is very non-flexible!

@hbler99
Copy link

hbler99 commented Sep 2, 2024

I think the issue of fixed input length can be resolved by adding a ring buffer when processing data. You're right, everyone's own code is best suited for themselves!

@f4lc0n-asm
Copy link
Author

f4lc0n-asm commented Sep 3, 2024

The way they programmed the new decimating FIR with fixed input length, it requires the FIR coefficients to be in reverse order because of coeff_pos++ starting with the oldest member of the delay line in the FIR computation loops - see https://github.com/espressif/esp-dsp/blob/b3841d696950b2591cd84c94a0494c724a9f322e/modules/fir/float/dsps_fird_f32_ansi.c#L30
By simple tweaking of the non-decimating FIR, I created a decimating one with fixed input length, but with normal FIR coefficient order, which is much more convenient. I added it to the original post (the fird2_fp32 subdir). Take it or leave it! :)

@hbler99
Copy link

hbler99 commented Sep 4, 2024

That's great! If you want to add this code to the DSP repository, I recommend submitting another issue in ESP-DSP to let them know.

@f4lc0n-asm
Copy link
Author

When I posted the 1½cycle/tap fp32 FIRs for ESP32-S3 here, which require FIR coefficients in normal order and accept any FIR length (including prime numbers), I also posted them in ESP-DSP. But they, without explaining anything, chose their solution, which requires FIR coefficients in reverse order and the FIR length must be a multiple of 4. Then they changed the fp32 FIRs for ESP32 to require the FIR coefficients in reversed order, too. Are they afraid of my competetive offer?
As you suggested, I posted the 3cycle/tap fp32 FIRs for ESP32, which require FIR coefficients in normal order, to ESP-DSP. Let's see how they will react…

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants