Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bi-directional I2S support #2775

Merged
merged 16 commits into from
Jan 24, 2025

Conversation

relic-se
Copy link
Contributor

@relic-se relic-se commented Jan 23, 2025

I've added a bi-directional mode to the I2S library using INPUT_PULLUP as the value of direction for the constructor. Certain resources of the object have been split into input and output variants (eg: pins, flags, audio buffers, and callbacks). Some minor API changes have been made to allow separate access to these resources as needed, but there shouldn't be any breaking changes (besides a slight change in constructor pin arguments). The documentation has been updated to reflect these changes. A simple loopback example has also been included.

Notes:

  • If doing too much computation in the loop which copies data between the input and output, audible stutters will occur. This can be mitigated by adjusting buffer sizes and potentially running audio loop on the second core.
  • Twice as much memory is consumed if running in bi-directional mode due to the separate audio buffer managers for input and output.
  • If there is a better direction value than INPUT_PULLUP to use here, I'd love the input. I think it'd be best avoid creating a specific enum for this unless it could be applied to other audio libraries.

@earlephilhower
Copy link
Owner

Nice, will look at it later today. However, looks like a Python cache file made it into the PR. Can you git rm tools/__pycache__/uf2conv.cpython-313.pyc?

@relic-se
Copy link
Contributor Author

Nice, will look at it later today. However, looks like a Python cache file made it into the PR. Can you git rm tools/__pycache__/uf2conv.cpython-313.pyc?

My bad! That one definitely slipped through.

I've been doing some playing around with this update and an RP2040 with a WM8960 codec. Results are quite good so far! I've been able to process the audio signal with simple integer-based processing and get clean output. However, once I throw in any floating point logic, the output begins to distort. I'm thinking I either need to look deeper into the buffer settings or just switch over to an RP2350 board. But I don't think that consideration has much of an effect on this PR except to potentially update the example.

@Andy2No
Copy link
Contributor

Andy2No commented Jan 23, 2025

@relic-se Looks interesting. The RP2040 has a hardware integer division unit, so fixed point would be a better choice than floating point - e.g. represent a 16 bit sample as a 32 bit number, with the fractional part as the lower 16 bits. That way you can do everything with integer operations.

The main thing is to always do any division as the last operation in a calculation, where possible, to keep as much precision as possible. E.g. don't do

a = (b/c)*d  in left to right order, do 

a= b*d/c instead.

@earlephilhower
Copy link
Owner

Re: performance, there's only soft-FP on the RP2040 so anything that's doing floating point will be pretty slow. Fixed point math is your friend there, though, and it's not hard to work with 17.15 (easier saturating math) or 16.16 (easier normalizing) numbers on a 32b device. Alternatively, look at the callback routines. You'll be delivered a block of bufsize l/r samples you can iterate over all at once. There's a lot of overhead doing individual 16bit read and writes plus you have the potential for cache spillage. You also have the 2nd core sitting there, unused...it's possible to grab data from the I2S running on core 0 and push work down to core1 for processing and re-output.

But a Pico2 is only $1.00 more and gives 2x flash and ~2x performance plus has single-precision FP in HW. Choices... 😆

FWIW, when I did block operations in BackgroundAudio (instead of per-sample ones like in ESP8266Audio) I went from an unusable webradio streamer (jitter, skips, just out of CPU oomph) to something that's rock solid and has ~30% free processing power left over each loop even while serving web pages and decoding MP3.

Copy link
Owner

@earlephilhower earlephilhower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great work and very much appreciated. Been something sitting out there since #1055 over 2 years ago, and it'll be awesome to get it into the core.

Very minor nomenclature and DWIM changes and I think we're set.

I'll have to wire up a Pico (everything's stuck on about 8 different ESP32 boards now for another library I'm writing) and make sure it doesn't break anything existing that shouldn't be broken. Visually, seems solid!

Thx again!

docs/i2s.rst Show resolved Hide resolved
docs/i2s.rst Outdated Show resolved Hide resolved
docs/i2s.rst Outdated Show resolved Hide resolved
libraries/I2S/src/pio_i2s.pio.h Outdated Show resolved Hide resolved
@@ -26,11 +26,13 @@

class I2S : public Stream, public AudioOutputBase {
public:
I2S(PinMode direction = OUTPUT, pin_size_t bclk = 26, pin_size_t data = 28, pin_size_t mclk = 25);
I2S(PinMode direction = OUTPUT, pin_size_t bclk = 26, pin_size_t data_out = 28, pin_size_t data_in = 22, pin_size_t mclk = 25);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get the need to allow multiple pins, but I think we need 2 constructors here not one because in the common, unidirectional case you'd need to put in a dummy out pin on an INPUT

I2S(PinMode direction = OUTPUT, pin_size_t bclk = 26, pin_size_t data = 28, pin_size_t mclk = 25);
and
I2S(PinMode direction, pin_size_t bclk, pin_size_t data_out, pin_size_t data_in, pin_size_t mclk = 25);

(probably need a little munging w/removing some default args to allos C++ to disambiguate which one you mean)...

The dataconstructor would assign the internal pinDIN or pinDOUT as appropriate for direction probably by calling the other constructor with one pin_size_t set to -1 or something dummy...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I'll be able to use multiple constructors in this case because of the types and defaults of the argument list. Ie:

I2S(PinMode direction, pin_size_t bclk, pin_size_t data_out, pin_size_t data_in);
I2S(PinMode direction, pin_size_t bclk, pin_size_t data, pin_size_t mclk);

Instead, I've got the order of the arguments worked out so that it's compatible with existing code. The only issue is that it will be a little awkward when initializing a bi-directional I2S bus without the need for mclk in which case you would use -1 or something similar as you alluded.

I2S(PinMode direction, pin_size_t bclk, pin_size_t data, pin_size_t mclk, pin_size_t data_rx);

libraries/I2S/src/I2S.h Outdated Show resolved Hide resolved
libraries/I2S/src/I2S.cpp Outdated Show resolved Hide resolved
libraries/I2S/src/I2S.cpp Outdated Show resolved Hide resolved
libraries/I2S/src/I2S.cpp Outdated Show resolved Hide resolved
libraries/I2S/src/I2S.cpp Outdated Show resolved Hide resolved
@relic-se
Copy link
Contributor Author

relic-se commented Jan 23, 2025

@Andy2No @earlephilhower Just a quick comment on the floating point debate. I am aware of the architectural differences between RP2040 and RP2350. In the case of the RP2040, my understanding is that pico-sdk implemented integer calculations which approximate their floating point counterparts.

When experimenting with this feature with a basic loop I2S::available() check then I2S::read16() and I2S::write16, even basic calculations were causing output underflows (stutters in the audio output). Here's an example of this:

const float volume = 0.75;
void loop() {
  int16_t l, r;
  while (i2s.available()) {
    if (!i2s.read16(&l, &r)) {
      break;
    }
    l *= volume;
    r *= volume;
    i2s.write16(l, r);
  }
}

I'm sure an RP2350 with proper FPU would fix this, but I think better buffer management might also do the trick (with size_t I2S::write(const uint8_t *buffer, size_t size)). I'll report back if I have any success there. Otherwise, I don't think it's critical to this PR.

@earlephilhower
Copy link
Owner

earlephilhower commented Jan 23, 2025

The RP2040 has some helper FP functions in ROM so they're 1-ROM read cycle access (vs. XIP cache loads) (not 1 cycle in operation!). They're supposed to be optimized for speed not accuracy. But only some functions are in ROM and it's still doing floating point in software. Each iteration of your loop has to go through an integer->double(I *think...might only be a single) conversion, a FP multiply, and a FP->integer conversion to go back to int16_t (x2 for stereo). You might have enough CPU on average, but I bet there's enough jitter due to USB IRQs, XIP cache misses, etc., that you're missing some deadlines.

A pure integer version would be (top of my head so sorry for typos)

const float volume = 0.75;
const int32_t sf = (1<<16 ) * volume;
void loop() {
  int16_t l, r;
  while (i2s.available()) {
    if (!i2s.read16(&l, &r)) {
      break;
    }
    int32_t  ll = l;
    ll *= sf;
    l = ll >> 16;
    int32_t rr = r;
    rr *= sf;
    r = rr >> 16;
    i2s.write16(l, r);
  }
}

Doing a read(buffer, count) <process-a-bunch-in-situ> write(buffer,count) instead of 1-sample at a time would save a lot of overhead, too, in here and in the ABM.

@relic-se
Copy link
Contributor Author

Thanks for the insight! I'll definitely give some of that a whirl.

Copy link
Owner

@earlephilhower earlephilhower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now! Good compromise with the constructor. I've done some simple tests with other libraries that use I2S output and not seen any compile or function issues so I'm happy to get this merged it. Thx again!

@earlephilhower earlephilhower merged commit e133147 into earlephilhower:master Jan 24, 2025
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants