-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
periph_common/init.c: unblock i2c bus #20787
base: master
Are you sure you want to change the base?
periph_common/init.c: unblock i2c bus #20787
Conversation
Both the already upstream ESP soft I2C bus as well as my PR for a soft I2C bus for based on GPIO LL have a recovery function to unblock the bus. They detect a locked bus at and recover it during normal operation, failing only the transfer that was affected by the issue. IMO this is the better approach, as a bus can lock up during normal operation as well when the peripheral misses a clock cycle or the controller misses clock stretching. IMO if a |
drivers/periph_common/init.c
Outdated
* we do not aim to mitigate here. | ||
*/ | ||
gpio_init(i2c_config[dev].sda_pin, GPIO_IN); | ||
gpio_init(i2c_config[dev].scl_pin, GPIO_OUT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid this will not work on all platforms as sda_pin
/scl_pin
is not used everywhere.
You can use i2c_pin_sda()
/i2c_pin_scl()
instead, but this requires the periph_i2c_reconfigure
feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice addition !
drivers/periph_common/init.c
Outdated
@@ -73,6 +122,7 @@ void periph_init(void) | |||
/* initialize configured I2C devices */ | |||
#ifdef MODULE_PERIPH_INIT_I2C | |||
for (unsigned i = 0; i < I2C_NUMOF; i++) { | |||
_i2c_sync_devices(i); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not move this into i2c_init() ?
There may be weird cases where users manually initialize their i2C peripherals without relying on auto_init
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. I'll check that. It will however require touching the i2c_init files of all cpus. Maybe it makes more sense to include it in this auto_init but also provide it in the form of a function whenever periph_i2c_reconfigure is available so that people can deliberately use it at any time if they want to manually set up things?
drivers/periph_common/init.c
Outdated
" recovering...\n", dev); | ||
} | ||
else { | ||
DEBUG("periph_init: i2c bus #%u is ok\n", dev); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the bus is "OK" Aren't we suppose to skip the next clock ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The clock is intended to be skipped for the gpio_read returning 1 in the while loop below. This section is only for purposes of debug output.
Maybe I am not getting correctly what you refer to as "next clock".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I didn't get my morning coffee yet...
I mean "the next BLOCK".
If the bus is OK, couldn't we skip entirely the whole loop that comes after ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's step back a bit and reconsider the approach. I think this is motivated by a broken SAM I2C driver or broken SAM I2C peripheral.
Other I2C peripherals can detect a locked bus or at least have a timeout, that allows to unblock a blocked bus when it happens.
Let's not add a work around needed for broken SAM hardware to common code.
@benpicco Where should I properly document this added behavior? |
The whole idea can fit in a dedicated pseudomodule if we don't want to impact all MCUs. |
@maribu Just to verify that we're talking about the same thing here: It is not the SAM I2C peripheral which is broken here. The state machine of the remote EEPROM is locked in a state where it pulls the bus low. But if there are some i2c peripherals for other MCUs which automatically recover from this behavior, I agree with your comment. Where would you put it then? We could add it as a pseudo-module and put it in the dependencies of selected CPUs and additionally make it available as a function if |
That can happen with every peripheral I2C device, if it missing a clock transition. That most likely happens when the MCU is resetting in the middle of a transfer, but also at any point in time. As a result, the MCU controller must check for a locked bus and recover. That may look like an arbitration lost event at first, but if SDA stays low, the bus is locked. So: Either the I2C peripheral needs to monitor SDA after an arbitration lost even, or the driver should. This cannot be done in common code. |
Looking at it, we never check the |
I will rewrite this PR and include the unblocking into the driver in the sam0_common driver. As @benpicco indicates, the buserror interrupt/flag is present but currently not used. |
c06da25
to
a1c53d9
Compare
I brought the unlock mechanism to Touching the i2c driver I noticed that part of the implementation was not in sync with the register description of the data sheet. In the current state of the peripheral, the implementation seemed to make sense after examining the registers on the device. I still chose to alter it in UPDATE: for testing, please note that I also altered the instructions in the description of this PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a quick peek, this looks good to me. I dismissed my prior review, as my concerns have been addressed.
cpu/sam0_common/periph/i2c.c
Outdated
DEBUG_PUTS("i2c.c: STATUS_ERR_TIMEOUT"); | ||
unblock_bus = true; | ||
} | ||
else if (dev->STATUS.bit.BUSSTATE == BUSSTATE_BUSY) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do like this verbose checking for readability.
I haven't checked, but I'm almost certain the compiler would under normal circumstances optimizes the more verbose code quite well, so that there is no performance penalty compared to less readable and more dense C code. However, these are not normal circumstances because SercomI2cm::STATUS
is volatile
and the compiler will be forces to re-load the memory on every access and not combine checks.
How about just reading out the status register into a local variable and check on that? I assume that this would yield optimal or near-optimal machine code. In addition, @dylad will be happy if we cease using the bitfield types (SERCOM_I2CM_STATUS_Type
), as the CMSIS header files provided by Microchip are no longer generated with bitfield types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to address your comments in my last commit. However, I also read the data sheet (same54/dt51 and saml21) more thoroughly and found a section on address transmission. I reorganized a few of the checks to adhere to that part. The original implementation on master does not completely conform to that part of the data sheet.
cpu/sam0_common/periph/i2c.c
Outdated
bool any_error = dev->STATUS.reg & ( | ||
SERCOM_I2CM_STATUS_SEXTTOUT /* XXX: do we need this? */ | | ||
SERCOM_I2CM_STATUS_ARBLOST | SERCOM_I2CM_STATUS_BUSERR | ||
/* not included as always active (hardware bug?) in the | ||
* current configuration: | ||
* SERCOM_I2CM_STATUS_MEXTTOUT, SERCOM_I2CM_STATUS_LOWTOUT | ||
* | ||
* not included as only useful with DMA: | ||
* SERCOM_I2CM_STATUS_LENERR | ||
*/ | ||
) || ( | ||
/* bus state should not be unknown at this point */ | ||
(dev->STATUS.reg & SERCOM_I2CM_STATUS_BUSSTATE_Msk) == BUSSTATE_UNKNOWN | ||
) || ( | ||
/* we don't handle multi-master layouts and treat this as error */ | ||
(dev->STATUS.reg & SERCOM_I2CM_STATUS_BUSSTATE_Msk) == BUSSTATE_BUSY | ||
) || | ||
( | ||
/* slave busy or address not on bus */ | ||
dev->STATUS.reg & SERCOM_I2CM_STATUS_RXNACK | ||
) || res; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to dev->STATUS
being volatile
, this will force the compiler to re-load that register for every check. Saving that into a temporary variable would allow the compiler to allocate this in a register and keep accessing that register for the bit checks without re-loading the register.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my last commit I copy the register for read access.
If this fixes the bug for you now, please squash! |
When back at home, I could check if I can integrate a test for a bus lock into the peripheral selftest using the Arduino shield. For |
EDITED on 2024-08-13 (changed description)
EDITED on 2024-08-27 (patch file upload for testing)
Contribution description
If a transaction is interrupted an external i2c device can remain in an unwanted state where it drives the bus to low
and will not release it without action. This can happen whenever a MCU has a software reset but no power cycle
is done on any part of the board.
This PR suggests a bus recovery for the i2c peripheral of sam0_common, during
i2c_init
or `_i2c_start.The basic mechanism for unblocking the bus is to "clock" it free until the state machine of the
corresponding device goes to idle and releases the bus. According to this reference (Section "2-wire software reset"),
a maximum of 9 clock cycles should be able to do this. The additional start and stop conditions might not be
required as other sources like this one
do not mention it. As they interfere with the error condition where a device has control over SDA by pulling it low,
this PR chooses only to use SDA as an input and intends to make a blocking i2c device release the bus.
Testing procedure
Reproduction was possible using a SAME54-XPRO, using
tests/drivers/at24cxxx
with the following changes applied:i2c_unblock_test.zip (patch file in archive due to upload restrictions)
With the test patch applied, the program will reset during a write transaction on the EEPROM. For testing the recovery during initialization has been removed by the patch and console output activated to initially report on the bus state before initialization. Recovery will still be executed after error detection in the first call of _i2c_start. For this reason the first test in the output below will fail for the first test. Activating recovery during initialization will make all tests pass at all times. Enabling more debug output will make it improbable to actually reproduce the error. This test has been tailored to one board and the current software changes.
The output after a POR can be found below. In the first iteration the bus is not blocked, in following iterations it is blocked after reset. Executing the tests requires a console input of
s<Enter>
to avoid an endless loop.Issues/PRs references
None.