Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Assertion fail on RP2040 with heap_4 #1187

Closed
ldursw opened this issue Nov 11, 2024 · 6 comments
Closed

[BUG] Assertion fail on RP2040 with heap_4 #1187

ldursw opened this issue Nov 11, 2024 · 6 comments
Labels
question Further information is requested

Comments

@ldursw
Copy link

ldursw commented Nov 11, 2024

When using heap_4 and pvPortMalloc is called from a critical section, an assertion fails.

pvPortMalloc calls vTaskSuspendAll here which in turn asserts that it is not in a critical section here.

The Pull Request #982 says

As an added bonus, it also allows the SMP implementation of vTaskSuspendAll() to be simplified (as we can assume the vTaskSuspendAll() is now never called in a critical section). As such, an extra assert has been added to vTaskSuspendAll() to ensure that it is never called in a critical section.

which may have caused the issue with pvPortMalloc.

I don't see why a memory allocation can't happen in a critical section. So I reverted part of the PR and it seems to work just fine.

@@ -3853,9 +3853,6 @@ void vTaskSuspendAll( void )
              * uxSchedulerSuspended since that will prevent context switches. */
             ulState = portSET_INTERRUPT_MASK();
 
-            /* This must never be called from inside a critical section. */
-            configASSERT( portGET_CRITICAL_NESTING_COUNT() == 0 );
-
             /* portSOFRWARE_BARRIER() is only implemented for emulated/simulated ports that
              * do not otherwise exhibit real time behaviour. */
             portSOFTWARE_BARRIER();
@@ -3867,7 +3864,14 @@ void vTaskSuspendAll( void )
              * it. */
             if( uxSchedulerSuspended == 0U )
             {
-                prvCheckForRunStateChange();
+                if( portGET_CRITICAL_NESTING_COUNT() == 0U )
+                {
+                    prvCheckForRunStateChange();
+                }
+                else
+                {
+                    mtCOVERAGE_TEST_MARKER();
+                }
             }
             else
             {

Target

  • Development board: Raspberry Pi Pico
  • Instruction Set Architecture: Dual Core ARM Cortex M0+
  • IDE and version: Visual Studio Code with PlatformIO
  • Toolchain and version: arm-none-eabi (xPack GNU Arm Embedded GCC x86_64) 12.3.1 20230626

To Reproduce
Configure a project with heap_4 and try to allocate memory in a critical section.

vTaskEnterCritical();
void *ptr = pvPortMalloc(16); // will assert here
vPortFree(ptr);
vTaskExitCritical();
@ldursw ldursw added the bug Something isn't working label Nov 11, 2024
@aggarg
Copy link
Member

aggarg commented Nov 11, 2024

I don't see why a memory allocation can't happen in a critical section.

The reason we do not do that is because "we do not perform non-deterministic operations in critical sections". Memory allocation involves walking the list of free blocks which is a non-deterministic operation as the length of the list varies.

Critical sections are implemented by masking interrupts which affects device's responsiveness if interrupts are masked for extended periods. We aim to keep the critical sections short and deterministic to not affect the device's responsiveness adversely.

Is this your application code or the kernel code that is attempting a memory allocation inside a critical section? If it is your application code, what problem are you trying to solve by doing memory allocation in critical section.

@aggarg aggarg added question Further information is requested and removed bug Something isn't working labels Nov 11, 2024
@ldursw
Copy link
Author

ldursw commented Nov 11, 2024

I hit the issue with the LVGL library in this function.

The call stack goes like this: lv_init -> lv_os_init -> lv_mutex_init -> prvCheckMutexInit (enters critical section) -> prvMutexInit -> xSemaphoreCreateRecursiveMutex -> xQueueCreateMutex -> xQueueGenericCreate -> pvPortMalloc (fails assertion).

By default the framework uses heap_3 and a critical section is needed because the RP2040 is a dual core microcontroller.

https://github.com/earlephilhower/arduino-pico/blob/6c8d62fdb80b3ce3e7205b29fc1be83e9ed6c983/cores/rp2040/malloc-lock.cpp#L42-L47

When I tried to switch from heap_3 to heap_4 I found this issue about critical section.

@aggarg
Copy link
Member

aggarg commented Nov 11, 2024

This actually is an issue in the LVGL - this critical section should be changed to suspend scheduler. The comment above this line says the following -

/* Mutex initialization must be in a critical section to prevent two threads
 * from initializing it at the same time. */

If the intention is to prevent another thread and not ISR (as the comment says), then the code should be changed to the following:

static void prvCheckMutexInit(lv_mutex_t * pxMutex)
{
    /* Check if the mutex needs to be initialized. */
    if(pxMutex->xIsInitialized == pdFALSE) {
        /* Mutex initialization must be in a critical section to prevent two threads
         * from initializing it at the same time. */
        vTaskSuspendAll();

        /* Check again that the mutex is still uninitialized, i.e. it wasn't
         * initialized while this function was waiting to enter the critical
         * section. */
        if(pxMutex->xIsInitialized == pdFALSE) {
            prvMutexInit(pxMutex);
        }

        /* Exit the critical section. */
        xTaskResumeAll();
    }
}

@ldursw
Copy link
Author

ldursw commented Nov 11, 2024

prvCheckMutexInit can also be called from ISR context here.

Thanks for your help so far. I'm not sure how to fix this issue.

@aggarg
Copy link
Member

aggarg commented Nov 11, 2024

Calling malloc from ISR is not a good idea. The correct way to fix it would be to fix the LVGL code. There are several options:

  1. Create mutex before creating any thread or starting interrupts. This will remove the need of prvCheckMutexInit altogether.
  2. OR Ensure that ISR is not called until the mutex is initialized and then prvCheckMutexInit (and hence malloc) wont be needed in ISR.
  3. OR Use static creation API so that malloc is not needed at all.

I do not know which option would be best for LVGL - LVGL maintainers can decide about that. In the short term, you can change heap_4 to use critical section instead of scheduler suspension. As I explained before, this masks interrupts for longer periods and therefore, may adversely affect responsiveness of your application.

@ldursw
Copy link
Author

ldursw commented Nov 11, 2024

Thanks a lot. I'll apply these suggestions to my application.

@ldursw ldursw closed this as completed Nov 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants