Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core: add platform abort handler + stm32 SERC usage #7253

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

GseoC
Copy link
Contributor

@GseoC GseoC commented Jan 30, 2025

Implement a platform abort handler to handle use cases such as data aborts generated by peripherals on the bus.

These kinds of abort could be caused by platform-specific features, hence the weak function.

@@ -555,6 +560,8 @@ void abort_handler(uint32_t abort_type, struct thread_abort_regs *regs)

switch (get_fault_type(&ai)) {
case FAULT_TYPE_IGNORE:
/* Allow platform-specific handling */
plat_abort_handler(regs);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not so keen on adding hooks that the compiler can't optimize away if unused.
How about using a config variable so the function call can be optimized away if not needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok for using a config switch, how about CFG_EXTERNAL_ABORT_PLAT_HANDLER,

/* Print abort info + stack dump to the console */
void abort_print_error(struct abort_info *ai);

void abort_handler(uint32_t abort_type, struct thread_abort_regs *regs);

/* Platform overload, should be implemented in platform code */
void plat_abort_handler(struct thread_abort_regs *regs);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it only external aborts we care about here? If so, how about plat_external_abort_handler()?
This function should probably take a struct abort_info *ai argument instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is about external aborts. Maybe we can even restrain the call to platform code on external aborts that are not table walks or table updates?
I'm fine with your suggestions

@@ -555,6 +560,8 @@ void abort_handler(uint32_t abort_type, struct thread_abort_regs *regs)

switch (get_fault_type(&ai)) {
case FAULT_TYPE_IGNORE:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a FAULT_TYPE_EXTERNAL_ABORT if we need special treatment for external aborts.

Copy link
Contributor Author

@GseoC GseoC Jan 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok so that type of fault should be returned when encountering CORE_MMU_FAULT_ASYNC_EXTERNAL if asynchronous and create another CORE_MMU_FAULT_SYNC_EXTERNAL then?

I think we also miss the synchronous external abort cases, looking at the documentation for AARCH64: "Synchronous External abort, not on translation tablewalk or hardware update of translation table." is 0x10 in FSR.

So 0x10 and 0x12->0x17 are all external aborts. We miss these descriptions in core/arch/arm/include/arm64.h

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I suppose it can be useful to tell synchronous and asynchronous extern aborts apart.

@GseoC
Copy link
Contributor Author

GseoC commented Jan 31, 2025

Updated with comments addressed.

I'm having thoughts on the sharing of bindings between AARCH64 ESR_FSC_x and AARCH32 DFSR because it's supposed to to be architecturally mapped. Please let me know what you think of it as it is now.

void __weak plat_abort_handler(struct abort_info *ai __unused)
{
/* Do nothing */
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this weak function needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For whatever reason, if the switch is enabled and the function is not defined? Given that the switch is intended for this purpose, I think there is indeed no need for this weak function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the switch is enabled, the platform should implement the handler. Having a build error is a nice way to trigger that it's missing IMO.

/* Print abort info + stack dump to the console */
void abort_print_error(struct abort_info *ai);

void abort_handler(uint32_t abort_type, struct thread_abort_regs *regs);

/* Platform overload, should be implemented in platform code */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explicitly mention CFG_EXTERNAL_ABORT_PLAT_HANDLER must be enabled to have the platform handler be called. I think the function label should explicitly mention _external_abort_ if called only for external aborts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, will change to that

@GseoC GseoC changed the title (RFC)Core: add platform abort handler + stm32 SERC usage Core: add platform abort handler + stm32 SERC usage Feb 7, 2025
@GseoC
Copy link
Contributor Author

GseoC commented Feb 7, 2025

Removed "RFC" from the P-R title, comments addressed

/* Print abort info + stack dump to the console */
void abort_print_error(struct abort_info *ai);

void abort_handler(uint32_t abort_type, struct thread_abort_regs *regs);

/*
* Platform overload, should be implemented in platform code.
* CFG_EXTERNAL_ABORT_PLAT_HANDLER must be set to have the platform handler
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"overload" is not applicable since there is no weak implementation. I suggest to rephrase to
"Platform specific handler for external abort exceptions.".

Prefer "enable" over "set" at line 38.

@@ -1444,6 +1444,8 @@ enum core_mmu_fault core_mmu_get_fault_type(uint32_t fault_descr)
assert(fault_descr & FSR_LPAE);

switch (fault_descr & FSR_STATUS_MASK) {
case 0x8: /* b01000 Asynchronous extern abort */
return CORE_MMU_FAULT_ASYNC_EXTERNAL;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm right, case 0x8 (0b010000) is a synchronous external abort.
while case 0x9 (0b010001) is an asynchronous external abort.

By the way, I think the debug event is wrong. Debug event is indeed 0b100010 but that's 0x22 not 0x12. Will deserve a specific commit (outside the scope of this P-R).

Copy link
Contributor Author

@GseoC GseoC Feb 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is indeed an error on the value here, it is 0x10 for synchronous external abort.
Extract from ARM A32 documentation when TTCBR.EAE == 1:

0b000000 Address size fault in translation table base register.
0b000001 Address size fault, level 1.
0b000010 Address size fault, level 2.
0b000011 Address size fault, level 3.
0b000101 Translation fault, level 1
0b000110 Translation fault, level 2.
0b000111 Translation fault, level 3.
0b001001 Access flag fault, level 1.
0b001010 Access flag fault, level 2.
0b001011 Access flag fault, level 3.
0b001101 Permission fault, level 1.
0b001110 Permission fault, level 2.
0b001111 Permission fault, level 3.
0b010000 Synchronous External abort, not on translation table walk.
0b010001 Asynchronous SError exception.
0b010101 Synchronous External abort on translation table walk,level 1.
0b010110 Synchronous External abort on translation table walk,level 2.
0b010111 Synchronous External abort on translation table walk,level 3.
0b011000 Synchronous parity or ECC error on memory access,not on translation table walk. FEAT_RAS is not implemented
0b011001 Asynchronous SError exception, from a parity or ECC error on memory access. FEAT_RAS is not implemented
0b011101 Synchronous parity or ECC error on memory access on translation table walk, level 1. FEAT_RAS is not implemented
0b011110 Synchronous parity or ECC error on memory access on translation table walk, level 2. FEAT_RAS is not implemented
0b011111 Synchronous parity or ECC error on memory access on translation table walk, level 3. FEAT_RAS is not implemented
0b100001 Alignment fault. 0b100010 Debug exception.
0b110000 TLB conflict abort.
0b110100 IMPLEMENTATION DEFINED fault (Lockdown).
0b110101 IMPLEMENTATION DEFINED fault (Unsupported Exclusive access).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will also add a patch to fix the debug event part

@@ -207,6 +207,10 @@ endif
# this in platform config if different.
CFG_MAX_CACHE_LINE_SHIFT ?= 6

# CFG_EXTERNAL_ABORT_PLAT_HANDLER is used to implement platform-specific
# handling of external abort using the plat_external_abort_handler() function.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to replace "using" with "implementing".


void plat_external_abort_handler(struct abort_info *ai __unused)
{
/* If fault is ignored, it could be due to a SERC event */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer /* External abort may be due to SERC events */.

Platforms may have specific code to handle an abort  when fault type
is FAULT_TYPE_IGNORE. Add plat_abort_handler() that can be overridden
at platform level.

Signed-off-by: Gatien Chevallier <[email protected]>
When a data abort occurs and its fault type is FAULT_TYPE_IGNORE, it
may be an abort generated by the SERC hardware block. Check if a
SERC Illegal Access was caught and print the SERC register and panic()
if that is the case.

Signed-off-by: Gatien Chevallier <[email protected]>
Differentiate Async/Sync external aborts, create a new fault type and
introduce CFG_EXTERNAL_ABORT_PLAT_HANDLER switch.

Signed-off-by: Gatien Chevallier <[email protected]>
Synchronize with the updates of the other patch.

Signed-off-by: Gatien Chevallier <[email protected]>
Type fixes and rename to plat_external_abort_handler

Signed-off-by: Gatien Chevallier <[email protected]>
Copyright fix and use renamed function.

Signed-off-by: Gatien Chevallier <[email protected]>
Fix ARM32 LPAE values for external aborts and add some for table walks.
Rework some comments.

Signed-off-by: Gatien Chevallier <[email protected]>
Rework comment

Signed-off-by: Gatien Chevallier <[email protected]>
Fix use case where stm32_serc_handle_ilac() is called before SERC
driver is probed. Nothing to do in this case.

Signed-off-by: Gatien Chevallier <[email protected]>
@GseoC
Copy link
Contributor Author

GseoC commented Feb 17, 2025

Comments addressed and added core: fix debug event fault value for ARM32 with LPAE.

I can squash the commits if necessary

According to ARM documentation, the debug event fault value is indeed
0b100010, which is 0x22, not 0x12. Fix this value in
core_mmu_get_fault_type().

Fixes: 0eff3e9 ("arm32: Adds LPAE support")
Signed-off-by: Gatien Chevallier <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants