-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
H7 dual core OTA fix #2966
H7 dual core OTA fix #2966
Conversation
I think we should partition flash differently, in order not to mess with the M4 and be able to update the M7. |
src/ota_stm32h7.c
Outdated
@@ -35,6 +35,9 @@ static struct mg_flash s_mg_flash_stm32h7 = { | |||
#define FLASH_OPTSR_PRG 0x20 | |||
#define FLASH_SIZE_REG 0x1ff1e880 | |||
|
|||
#define STM_DBGMCU_IDCODE 0x5C001000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this register present in all H7s ? We might break other OTAs by triggering a HardFault on invalid memory access
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I checked that. It appears to be on all H7s.
src/ota_stm32h7.c
Outdated
if (STM_DEV_ID == 0x450) { | ||
// H745/H755 and H747/H757 are running on dual core | ||
// Disable CM4 to prevent the old and the new firmwares from booting | ||
// simultaneously on each core. | ||
MG_SET_BITS(MG_REG(bank + FLASH_OPTSR_PRG), MG_BIT(22), 0); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a quick workaround, not a fix...
The user bought a dual-core chip for a reason, and we are rendering the second core unusable after first upgrade.
The problem with dual-core OTA is that any core firmware might want to be updated.
IMO, at a minimum, we should be able to update the core we are running in without disabling the other core.
Ideally, we should be able to update both cores, but I guess that would require a different schema, as each core has its own image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, so in this case we could repartition the flash, like you mentioned. How about using only the first bank (only for these dual core boards)? We'll partition this instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So they designed a dual-flash-bank chip, where banks can be swapped, and assigned one bank for one core and the other bank for the other core, effectively rendering the swap feature unusable...
Can we change core base addresses ? If so, how would that work with Cube ? How will users flash their devices ?
Otherwise, we can only work with the portion that belongs to our core, or with both at the same time. The former is the way we've been doing it, the latter requires merging two images into one uploadable chunk; I'd rather avoid that.
One other issue: if we tinker with the M4 memory space, we need to stop that core before doing so, that is, probably the easiest path is to limit to the core we are running in.
When and if a customer request updating the M4, we'll devise a schema to do it; be it a single image or a selector for the core to be updated. For now, IMO, stick to the simplest way to update the M7 that coexists with what Cube will generate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To answer your questions,
Yes, we can change the base addresses for each core. There are some option bytes which control this and modifying the values seems to be persistent across reboots. I tried to figure out how to do it in Cube, but I can't find such an option. Cube seems to generate code linked at the default addresses for each core. (CM4 code is linked at 0x81.... and CM7 at 0x80...). You can change these option bytes though in STM32CubeProgrammer, but I think that is irrelevant for now.
For now, I don't think we should deal with M4 memory space. As long as we use the first bank only for OTA, everything should be fine, since we are updating only M7.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, if it can't be done in a way that is transparent to the user (does not require users to be aware of it and go changing stuff or programming in a different way), I think we'd better stick to a single-bank approach. That has already been done and it only requires thinking flash is half the size and there is no swap, so we manually overwrite the lower portion of what is physical bank 1, and don't mess with bank 2, that belongs to the M4.
Is that correct ?
That will let us easily handle updates for the M4 when and if a customer requests that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly! We'll treat it as a single bank scenario by assuming the 2nd bank does not exist.
src/ota_stm32h7.c
Outdated
@@ -121,6 +124,12 @@ MG_IRAM static bool mg_stm32h7_swap(void) { | |||
flash_clear_err(bank); | |||
// printf("OPTSR_PRG 1 %#lx\n", FLASH->OPTSR_PRG); | |||
MG_SET_BITS(MG_REG(bank + FLASH_OPTSR_PRG), MG_BIT(31), desired); | |||
if (STM_DEV_ID == 0x450) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like H745/H755 and H747/H757 aren't the only ones that use this ID. H743 uses 0x450 as well. I did not expect that, since H743 has a different RM. They shouldn't have allowed to have the same ID, it complicates things. How are we supposed to distinguish between a H743 and a H745 at runtime??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can always do it at compile time, but let me dive into ST waters and let's see what I can find.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know. It would be cleaner to do it at runtime though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like single-core chips are in fact dual-core chips where the M4 has been disabled. This means that everything is the same, but probably we can check the BCM bits in the OPTION bytes.
Whether the chip is a single-core or a dual-core one, I think it is safe to say that if the CM4 is gated (no clock, BCM4 = 0 ), we can assume all flash is available.
WDYT ?
745 RM 13.3.14 SYSCFG_UR1
742 RM 12.3.12 shows this register is not present, notice the next register is UR2. The less-significant 16-bit portion of UR4 is reserved.
Worst case: we get a HardFault when accessing SYSCFG_UR1, this is an invasive way to detect a single core...
Best case: the option byte is readable
I don't see any other option if 743s really are 745s with the M4 disabled
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, but unfortunately UR1 reads as 0x10001 on both H743 and H745/H755... So no hard faults, but instead it returns the wrong value. It should have been 0x10000 instead.
I guess we'll have to decide at compile time if it's dual core or not. I did not find any other options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So 743s are in fact 745 chips with either a "broken" or removed M4 core...
The other options I can come up with, require collaboration between cores and a timeout... a big no no no...
Let's add a #define for all dual cores. Does this very same situation replicate with the other dual core families we handle ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's the same for all dual core boards we have (H745/H755 and H747/H757). They are all under the same reference manual, anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I would add MG_OTA_H7DUALCORE or something like that (give or take one or two underscores)
000eb0a
to
665298a
Compare
I encountered an issue while testing that prevented OTA from working properly on dual core H7 devices (H755 and H745).
These devices normally boot from both cores. CM4 requires by default a firmware at address 0x08100000 and CM7 requires a firmware at address 0x08000000.
The issue is we are using address 0x08100000 to store the new firmware, which means that after the next reset, both cores will start executing their corresponding firmwares. I noticed this always leads to a crash and the board gets bricked after this...
I put up a fix to disable CM4 from executing before resetting.