-
Notifications
You must be signed in to change notification settings - Fork 6.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ZMS power cut corner case causing cycle count to wrap around and sector to be interpreted as a closed one #84874
Comments
Please use our bug template when reporting bugs. You need to edit this issue to include the information requested in https://github.com/zephyrproject-rtos/zephyr/blob/main/.github/ISSUE_TEMPLATE/001_bug_report.md |
Hi @Emill thank you for reporting this corner case bug. I will have a look at your proposal to fix this very soon |
Thanks for taking a look into this! Please also see my line comment at d4e246d#diff-bdbbc59aff22840e2b494ff2b05c3c2d0521617f259e84ead786a33e3f001f78R1229 for subsys/fs/zms/zms.c at line 1229. If you want to I can file a new separate issue for this. |
That one should be fixed as well. thanks |
1 similar comment
That one should be fixed as well. thanks |
It appears to be the same issue here: Line 1048 in e338657
|
Describe the bug
Consider a ZMS fs with > 2 sectors.
Assume sectors 0, 1 are closed, 2 is open and 3 is empty.
If sector 2 gets closed, garbage collection is initiated, moving items from sector 0 to sector 3. Assume a power cut happens during the move of items.
During re-initialization at the next boot, GC will be restarted. Before items are starting to be moved, the cycle count of sector 2 is incremented modulo 256.
Assume power cuts continue to happen during the GC is in progress 256 times in a row. At that time, the cycle count will eventually have wrapped around and now be equal to the cycle count of the close ATE, which now indicates that this is a valid closed sector instead of an empty one. That means all sectors will be marked as closed the next time the ZMS is initialized and thus cause corruption.
Suggested fix: in
zms_add_empty_ate
, make sure that the cycle counter that is to be written is not equal to whatever value is currently in the close ATE slot.In fact, I don't really see right now why 256 different values are needed. It seems that 2 values (0 or 1) would be enough, and every
zms_add_empty_ate
could simply set to the cycle counter to the opposite of the one present in the close ATE.To Reproduce
Initialize a system as above, set breakpoints in the flash_write function to be able to easily cut power when appropriate and perform the mentioned steps.
Expected behavior
Memory should not become corrupt.
Impact
The memory becomes corrupt.
Logs and console output
N/A
Environment (please complete the following information):
Additional context
N/A
@rghaddab
The text was updated successfully, but these errors were encountered: