-
Notifications
You must be signed in to change notification settings - Fork 307
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DAOS-16927 pool: Fix upgrade DER_NOTSUPPORTEDs #15873
base: master
Are you sure you want to change the base?
Conversation
Ticket title is 'Special pool and container handles should be stored persistently in RDBs' |
If a pool upgrade encounters an error in pool_upgrade_props, the upgrade global version remains at the old global version, while the upgrade status changes to FAILED. In this state, a repeated pool upgrade will fail with the following error in the PS leader engine log: ds_pool_upgrade_if_needed() b377db55: upgrading pool 3 -> 4 is unsupported because pool upgraded to 3 last time failed This patch implements a quick fix that ensures the upgrade global version is updated to the new global version when the upgrade status changes to FAILED, so that a repeated upgrade command will attempt to upgrade the pool again. Signed-off-by: Li Wei <[email protected]> Required-githooks: true
Due to extremely long CI response times, I'm requesting reviews before CI results come in. The PR has passed my manual upgrade testing though. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing issue.
@wangshilong, thank you for the guidance. |
* Currently, the upgrade global version may have not been updated yet, if | ||
* pool_upgrade_props has encountered an error. | ||
*/ | ||
d_iov_set(&value, &global_version, sizeof(global_version)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No change requested. Question (I must have forgotten how interop/upgrade is supposed to work) - is there logic anywhere that enforces an upgrade of one layout version at a time? For example, before an upgrade attempt, if we have a pool with global_version=2, upgrade_version=2, and the software installation has DAOS_POOL_VERSION=4 - does the upgrade code just try to upgrade the pool to the latest DAOS_POOL_VERSION(4), or is some logic in place to force an upgrade to global version 2 first, then another upgrade to global version 3?
If a pool upgrade encounters an error in pool_upgrade_props, the upgrade global version remains at the old global version, while the upgrade status changes to FAILED. In this state, a repeated pool upgrade will fail with the following error in the PS leader engine log:
This patch implements a quick fix that ensures the upgrade global version is updated to the new global version when the upgrade status changes to FAILED, so that a repeated upgrade command will attempt to upgrade the pool again.
Before requesting gatekeeper:
Features:
(orTest-tag*
) commit pragma was used or there is a reason documented that there are no appropriate tags for this PR.Gatekeeper: