Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(WIP) Add retry of EVE image update to API #562

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

sadov
Copy link
Contributor

@sadov sadov commented Mar 23, 2021

If EVE image update fails due to some fluke (such as a power outage) we can currently retry by doing a eveimage-remove, wait until the partition in EVE is marked as "unused", and then redo the eveimage-update.

That three step is hard for the controller to do.

We can add a retry/version count for the EVE images.
One way to do that is to use the Version field in
message BaseOSConfig {
UUIDandVersion uuidandversion = 1;

and if that version changes we would ignore the fact that the image exists as INPROGRESS in some partition.

Copy link
Contributor

@rvs rvs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Should we merge this @mydatascience ?

Copy link
Contributor

@eriknordmark eriknordmark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. But don't we also need to change EVE to react to the new version number?

@sadov
Copy link
Contributor Author

sadov commented Apr 9, 2021

Yes, this test just repeat the use case with interrupted update. We need to change EVE and then change test for checking of reupdate.

@eriknordmark
Copy link
Contributor

Yes, this test just repeat the use case with interrupted update. We need to change EVE and then change test for checking of reupdate.

Are you doing a PR against EVE for that, or should we create a separate story for that?

@sadov
Copy link
Contributor Author

sadov commented Apr 9, 2021

Are you doing a PR against EVE for that, or should we create a separate story for that?

The baseos update machinery in EVE (and especially the suggestion to reuse UUIDandVersion to indicate re-update) is not clear for me. Maybe a separate story would make sense.

@eriknordmark
Copy link
Contributor

See lf-edge/eve#2013

@sadov
Copy link
Contributor Author

sadov commented Apr 13, 2021

@eriknordmark tests updated accordingly lf-edge/eve#2013:

  • tests/update_eve_image/testdata/reupdate_eve_image.txt
  • tests/update_eve_image/testdata/reupdate_eve_image_oci.txt

This time I get a config with information about the current baseos update, then I send that config with an empty "base" array, and then I send the first one again.

At this point, the tests have failed because an automatic reboot has not been performed after a configuration change. Logs:
reupdate_eve_image.048.log
reupdate_eve_image_oci.049.log

Maybe in this case we need to change something else in the EVE config?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants