Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFE] Restraint should allow user to specify the action upon git fetch task failure #288

Open
jirka-h opened this issue Feb 21, 2023 · 6 comments

Comments

@jirka-h
Copy link

jirka-h commented Feb 21, 2023

Hello,

we have migrated the Linux Kernel Install task to the "git fetch" flow. When testing, we have run into this problem:

  1. When "git fetch" URL is inaccessible, for example, due to the service outage, the kernel_install task is marked as Aborted.
  2. However, the Beaker job has continued, executing the follow-up tasks with the wrong kernel

In such a situation, we need the Beaker job to be aborted immediately. We are looking for a way to specify for the git fetch task an action when it fails to download the test.

Thanks a lot

Jirka

@jirka-h
Copy link
Author

jirka-h commented Feb 22, 2023 via email

@idorax
Copy link

idorax commented Feb 23, 2023

Hi @jirka-h, you're welcome. I'll discuss with @jbastian about this RFE since it's really needed.

@wackrat
Copy link

wackrat commented Mar 1, 2023

The request here should be to abort the recipe in which the task fails, not the entire job (which might have other recipes which don't have the same problem; if they do, they can abort themselves).

@jirka-h
Copy link
Author

jirka-h commented Mar 1, 2023

Hi Eirik,

yes, you are right! We want to abort the recipe here. Thanks for pointing this out!

Jirka

@lulinqing
Copy link
Contributor

+1 on adding this "kill-switch", even per recipe (if per task is too much to ask).
We are relying on fetch-url to perform some key/core tasks instead of legacy task rpm, so this is critical.

Thanks in advance!

@lulinqing
Copy link
Contributor

lulinqing commented Apr 19, 2023

+1 on adding this "kill-switch", even per recipe (if per task is too much to ask). We are relying on fetch-url to perform some key/core tasks instead of legacy task rpm, so this is critical.

Thanks in advance!

BTW it also makes sense to increase the number of fetch-url retries, as well as longer delay (incrementally) before each retry.
It's a common practice to mitigate temporary connection issues - with minimum change needed in restraint code.

tcler added a commit to tcler/nrestraint that referenced this issue May 7, 2024
see: restraint-harness#288
and support customize the task fetch options in param: _fetch_opts:
retry=N
timeo=N
keepchanges
abort_recipe|abort_recipe_when_fetch_fail

e.g: If the key task /kernel/install fetch fails, subsequent tasks
will not run:
<task name="/kernel/install">
  <fetch url="http://a.b.c/path/archive.tgz#rpath"/>
  <params>
    <param name="_fetch_opts" value="retry=16,timeo=16,abort_recipe"/>
    <param name="KERNEL_VER" value="kernel-6.7.0-119.el10"/>
  </params>
</task>
<task name="/kernel/fs/nfs/test1">
  <fetch url="http://a.b.c/path/archive.tgz#rpath1"/>
  <params>
    <param name="_fetch_opts" value="retry=8,keepchanges"/>
    <param name="KERNEL_VER" value="kernel-6.7.0-119.el10"/>
  </params>
</task>

Q: why add these options in task params?
A: because beaker job xml schema has not accept the new xml tag,
so as a workaround I put these fetch options in task->params

Signed-off-by: Jianhong Yin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants