New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

k8s: improve logging and debugging #432

Open

mdonadoni opened this issue Feb 15, 2024 · 0 comments

Labels

priority/soon type/enhancement

Member

mdonadoni commented Feb 15, 2024

It's currently very hard to understand what goes wrong when a workflow gets stuck in the "running" phase.

Let's improve the logging (in particular of the job monitor) to clearly understand:

What events are coming from the cluster (e.g. pod evicted)
What actions are being taken (e.g. storing logs, setting as failed, skipping as job is still running)
Why these actions are being taken (e.g. cause of failure)

Some additional ideas:

make sure that id used in reana-run-job-<id> is the same as the job's id (no need for two different identifiers!)
if multiple ids are used to identify the same job then let's always print them together in the logs

The text was updated successfully, but these errors were encountered:

mdonadoni added type/enhancement priority/soon labels

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment