Skip to content

Add the activeDeadlineSeconds attribute to the backup pod #1109

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Funk66
Copy link

@Funk66 Funk66 commented Jan 2, 2025

The backup pod gets stuck while dumping the data from one of our clusters. It happens irregularly for about 1/3 of the hourly jobs. I have yet to figure out the issue - possibly a deadlock situation? In the meantime, I thought it'd be helpful if the pod gets removed automatically when the job hasn't succeeded in a reasonable amount of time. That way, the stuck pod isn't blocking the cronjob from scheduling new pods.
I'm opening this PR as a way of illustrating my request. I do not know if these code changes are complete and functional. I guess including a test for this newly added field would be desirable.

@mmontes11
Copy link
Member

mmontes11 commented Jan 23, 2025

I have yet to figure out the issue - possibly a deadlock situation?

Just to let you know, we have merged a new change to prevent deadlocks while backups are being taken:

This will be shortly released in 0.37.0.

Regarding next steps in your PR, you need to set the activeDeadlineSeconds in the backup `Job here:

job := &batchv1.Job{

Copy link

github-actions bot commented Apr 7, 2025

This PR is stale because it has been open 60 days with no activity.

@github-actions github-actions bot added the stale label Apr 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants