Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow keeping input folder during job cleanup #199

Merged
merged 3 commits into from
Jul 18, 2024

Conversation

vieting
Copy link
Contributor

@vieting vieting commented Jul 17, 2024

When sisyphus cleans up a job folder, the input folder is deleted. This can be impractical if you want to find some job's dependencies because instead of just going through the input folders, you need to open each finished.tar.gz file and check for input jobs in the info file. I therefore propose to add the config option JOB_CLEANUP_KEEP_INPUT. It set to True, the input folder will remain even after cleaning.

Edit: As suggested in the PR, we set JOB_CLEANUP_KEEP_INPUT = True as the default setting. This changes the default behavior, but is not expected to hurt anyone.

Copy link
Contributor

@michelwi michelwi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes the default behavior from "add input folder to tar archive" to "delete input folder". But I even slightly prefer the new behavior.

@vieting
Copy link
Contributor Author

vieting commented Jul 18, 2024

This changes the default behavior from "add input folder to tar archive" to "delete input folder". But I even slightly prefer the new behavior.

Right, should have mentioned this in the description. I'd expect that an input folder in the tar archive is hardly ever used. But we can also keep it in there if others prefer that.

@albertz
Copy link
Member

albertz commented Jul 18, 2024

you need to open each finished.tar.gz file and check for input jobs in the info file

I'm not sure I fully understand. You don't need to open the finished.tar.gz. You can just open the info file directly, which is not removed?

sisyphus/job.py Outdated Show resolved Hide resolved
sisyphus/global_settings.py Outdated Show resolved Hide resolved
sisyphus/job.py Outdated Show resolved Hide resolved
@vieting
Copy link
Contributor Author

vieting commented Jul 18, 2024

you need to open each finished.tar.gz file and check for input jobs in the info file

I'm not sure I fully understand. You don't need to open the finished.tar.gz. You can just open the info file directly, which is not removed?

Right, the info file is not inside the tar archive. Still, I prefer to easily step through the input dependencies by going through the input folders. This is way easier than checking the input in the info file, then going to that job's info file and so forth.

Copy link
Member

@albertz albertz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would mention in the description of the commit and PR that this changes the default behavior. But I'm fine with that.

I'm also fine with the other changes otherwise.

@vieting vieting merged commit a22e923 into master Jul 18, 2024
3 checks passed
@vieting vieting deleted the peter_cleanup_keep_input branch July 18, 2024 13:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants