Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ci] Enabled slurmdbd in docker compose #3391

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

teojgo
Copy link
Contributor

@teojgo teojgo commented Feb 13, 2025

Closes #3382

Signed-off-by: Theofilos Manitaras <[email protected]>
@teojgo teojgo requested a review from vkarak February 13, 2025 15:55
@teojgo teojgo self-assigned this Feb 13, 2025
@vkarak vkarak added this to the ReFrame 4.8 milestone Feb 13, 2025
Copy link
Contributor

@vkarak vkarak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me except that I think that there should be a dependency to prevent fronted to start before any other Slurm containers. If you pull and create from scratch all the Docker compose containers and try to run with the slurm backend one of the tutorial examples, you will get:

sbatch: error: Batch job submission failed: Unable to contact slurm controller (connect failure)

If you shutdown and rerun docker compose, frontend will be brought up at the end and the test will run fine.

@teojgo
Copy link
Contributor Author

teojgo commented Feb 16, 2025

Looks good to me except that I think that there should be a dependency to prevent fronted to start before any other Slurm containers. If you pull and create from scratch all the Docker compose containers and try to run with the slurm backend one of the tutorial examples, you will get:

sbatch: error: Batch job submission failed: Unable to contact slurm controller (connect failure)

If you shutdown and rerun docker compose, frontend will be brought up at the end and the test will run fine.

Yes, I need to look at it a little more to ensure that the slurm cluster is up before going on and add a health check for this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

Enrich docker-compose based unit testing to cover Slurm with accounting
2 participants