Skip to content

[JobInfo] Fix the retrieval of job info by making the SSM command to store the outputs on CloudWatch logs to prevent truncation. #405

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

gmarciani
Copy link
Collaborator

@gmarciani gmarciani commented Apr 2, 2025

Description

Fix the retrieval of job info by making the SSM command to store the outputs on CloudWatch logs to prevent truncation.
This change fixes #376

Screenshot 2025-04-02 at 4 48 08 PM

How Has This Been Tested?

Verified that PCUI is now able to show information when 200+ jobs are submitted.
In particular, tested with 9999 jobs, which is maximum amount of jobs in queue for a single node accepted by Slurm.

Current Limitation
Unit tests have been implemented, but commented out because they require the refactoring of the logging packages to prevent test failures, which is a more invasive change we want to decouple from this bugfix. This seems unreasonable, but actually caused by the fact that PCUI logging utilities clashes with the logging library of Python, disturbing pytest.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

…store the outputs on CloudWatch logs to prevent truncation.

This change fixes aws#376
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Large numbers of jobs cause slow loading and many error messages in Job status tab
1 participant