Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add integration tests for kueue for A3 high #3236

Open
wants to merge 9 commits into
base: develop
Choose a base branch
from

Conversation

ighosh98
Copy link
Contributor

@ighosh98 ighosh98 commented Nov 8, 2024

  • Added tests for integrating kueue and running kueue jobs on A3 high.
  • Ansible scripts for running kueue jobs is generic.
  • Added a new blueprint for A3 high over using sed as multiple lines of kueue configuration were added.

Submission Checklist

NOTE: Community submissions can take up to 2 weeks to be reviewed.

Please take the following actions before submitting this pull request.

  • Fork your PR branch from the Toolkit "develop" branch (not main)
  • Test all changes with pre-commit in a local branch #
  • Confirm that "make tests" passes all tests
  • Add or modify unit tests to cover code changes
  • Ensure that unit test coverage remains above 80%
  • Update all applicable documentation
  • Follow Cluster Toolkit Contribution guidelines #

@ighosh98 ighosh98 added test-enhancement Tests enhancement or coverage improvement release-improvements Added to release notes under the "Improvements" heading. labels Nov 8, 2024
@annuay-google annuay-google marked this pull request as draft November 11, 2024 06:40
cdunbar13 and others added 7 commits November 11, 2024 12:41
The use of check blocks requires Terraform 1.5 and above.
The gke-node-pool module uses older "attribute" syntax for the
GPU-related arguments that has been removed in the google Terraform
plugin 6.x. This commit replaces attribute syntax with block syntax.
The key to understanding this change is that a dynamic block iterating
over a list is equivalent to null when the list is empty (no dynamic
blocks are inserted).

The gpu_sharing_config and gpu_driver_installation_config settings are
not (and never were) list(object) in the Terraform plugin. They could only
ever taken on length 0 or 1. These are therefore being converted to object
format as they are in the API.

https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/container_cluster#nested_guest_accelerator
https://developer.hashicorp.com/terraform/language/attr-as-blocks
This commit fixes the documentation and examples to align with changes
introduced in a9c2a69 to make gke-node-pool module compatible with TPG
6.x.
@ighosh98 ighosh98 changed the title add integration tests for kueue for A3 high add TAS support, version bump kueue and add more kueue integration tests for A3 high Nov 11, 2024
@ighosh98 ighosh98 changed the title add TAS support, version bump kueue and add more kueue integration tests for A3 high add TAS support, version bump kueue and improve integration test coverage for A3 high Nov 11, 2024
@ighosh98 ighosh98 marked this pull request as ready for review November 13, 2024 12:52
@ighosh98
Copy link
Contributor Author

Changed to published as A3U integration tests will be directed at experimental branch. Aligned with @annuay-google

@ighosh98 ighosh98 changed the title add TAS support, version bump kueue and improve integration test coverage for A3 high add integration tests for kueue for A3 high Nov 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-improvements Added to release notes under the "Improvements" heading. test-enhancement Tests enhancement or coverage improvement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants