Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support JobBackoffLimitPerIndex feature gate fields #2421

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .changelog/2421.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
```release-note:enhancement
Add `backoff_per_limit_index` and `max_failed_indexes` fields in `structure_job.go`
```
2 changes: 2 additions & 0 deletions docs/resources/cron_job.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,8 @@ Optional:

- `active_deadline_seconds` (Number) Optional duration in seconds the pod may be active on the node relative to StartTime before the system will actively try to mark it failed and kill associated containers. Value must be a positive integer.
- `backoff_limit` (Number) Specifies the number of retries before marking this job failed. Defaults to 6
- `backoff_limit_per_index` - (Number) Specifies the limit for the number of retries within an index before marking this index as failed. When enabled the number of failures per index is kept in the pod's batch.kubernetes.io/job-index-failure-count annotation. It can only be set when Job's completionMode=Indexed, and the Pod's restart policy is Never. The field is immutable.
- `max_failed_indexes` - (Number) Controls generation of pod labels and pod selectors. Leave unset unless you are certain what you are doing. When false or unset, the system pick labels unique to this job and appends those labels to the pod template. When true, the user is responsible for picking unique labels and specifying the selector. Failure to pick a unique label may cause this and other jobs to not function correctly. More info: https://git.k8s.io/community/contributors/design-proposals/selector-generation.md
- `completion_mode` (String) Specifies how Pod completions are tracked. It can be `NonIndexed` (default) or `Indexed`. More info: https://kubernetes.io/docs/concepts/workloads/controllers/job/#completion-mode
- `completions` (Number) Specifies the desired number of successfully finished pods the job should be run with. Setting to nil means that the success of any pod signals the success of all pods, and allows parallelism to have any positive value. Setting to 1 means that parallelism is limited to 1 and the success of that pod signals the success of the job. More info: https://kubernetes.io/docs/concepts/workloads/controllers/jobs-run-to-completion/
- `manual_selector` (Boolean) Controls generation of pod labels and pod selectors. Leave unset unless you are certain what you are doing. When false or unset, the system pick labels unique to this job and appends those labels to the pod template. When true, the user is responsible for picking unique labels and specifying the selector. Failure to pick a unique label may cause this and other jobs to not function correctly. More info: https://git.k8s.io/community/contributors/design-proposals/selector-generation.md
Expand Down
2 changes: 2 additions & 0 deletions docs/resources/cron_job_v1.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,8 @@ Optional:

- `active_deadline_seconds` (Number) Optional duration in seconds the pod may be active on the node relative to StartTime before the system will actively try to mark it failed and kill associated containers. Value must be a positive integer.
- `backoff_limit` (Number) Specifies the number of retries before marking this job failed. Defaults to 6
- `backoff_limit_per_index` - (Number) Specifies the limit for the number of retries within an index before marking this index as failed. When enabled the number of failures per index is kept in the pod's batch.kubernetes.io/job-index-failure-count annotation. It can only be set when Job's completionMode=Indexed, and the Pod's restart policy is Never. The field is immutable.
- `max_failed_indexes` - (Number) Controls generation of pod labels and pod selectors. Leave unset unless you are certain what you are doing. When false or unset, the system pick labels unique to this job and appends those labels to the pod template. When true, the user is responsible for picking unique labels and specifying the selector. Failure to pick a unique label may cause this and other jobs to not function correctly. More info: https://git.k8s.io/community/contributors/design-proposals/selector-generation.md
- `completion_mode` (String) Specifies how Pod completions are tracked. It can be `NonIndexed` (default) or `Indexed`. More info: https://kubernetes.io/docs/concepts/workloads/controllers/job/#completion-mode
- `completions` (Number) Specifies the desired number of successfully finished pods the job should be run with. Setting to nil means that the success of any pod signals the success of all pods, and allows parallelism to have any positive value. Setting to 1 means that parallelism is limited to 1 and the success of that pod signals the success of the job. More info: https://kubernetes.io/docs/concepts/workloads/controllers/jobs-run-to-completion/
- `manual_selector` (Boolean) Controls generation of pod labels and pod selectors. Leave unset unless you are certain what you are doing. When false or unset, the system pick labels unique to this job and appends those labels to the pod template. When true, the user is responsible for picking unique labels and specifying the selector. Failure to pick a unique label may cause this and other jobs to not function correctly. More info: https://git.k8s.io/community/contributors/design-proposals/selector-generation.md
Expand Down
2 changes: 2 additions & 0 deletions docs/resources/job_v1.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,8 @@ Optional:

- `active_deadline_seconds` (Number) Optional duration in seconds the pod may be active on the node relative to StartTime before the system will actively try to mark it failed and kill associated containers. Value must be a positive integer.
- `backoff_limit` (Number) Specifies the number of retries before marking this job failed. Defaults to 6
- `backoff_limit_per_index` - (Number) Specifies the limit for the number of retries within an index before marking this index as failed. When enabled the number of failures per index is kept in the pod's batch.kubernetes.io/job-index-failure-count annotation. It can only be set when Job's completionMode=Indexed, and the Pod's restart policy is Never. The field is immutable.
- `max_failed_indexes` - (Number) Controls generation of pod labels and pod selectors. Leave unset unless you are certain what you are doing. When false or unset, the system pick labels unique to this job and appends those labels to the pod template. When true, the user is responsible for picking unique labels and specifying the selector. Failure to pick a unique label may cause this and other jobs to not function correctly. More info: https://git.k8s.io/community/contributors/design-proposals/selector-generation.md
- `completion_mode` (String) Specifies how Pod completions are tracked. It can be `NonIndexed` (default) or `Indexed`. More info: https://kubernetes.io/docs/concepts/workloads/controllers/job/#completion-mode
- `completions` (Number) Specifies the desired number of successfully finished pods the job should be run with. Setting to nil means that the success of any pod signals the success of all pods, and allows parallelism to have any positive value. Setting to 1 means that parallelism is limited to 1 and the success of that pod signals the success of the job. More info: https://kubernetes.io/docs/concepts/workloads/controllers/jobs-run-to-completion/
- `manual_selector` (Boolean) Controls generation of pod labels and pod selectors. Leave unset unless you are certain what you are doing. When false or unset, the system pick labels unique to this job and appends those labels to the pod template. When true, the user is responsible for picking unique labels and specifying the selector. Failure to pick a unique label may cause this and other jobs to not function correctly. More info: https://git.k8s.io/community/contributors/design-proposals/selector-generation.md
Expand Down
105 changes: 105 additions & 0 deletions kubernetes/resource_kubernetes_cron_job_v1_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -222,6 +222,61 @@ func TestAccKubernetesCronJobV1_minimalWithPodFailurePolicy(t *testing.T) {
})
}

func TestAccKubernetesCronJobV1_minimalWithBackoffLimitPerIndex(t *testing.T) {
var conf1, conf2 batchv1.CronJob

name := fmt.Sprintf("tf-acc-test-%s", acctest.RandStringFromCharSet(10, acctest.CharSetAlphaNum))
resourceName := "kubernetes_cron_job_v1.test"
imageName := busyboxImage

resource.ParallelTest(t, resource.TestCase{
PreCheck: func() {
testAccPreCheck(t)
skipIfClusterVersionLessThan(t, "1.29.0")
},
IDRefreshName: resourceName,
IDRefreshIgnore: []string{"metadata.0.resource_version"},
ProviderFactories: testAccProviderFactories,
CheckDestroy: testAccCheckKubernetesCronJobV1Destroy,
Steps: []resource.TestStep{
{
Config: testAccKubernetesCronJobV1ConfigMinimal(name, imageName),
Check: resource.ComposeAggregateTestCheckFunc(
testAccCheckKubernetesCronJobV1Exists(resourceName, &conf1),
resource.TestCheckResourceAttrSet(resourceName, "metadata.0.generation"),
resource.TestCheckResourceAttrSet(resourceName, "metadata.0.resource_version"),
resource.TestCheckResourceAttrSet(resourceName, "metadata.0.uid"),
resource.TestCheckResourceAttrSet(resourceName, "metadata.0.namespace"),
resource.TestCheckResourceAttr(resourceName, "spec.0.job_template.0.metadata.0.namespace", ""),
),
},
{
Config: testAccKubernetesCronJobV1ConfigMinimalWithBackoffLimitPerIndex(name, imageName),
Check: resource.ComposeAggregateTestCheckFunc(
testAccCheckKubernetesCronJobV1Exists(resourceName, &conf2),
resource.TestCheckResourceAttrSet(resourceName, "metadata.0.generation"),
resource.TestCheckResourceAttrSet(resourceName, "metadata.0.resource_version"),
resource.TestCheckResourceAttrSet(resourceName, "metadata.0.uid"),
resource.TestCheckResourceAttrSet(resourceName, "metadata.0.namespace"),
resource.TestCheckResourceAttr(resourceName, "spec.0.job_template.0.spec.0.backoff_limit_per_index", "3"),
resource.TestCheckResourceAttr(resourceName, "spec.0.job_template.0.spec.0.max_failed_indexes", "4"),
resource.TestCheckResourceAttr(resourceName, "spec.0.job_template.0.spec.0.pod_failure_policy.0.rule.#", "2"),
resource.TestCheckResourceAttr(resourceName, "spec.0.job_template.0.spec.0.pod_failure_policy.0.rule.0.action", "FailJob"),
resource.TestCheckResourceAttr(resourceName, "spec.0.job_template.0.spec.0.pod_failure_policy.0.rule.0.on_exit_codes.0.container_name", "test"),
resource.TestCheckResourceAttr(resourceName, "spec.0.job_template.0.spec.0.pod_failure_policy.0.rule.0.on_exit_codes.0.values.#", "3"),
resource.TestCheckResourceAttr(resourceName, "spec.0.job_template.0.spec.0.pod_failure_policy.0.rule.0.on_exit_codes.0.values.0", "1"),
resource.TestCheckResourceAttr(resourceName, "spec.0.job_template.0.spec.0.pod_failure_policy.0.rule.0.on_exit_codes.0.values.1", "2"),
resource.TestCheckResourceAttr(resourceName, "spec.0.job_template.0.spec.0.pod_failure_policy.0.rule.0.on_exit_codes.0.values.2", "42"),
resource.TestCheckResourceAttr(resourceName, "spec.0.job_template.0.spec.0.pod_failure_policy.0.rule.1.action", "Ignore"),
resource.TestCheckResourceAttr(resourceName, "spec.0.job_template.0.spec.0.pod_failure_policy.0.rule.1.on_pod_condition.0.type", "DisruptionTarget"),
resource.TestCheckResourceAttr(resourceName, "spec.0.job_template.0.spec.0.pod_failure_policy.0.rule.1.on_pod_condition.0.status", "False"),
testAccCheckKubernetesCronJobV1ForceNew(&conf1, &conf2, true),
),
},
},
})
}

func testAccCheckKubernetesCronJobV1Destroy(s *terraform.State) error {
conn, err := testAccProvider.Meta().(KubeClientsets).MainClientset()

Expand Down Expand Up @@ -461,6 +516,56 @@ func testAccKubernetesCronJobV1ConfigMinimal(name, imageName string) string {
`, name, imageName)
}

func testAccKubernetesCronJobV1ConfigMinimalWithBackoffLimitPerIndex(name, imageName string) string {
return fmt.Sprintf(`resource "kubernetes_cron_job_v1" "test" {
metadata {
name = "%s"
}
spec {
schedule = "*/1 * * * *"
job_template {
metadata {}
spec {
backoff_limit_per_index = 3
max_failed_indexes = 4
completions = 4
completion_mode = "Indexed"
pod_failure_policy {
rule {
action = "FailJob"
on_exit_codes {
container_name = "test"
operator = "In"
values = [1, 2, 42]
}
}
rule {
action = "Ignore"
on_pod_condition {
status = "False"
type = "DisruptionTarget"
}
}
}
template {
metadata {}
spec {
container {
name = "test"
image = "%s"
command = ["sleep", "5"]
}
termination_grace_period_seconds = 1
}
}
}
}
}
}
`, name, imageName)
}

func testAccKubernetesCronJobV1ConfigMinimalWithPodFailurePolicy(name, imageName string) string {
return fmt.Sprintf(`resource "kubernetes_cron_job_v1" "test" {
metadata {
Expand Down
14 changes: 14 additions & 0 deletions kubernetes/schema_job_spec.go
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,13 @@ func jobSpecFields(specUpdatable bool) map[string]*schema.Schema {
ValidateFunc: validateNonNegativeInteger,
Description: "Specifies the number of retries before marking this job failed. Defaults to 6",
},
"backoff_limit_per_index": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there's errors with the PR specifically due to the default value being used. Could you address these failures as well as include a test case for these new fields?

This is now supported with the feature gate being set to true by default. Let me know if there's anything unclear from adding tests.

 === NAME  TestAccKubernetesCronJobV1_minimalWithPodFailurePolicy
    resource_kubernetes_cron_job_v1_test.go:179: Step 1/2 error: Error running apply: exit status 1
        
        Error: CronJob.batch "tf-acc-test-jk4onc8cdw" is invalid: [spec.jobTemplate.spec.backoffLimitPerIndex: Invalid value: 6: requires indexed completion mode, spec.jobTemplate.spec.maxFailedIndexes: Invalid value: 6: requires indexed completion mode]
        
          with kubernetes_cron_job_v1.test,
          on terraform_plugin_test.tf line 1, in resource "kubernetes_cron_job_v1" "test":
           1: resource "kubernetes_cron_job_v1" "test" {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any help needed on this @theloneexplorerquest ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a heads-up I'll be wrapping this up to get it merged as this is part of our v2.33.0 milestone @theloneexplorerquest

Type: schema.TypeInt,
Optional: true,
ForceNew: true,
ValidateFunc: validateNonNegativeInteger,
Description: "Specifies the limit for the number of retries within an index before marking this index as failed. When enabled the number of failures per index is kept in the pod's batch.kubernetes.io/job-index-failure-count annotation. It can only be set when Job's completionMode=Indexed, and the Pod's restart policy is Never. The field is immutable.",
},
// This field is immutable in Jobs.
"completions": {
Type: schema.TypeInt,
Expand All @@ -83,6 +90,13 @@ func jobSpecFields(specUpdatable bool) map[string]*schema.Schema {
ForceNew: false,
Description: "Controls generation of pod labels and pod selectors. Leave unset unless you are certain what you are doing. When false or unset, the system pick labels unique to this job and appends those labels to the pod template. When true, the user is responsible for picking unique labels and specifying the selector. Failure to pick a unique label may cause this and other jobs to not function correctly. More info: https://git.k8s.io/community/contributors/design-proposals/selector-generation.md",
},
"max_failed_indexes": {
Type: schema.TypeInt,
Optional: true,
ForceNew: false,
ValidateFunc: validateNonNegativeInteger,
Description: "Controls generation of pod labels and pod selectors. Leave unset unless you are certain what you are doing. When false or unset, the system pick labels unique to this job and appends those labels to the pod template. When true, the user is responsible for picking unique labels and specifying the selector. Failure to pick a unique label may cause this and other jobs to not function correctly. More info: https://git.k8s.io/community/contributors/design-proposals/selector-generation.md",
},
"parallelism": {
Type: schema.TypeInt,
Optional: true,
Expand Down
16 changes: 16 additions & 0 deletions kubernetes/structure_job.go
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,14 @@ func flattenJobV1Spec(in batchv1.JobSpec, d *schema.ResourceData, meta interface
att["backoff_limit"] = *in.BackoffLimit
}

if in.BackoffLimitPerIndex != nil {
att["backoff_limit_per_index"] = *in.BackoffLimitPerIndex
}

if in.MaxFailedIndexes != nil {
att["max_failed_indexes"] = *in.MaxFailedIndexes
}

if in.Completions != nil {
att["completions"] = *in.Completions
}
Expand Down Expand Up @@ -79,6 +87,10 @@ func expandJobV1Spec(j []interface{}) (batchv1.JobSpec, error) {
obj.BackoffLimit = ptr.To(int32(v))
}

if v, ok := in["backoff_limit_per_index"].(int); in["completion_mode"] == "Indexed" && ok && v >= 0 {
obj.BackoffLimitPerIndex = ptr.To(int32(v))
}

if v, ok := in["completions"].(int); ok && v > 0 {
obj.Completions = ptr.To(int32(v))
}
Expand All @@ -92,6 +104,10 @@ func expandJobV1Spec(j []interface{}) (batchv1.JobSpec, error) {
obj.ManualSelector = ptr.To(v.(bool))
}

if v, ok := in["max_failed_indexes"].(int); in["completion_mode"] == "Indexed" && ok && v >= 0 {
obj.MaxFailedIndexes = ptr.To(int32(v))
}

if v, ok := in["parallelism"].(int); ok && v >= 0 {
obj.Parallelism = ptr.To(int32(v))
}
Expand Down
Loading