Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify the wait_for_xx bash functions #2440

Merged
merged 3 commits into from
Sep 18, 2020
Merged

Conversation

chizhg
Copy link
Member

@chizhg chizhg commented Sep 17, 2020

What this PR does, why we need it:

  • Use kubectl wait pod --for=condition=Ready -n "$1" -l '!job-name' --timeout=5m for wait_for_pods_running
  • Use kubectl wait job --for=condition=Complete --all -n "$1" for wait_until_batch_job_complete

Which issue(s) this PR fixes:
Fixes #1611

@knative-prow-robot
Copy link
Collaborator

@chizhg: GitHub didn't allow me to request PR reviews from the following users: chizhg.

Note that only knative members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

What this PR does, why we need it:

Which issue(s) this PR fixes:
Fixes #

Special notes to reviewers:

User-visible changes in this PR:

/cc

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@knative-prow-robot knative-prow-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 17, 2020
@googlebot googlebot added the cla: yes Indicates the PR's author has signed the CLA. label Sep 17, 2020
@knative-prow-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: chizhg

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow-robot knative-prow-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Sep 17, 2020
@chizhg chizhg changed the title [WIP] test Simplify the wait_for_xx bash functions Sep 18, 2020
@knative-prow-robot knative-prow-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 18, 2020
@chizhg
Copy link
Member Author

chizhg commented Sep 18, 2020

/cc @chaodaiG

fi
return 1
echo "Waiting until all pods in namespace $1 are up"
kubectl wait pod --for=condition=Ready -n "$1" -l '!job-name' --timeout=5m || return 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make sure that Ready covers Running, Completed, etc right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out.

Actually the function name is not correct - other than wait_until_pods_running it also wait_until_job_pods_completed. But since the function is being used everywhere, let's also wait for the job pods to be completed to avoid regressions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So my $0.02, we should just use this one line: Ready and -l '!jobname'.

This doesn't properly handle Job pods today, which are what result in Completed because we need one pod to be Completed, not all (this waits for all).

@chaodaiG
Copy link
Contributor

I believe you have tested it already, so
/lgtm
/hold
In case my comment makes sense to you

@knative-prow-robot knative-prow-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 18, 2020
@knative-prow-robot knative-prow-robot added lgtm Indicates that a PR is ready to be merged. and removed lgtm Indicates that a PR is ready to be merged. labels Sep 18, 2020
kubectl wait pod --for=condition=Ready -n "$1" -l '!job-name' --timeout=5m || return 1
# Also wait for all the job pods to be completed.
# This is mainly for backward compatibility and should not throw out an error is there is no job in the ns.
kubectl wait job --for=condition=Complete --all -n "$1" --timeout=5m || true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be Completed (with a d), I believe.

We definitely don't want --all with Completed, maybe -ljobname (no !).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This kubectl command I added is waiting on job instead of pod

The status of a completed job pod is like this:

conditions:
  - lastProbeTime: null
    lastTransitionTime: "2020-09-17T19:41:07Z"
    reason: PodCompleted
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2020-09-17T19:41:21Z"
    reason: PodCompleted
    status: "False"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2020-09-17T19:41:21Z"
    reason: PodCompleted
    status: "False"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2020-09-17T19:41:07Z"
    status: "True"
    type: PodScheduled

It does not have a Completed condition, so the command will simply fail:

chizhg@chizhg-macbookpro:~$ kubectl wait pod --for=Completed -n istio-system -l 'job-name' --timeout=5m
error: unrecognized condition: "Completed"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

misread here as well, thanks for clarifying 🤦

echo -e "\n\nERROR: timeout waiting for jobs to complete\n${jobs}"
return 1
echo "Waiting until all batch jobs in namespace $1 run to completion."
kubectl wait job --for=condition=Complete --all -n "$1" --timeout=5m || return 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-ljobname as well

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This command is waiting on job, I have tested the command on my cluster and it works:

chizhg@chizhg-macbookpro:~$ kubectl wait job --for=condition=Complete --all -n istio-system --timeout=5m
job.batch/istio-init-crd-10-1.4.6 condition met
job.batch/istio-init-crd-11-1.4.6 condition met
job.batch/istio-init-crd-14-1.4.6 condition met
job.batch/istio-security-post-install-1.4.6 condition met

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, not pods, I misread

Copy link
Member

@mattmoor mattmoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@knative-prow-robot knative-prow-robot added the lgtm Indicates that a PR is ready to be merged. label Sep 18, 2020
@chizhg
Copy link
Member Author

chizhg commented Sep 18, 2020

/unhold

@knative-prow-robot knative-prow-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 18, 2020
@knative-prow-robot knative-prow-robot merged commit 22269d5 into knative:master Sep 18, 2020
@chizhg chizhg deleted the test branch September 18, 2020 02:23
chizhg added a commit to chizhg/test-infra that referenced this pull request Sep 20, 2020
knative-prow-robot pushed a commit that referenced this pull request Sep 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cla: yes Indicates the PR's author has signed the CLA. lgtm Indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
5 participants