Skip to content

Github and Build Cop Rotation

David Oppenheimer edited this page Nov 27, 2015 · 20 revisions

Kubernetes "Github and Build-cop" Rotation

Preqrequisites

Traffic sources and responsibilities

  • GitHub https://github.com/kubernetes/kubernetes/issues and https://github.com/kubernetes/kubernetes/pulls: Your job is to be the first responder to all new issues and PRs. If you are not equipped to do this (which is fine!), it is your job to seek guidance!
    • Support issues should be closed and redirected to Stackoverflow (see example response below).
    • All incoming issues should be tagged with a team label (team/{api,ux,control-plane,node,cluster,csi,redhat,mesosphere,gke,release-infra,test-infra,none}); for issues that overlap teams, you can use multiple team labels
      • There is a related concept of "Github teams" which allow you to @ mention a set of people; feel free to @ mention a Githuub team if you wish, but this is not a substitute for adding a team/* label, which is required
      • If the issue is reporting broken builds, broken e2e tests, or other obvious P0 issues, label the issue with priority/P0 and assign it to someone. This is the only situation in which you should add a priority/* label
        • non-P0 issues do not need a reviewer assigned initially
    • All incoming PRs should be assigned a reviewer.
    • Keep in mind that you can @ mention people in an issue/PR to bring it to their attention without assigning it to them. You can also @ mention github teams, such as @kubernetes/goog-ux or @kubernetes/kubectl
    • If you need help triaging an issue or PR, consult with (or assign it to) @brendandburns, @thockin, @bgrant0607, @quinton-hoole, @davidopp, @dchen1107, @lavalamp (all U.S. Pacific Time) or @fgrzadkowski (Central European Time).
    • At the beginning of your shift, please add team/* labels to any issues that have fallen through the cracks and don't have one. Likewise, be fair to the next person in rotation: try to ensure that every issue that gets filed while you are on duty is handled. The Github query to find issues with no team/* label is: here.

Example response for support issues:

Please re-post your question to [stackoverflow](http://stackoverflow.com/questions/tagged/kubernetes). 

We are trying to consolidate the channels to which questions for help/support are posted so that we can improve our efficiency in responding to your requests, and to make it easier for you to find answers to frequently asked questions and how to address common use cases. 

We regularly see messages posted in multiple forums, with the full response thread only in one place or, worse, spread across multiple forums. Also, the large volume of support issues on github is making it difficult for us to use issues to identify real bugs.

The Kubernetes team scans stackoverflow on a regular basis, and will try to ensure your questions don't go unanswered.

Before posting a new question, please search stackoverflow for answers to similar questions, and also familiarize yourself with:
   * [the user guide](http://kubernetes.io/v1.0/)
   * [the troubleshooting guide](http://kubernetes.io/v1.0/docs/troubleshooting.html)

Again, thanks for using Kubernetes.

The Kubernetes Team

Build-copping

  • The merge-bot submit queue (source) should auto-merge all eligible PR's for you once they've passed all the relevant checks mentioned below and all [critical e2e tests] (https://goto.google.com/k8s-test/view/Critical%20Builds/) are passing. If the merge-bot been disabled for some reason, or tests are failing, you might need to do some manual merging to get things back on track.
  • It's a good idea to check the flaky test builds once a day or so; if they are timing out, clusters are failing to start, or tests are consistently failing (instead of just flaking), file an issue to get things back on track.
  • If you are a weekday oncall, ensure that PRs confirming to the following pre-requisites are being merged at a reasonable rate:
    • Have been LGTMd
    • Pass Travis and Shippable.
    • Author has signed CLA if applicable.
  • If you are a weekend oncall, never merge PRs manually, instead add the label "lgtm" to the PRs once they have been LGTMd and passed Travis and Shippable; this will cause merge-bot to merge them automatically (or make them easy to find by the next oncall, who will merge them).
  • When the build is broken, roll back the PRs responsible ASAP
  • When E2E tests are unstable, a "merge freeze" may be instituted. During a merge freeze:
    • Who ever a PR is assigned to for review, should only label it "lgtm" but not merge it.
    • Oncall should slowly merge LGTMd changes throughout the day while monitoring E2E to ensure stability.
    • Ideally the E2E run should be green, but some tests are flaky and can fail randomly (not as a result of a particular change).
      • If a large number of tests fail, or tests that normally pass fail, that is an indication that one or more of the PR(s) in that build might be problematic (and should be reverted).
      • Use the Test Results Analyzer to see individual test history over time.

Contact information

@k8s-oncall will reach the current person on call.