This collects test results scattered across a variety of GCS buckets, stores them in a local SQLite database, and outputs newline-delimited JSON files for import into BigQuery.
Results are stored in the k8s-gubernator:build BigQuery dataset, which is publicly accessible.
Use pip install -r requirements.txt
to install dependencies.
Kettle runs as a pod in the k8s-gubernator/g8r
cluster
If you change:
buckets.yaml
: do nothing, it's automatically fetched from GitHubdeployment.yaml
: deploy withmake push deploy
- any code: deploy with
make push update
, revert withmake rollback
if it fails
eg: by looking at the logs
make get-cluster-credentials
kubectl logs -l app=kettle
# ...
==== 2018-07-06 08:19:05 PDT ========================================
PULLED 174
ACK irrelevant 172
EXTEND-ACK 2
gs://kubernetes-jenkins/pr-logs/pull/kubeflow_kubeflow/1136/kubeflow-presubmit/2385 True True 2018-07-06 07:51:49 PDT FAILED
gs://kubernetes-jenkins/logs/ci-cri-containerd-e2e-ubuntu-gce/5742 True True 2018-07-06 07:44:17 PDT FAILURE
ACK "finished.json" 2
Downloading JUnit artifacts.
Alternatively, navigate to Gubernator BigQuery page (click on “all” on the left and “Details”) and you can see a table showing last date/time the metrics were collected.
kubectl delete pod -l app=kettle
kubectl rollout status deployment/kettle # monitor pod restart status
kubectl get pod -l app=kettle # should show a new pod name
You can watch the pod startup and collect data from various GCS buckets by looking at its logs
kubectl logs -f $(kubectl get pod -l app=kettle -oname)
It might take a couple of hours to be fully functional and start updating BigQuery. You can always go back to the Gubernator BigQuery page and check to see if data collection has resumed. Backfill should happen automatically.
- Occasionally data from Kettle stops updating, we suspect this is due to a transient hang when contacting GCS (#8800). If this happens, restart kettle