Skip to content

Latest commit

 

History

History
37 lines (25 loc) · 2.1 KB

README.md

File metadata and controls

37 lines (25 loc) · 2.1 KB

ML Testing Accelerators

A set of tools and examples to run machine learning tests on ML hardware accelerators (TPUs or GPUs) using Google Cloud Platform.

This is not an officially supported Google product.

Getting Started (full-featured standalone mode)

In this mode, your tests and/or models run on an automated schedule in GKE. Results are collected by the "Metrics Handler" and written to BigQuery.

This route is recommended if you have many tests that run for a long time and produce many metrics that you want to monitor for regressions.

  1. Install all of our development prerequisites.
  2. Follow instructions in the deployments directory to set up a Kubernetes Cluster.
  3. Follow instructions in the images directory to set up the Docker image that your tests will run.
  4. Deploy the metrics handler to Google Cloud Functions.
  5. See templates directory for a JSonnet template library to generate test config files.
  6. (Optional) Set up a dashboard to view test results. See dashboard directory for instructions.

Getting Started (lighter-weight Continuous Integration mode)

In this mode, your tests run on GKE but are tied to a CI platform like Github Actions or CircleCI. Tests can run as presubmits for pending PRs, as postsubmit checks on submitted PRs, or on a timed schedule.

This route is recommended if you want some tie-in with Github and your tests are relatively short-running.

  1. Install all of our development prerequisites.
  2. Follow instructions in the deployments directory to set up a Kubernetes Cluster.
  3. See the ci_pytorch directory for the last few setup steps.

Are you interested in using ML Testing Accelerators? E-mail [email protected] and tell us about your use-case. We're happy to help you get started.