Skip to content

Latest commit

 

History

History
79 lines (58 loc) · 4.89 KB

README.md

File metadata and controls

79 lines (58 loc) · 4.89 KB

CockroachDB Runbook Template

Overview

This document is a Template for creating a custom CockroachDB Runbook, a.k.a. CockroachDB Operation Manual

A runbook is a reference document which describes a CockroachDB deployment in a specific application environment with related tasks, checklists, and operational procedures.

This template provides an overall structure and implementation outlines for common CockroachDB operating procedures, expediting the creation of a custom runbook - an important deliverable of the overall IT system to ensure a required state of preparedness.

Customers who already have a CockroachDB runbook can use this template to check their existing manual for completeness.

In practice, CockroachDB operators will strive to automate most of the checks and procedures. This template, however, is focused on documenting the detailed checklists and steps comprising individual operational procedures. The automation of these procedures is not in scope of this document.


Terms Used in this Document

CockroachDB Node is an instance of a cockroach server process. To underscore this point - a node is neither a [virtual] server nor an instance of an OS nor a container. Cockroach Labs strongly recommends running one CockroachDB node per one instance of an OS or per container.

CockroachDB Cluster is a set of connected CockroachDB Nodes that form a single system that works together on all tasks.

Platform is a set of compatible hardware, virtualized or containerized hardware, as well as related structures, on which CockroachDB can be run. Platform examples are bare metal x86_64, AWS EC2, Google Cloud Platform, Microsoft Azure, VMware vSphere, Docker, Kubernetes.


Contents

  1. The Most Common Problems experienced by CockroachDB users
  2. System Overview
  3. Routine Maintenance Procedures
  4. Monitoring and Alerting
  5. Diagnostic and Support
  6. Emergency Procedures / Operation Continuity