Skip to content

Kubernetes Liveness and Readiness Probes

Ponnia edited this page Mar 22, 2024 · 10 revisions

Background

Applications can exhibit instability due to various factors, including temporary loss of connectivity, misconfigurations, or internal application faults. Kubernetes ensures application health through liveness and readiness probes, facilitating automatic container restarts when necessary. However, for detailed insights into application status, resource usage, and errors, developers should integrate monitoring and observability tools alongside Kubernetes.

A probe in Kubernetes is a mechanism that performs periodic health checks on containers to manage their lifecycle. These checks help determine when to restart a container (liveness probe) or when a container is ready to handle traffic (readiness probe). Developers can define probes in their service deployments using YAML configuration files or the kubectl command-line interface, with the YAML approach being highly recommended for its clarity and version control capabilities.

Types of Probes in Kubernetes

  1. Liveness Probe
  • This determines whether an application running in a container is in a healthy state. If the liveness probe detects an unhealthy state, then Kubernetes kills the container and WILL RESTART the container.
  1. Readiness Probe
  • This determines whether a container is ready to handle requests or receive traffic. A failure in this probe leads Kubernetes to stop sending traffic to that container by removing its IP from the service endpoints WITHOUT restarting the container. Instead, it's expected that the application will eventually pass the readiness probe through internal recovery, configuration changes, or by completing its initialization tasks (basically we have to troubleshoot to get the K8s readiness probe to pass).
  • This is useful when waiting for an application to perform time-consuming initial tasks, such as establishing network connections, loading files, and warming caches.
  1. Startup Probe
  • This determines whether the application within a container has started. They are crucial for applications that have lengthy startup times, ensuring that liveness and readiness probes do not interfere prematurely.
  • Startup probes run before any other probes, and, unless it finishes successfully, disables other probes. If a container fails its startup probe, then the container is killed and follows the pod’s restartPolicy.

K8s Probe Configuration for VRO Spring Boot Applications

All VRO Spring Boot Applications (BIE Kafka and BIP API), API gateway, and Lighthouse API are configured the same way. This part of this documentation provides a comprehensive guide on configuring Kubernetes liveness and readiness probes for Spring Boot applications.

Pre-configuration Investigation

Before configuring liveness and readiness probes for our Spring Boot applications, we need the following information:

  • Health Check Port: Identify the port used for health checks. Found in either the application.yaml file or gradle.properties file of the VRO microservice.
  • Health Check URL Path: Determine the path to the health check URL. This is often specified in the Dockerfile using a HEALTHCHECK CMD directive like this: HEALTHCHECK CMD curl --fail http://localhost:${HEALTHCHECK_PORT}/actuator/health || exit 1
  • Actuator Dependency: Verify if the VRO application includes the Spring Boot Actuator dependency by checking the build.gradle file for this line: implementation 'org.springframework.boot:spring-boot-starter-actuator'.

Configuration Steps

Step 1: Configure Liveness and Readiness probe endpoints

In the application.yaml file (located in the resources directory of the Spring Boot VRO microservice), configure the existing Spring Boot Actuator's health endpoint to include liveness and readiness probes, which are then accessible via specific paths.

Screenshot 2024-03-22 at 7 07 15 PM

Step 2: Helm Chart values.yaml Configuration

In the Helm chart for the application, modify the values.yaml file to configure the ports, livenessProbe, and readinessProbe. This is where we specify the specific paths for liveness checks (/actuator/health/liveness) and readiness checks (/actuator/health/readiness).

Screenshot 2024-03-22 at 7 24 36 PM

initialDelaySeconds: This setting delays the start of the liveness probe checks by 120 seconds after the container has started. This delay allows the application within the container enough time to initialize and start up before Kubernetes begins checking its liveness.

periodSeconds: This configuration specifies the frequency at which the liveness probe checks are performed. With periodSeconds set to 10, Kubernetes will check the liveness of the container every 10 seconds.

timeoutSeconds: This parameter defines the time after which the probe check is considered failed if no response is received. Here, if the liveness probe does not receive a response from the /actuator/health/liveness endpoint within 10 seconds, the check will fail. Setting an appropriate timeout prevents false positives in situations where the application or system is temporarily slow.

failureThreshold: This setting determines the number of consecutive failures required to consider the probe failed. With a failureThreshold of 3, Kubernetes will mark the liveness probe as failed and restart the container only after three consecutive failures. This threshold helps in avoiding unnecessary restarts for transient or short-lived issues.

Step 3: Helm Chart deployment.yaml Configuration

In the deployment.yaml file, add the configurations for ports, livenessProbe, and readinessProbe within spec.containers:

Screenshot 2024-03-22 at 7 39 29 PM

Step 4: Incrementing Chart.yaml and appVersion

To track changes and updates, increment this chart version in the Chart.yaml file with every change to the chart and Helm templates. VRO is still to discuss how to automate this process (future work).

Step 5: Testing endpoints

Ensure the application's health endpoints (/actuator/health/liveness and /actuator/health/readiness) are correctly implemented and return the expected statuses. Locally, spin up the specific VRO microservice and confirmed the service container is running in Docker Desktop. Go on the browser and follow this format: http://localhost:${HEALTHCHECK_PORT}/actuator/health

For example, http://localhost:10301/actuator/health

It is best practice to regularly review and test the liveness and readiness configurations to ensure they accurately reflect the application's health and readiness states.

K8s Probe Configuration for VRO Ruby Applications

bIS-api (formerly BGS-api) pending...

Clone this wiki locally