-
Notifications
You must be signed in to change notification settings - Fork 35
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: Frank/build reliable java with aca (#56)
## Purpose <!-- Describe the intention of the changes being proposed. What problem does it solve or functionality does it add? --> add guidance to build reliable Java, including - Graceful shutdown - Health probes ## Does this introduce a breaking change? <!-- Mark one with an "x". --> ``` [ ] Yes [ ] No ``` ## Pull Request Type What kind of change does this Pull Request introduce? <!-- Please check the one that applies to this PR using "x". --> ``` [ ] Bugfix [ ] Feature [ ] Code style update (formatting, local variables) [ ] Refactoring (no functional changes, no api changes) [ ] Documentation content changes [ ] Other... Please describe: ``` ## How to Test * Get the code ``` git clone [repo-address] cd [repo-name] git checkout [branch-name] npm install ``` * Test the code <!-- Add steps to run the tests suite and/or manually test --> ``` ``` ## What to Check Verify that the following are valid * ... ## Other Information <!-- Add any other helpful information that may be needed here. --> --------- Co-authored-by: Frank Liu <[email protected]>
- Loading branch information
1 parent
cff9669
commit 2633429
Showing
8 changed files
with
279 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,121 @@ | ||
--- | ||
title: '1. Add shutdown hook to a Java application' | ||
layout: default | ||
nav_order: 1 | ||
parent: 'Lab 10: Build reliable Java application on ACA' | ||
--- | ||
|
||
# Graceful shutdown on Java Container Apps | ||
To achieve zero-downtime during a rolling update, gracefully shutting down a Java application is essential. Graceful shutdown refers to the "window of opportunity" an application has to programmatically clean up resources between the time a `SIGTERM` signal is sent to a container app and the time the app actually shuts down (receiving `SIGKILL`). See [Container App Lifecycle Shutdown](https://learn.microsoft.com/en-us/azure/container-apps/application-lifecycle-management#shutdown). | ||
|
||
The cleanup behavior may include logic such as: | ||
- Closing database connections | ||
- Waiting for any long-running operations to finish | ||
- Clearing out a message queue | ||
- Etc. | ||
|
||
`SIGTERM`can be sent to containers during various shutdown events, including management operations (such as scale in/down or any [Revision-scope changes](https://learn.microsoft.com/en-us/azure/container-apps/revisions#revision-scope-changes)) and internal platform upgrade(e.g. node upgrade). Therefore, it is essential to properly handle the `SIGTERM` signal in the Java applications to achieve zero-downtime. | ||
|
||
|
||
## Step by step guidance | ||
|
||
### 1. Handle SIGTERM signal | ||
In this lab, we will guide you on how to properly handle Eureka cache issues and long HTTP requests when receiving the `SIGTERM` signal. | ||
|
||
#### 1.1 Add a Shutdown Hook for Spring Cloud Applications | ||
|
||
Eureka is designed to be an eventually consistent system. When an upstream service app is shut down, the service-consuming app (i.e., client app) will not see the registry update immediately and will continue to send requests to the upstream app. If the upstream app shuts down immediately, the client app will receive `5xx` network/IO exceptions. | ||
|
||
To gracefully shut down an upstream app that offers services via Eureka service discovery, you need to catch the `SIGTERM` signal and follow the deregister-then-wait pattern: | ||
1) Deregister the instance from the Eureka server. | ||
2) Wait until all client apps refresh their Eureka cache. | ||
3) Shut down the application. | ||
|
||
To implement this, you may refer the sample code [EurekaGracefulShutdown.java](https://github.com/Azure-Samples/java-microservices-aca-lab/blob/main/src/spring-petclinic-customers-service/src/main/java/org/springframework/samples/petclinic/customers/shutdown/EurekaGracefulShutdown.java) | ||
|
||
```java | ||
@Component | ||
@Slf4j | ||
public class EurekaGracefulShutdown { | ||
|
||
@Autowired | ||
private EurekaInstanceConfigBean eurekaInstanceConfig; | ||
private static final String STATUS_DOWN = "DOWN"; | ||
private static final int WAIT_SECONDS = 30; | ||
|
||
@EventListener | ||
public void onShutdown(ContextClosedEvent event) { | ||
log.info("Caught shutdown event"); | ||
log.info("De-register instance from eureka server"); | ||
eurekaInstanceConfig.setStatusPageUrl(STATUS_DOWN); | ||
|
||
// Wait to continue serve traffic before all Eureka clients refresh their cache | ||
try { | ||
log.info("wait {} seconds before shutting down the application", WAIT_SECONDS); | ||
Thread.sleep(1000 * WAIT_SECONDS); | ||
} catch (InterruptedException e) { | ||
Thread.currentThread().interrupt(); | ||
} | ||
log.info("Shutdown the application now."); | ||
} | ||
} | ||
``` | ||
|
||
Note, when setting the `WAIT_SECONDS`, consider the maximum possible Eureka cache intervals, including | ||
- eureka server cache `eureka.server.responseCacheUpdateIntervalMs` | ||
- eureka client cache `eureka.client.registryFetchIntervalSeconds` | ||
- ribboin load balacer cache `ribbon.ServerListRefreshInterval` if ribboin is used | ||
|
||
In the Pet Clinic sample, we have already added `EurekaGracefulShutdown` to all the micro-services using eureka as service discovery server. | ||
|
||
|
||
{: .note } | ||
> Azure Container Apps provides built-in service discovery for microservice applications within the same container environment. You can call a container app by sending a request to `http(s)://<CONTAINER_APP_NAME>` from another app in the environment. For more details, see [Call a container app by name](https://learn.microsoft.com/en-us/azure/container-apps/connect-apps?tabs=bash#call-a-container-app-by-name). If your microservice applications are in the same container environment, you can use this feature to avoid the Eureka cache issue. | ||
#### 1.2 Config graceful shutdown for Spring-boot application | ||
Another common scenario is handling long HTTP operations. Before shutting down the application, the web server needs to finish processing all received HTTP requests. | ||
|
||
If the Java application is a Spring Boot application and does not offer services via Eureka, you can simply use Spring Boot’s built-in [graceful shutdown support](https://docs.spring.io/spring-boot/reference/web/graceful-shutdown.html). Otherwise, use the above approach. | ||
|
||
```yaml | ||
server.shutdown=graceful | ||
spring.lifecycle.timeout-per-shutdown-phase=30s | ||
``` | ||
This configuration uses a timeout to provide a grace period during which existing requests are allowed to complete, but no new requests will be permitted. | ||
|
||
In the Pet Clinic sample, we are using Spring Boot’s graceful shutdown support for the `gateway` application. | ||
|
||
|
||
### 2. Config terminationGracePeriodSeconds | ||
After adding proper shutdown hooks, configure the `terminationGracePeriodSeconds` in Container Apps to match the cleanup wait time. The `terminationGracePeriodSeconds` defaults to 30 seconds, update it to 60s for app `customers-service`. | ||
|
||
- Portal: Go to the `Revisions blade` -> Create a new revision -> Save this new revision. | ||
![lab 10 grace periods](../../images/lab10-grace.png) | ||
|
||
{: .note } | ||
> You can set a maximum value of 600 seconds (10 minutes) for terminationGracePeriodSeconds. If an application needs upwards of 10 minutes for cleanup, it is highly recommended to revisit the application’s design to reduce this time. | ||
### 3. Test the application | ||
Use [wrk](https://github.com/wg/wrk) or any traffic emiting tool to verify there is no down-time during application shutdown. | ||
|
||
Open a terminal, and emitting traffic with wrk | ||
```bash | ||
endpoint=$api_gateway_FQDN/api/customer/owners | ||
wrk -t1 -c1 -d300s $endpoint | ||
``` | ||
|
||
Open a new terminal, and restart the revision | ||
```bash | ||
revision=$(az containerapp revision list \ | ||
--name customers-service \ | ||
--resource-group $RESOURCE_GROUP | jq -r .[].name) | ||
|
||
az containerapp revision restart \ | ||
--revision $revision \ | ||
--resource-group $RESOURCE_GROUP | ||
``` | ||
|
||
In the first terminal, you should see there is no `5xx` error during the whole application restart time window. | ||
![lab 10 no dontime](../../images/lab10-no-downtime.png) | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,98 @@ | ||
--- | ||
title: '2. Config health probes for a Java Application' | ||
layout: default | ||
nav_order: 2 | ||
parent: 'Lab 10: Build reliable Java application on ACA' | ||
--- | ||
|
||
# Config health probes for a Java Application | ||
Azure Container Apps health probes allow the Container Apps runtime to regularly inspect the status of your container apps. | ||
|
||
Container Apps supports the following probes: | ||
|
||
- `Startup`. Checks if your application has successfully started. This check is separate from the liveness probe and executes during the initial startup phase of your application. | ||
- `Liveness`. Checks if your application is still running and responsive. | ||
- `Readiness`. Checks to see if a replica is ready to handle incoming requests. | ||
|
||
You can find more info on Azuzre doc [Health probes in Azure Container Apps](https://learn.microsoft.com/en-us/azure/container-apps/health-probes?tabs=arm-template). | ||
|
||
Health probes can help work around performance issues related to timeouts during container startup, deadlocks when running the container, and serving traffic when the container is not ready to accept traffic. | ||
|
||
## Step by step guidance | ||
|
||
### 1. Expose health probes in Spring Boot Application | ||
|
||
Spring Boot has out-of-the-box support to [manage your application availability state](https://docs.spring.io/spring-boot/docs/2.3.0.RELEASE/reference/html/production-ready-features.html#production-ready-kubernetes-probes). | ||
|
||
Add bellow configuration into `customers-service.yml` | ||
|
||
```yml | ||
management: | ||
health: | ||
livenessState: | ||
enabled: true | ||
readinessState: | ||
enabled: true | ||
endpoint: | ||
health: | ||
probes: | ||
enabled: true | ||
|
||
``` | ||
With the above configration, two heath endpoints will be exposed via spring-boot actuator | ||
- `/actuator/health/liveness` for application liveness | ||
- `/actuator/health/readiness` for application readiness | ||
|
||
### 2. Define a customized health indicator in Spring Boot Application | ||
In Spring Boot app, you can define a customized `HealthIndicator`. Here is a `HealthIndicator` sample in project `customers-service`. | ||
```java | ||
public class ServiceHealthIndicator implements HealthIndicator { | ||
|
||
private final ScheduledExecutorService scheduler = Executors.newScheduledThreadPool(1); | ||
private boolean isHealthy = false; | ||
|
||
private OwnerRepository ownerRepo; | ||
|
||
public ServiceHealthIndicator(OwnerRepository ownerRepo) { | ||
this.ownerRepo = ownerRepo; | ||
scheduler.scheduleAtFixedRate(() -> { | ||
checkDatabaseStatus(); | ||
if (isHealthy) { | ||
scheduler.shutdown(); | ||
} | ||
}, 10, 5, TimeUnit.SECONDS); | ||
} | ||
|
||
private void checkDatabaseStatus() { | ||
boolean databaseReady = ownerRepo.findAll().size() > 0; | ||
if (databaseReady) { | ||
isHealthy = true; | ||
log.info("Database is healthy. Stopping checks."); | ||
} else { | ||
log.info("Database is not healthy. Checking again in 5 seconds."); | ||
} | ||
} | ||
|
||
@Override | ||
public Health health() { | ||
return isHealthy ? Health.up().build() : Health.down().build(); | ||
} | ||
} | ||
``` | ||
In this sample, the code `ServiceHealthIndicator` will report health status `UP` only after some db operation is ready. This can be helpful in some scenarios where you application needs some warmup (e.g. cache/db preload) time before receiving traffic. | ||
|
||
|
||
### 3. Config Health probes in Azure Container Apps | ||
Health probes can be configed via either Portal or [ARM template](https://learn.microsoft.com/en-us/azure/container-apps/health-probes?tabs=arm-template). | ||
|
||
|
||
- Portal: Find the application `customers-service` -> Go to the `Revisions blade` -> Create a new revision -> Save this new revision. | ||
![lab 10 health probes](../../images/lab10-liveness-probe.png) | ||
![lab 10 readiness probes](../../images/lab10-readiness-probe.png) | ||
|
||
Here, we set the `initial delay seconds` in readiness probe to 10 seconds, which align with the above health check logic in `ServiceHealthIndicator`. | ||
|
||
{: .note } | ||
> Azure Container Apps is built ontop of Kubernetes, the health probes feature maps closely with Kubernetes Probes, you may gain a deeper understanding on probes from [kubernetes probes](https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/). | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
--- | ||
title: '3. Review' | ||
layout: default | ||
nav_order: 3 | ||
parent: 'Lab 10: Build reliable Java application on ACA' | ||
--- | ||
|
||
# Review | ||
|
||
In this lab, you implemented some design to build reliable Java applications with Azure Container Apps. In this lab you | ||
|
||
- Add appropiate shutdown hook to a Java application | ||
- Config health probes for a Java application | ||
|
||
|
||
The below image illustrates the end state you have build in this lab. | ||
|
||
![lab 5 overview](../../images/lab5.png) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
--- | ||
title: 'Lab 10: Build reliable Java application on ACA' | ||
layout: default | ||
nav_order: 12 | ||
has_children: true | ||
--- | ||
|
||
# Lab 10: Build reliable Java application on ACA | ||
|
||
# Student manual | ||
|
||
## Lab scenario | ||
|
||
Azure Container Apps has a richful set of features available to help you build reliable Java applications. In this Lab, you will learn how to design and maintain your app for long-term health and stability. You can find more infomation on | ||
- [Management and operations for the Azure Container Apps - Landing Zone Accelerator](https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/scenarios/app-platform/container-apps/management) | ||
|
||
|
||
## Objectives | ||
|
||
After you complete this lab, you will be able to: | ||
|
||
- Gracefully shut down a Java application | ||
- Config health probes for a Java Application | ||
|
||
The below image illustrates the end state you will be building in this lab. | ||
|
||
![lab 5 overview](../../images/lab5.png) | ||
|
||
## Lab Duration | ||
|
||
- **Estimated Time**: 60 minutes | ||
|
||
## Instructions | ||
|
||
During this lab, you will: | ||
|
||
- Add appropiate shutdown hook to a Java application | ||
- Config health probes for a Java application | ||
|
||
|
||
{: .note } | ||
> The instructions provided in this exercise assume that you successfully completed the previous exercise and are using the same lab environment, including your Git Bash session with the relevant environment variables already set. |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.