-
Notifications
You must be signed in to change notification settings - Fork 122
/
maintaining-high-availability.html.md.erb
44 lines (25 loc) · 2.62 KB
/
maintaining-high-availability.html.md.erb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
---
title: How Cloud Foundry maintains high availability
owner: Release Integration
---
<%# Reset page title based on product title %>
<% if vars.platform_code != 'CF' %>
<% set_title("How", vars.app_runtime_abbr, "Maintains High Availability") %>
<% end %>
<%= vars.app_runtime_first %> deployments make up several layers of high availability to keep your apps running during system failure.
These layers include AZs, app health management, process monitoring, and VM resurrection.
## <a id='azs-2'></a> Availability zones
<%= vars.platform_name %> supports deploying apps instances across multiple AZs. This level of high availability requires that you define AZs in your IaaS. <%= vars.platform_name %> balances the apps you deploy across the AZs you defined. If an AZ goes down, you still have app instances running in another.
<%= vars.azs %>
## <a id='health-management-apps'></a> Health management for app instances
If you lose app instances for any reason, such as a bug in the app or an AZ going down, <%= vars.platform_name %> restarts new instances to maintain capacity. Under Diego architecture, the nsync, BBS, and Cell Rep components track the number of instances of each app that are running across all of the Diego cells. When these components detect a discrepancy between the actual state of the app instances in the cloud and the desired state as known by the Cloud Controller, they advise the Cloud Controller of the difference and the Cloud Controller initiates the deployment of new app instances.
For more information about the nsync, BBS, and Cell Rep components, see the [nsync, BBS, and Cell Rep](./architecture/index.html#nsync-bbs) section of the _<%= vars.app_runtime_abbr %> Components_ topic.
## <a id='monitored-processes'></a> Process monitoring
<%= vars.platform_name %> uses a BOSH agent, monit, to monitor the processes on the component VMs that work together to keep your apps running, such as nsync, BBS, and Cell Rep. If monit detects a failure, it restarts the process and notifies the BOSH agent on the VM. The BOSH agent notifies the BOSH Health Monitor, which starts the responders through plug-ins. For example, email notifications or paging.
## <a id='health-monitoring'></a> Resurrector VMs
BOSH detects if a VM is present by listening for heartbeat messages that
are sent from the BOSH agent every 60 seconds. The BOSH Health Monitor listens for
those heartbeats. When the Health Monitor finds that a VM is not responding, it passes an alert
to the Resurrector component. If the Resurrector is enabled,
it sends the IaaS a request to create a new VM instance to replace the one that failed.
<%= vars.resurrector %>