Skip to content

Commit 7218f12

Browse files
committed
Add docs for scale-to-zero
(cherry picked from commit 296d0b9)
1 parent 5f14c9c commit 7218f12

File tree

4 files changed

+6
-5
lines changed

4 files changed

+6
-5
lines changed

docs/workloads/async/async.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ Async APIs are a good fit for users who want to submit longer workloads (such as
1010
* retrieve status and response via HTTP endpoint
1111
* autoscale based on queue length
1212
* avoid cold starts
13-
* scale to 0
13+
* scale to zero
1414
* perform rolling updates
1515
* automatically recover from failures and spot instance termination
1616

docs/workloads/async/autoscaling.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -6,11 +6,11 @@ Cortex auto-scales AsyncAPIs on a per-API basis based on your configuration.
66

77
### Autoscaling configuration
88

9-
**`min_replicas`**: The lower bound on how many replicas can be running for an API.
9+
**`min_replicas`** (default: 1): The lower bound on how many replicas can be running for an API. Scale-to-zero is supported.
1010

1111
<br>
1212

13-
**`max_replicas`**: The upper bound on how many replicas can be running for an API.
13+
**`max_replicas`** (default: 100): The upper bound on how many replicas can be running for an API.
1414

1515
<br>
1616

docs/workloads/realtime/autoscaling.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -18,11 +18,11 @@ In addition to the autoscaling configuration options (described below), there ar
1818

1919
### Autoscaling configuration
2020

21-
**`min_replicas`**: The lower bound on how many replicas can be running for an API.
21+
**`min_replicas`** (default: 1): The lower bound on how many replicas can be running for an API. Scale-to-zero is supported (experimental).
2222

2323
<br>
2424

25-
**`max_replicas`**: The upper bound on how many replicas can be running for an API.
25+
**`max_replicas`** (default: 100): The upper bound on how many replicas can be running for an API.
2626

2727
<br>
2828

docs/workloads/realtime/realtime.md

+1
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ Realtime APIs are a good fit for users who want to run stateless containers as a
99
* respond to requests synchronously
1010
* autoscale based on request volume
1111
* avoid cold starts
12+
* scale to zero
1213
* perform rolling updates
1314
* automatically recover from failures and spot instance termination
1415
* perform A/B tests and canary deployments

0 commit comments

Comments
 (0)