Scaling

Click the Scaling link to define the number of pods or instances of the application you want to deploy initially.

If you are creating a serverless deployment, you can also configure the following settings:

Min Pods determines the lower limit for the number of pods that must be running at any given time for a Knative service. This is also known as the minScale setting.
Max Pods determines the upper limit for the number of pods that can be running at any given time for a Knative service. This is also known as the maxScale setting.
Concurrency target determines the number of concurrent requests desired for each instance of the application at a given time.
Concurrency limit determines the limit for the number of concurrent requests allowed for each instance of the application at a given time.
Concurrency utilization determines the percentage of the concurrent requests limit that must be met before Knative scales up additional pods to handle additional traffic.
Autoscale window defines the time window over which metrics are averaged to provide input for scaling decisions when the autoscaler is not in panic mode. A service is scaled-to-zero if no requests are received during this window. The default duration for the autoscale window is 60s. This is also known as the stable window.

Provide feedback

Saved searches