The package provides two types of server-side throttling for HTTP services:
- In-flight throttling limits the total number of currently served (in-flight) HTTP requests.
- Rate-limit throttling limits the rate of HTTP requests using the leaky bucket or the sliding window algorithms (it's configurable, the leaky bucket is used by default).
Throttling is implemented as a typical Go HTTP middleware and can be configured from the code or from the JSON/YAML configuration file.
Please see testable example to understand how to configure and use the middleware.
The throttling configuration is usually stored in the JSON or YAML file. The configuration consists of the following parts:
rateLimitZones
. Each zone has a rate limit, burst limit, and other parameters.inFlightLimitZones
. Each zone has an in-flight limit, backlog limit, and other parameters.rules
. Each rule contains a list of routes and rate/in-flight limiting zones that should be applied to these routes.
Global throttling assesses all traffic coming into an API from all sources and ensures that the overall rate/concurrency limit is not exceeded. Overwhelming an endpoint with traffic is an easy and efficient way to carry out a denial-of-service attack. By using a global rate/concurrency limit, you can ensure that all incoming requests are within a specific limit.
rateLimitZones:
rl_total:
rateLimit: 5000/s
burstLimit: 10000
responseStatusCode: 503
responseRetryAfter: 5s
inFlightLimitZones:
ifl_total:
inFlightLimit: 5000
backlogLimit: 10000
backlogTimeout: 30s
responseStatusCode: 503
responseRetryAfter: 1m
rules:
- routes:
- path: "/"
rateLimits:
- zone: rl_total
inFlightLimits:
- zone: ifl_total
With this configuration, all HTTP requests will be limited by rate (no more than 5000 requests per second, rateLimit). Excessive requests within the burst limit (10000 here, burstLimit
) will be served immediately regardless of the specified rate, requests above the burst limit will be rejected with the 503 error (responseStatusCode
), and "Retry-After: 5"
HTTP header (responseRetryAfter
).
Additionally, there is a concurrency limit. If 5000 requests (inFlightLimit
) are being processed right now, new incoming requests will be backlogged (suspended).
If there are more than 10000 such backlogged requests (backlogLimit
), the rest will be rejected immediately. The request can be in backlogged status for no more than 30 seconds (backlogTimeout
) and then it will be rejected. Response for rejected request contains 503 (responseStatusCode
) HTTP error code and "Retry-After: 60"
HTTP header (responseRetryAfter
).
backlogLimit
and backlogTimeout
can be specified for rate-limiting zones too.
Per-client throttling is focused on controlling traffic from individual sources and making sure that API clients are staying within their prescribed limits. Per-client throttling allows avoiding cases when one client exhausts the application resources of the entire backend service (for example, it uses all connections from the DB pool), and all other clients have to wait for their release.
To implement per-client throttling, the package uses the concept of "identity". If the client is identified by a unique key, the package can throttle requests per this key.
MiddlewareOpts
struct has a GetKeyIdentity
callback that should return the key for the current request. It may be a user ID, JWT "sub" claim, or any other unique identifier.
If any rate/in-flight limiting zone's key.type
field is set to identity
, the GetKeyIdentity
callback must be implemented.
Example of per-client throttling configuration:
rateLimitZones:
rl_identity:
rateLimit: 50/s
burstLimit: 100
responseStatusCode: 429
responseRetryAfter: auto
key:
type: identity
maxKeys: 50000
inFlightLimitZones:
ifl_identity:
inFlightLimit: 64
backlogLimit: 128
backlogTimeout: 30s
responseStatusCode: 429
key:
type: identity
maxKeys: 50000
ifl_identity_expensive_op:
inFlightLimit: 4
backlogLimit: 8
backlogTimeout: 30s
responseStatusCode: 429
key:
type: identity
maxKeys: 50000
excludedKeys:
- "150853ab-322c-455d-9793-8d71bf6973d9" # Exclude root admin.
rules:
- routes:
- path: "/"
rateLimits:
- zone: rl_identity
inFlightLimits:
- zone: ifl_identity
alias: per_identity
- routes:
- path: "= /api/v1/do_expensive_op_1"
methods: POST
- path: "= /api/v1/do_expensive_op_2"
methods: POST
inFlightLimits:
- zone: ifl_identity_expensive_op
alias: per_identity_expensive_ops
All throttling counters are stored inside an in-memory LRU cache (maxKeys
determines its size).
For the rate-limiting zone, responseRetryAfter
may be specified as "auto"
. In this case, the time when a client may retry the request will be calculated automatically.
Each throttling rule may contain an unlimited number of rate/in-flight limiting zones. All rule zones will be applied to all specified routes. The route is described as path + list of HTTP methods. To select a route, exactly the same algorithm is used as to select a location in Nginx (http://nginx.org/en/docs/http/ngx_http_core_module.html#location). Also, the route may have an alias that will be used in the Prometheus metrics label (see example below).
rateLimitZones:
rl_identity:
alg: sliding_window
rateLimit: 15/m
responseStatusCode: 429
responseRetryAfter: auto
key:
type: identity
maxKeys: 50000
rules:
- routes:
- path: "/"
rateLimits:
- zone: rl_identity
alias: per_identity
In this example sliding window algorithm will be used for rate-limiting (alg
parameter has "token_bucket"
value by default). It means, only 15 requests are allowed per minute. They could be sent even simultaneously, but all exceeding requests that are received in the same minute will be rejected.
rateLimitZones:
rl_bad_user_agents:
rateLimit: 500/s
burstLimit: 1000
responseStatusCode: 503
responseRetryAfter: 15s
key:
type: header
headerName: "User-Agent"
noBypassEmpty: true
includedKeys:
- ""
- "Go-http-client/1.1"
- "python-requests/*"
- "Python-urllib/*"
maxKeys: 1000
rules:
- routes:
- path: "/"
rateLimits:
- zone: rl_bad_user_agents
rateLimitZones:
rl_by_remote_addr:
rateLimit: 100/s
burstLimit: 1000
responseStatusCode: 503
responseRetryAfter: auto
key:
type: remote_addr
maxKeys: 10000
rules:
- routes:
- path: "/"
rateLimits:
- zone: rl_by_remote_addr
The package collects several metrics in the Prometheus format:
rate_limit_rejects_total
. Type: counter; Labels: dry_run, rule.in_flight_limit_rejects_total
. Type: counter; Labels: dry_run, rule, backlogged.
Tags are useful when different rules of the same configuration should be used by different middlewares. For example, suppose you want to have two different throttling rules:
- A rule for all requests.
- A rule for all identity-aware (authorized) requests.
# ...
rules:
- routes:
- path: "/hello"
methods: GET
rateLimits:
- zone: rl_zone1
tags: all_reqs
- routes:
- path: "/feedback"
methods: POST
inFlightLimits:
- zone: ifl_zone1
tags: all_reqs
- routes:
- path: /api/1/users
methods: PUT
rateLimits:
- zone: rl_zone2
tags: require_auth_reqs
# ...
In your code, you will have two middlewares that will be executed at different steps of the HTTP request serving process. Each middleware should only apply its own throttling rule.
allMw := MiddlewareWithOpts(cfg, "my-app-domain", throttleMetrics, MiddlewareOpts{Tags: []string{"all_reqs"}})
requireAuthMw := MiddlewareWithOpts(cfg, "my-app-domain", throttleMetrics, MiddlewareOpts{Tags: []string{"require_auth_reqs"}})
Before configuring real-life throttling, usually, it's a good idea to try the dry-run mode. It doesn't affect the processing requests flow, however, all excessive requests are still counted and logged. Dry-run mode allows you to better understand how your API is used and determine the right throttling parameters.
The dry-run mode can be enabled using the dryRun
configuration parameter. Example:
rateLimitZones:
rl_identity:
rateLimit: 50/s
burstLimit: 100
responseStatusCode: 429
responseRetryAfter: auto
key:
type: identity
maxKeys: 50000
dryRun: true
inFlightLimitZones:
ifl_identity:
inFlightLimit: 64
backlogLimit: 128
backlogTimeout: 30s
responseStatusCode: 429
key:
type: identity
maxKeys: 50000
dryRun: true
rules:
- routes:
- path: "/"
rateLimits:
- zone: rl_identity
inFlightLimits:
- zone: ifl_identity
alias: per_identity
If specified limits are exceeded, the corresponding messages will be logged.
For rate-limiting:
{"msg": "too many requests, serving will be continued because of dry run mode", "rate_limit_key": "ee9a0dd8-7396-5478-8b83-ab7402d6746b"}
For in-flight limiting:
{"msg": "too many in-flight requests, serving will be continued because of dry run mode", "in_flight_limit_key": "3c00e780-5721-59f8-acad-f0bf719777d4"}
Copyright © 2024 Acronis International GmbH.
Licensed under MIT License.