Rate and concurrency limiting

Rate limiting consists of counting how many requests a server is accepting in a period of time, and rejecting new ones when a limit is reached.

Concurrency limiting consists of counting how many concurrent requests are being served by the web server to the same remote user and starting to reject new ones when it reaches a defined threshold. Since many requests can reach the server simultaneously, a concurrency limiter needs to have a small allowance in its threshold.

Both are implemented using the same technique. Let's look at how to build a concurrency limiter.

OpenResty ships with a rate limiting library written in Lua called lua-resty-limit-traffic (https://github.com/openresty/lua-resty-limit-traffic); you can use it in a acces_by_lua_block section.

The function uses Lua Shared Dict, which is a memory mapping that is shared by all nginx workers within the same process. Using a memory dict means that the rate limiting will work at the process level.

Since we're typically deploying one nginx per service node, the rate limiting will happen per web server. So, if you are deploying several nodes for the same microservice and doing some load balancing, you will have to take this into account when you set the threshold.

In the following example, we're adding a lua_shared_dict definition and a access_by_lua_block section to activate the rate limiting. Note that this example is a simplified version of the example from the project's documentation:

    ... 
    http { 
      ...  
      lua_shared_dict my_limit_req_store 100m; 
 
      server { 
        access_by_lua_block { 
          local limit_req = require "resty.limit.req" 
          local lim, err = limit_req.new("my_limit_req_store",200, 100) 
          local key = ngx.var.binary_remote_addr 
          local delay, err = lim:incoming(key, true) 
          if not delay then 
            if err == "rejected" then 
              return ngx.exit(503) 
              end 
            end 
          if delay >= 0.001 then 
            ngx.sleep(delay) 
            end 
        } 
        proxy_pass ... 
      } 
    }

The access_by_lua_block section can be considered as a Lua function and can use some of the variables and function OpenResty exposes. For instance, ngx.var is a table containing all the nginx variables and ngx.exit() is a function that can be used to immediately return a response to the user. In our case, a 503 when we need to reject a call because of rate-limiting.

The library uses the my_limit_req_store dict that is passed to the resty.limit.req function and every time a request reaches the server, it calls the incoming() function with the binary_remote_addr value, which is the client address.

The incoming function will use the shared dict to maintain the number of active connections per remote address and send back a rejected value when that number reaches the threshold, for example, when there are more than 300 concurrent requests.

If the connection is accepted, the incoming() function sends back a delay value. Lua will hold the request using that delay and the asynchronous ngx.sleep() function. The delay will be 0 when the remote client has not reached the threshold of 200, and a small delay when between 200 and 300, so the server has a chance to unstack all the pending requests.

This elegant design will be quite efficient to avoid a service to get overwhelmed by many requests. Setting up a ceiling like that is also a good way to avoid reaching a point where you know your microservice will start to break.

For instance, if some of your benchmarks concluded that your service could not serve more than 100 simultaneous requests before starting to crash, you can set the rate limiting, so it's nginx that rejects requests instead of letting your Flask microservice pile up error logs and heat the CPU just to handle rejections.

The key used to calculate the rate in this example is the remote address header of the request. If your nginx server is itself behind a proxy, make sure you are using a header that contains the real remote address. Otherwise, you will rate limit a single remote client, the proxy server. It's usually in the X-Forwarded-For header in that case.

If you want a WAF with more features, the lua-resty-waf (https://github.com/p0pr0ck5/lua-resty-waf) project works like lua-resty-limit-traffic, but offers a lot of other protections. It's also able to read ModSecurity rule files, so you can use the rule files from the OWASP project without having to use ModSecurity itself.

Table of Contents for Rate and concurrency limiting

Create new playlist

Sign In

Sign Up

Table of Contents for
Rate and concurrency limiting