7

Securing Your Services

So far in this book, all the interactions between services were done without any form of authentication or authorization; each HTTP request would happily return a result. This cannot happen in production for two simple reasons: we need to know who is calling the service (authentication), and we need to make sure that the caller is allowed to perform the call (authorization). For instance, we probably don't want an anonymous caller to delete entries in a database.

In a monolithic web application, simple authentication can happen with a login form, and once the user is identified a cookie is set with a session identifier so that the client and server can collaborate on all subsequent requests. In a microservice-based architecture, we cannot use this scheme everywhere because services are not users and won't use web forms for authentication. We need a way to accept or reject calls between services automatically.

The OAuth2 authorization protocol gives us the flexibility to add authentication and authorization in our microservices, which can be used to authenticate both users and services. In this chapter, we will discover the essential features of OAuth2 and how to implement an authentication microservice. This service will be used to secure service-to-service interactions.

A few things can be done at the code level to protect your services, such as controlling system calls, or making sure HTTP redirects are not ending up on hostile web pages. We will discuss how to add protection against badly formed data, some common pitfalls to avoid, and demonstrate how you can scan your code against potential security issues.

Lastly, securing services also means we want to filter out any suspicious or malicious network traffic before it reaches our application. We will look at setting up a basic web application firewall to defend our services.

The OAuth2 protocol

If you are reading this book, you are in all likelihood someone who has logged in to a web page with a username and password. It's a straightforward model to confirm who you are, but there are drawbacks.

Many different websites exist, and each needs to properly handle someone's identity and password. The potential for security leaks multiplies with the number of different places an identity is stored, and how many routes a password can take through the different systems involved. It also becomes easier for attackers to create fake sites, as people become used to entering their username and password in multiple different places that may all look slightly different. Instead, you have probably come across websites that let you "Login with Google," Microsoft, Facebook, or GitHub. This feature uses OAuth2, or tools built on top of it.

OAuth2 is a standard that is widely adopted for securing web applications and their interactions with users and other web applications. Only one service ever gets told your password or multi-factor authentication codes, and any site that needs to authenticate you directs you there. There are two types of authentication that we will cover here, the first being the Authentication Code Grant, which is initiated by a human using a browser or mobile app.

The process for a user-driven Authentication Code Grant looks complicated as it is depicted in Figure 7.1, but it serves an important purpose. Following the figure through, when a client requests a resource—whether it is a web page or some data, say—that they must log in to view, the application sends a 302 redirection to visit the authentication service. In that URL will be another address that the authentication service can use to send the client back to the application.

Once the client connects, the authentication service does the things you might expect—it asks for a username, password, and multi-factor authentication codes, and some may even display a picture or some text to demonstrate that you are visiting the right place. After logging in correctly, the authentication service redirects the client back to the application, this time with a token to present.

The application can validate the token with the authentication service, and can remember that result until the token expires, or for some other configurable length of time, occasionally rechecking it to check that the token hasn't been revoked. This way the application never has to deal with a username or password, and only has to learn enough to uniquely identify the client.

Figure 7.1: The OAuth2 authentication flow

When setting up OAuth2 for a program to use, so that one service can connect to another, there is a similar process called Client Credentials Grant (CCG) in which a service can connect to the authentication microservice and ask for a token that it can use. You can refer to the CCG scenario described in section 4.4 of the OAuth2 Authorization Framework for more information: https://tools.ietf.org/html/rfc6749#section-4.4.

This works like the authorization code, but the service is not redirected to a web page as a user is. Instead, it's implicitly authorized with a secret key that can be traded for a token.

For a microservices-based architecture, using these two types of grants will let us centralize every aspect of authentication and authorization of the system. Building a microservice that implements part of the OAuth2 protocol to authenticate services and keep track of how they interact with each other is a good solution to reduce security issues—everything is centralized in a single place.

The CCG flow is by far the most interesting aspect to look at in this chapter, because it allows us to secure our microservice interactions independently from the users. It also simplifies permission management, since we can issue tokens with different scopes depending on the context. The applications are still responsible for enforcing what those scopes can and cannot do.

If you do not want to implement and maintain the authentication part of your application and you can trust a third party to manage this process, then Auth0 is an excellent commercial solution that provides all the APIs needed for a microservice-based application: https://auth0.com/.

X.509 certificate-based authentication

The X.509 standard (https://datatracker.ietf.org/doc/html/rfc5280) is used to secure the web. Every website using TLS—the ones with https:// URLs—has an X.509 certificate on its web server, and uses it to verify the server's identity and set up the encryption the connection will use.

How does a client verify a server's identity when it is presented with such a certificate? Each properly issued certificate is cryptographically signed by a trusted authority. A Certificate Authority (CA) will often be the one issuing the certificate to you and will be the ultimate organization that browsers rely on to know who to trust. When the encrypted connection is being negotiated, a client will examine the certificate it's given and check who has signed it. If it is a trusted CA and the cryptographic checks are passed, then we can assume the certificate represents who it claims to. Sometimes the signer is an intermediary, so this step should be repeated until the client reaches a trusted CA.

It is possible to create a self-signed certificate, and this can be useful in test suites or for local development environments—although it is the digital equivalent of saying, "trust me, because I said so." A production service should not use self-signed certificates, and if the browser issues a warning, the human sitting in front of it will be right to be wary of the site they're accessing.

Obtaining a good certificate is significantly easier than it used to be, thanks to Let's Encrypt (https://letsencrypt.org/). Organizations that charge money for a certificate still offer value—there are features such as Extended Validation that aren't easily automated, and sometimes that extra display in the browser, often as a green padlock in the address bar, is worth it.

Let us generate a certificate using Let's Encrypt, and use some command-line tools to examine it. On the Let's Encrypt website there are instructions to install a utility called certbot. The instructions will vary slightly depending on the platform being used, so we won't include them here. Once certbot is installed, obtaining a certificate for a web server such as nginx is simple:

$ sudo certbot --nginx
No names were found in your configuration files. Please enter in your domain
name(s) (comma and/or space separated)  (Enter 'c' to cancel): certbot-test.mydomain.org
Requesting a certificate for certbot-test.mydomain.org
Performing the following challenges:
http-01 challenge for certbot-test.mydomain.org
Waiting for verification...
Cleaning up challenges
Deploying Certificate to VirtualHost /etc/nginx/sites-enabled/default
Redirecting all traffic on port 80 to ssl in /etc/nginx/sites-enabled/default
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Congratulations! You have successfully enabled https://certbot-test.mydomain.org
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Now we can examine our nginx configuration, and the site mentioned in the certbot output—/etc/nginx/sites-enabled/default. We can also see that the certificates have been set up for us, and we could have told certbot to just generate some certificates and let us install them if we wanted more fine-grained control over what happens with our configuration. In the following snippet of nginx configuration, we see the parts that certbot has added in order to secure the web service:

listen [::]:443 ssl ipv6only=on; # managed by Certbot
listen 443 ssl; # managed by Certbot
ssl_certificate /etc/letsencrypt/live/certbot-test.mydomain.org/fullchain.pem; # managed by Certbot
ssl_certificate_key /etc/letsencrypt/live/certbot-test.mydomain.org/privkey.pem; # managed by Certbot
include /etc/letsencrypt/options-ssl-nginx.conf; # managed by Certbot
ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem; # managed by Certbot

We can use the OpenSSL toolkit to examine our certificate, both by looking at the file and by sending a query to the web server. Examining the certificate will provide a lot of information, although the important pieces for us include the sections on Validity and Subject. A certificate expiring without being renewed is a common error condition when running a service; certbot includes helpers to automatically refresh certificates that are about to expire, and so if we use the provided tools, this should not be a problem.

A certificate subject describes the entity that the certificate has been created for, and in this instance, that is a hostname. The certificate presented here has a subject Common Name (CN) of certbot-test.mydomain.org, but if that's not the hostname we are using then the clients connecting to our service will rightfully complain.

In order to examine a certificate's details, including the subject, we can use the openssl utility to display the certificate:

$ sudo openssl x509 -in /etc/letsencrypt/live/certbot-test.mydomain.org/fullchain.pem  -text -noout
Certificate:
	Data:
  	Version: 3 (0x2)
  	Serial Number:
    	04:92:e3:37:a4:83:77:4f:b9:d7:5c:62:24:74:7e:a4:5a:e0
  	Signature Algorithm: sha256WithRSAEncryption
  	Issuer: C = US, O = Let's Encrypt, CN = R3
  	Validity
    	Not Before: Mar 13 14:43:12 2021 GMT
    	Not After : Jun 11 14:43:12 2021 GMT
  	Subject: CN = certbot-test.mydomain.org
...

It is also possible to connect to a running web server using the openssl utility, which may be useful to confirm that the correct certificate is being used, to run monitoring scripts for certificates that will soon expire, or for other such diagnostics. Using the nginx instance we configured above, we can establish an encrypted session over which we can send HTTP commands:

$ openssl s_client -connect localhost:443
CONNECTED(00000003)
Can't use SSL_get_servername
depth=2 O = Digital Signature Trust Co., CN = DST Root CA X3
verify return:1
depth=1 C = US, O = Let's Encrypt, CN = R3
verify return:1
depth=0 CN = certbot-test.mydomain.org
verify return:1
---
Certificate chain
 0 s:CN = certbot-test.mydomain.org
 i:C = US, O = Let's Encrypt, CN = R3
 1 s:C = US, O = Let's Encrypt, CN = R3
 i:O = Digital Signature Trust Co., CN = DST Root CA X3
---
Server certificate
-----BEGIN CERTIFICATE-----
MII  
# A really long certificate has been removed here
-----END CERTIFICATE-----
subject=CN = certbot-test.mydomain.org
issuer=C = US, O = Let's Encrypt, CN = R3
---
New, TLSv1.3, Cipher is TLS_AES_256_GCM_SHA384
Server public key is 2048 bit
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
Early data was not sent
Verify return code: 0 (ok)
---

We can easily read the public certificate in this exchange, and confirm it is the one we are expecting the server to use, from its configuration file. We can also discover which encryption suites have been negotiated between the client and server, and identify any that might be a cause of problems if older client libraries or web browsers are being used.

So far, we have only discussed the server using certificates to verify its identity and to establish a secure connection. It is also possible for the client to present a certificate to authenticate itself. The certificate would allow our application to verify that the client is who they claim to be, but we should be careful, as it does not automatically mean that the client is allowed to do something—that control still lies with our own application. Managing these certificates, setting up a CA to issue the appropriate certificates for clients, and how to properly distribute the files, are beyond the scope of this book. If it is the right choice for an application you are creating, a good place to start is the nginx documentation at http://nginx.org/en/docs/http/ngx_http_ssl_module.html#ssl_verify_client.

Let's take a look at authenticating clients that use our services, and how we can set up a microservice dedicated to validating client access.

Token-based authentication

As we said earlier, when one service wants to get access to another without any user intervention, we can use a CCG flow. The idea behind CCG is that a service can connect to an authentication service and ask for a token that it can then use to authenticate against other services.

Authentication services could issue multiple tokens in systems where different sets of permissions are needed, or identities vary.

Tokens can hold any information that is useful for the authentication and authorization process. Some of these are as follows:

  • The username or ID, if it's pertinent to the context
  • The scope, which indicates what the caller can do (read, write, and so on)
  • A timestamp indicating when the token was issued
  • An expiration timestamp, indicating how long the token is valid for

A token is usually built as a complete proof that you have permission to use a service. It is complete because it is possible to validate the token with the authentication service without knowing anything else, or having to query an external resource. Depending on the implementation, a token can also be used to access different microservices.

OAuth2 uses the JWT standard for its tokens. There is nothing in OAuth2 that requires the use of JWT—it just happens to be a good fit for what OAuth2 wants to do.

The JWT standard

The JSON Web Token (JWT) described in RFC 7519 is a standard that is commonly used to represent tokens: https://tools.ietf.org/html/rfc7519.

A JWT is a long string composed of three dot-separated parts:

  • A header: This provides information on the token, such as which hashing algorithm is used
  • A payload: This is the actual data
  • A signature: This is a signed hash of the header and payload to verify that it is legitimate

JWTs are Base64-encoded so they can be safely used in query strings. Here's a JWT in its encoded form:

eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9
.
eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IlNpbW9uIEZyYXNlciIsIm lhdCI6MTYxNjQ0NzM1OH0
.
K4ONCpK9XKtc4s56YCC-13L0JgWohZr5J61jrbZnt1M

Each part in the token above is separated by a line break for display purposes—the original token is a single line. You can experiment with JWT encoding and decoding using a utility provided by Auth0 at https://jwt.io/.

If we use Python to decode it, the data is simply in Base64:

>>> import base64
>>> def decode(data):
... # adding extra = for padding if needed
... pad = len(data) % 4
... if pad > 0:
...     data += "=" * (4 - pad)
... return base64.urlsafe_b64decode(data)
...
>>> decode("eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9")
b'{"alg":"HS256","typ":"JWT"}'
>>> import base64 
>>> decode("eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IlNpbW9uIEZyYXNlciIsImlhdC I6MTYxNjQ0NzM1OH0")
b'{"sub":"1234567890","name":"Simon Fraser","iat":1616447358}'
>>> decode("K4ONCpK9XKtc4s56YCC-13L0JgWohZr5J61jrbZnt1M")
b"+x83x8d
x92xbd\xab\xe2xcez` xbexd7rxf4&x05xa8x85x9axf9'xadcxadxb6gxb7S"

Every part of the JWT is a JSON mapping except the signature. The header usually contains just the typ and the alg keys: the typ key says that it is a JWT, and the alg key indicates which hashing algorithm is used. In the following header example, we have HS256, which stands for HMAC-SHA256:

{"typ": "JWT",  "alg": "HS256"} 

The payload contains whatever you need, and each field is called a JWT claim in the RFC 7519 jargon. The RFC has a predefined list of claims that a token may contain, called Registered Claim Names. Here's a subset of them:

  • iss: This is the issuer, which is the name of the entity that generated the token. It's typically the fully qualified hostname, so the client can use it to discover its public keys by requesting /.well-known/jwks.json.
  • exp: This is the expiration time, which is a timestamp after which the token is invalid.
  • nbf: This stands for not before time, which is a timestamp before which the token is invalid.
  • aud: This indicates the audience, which is the recipient for whom the token was issued.
  • iat: Stands for issued at, which is a timestamp for when the token was issued.

In the following payload example, we're providing the custom user_id value along with timestamps that make the token valid for the 24 hours after it was issued; once valid, that token can be used for 24 hours:

{
  "iss": "https://tokendealer.mydomain.org", 
  "aud": "mydomain.org", 
  "iat": 1616447358, 
  "nbt": 1616447358, 
  "exp": 1616533757, 
  "user_id": 1234
} 

These headers give us a lot of flexibility to control how long our tokens will stay valid. Depending on the nature of the microservice, the token Time-To-Live (TTL) can be anything from very short to infinite. For instance, a microservice that interacts with others within your system should probably rely on tokens that are valid for long enough to avoid having to regenerate tokens unnecessarily, multiple times. On the other hand, if your tokens are distributed in the wild, or if they relate to changing something highly important, it's a good idea to make them short-lived.

The last part of a JWT is the signature. It contains a signed hash of the header and the payload. There are several algorithms used to sign the hash; some are based on a secret key, while others are based on a public and private key pair.

PyJWT

In Python, the PyJWT library provides all the tools you need to generate and read back JWTs: https://pyjwt.readthedocs.io/.

Once you've pip-installed pyjwt (and cryptography), you can use the encode() and the decode() functions to create tokens. In the following example, we're creating a JWT using HMAC-SHA256 and reading it back. The signature is verified when the token is read, by providing the secret:

>>> import jwt
>>>  def create_token(alg="HS256", secret="secret", data=None):
        return jwt.encode(data, secret, algorithm=alg)
...
>>>
>>> def read_token(token, secret="secret", algs=["HS256"]):
...  return jwt.decode(token, secret, algorithms=algs)
...
>>>  token = create_token(data={"some": "data", "inthe": "token"})
>>> print(token)
eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJzb21lIjoiZGF0YSIsImludGhlIjoidG9rZW4ifQ.vMHiSS_vk-Z3gMMxcM22Ssjk3vW3aSmJXQ8YCSCwFu4
>>> print(read_token(token))
{'some': 'data', 'inthe': 'token'}

When executing this code, the token is displayed in both its compressed and uncompressed forms. If you use one of the registered claims, PyJWT will control them. For instance, if the exp field is provided and the token is outdated, the library will raise an error.

Using a secret for signing and verifying the signature is great when you have a few services running, but it can soon become a problem due to it requiring you to share the secret among all services that need to verify the signature. So, when the secret needs to be changed, it can be a challenge to change it across your stack securely. Basing your authentication on a secret that you are sharing around is also a weakness. If a single service is compromised and the secret is stolen, your whole authentication system is compromised.

A better technique is to use an asymmetric key composed of a public key and a private key. The private key is used by the token issuer to sign the tokens, and the public key can be utilized by anyone to verify that the signature was signed by that issuer. Of course, if an attacker has access to the private key, or can convince clients that a forged public key is the legitimate one, you would still be in trouble.

But using a public/private key pair does still reduce the attack surface of your authentication process, often sufficiently to discourage most attackers; and, since the authentication microservice will be the only place that contains the private key, you can focus on adding extra security to it. For instance, such sensible services are often deployed in a firewalled environment where all access is strictly controlled. Let us now see how we can create asymmetric keys in practice.

Using a certificate with JWT

To simplify matters for this example, we will use the letsencrypt certificates we generated for nginx earlier on. If you are developing on a laptop or container that is not available from the internet, you may need to generate those certificates using a cloud instance or a certbot DNS plugin and copy them to the right place.

If certbot generated the certificates directly, they will be available in /etc/letsencrypt/live/your-domain/. To start with, we are interested in these two files:

  • cert.pem, which contains the certificate
  • privkey.pem, which has the RSA private key

In order to use these with PyJWT, we need to extract the public key from the certificate:

openssl x509 -pubkey -noout -in cert.pem  > pubkey.pem

RSA stands for Rivest, Shamir, and Adleman, the three authors. The RSA encryption algorithm generates crypto keys that can go up to 4,096 bytes, and are considered secure.

From there, we can use pubkey.pem and privkey.pem in our PyJWT script to sign and verify the signature of the token, using the RSASSA-PKCS1-v1_5 signature algorithm and the SHA-512 hash algorithm:

  import jwt
  with open("pubkey.pem") as f:
    PUBKEY = f.read()
  with open("privkey.pem") as f:
    PRIVKEY = f.read()
  def create_token(**data):
    return jwt.encode(data, PRIVKEY, algorithm="RS512")
  def read_token(token):
    return jwt.decode(token, PUBKEY, algorithms="RS512")
  token = create_token(some="data", inthe="token")
  print(token)
  read = read_token(token)
  print(read)

The result is similar to the previous run, except that we get a much bigger token:

eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzUxMiJ9.eyJzb21lIjoiZGF0YSIsImludGh lIjoidG9rZW4ifQ.gi5p3k4PAErw8KKrghRjsi8g1IXnflivXiwwaZdFEh84zvgw9RJRa 50uJe778A1CBelnmo2iapSWOQ9Mq5U6gpv4VxoVYv6QR2zFNO13GB_tce6xQ OhjpAd-hRxouy3Ozj4oNmvwLpCT5dYPsCvIiuYrLt4ScK5S3q3a0Ny64VXy 3CcISNkyjs7fnxyMMkCMZq65Z7jOncf1RXpzNNIt546aJGsCcpCPGHR1cRj uvV_uxPAMd-dfy2d5AfiCXOgvmwQhNdaxYIM0gPgz9_yHPzgaPjtgYoJMc9iK ZdOLz2-8pLc1D3r_uP3P-4mfxP7mOhQHYBrY9nv5MTSwFC3JDA
{'some': 'data', 'inthe': 'token'}

Adding that much extra data to each request can have consequences for the amount of network traffic generated, so the secret-based JWT technique is an option to keep in mind if you need to reduce the network overhead.

The TokenDealer microservice

Our first step in building the authentication microservice will be to implement everything needed to perform a CCG flow. For that, the app receives requests from services that want a token and generates them on demand, assuming the request has a known secret in it. The generated tokens will have a lifespan of one day. This approach has the most flexibility, without the complexity of generating our own X.509 certificates, while allowing us to have one service responsible for generating the tokens.

This service will be the only service to possess the private key that is used to sign the tokens, and will expose the public key for other services that want to verify tokens. This service will also be the only place where all the client IDs and secret keys are kept.

We will greatly simplify the implementation by stating that once a service gets a token, it can access any other service in our ecosystem. When a service is accessed with a token, it can verify that token locally or call the TokenDealer to perform the verification. The choice between a network request and some CPU usage in the microservice will depend on what the application does and where its bottlenecks are. When balancing the security and performance requirements it might be necessary to validate the token, at most, once every few minutes, rather than every single time. This will cause a delay if the token needs to be invalidated, though, so we should consult the user stories and, if necessary, discuss the topic with the people who will be using the service to see which is most important.

To implement everything we've described, three endpoints will be created in this microservice:

  • GET /.well-known/jwks.json: This is the public key published in the JSON Web Key (JWK) format, as described in RFC 7517, when other microservices want to verify tokens on their own. For more information, see the following: https://tools.ietf.org/html/rfc7517.
  • POST /oauth/token: This endpoint accepts a request with credentials and returns a token. Adding the /oauth prefix is a widely adopted convention, since it is used in the OAuth RFC.
  • POST /verify_token: This endpoint returns the token payload, given a token. If the token is not valid, it returns an HTTP 400 error code.

Using the microservice skeleton, we can create a very simple Quart application that implements these three views. The skeleton is available at https://github.com/PacktPublishing/Python-Microservices-Development-2nd-Edition/.

Let's look at these three OAuth views.

The OAuth implementation

For the CCG flow, the service that wants a token sends a POST request with a URL-encoded body that contains the following fields:

  • client_id: This is a unique string identifying the requester.
  • client_secret: This is a secret key that authenticates the requester. It should be a random string generated upfront and registered with the auth service.
  • grant_type: This is the grant type, which here must be client_credentials.

We'll make a few assumptions to simplify the implementation. Firstly, we will keep the list of secrets in a Python data structure, for demonstration purposes. In a production service, they should be encrypted at rest and kept in a resilient data store. We will also assume that client_id is the name of the calling microservice, and for now we will generate secrets using binascii.hexlify(os.urandom(16)).

The first view will be the one that actually generates the tokens needed by the other services. In our example we are reading in the private key each time we create a token—this may be better stored in the application configuration for a real service, just to reduce the time spent waiting to read a file from the disk. We make sure the client has sent us a reasonable request, and that it wants some client_credentials. The error handling functions and utilities are available in the full source code samples for this chapter.

The token itself is a data structure with several fields: The issuer (iss) of the token, commonly the URL of the service; the intended audience (aud) for the token, that is, who the token is intended for; the time the token was issued (iat); as well as its expiry (exp) time. We then sign this data using the jwt.encode method and return it to the requesting client:

@app.route("/oauth/token", methods=["POST"])
async def create_token():
    with open(current_app.config["PRIVATE_KEY_PATH"]) as f:
        key = f.read().strip()
    try:
        data = await request.form
        if data.get("grant_type") != "client_credentials":
            return bad_request(f"Wrong grant_type {data.get('grant_type')}")
 
        client_id = data.get("client_id")
        client_secret = data.get("client_secret")
        aud = data.get("audience", "")
 
        if not is_authorized_app(client_id, client_secret):
            return abort(401)
 
        now = int(time.time())
 
        token = {
            "iss": current_app.config["TOKENDEALER_URL"],
            "aud": aud,
            "iat": now,
            "exp": now + 3600 * 24,
        }
        token = jwt.encode(token, key, algorithm="RS512")
        return {"access_token": token}
    except Exception as e:
        return bad_request("Unable to create a token") 

The next view to add is a function that returns the public keys used by our token generation, so that any client can verify the tokens without making further HTTP requests. This is often located at a well-known URL—the address literally contains the string .well-known/, which is a practice encouraged by the IETF to provide a way for a client to discover metadata about a service. Here we are responding with the JWKS.

In the data returned are the key type (kty), the algorithm (alg), the public key use (use)—here a signature—and two values used by the RSA algorithm that our cryptographic keys are generated with:

@app.route("/.well-known/jwks.json")
async def _jwks():
    """Returns the public key in the Json Web Key Set (JWKS) format"""
    with open(current_app.config["PUBLIC_KEY_PATH"]) as f:
        key = f.read().strip()
    data = {
        "alg": "RS512",
        "e": "AQAB",
        "n": key,
        "kty": "RSA",
        "use": "sig",
    }
 
    return jsonify({"keys": [data]})

The final view lets clients verify a token without doing the work themselves. Much more straightforward than the token generation, we simply extract the right fields from the input data and call the jwt.decode function to provide the values. Note that this function verifies the token is valid, but not that the token allows any particular access — that part is up to the service that has been presented with the token:

@app.route("/verify_token", methods=["POST"])
async def verify_token():
    with open(current_app.config["PUBLIC_KEY_PATH"]) as f:
        key = f.read()
    try:
        input_data = await request.form
        token = input_data["access_token"]
        audience = input_data.get("audience", "")
        return jwt.decode(token, key, algorithms=["RS512"], audience=audience)
    except Exception as e:
        return bad_request("Unable to verify the token") 

The whole source code of the TokenDealer microservice can be found on GitHub: https://github.com/PacktPublishing/Python-Microservices-Development-2nd-Edition.

The microservice could offer more features around token generation. For instance, the ability to manage scopes and make sure microservice A is not allowed to generate a token that can be used in microservice B, or managing a whitelist of services that are authorized to ask for some tokens. A client could also request a token that is intended for read-only use. Despite this, however, the pattern we have implemented is the basis for a simple token-based authentication system in a microservice environment that you can develop on your own, while also being good enough for our Jeeves app.

Looking back at our example microservice, TokenDealer now sits as a separate microservice in the ecosystem, creating and verifying keys that allow access to our data service, and authorizing access to the third-party tokens and API keys we need to query other sites:

Figure 7.2: The microservice ecosystem with the CCG TokenDealer

Those services that require a JWT may validate it by calling the TokenDealer microservice. The Quart app in Figure 7.2 needs to obtain tokens from TokenDealer on behalf of its users.

Now that we have a TokenDealer service that implements CCG, let us see how it can be used by our services in the next section.

Using TokenDealer

In Jeeves, the Data Service is a good example of a place where authentication is required. Adding information via the Data Service needs to be restricted to authorized services:

image1.png

Figure 7.3: Requesting a CCG workflow

Adding authentication for that link is done in four steps:

  1. TokenDealer manages a client_id and client_secret pair for the Strava worker and shares it with the Strava worker developers
  2. The Strava worker uses client_id and client_secret to retrieve a token from TokenDealer
  3. The worker adds the token to the header for each request to the Data Service
  4. The Data Service verifies the token by calling the verification API of TokenDealer or by performing a local JWT verification

In a full implementation, the first step can be partially automated. Generating a client secret is usually done through a web administration panel in the authentication service. That secret is then provided to the client microservice developers. Each microservice that requires a token can now get one, whether it is the first time connecting, or because the tokens it has already obtained have expired. All they need to do to use it is add that token to the Authorization header when calling the Data Service.

The following is an example of such a call using the requests library—assuming our TokenDealer is already running on localhost:5000:

# fetch_token.py
import requests
TOKENDEALER_SERVER = "http://localhost:5000"
SECRET = "f0fdeb1f1584fd5431c4250b2e859457"
def get_token():
    data = {
        "client_id": "worker1",
        "client_secret": secret,
        "audience": "jeeves.domain",
        "grant_type": "client_credentials",
    }
    headers = {"Content-Type": "application/x-www-form-urlencoded"}
    url = tokendealer_server + "/oauth/token"
    response = requests.post(url, data=data, headers=headers)
    return response.json()["access_token"]

The get_token() function retrieves a token that can then be used in the Authorization header when the code calls the Data Service, which we assume is listening on port 5001 for this example:

# auth_caller.py
_TOKEN = None
def get_auth_header(new=False):
    global _TOKEN
    if _TOKEN is None or new:
        _TOKEN = get_token()
    return "Bearer " + _TOKEN
_dataservice = "http://localhost:5001"
def _call_service(endpoint, token):
    # not using session and other tools, to simplify the code
    url = _dataservice + "/" + endpoint
    headers = {"Authorization": token}
    return requests.get(url, headers=headers)
def call_data_service(endpoint):
    token = get_auth_header()
    response = _call_service(endpoint, token)
    if response.status_code == 401:
        # the token might be revoked, let's try with a fresh one
        token = get_auth_header(new=True)
        response = _call_service(endpoint, token)
    return response

The call_data_service() function will try to get a new token if the call to the Data Service leads to a 401 response. This refresh-token-on-401 pattern can be used in all your microservices to automate token generation.

This covers service-to-service authentication. You can find the full implementation in the example GitHub repository to play with this JWT-based authentication scheme and use it as a basis for building your authentication process.

The next section looks at another important aspect of securing your web services, and that is securing the code itself.

Securing your code

Whatever we do, an application must receive data and act on it, somehow, or it will not be very useful. If a service receives data, then as soon as you expose your app to the world, it is open to numerous possible types of attack, and your code needs to be designed with this in mind.

Anything that is published to the web can be attacked, although we have the advantage that most microservices are not exposed to the public internet, which reduces the possible ways they could be exploited. The expected inputs and outputs of the system are narrower, and often better defined using specification tools, such as OpenAPI.

Attacks are not always due to hostile intent, either. If the caller has a bug or is just not calling your service correctly, the expected behavior should be to send back a 4xx response and explain to the client why the request was rejected.

The Open Web Application Security Project (OWASP) (https://www.owasp.org) is an excellent resource to learn about ways to protect your web apps from bad behaviors. Let's look at some of the most common forms of attack:

  • Injection: In an application that receives data, an attacker sends SQL statements, shell commands, or some other instructions inside the request. If your application is not careful about how it uses that data, you can end up running code that is meant to damage your application. In Python, SQL injection attacks can be avoided by using SQLAlchemy, which constructs the SQL statements for you in a safe way. If you do use SQL directly, or provide arguments to shell scripts, LDAP servers, or some other structured query, you must make sure that every variable is quoted correctly.
  • Cross-Site Scripting (XSS): This attack happens only on web pages that display some HTML. The attacker uses some of the query attributes to try to inject their piece of HTML on the page to trick the user into performing some set of actions, thinking they are on the legitimate website.
  • Cross-Site Request Forgery (XSRF/CSRF): This attack is based on attacking a service by reusing the user's credentials from another website. The typical CSRF attack happens with POST requests. For instance, a malicious website displays a link to a user to trick that user into performing the POST request on your site using their existing credentials.

Things such as Local File Inclusion (LFI), Remote File Inclusion (RFI), or Remote Code Execution (RCE) are all attacks that trick the server into executing something via client input, or revealing server files. They can happen of course in applications written in most languages and toolkits, but we will examine some of Python's tools to protect against these attacks.

The idea behind secure code is simple, yet hard to do well in practice. The two fundamental principles are:

  • Every request from the outside world should be carefully assessed before it does something in your application and data.
  • Everything your application is doing on a system should have a well-defined and limited scope.

Let's look at how to implement these principles in practice.

Limiting your application scope

Even if you trust the authentication system, you should make sure that whoever connects has the minimum level of access required to perform their work. If there is a client that connects to your microservice and can authenticate themselves, that doesn't mean they should be allowed to perform any action. If they only need read-only access, then that's all they should be granted.

This isn't just protecting against malicious code, but also bugs and accidents. Any time you think, "the client should never call this endpoint," then there should be something in place that actively prevents the client using it.

That scope limitation can be done with JWTs by defining roles (such as read/write) and adding that information in the token under a permissions or scope key, for example. The target microservice will then be able to reject a call on a POST that is made with a token that is supposed to only read data.

This is what happens when you grant access to an application on your GitHub account, or on your Android phone. A detailed list of what the app wants to do is displayed, and you can grant or reject access.

This is in addition to network-level controls and firewalls. If you are controlling all parts of your microservices ecosystem, you can also use strict firewall rules at the system level to whitelist the IPs that are allowed to interact with each microservice, but that kind of setup greatly depends on where you are deploying your application. In the Amazon Web Services (AWS) cloud environment, you don't need to configure a Linux firewall; all you have to do is set up the access rules in the AWS Console. Chapter 10, Deploying on AWS, covers the basics of deploying your microservices on the Amazon cloud.

Besides network access, any other resource your application can access should be limited whenever possible. Running the application as a root user on Linux is not a good idea because if your application has full administrative privileges, then so does an attacker who successfully breaks in.

In essence, if a layer of security fails, there should be another behind it. If an application's web server is successfully attacked, any attacker should ideally be as limited as possible in what they can do, as they only have access to the well-defined interfaces between the services in the application—instead of full administrative control over the computer running the code. Root access to a system has become an indirect threat in modern deployments, since most applications are running in containers or a Virtual Machine (VM), but a process can still do a lot of damage even if its abilities are limited by the VM it is running in. If an attacker gains access to one of your VMs, they have achieved the first step in getting control over the whole system. To mitigate the problem, there are two rules you should follow:

  1. All software should run with the smallest set of permissions possible
  2. Be very cautious when executing processes from your web service, and avoid it if you can

For the first rule, the default behavior for web servers such as nginx is to run its processes using the www-data user and group, so that standard user controls prevent the server accessing other files, and the account itself can be set up to not be allowed to run a shell or any other interactive commands. The same rules apply to your Quart processes. We will see in Chapter 9, Packaging and Running Python, the best practices to run a stack in the user space on a Linux system.

For the second rule, any Python call to os.system() should be avoided unless absolutely necessary, as it creates a new user shell on the computer, adding risks associated with badly formed commands being run, and increasing the risk of uncontrolled access to the system. The subprocess module is better, although it, too, must be used carefully to avoid unwanted side effects—avoid using the shell=True argument, which will result in the same trouble as os.system(), and avoid using input data as arguments and commands. This is also true for high-level network modules that send emails or connect to third-party servers via FTP, via the local system.

Untrusted incoming data

The majority of applications accept data as input: whose account to look up; which city to fetch a weather report for; which account to transfer money into, and so forth. The trouble is that data that comes from outside our system is not easily trusted.

Earlier, we discussed SQL injection attacks; let us now consider a very naive example, where we use a SQL query to look up a user. We have a function that treats the query as a string to be formatted, and fills it in using standard Python syntax:

import pymysql 
connection = pymysql.connect(host='localhost', db='book') 
def get_user(user_id): 
    query = f"select * from user where id = {user_id}"
        with connection.cursor() as cursor: 
        cursor.execute(query) 
        result = cursor.fetchone() 
        return result

This looks fine when the user_id is always a sensible value. However, what if someone presents a carefully crafted malicious value? If we allow people to enter data for get_user(), above, and instead of entering a number as a user_id, they enter:

'1'; insert into user(id, firstname, lastname, password) values (999, 'pwnd', 'yup', 'somehashedpassword')

Now our SQL statement is really two statements:

select * from user where id = '1'
insert into user(id, firstname, lastname, password) values (999, 'pwnd', 'yup', 'somehashedpassword')

get_user will perform the expected query, and a second query that will add a new user! It could also delete a table, or perform any other action available to SQL statements. Some damage limitation is there if the authenticated client has limited permissions, but a large amount of data could still be exposed. This scenario can be prevented by quoting any value used to build raw SQL queries. In PyMySQL, you just need to pass the values as parameters to the execute argument to avoid this problem:

def get_user(user_id): 
    query = 'select * from user where id = %s' 
        with connection.cursor() as cursor: 
        cursor.execute(query, (user_id,)) 
        result = cursor.fetchone() 
        return result 

Every database library has this feature, so as long as you are correctly using these libraries when building raw SQL, you should be fine. Better still is to avoid using raw SQL completely, and instead use a database model through SQLAlchemy.

If you have a view that grabs JSON data from the incoming request and uses it to push data to a database, you should verify that the incoming request has the data you are expecting, and not blindly pass it over to your database backend. That's why it can be interesting to use Swagger to describe your data as schemas, and use them to validate incoming data. Microservices usually use JSON, but if you happen to use templates to provide formatted output, that's yet another place where you need to be careful with respect to what the template is doing with variables.

Server-Side Template Injection (SSTI) is a possible attack in which your templates blindly execute Python statements. In 2016, such an injection vulnerability was found on Uber's website on a Jinja2 template, because raw formatting was done before the template was executed. See more at https://hackerone.com/reports/125980.

The code was something similar to this small app:

from quart import Quart, request, render_template_string
app = Quart(__name__)
SECRET = "oh no!"
_TEMPLATE = """
    Hello %s
    Welcome to my API!
    """
class Extra:
    def __init__(self, data):
    self.data = data
@app.route("/")
async def my_microservice():
    user_id = request.args.get("user_id", "Anonymous")
    tmpl = _TEMPLATE % user_id
    return await render_template_string(tmpl, extra=Extra("something"))
app.run()

By doing this preformatting on the template with a raw % formatting syntax, the view creates a huge security hole in the app, since it allows attackers to inject what they want into the Jinja script before it is executed. In the following example, the user_id variable security hole is exploited to read the value of the SECRET global variable from the module:

# Here we URL encode the following:
# http://localhost:5000/?user_id={{extra.__class__.__init__.__globals__["SECRET"]}} 
$ curl http://localhost:5000/?user_id=%7B%7Bextra.__class__.__init__.__globals__%5B%22SECRET%22%5D%7D%7D
Hello oh no!
Welcome to my API!

That's why it is important to avoid string formatting with input data unless there is a template engine or some other layer that provides protection.

If you need to evaluate untrusted code in a template, you can use Jinja's sandbox; refer to http://jinja.pocoo.org/docs/latest/sandbox/. This sandbox will reject any access to methods and attributes from the object being evaluated. For instance, if you're passing a callable in your template, you will be sure that its attributes, such as ;__class__, cannot be used.

That said, Python sandboxes are tricky to get right due to the nature of the language. It's easy to misconfigure a sandbox, and the sandbox itself can be compromised with a new version of the language. The safest bet is to avoid evaluating untrusted code altogether and make sure you're not directly relying on incoming data for templates.

Redirecting and trusting queries

The same precaution applies when dealing with redirects. One common mistake is to create a login view that makes the assumption that the caller will be redirected to an internal page and use a plain URL for that redirect:

@app.route('/login') 
def login(): 
    from_url = request.args.get('from_url', '/') 
    # do some authentication 
    return redirect(from_url) 

This view can redirect the caller to any website, which is a significant threat—particularly during the login process. Good practice involves avoiding free strings when calling redirect(), by using the url_for() function, which will create a link to your app domain. If you need to redirect to third parties, you cannot use the url_for() and redirect() functions, as they can potentially send your clients to unwanted places.

One solution is to create a restricted list of third-party domains that your application is allowed to redirect to and make sure any redirection done by your application or underlying third-party libraries is checked against that list.

This can be done with the after_request() hook that will be called after our views have generated a response, but before Quart has sent it back to the client. If the application tries to send back a 302, you can check that its location is safe, given a list of domains and ports:

# quart_after_response.py
from quart import Quart, redirect
from quart.helpers import make_response
from urllib.parse import urlparse
app = Quart(__name__)
@app.route("/api")
async def my_microservice():
    return redirect("https://github.com:443/")
# domain:port
SAFE_DOMAINS = ["github.com:443", "google.com:443"]
@app.after_request
async def check_redirect(response):
    if response.status_code != 302:
        return response
    url = urlparse(response.location)
    netloc = url.netloc
    if netloc not in SAFE_DOMAINS:
        # not using abort() here or it'll break the hook
        return await make_response("Forbidden", 403)
    return response
if __name__ == "__main__":
    app.run(debug=True)

Sanitizing input data

In addition to the other practices for handling untrusted data, we can ensure the fields themselves match what we expect. Faced with the examples above, it is tempting to think that we should filter out any semicolons, or perhaps all the curly braces, but this leaves us in the position of having to think of all the ways in which the data could be malformed, and trying to outwit the inventiveness of both malicious programmers and also random bugs.

Instead, we should concentrate on what we know about what our data should look like—instead of what it should not. This is a much narrower question, and the answer can often be much easier to define. As an example, if we know that an endpoint accepts an ISBN to look up a book, then we know that we should only expect a series of numbers either 10 or 13 digits long, perhaps with dashes as separators. When it comes to people, however, data is much harder to clean up.

There are several fantastic lists of falsehoods that programmers believe about various topics at https://github.com/kdeldycke/awesome-falsehood. These lists are not meant to be exhaustive or authoritative, but they are helpful in reminding us that we may have false notions about how human information works. Human names, postal addresses, phone numbers: we should not make assumptions about what any of this data looks like, how many lines it has, or what order the elements are in. The best we can do is ensure that the human entering the information has the best chance to check that it is all correct, and then use the quoting and sandboxing techniques described earlier to avoid any incidents.

Even an email address is extremely complicated to validate. The permitted format has a lot of different parts to it, and not all of them are supported by every email system. An oft-quoted saying is that the best way to validate an email address is to try sending it an email, and this validation method is used by both legitimate websites—sending an email and informing you that they "have sent an email to confirm your account"—and by spammers who send nonsensical messages to millions of addresses and record which ones don't return an error.

To summarize, you should always treat incoming data as a potential threat, as a source of attacks to be injected into your system. Escape or remove any special characters, avoid using the data directly in database queries or templates without a layer of isolation between them, and ensure your data looks as you would expect it to.

There is also a way to continuously check your code for potential security issues using the Bandit linter, explored in the next section.

Using Bandit linter

Managed by the Python Code Quality Authority, Bandit (https://github.com/PyCQA/bandit) is another tool to scan your source code for potential security risks. It can be run in CI systems for the automatic testing of any changes before they get deployed. The tool uses the ast module to parse the code in the same way that flake8 and pylint do. Bandit will also scan for some known security issues in your code. Once you have installed it with the pip install bandit command, you can run it against your Python module using the bandit command.

Adding Bandit to your continuous integration pipeline alongside other checks, as described in Chapter 3, Coding, Testing, and Documentation: the Virtuous Cycle, is a good way to catch potential security issues in your code.

Dependencies

Most projects will use other libraries, as programmers build on the work of others, and oftentimes there is not enough time to keep a close eye on how those other projects are doing. If there's a security vulnerability in one of our dependencies, we want to know about it quickly so that we can update our own software, without manually checking.

Dependabot (https://dependabot.com/) is a tool that will perform security sweeps of your project's dependencies. Dependabot is a built-in component of GitHub, and its reports should be visible in your project's Security tab. Turning on some extra features in the project's Settings page allows Dependabot to automatically create pull requests with any changes that need making to remain secure.

PyUp has a similar set of features but requires manually setting up—as does Dependabot if you're not using GitHub.

Web application firewall

Even with the safest handling of data, our application can still be vulnerable to attack. When you're exposing HTTP endpoints to the world, this is always a risk. You will be hoping for callers to behave as intended, with each HTTP conversation following a scenario that you have programmed in the service.

A client can send legitimate requests and just hammer your service with it, leading to a Denial of Service (DoS) due to all the resources then being used to handle requests from the attacker. When many hundreds or thousands of clients are used to do this, it's known as a Distributed Denial of Service (DDoS) attack. This problem sometimes occurs within distributed systems when clients have replay features that are automatically recalling the same API. If nothing is done on the client side to throttle calls, you might end up with a service overloaded by legitimate clients.

Adding protection on the server side to make such zealous clients back off is usually not hard to do, and goes a long way to protect your microservice stack. Some cloud providers also supply protection against DDoS attacks and a lot of the features mentioned here.

OWASP, mentioned earlier in this chapter, provides a set of rules for the ModSecurity toolkit's WAF that can be used to avoid many types of attacks: https://github.com/coreruleset/coreruleset/.

In this section, we will focus on creating a basic WAF that will explicitly reject a client that's making too many requests on our service. The intention of this section is not to create a full WAF, but rather to give you a good understanding of how WAFs are implemented and used. We could build our WAF in a Python microservice, but it would add a lot of overhead if all the traffic has to go through it. A much better solution is to rely directly on the web server.

OpenResty: Lua and nginx

OpenResty (http://openresty.org/en/) is an nginx distribution that embeds a Lua (http://www.lua.org/) interpreter that can be used to script the web server. We can then use scripts to apply rules and filters to the traffic.

Lua is an excellent, dynamically typed programming language that has a lightweight and fast interpreter. The language offers a complete set of features and has built-in async features. You can write coroutines directly in vanilla Lua.

If you install Lua (refer to http://www.lua.org/start.html), you can play with the language using the Lua Read-Eval-Print Loop (REPL), exactly as you would with Python:

$ lua 
Lua 5.4.2  Copyright (C) 1994-2020 Lua.org, PUC-Rio
> io.write("Hello world
")
Hello world
file (0x7f5a66f316a0)
> mytable = {}
> mytable["user"] = "simon"
> = mytable["user"]
simon
> = string.upper(mytable["user"])
SIMON
>

To discover the Lua language, this is your starting page: http://www.lua.org/docs.html.

Lua is often the language of choice to be embedded in compiled apps. Its memory footprint is ridiculously small, and it allows for fast dynamic scripting features—this is what is happening in OpenResty. Instead of building nginx modules you can extend the web server using Lua scripts and deploy them directly with OpenResty.

When you invoke some Lua code from your nginx configuration, the LuaJIT (http://luajit.org/) interpreter that's employed by OpenResty will run them, running at the same speed as the nginx code itself. Some performance benchmarks find that Lua can be faster than C or C++ in some cases; refer to: http://luajit.org/performance.html.

Lua functions are coroutines, and so will run asynchronously in nginx. This leads to a low overhead even when your server receives a lot of concurrent requests, which is exactly what is needed for a WAF.

OpenResty comes as a Docker image and a package for some Linux distributions. It can also be compiled from the source code if needed; refer to http://openresty.org/en/installation.html.

On macOS, you can use Brew and the brew install openresty command.

Once OpenResty is installed, you will get an openresty command, which can be used exactly like nginx to serve your applications. In the following example, the nginx configuration will proxy requests to a Quart application running on port 5000:

# resty.conf
daemon off;
worker_processes  1;
error_log /dev/stdout info;
events {
worker_connections  1024;
}
http {
  access_log /dev/stdout;
  server {
    listen   8888;
    server_name  localhost;
    location / {
      proxy_pass http://localhost:5000;
      proxy_set_header Host $host;
      proxy_set_header X-Real-IP $remote_addr;
      proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    }
   }
  } 

This configuration can be used with the openresty command line, and will run in the foreground (daemon off) on port 8888 to proxy pass all requests to the Quart app running on port 5000:

$ openresty -p $(pwd) -c resty.conf
2021/07/03 16:11:08 [notice] 44691#12779096: using the "kqueue" event method
2021/07/03 16:11:08 [warn] 44691#12779096: 1024 worker_connections exceed open file resource limit: 256
nginx: [warn] 1024 worker_connections exceed open file resource limit: 256
2021/07/03 16:11:08 [notice] 44691#12779096: openresty/1.19.3.2
2021/07/03 16:11:08 [notice] 44691#12779096: built by clang 12.0.0 (clang-1200.0.32.2)
2021/07/03 16:11:08 [notice] 44691#12779096: OS: Darwin 19.6.0
2021/07/03 16:11:08 [notice] 44691#12779096: hw.ncpu: 12
2021/07/03 16:11:08 [notice] 44691#12779096: net.inet.tcp.sendspace: 131072
2021/07/03 16:11:08 [notice] 44691#12779096: kern.ipc.somaxconn: 128
2021/07/03 16:11:08 [notice] 44691#12779096: getrlimit(RLIMIT_NOFILE): 256:9223372036854775807
2021/07/03 16:11:08 [notice] 44691#12779096: start worker processes
2021/07/03 16:11:08 [notice] 44691#12779096: start worker process 44692

Note that this configuration can also be used in a plain nginx server since we are not using any Lua yet. That's one of the great things about OpenResty: it's a drop-in replacement for nginx, and can run your existing configuration files.

The code and configuration demonstrated in this section can be found at https://github.com/PacktPublishing/Python-Microservices-Development-2nd-Edition/tree/main/CodeSamples.

Lua can be invoked at different moments when a request comes in; the two that are most attractive to this chapter are:

  • access_by_lua_block: This is called on every incoming request before a response is built, and is where we can build access rules in our WAF
  • content_by_lua_block: This uses Lua to generate a response

Let us now see how we can rate-limit incoming requests.

Rate and concurrency limiting

Rate limiting consists of counting how many requests a server accepts within a given period of time, and rejecting new ones when a limit is reached.

Concurrency limiting consists of counting how many concurrent requests are being served by the web server to the same remote user, and rejecting new ones when it reaches a defined threshold. Since many requests can reach the server simultaneously, a concurrency limiter needs to have a small allowance in its threshold.

These techniques avoid any trouble within our application when we know there is an upper limit to how many requests it can respond to concurrently, and that can be a factor in load balancing across multiple instances of our app. Both are implemented using the same technique. Let's look at how to build a concurrency limiter.

OpenResty ships with a rate-limiting library written in Lua called lua-resty-limit-traffic; you can use it in an access_by_lua_block section: https://github.com/openresty/lua-resty-limit-traffic.

The function uses Lua Shared Dict, which is a memory mapping that is shared by all nginx workers within the same process. Using an in-memory dictionary means that rate limiting will work at the process level.

Since we're typically deploying one nginx per service node, rate limiting will happen per web server. So, if you are deploying several nodes for the same microservice, our effective rate limit will be the number of connections a single node can handle multiplied by the number of nodes—this will be important to take into account when deciding on the overall rate limit and how many concurrent requests the microservices can process.

In the following example, we're adding a lua_shared_dict definition and a section called access_by_lua_block to activate the rate limiting. Note that this example is a simplified version of the example in the project's documentation:

# resty_limiting.conf
daemon off;
worker_processes  1;
error_log /dev/stdout info;
events {
    worker_connections  1024;
}
http {
    lua_shared_dict my_limit_req_store 100m;
 
    server {
        listen   8888;
        server_name  localhost;
        access_log /dev/stdout;
        location / {
            access_by_lua_block {
                local limit_req = require "resty.limit.req"
                local lim, err = limit_req.new("my_limit_req_store", 200, 100)
                local key = ngx.var.binary_remote_addr
                local delay, err = lim:incoming(key, true)
                if not delay then
                    if err == "rejected" then
                        return ngx.exit(503)
                    end
                    ngx.log(ngx.ERR, "failed to limit req: ", err)
                    return ngx.exit(500)
                end
 
                if delay >= 0.001 then
                    local excess = err
                    ngx.sleep(delay)
                end
            }
            proxy_pass http://localhost:5000;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        }
    }
}

The access_by_lua_block section can be considered as a Lua function, and can use some of the variables and functions that OpenResty exposes. For instance, ngx.var is a table containing all the nginx variables, and ngx.exit() is a function that can be used to immediately return a response to the user—in our case, a 503 when we need to reject a call because of rate limiting.

The library uses the my_limit_req_store dictionary that is passed to the resty.limit.req function; every time a request reaches the server, it calls the incoming() function with the binary_remote_addr value, which is the client address.

The incoming() function will use the shared dictionary to maintain the number of active connections per remote address, and send back a rejected value when that number reaches the threshold; for example, when there are more than 300 concurrent requests.

If the connection is accepted, the incoming() function sends back a delay value. Lua will hold the request using that delay and the asynchronous ngx.sleep() function. The delay will be 0 when the remote client has not reached the threshold of 200, and a small delay when between 200 and 300, so the server has a chance to unstack all the pending requests.

This design is quite efficient to prevent a service from becoming overwhelmed by many requests. Setting up a ceiling like this is also a good way to avoid reaching a point at which you know your microservice will start to break. For instance, if some of your benchmarks concluded that your service could not serve more than 100 simultaneous requests before starting to crash, you can set the rate limit appropriately, so it is nginx that rejects requests instead of letting your Quart microservice try to process all those incoming connections only to reject them.

The key used to calculate the rate in this example is the remote address header of the request. If your nginx server is itself behind a proxy, make sure you are using a header that contains the real remote address. Otherwise, you will rate limit a single remote client, the proxy server. It's usually in the X-Forwarded-For header in that case.

If you want a WAF with more features, the lua-resty-waf (https://github.com/p0pr0ck5/lua-resty-waf) project works like lua-resty-limit-traffic, but offers a lot of other protections. It is also able to read ModSecurity rule files, so you can use the rule files from the OWASP project without having to use ModSecurity itself.

Other OpenResty features

OpenResty comes with many Lua scripts that can be useful to enhance nginx. Some developers are even using it to serve their data directly. The following components page contains some useful tools for having nginx interact with databases, cache servers, and so on: http://openresty.org/en/components.html.

There's also a website for the community to publish OpenResty components: https://opm.openresty.org/.

If you are using OpenResty in front of your Quart microservices, there will probably be other use cases where you can transfer some code that is in the Quart app to a few lines of Lua in OpenResty. The goal should not be to move the app's logic to OpenResty, but rather to leverage the web server to do anything that can be done before or after your Quart app is called. Let Python focus on the application logic and OpenResty work on a layer of protection.

For instance, if you are using a Redis or a Memcached server to cache some of your GET resources, you can directly call them from Lua to add or fetch a cached version for a given endpoint. The srcache-nginx-module (https://github.com/openresty/srcache-nginx-module) is an implementation of such behavior, and will reduce the number of GET calls made to your Quart apps if you can cache them.

To conclude this section about WAFs: OpenResty is a powerful nginx distribution that can be used to create a simple WAF to protect your microservices. It also offers abilities that go beyond firewalling. In fact, if you adopt OpenResty to run your microservices, it opens a whole new world of possibilities, thanks to Lua.

Summary

In this chapter, we have looked at how to centralize authentication and authorization in a microservices-based application environment using OAuth2 and JWTs. Tokens give us the ability to limit what a caller can do with one of the microservices, and for how long they can do it.

When used with public and private keys, it also limits the damage an attacker can inflict if one component of the whole application is compromised. It also ensures that each connection is cryptographically validated.

A secure code base is the first step to a secure application. You should follow good coding practices and make sure your code does not do anything bad when interacting with incoming user data and resources. While a tool like Bandit will not guarantee the safety and security of your code, it will catch the most obvious potential security issues, so there should be no hesitation about continuously running it on your code base.

Lastly, a WAF is also a good way to prevent some fraud and abuse on your endpoints and is very easy to do with a tool such as OpenResty, thanks to the power of the Lua programming language.

OpenResty is also an excellent way to empower and speed up your microservices by doing a few things at the web server level when they do not need to be done within the Quart application.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset