Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

8. Cryptography

Moshe Zadka¹

(1)

Belmont, CA, USA

Cryptography is a necessary component in many parts of a secure architecture. However, just adding cryptography to the code does not make it more secure; care must be given to such topics as secrets generation, secret storage, and plain-text management. Properly designing secure software is complicated, especially when cryptography is involved.

Designing for security is beyond the scope here. This chapter only teaches Python’s basic tools for cryptography and how to use them.

8.1 Fernet

The cryptography module supports the Fernet cryptography standard. It is named after an Italian, not French, wine; the t is pronounced. A good approximation for the pronunciation is fair-net.

Fernet works for symmetric cryptography. It does not support partial or streaming decryption. It expects to read in the whole ciphertext and return the whole plain text. This makes it suitable for names, text documents, or even pictures. However, videos and disk images are a poor fit for Fernet.

The cryptographic parameters were Fernet, which were chosen by domain experts who researched available encryption methods and the known best attacks against them. One advantage of using Fernet is that it avoids the need to become an expert yourself. However, for completeness, note that the Fernet standard uses AES-128 in CBC padding with PKCS7, and HMAC uses SHA256 for authentication.

The Fernet standard is also supported by Go, Ruby, and Erlang and so is sometimes suitable for data exchange with other languages. It was specially designed, so using it insecurely is harder than using it correctly.

>>> k = fernet.Fernet.generate_key()

>>> type(k)

The key is a short string of bytes. Securely managing the key is important; cryptography is only as good as its keys. If it is kept in a file, for example, the file should have minimal permissions and ideally be hosted on an encrypted file system.

The generate_key class method takes care to generate the key securely, using an operating system–level source of random bytes. However, it is still vulnerable to operating system–level flaws; for example, when cloning virtual machines, care must be taken that when starting the clone, it refreshes the source of randomness. This is admittedly an esoteric case, and whatever virtualization system is being used should have documentation on how to refresh the randomness source in its virtual machines.

>>> frn = fernet.Fernet(k)

The fernet class is initialized with a key. It makes sure that the key is valid.

>>> encrypted = frn.encrypt(b"x marks the spot")

>>> encrypted[:10]

b'gAAAAABb1'

Encryption is simple. It takes a string of bytes and returns an encrypted string. Note that the encrypted string is longer than the source string. It is also signed with the secret key, which means that tampering with the encrypted string is detectable, and the Fernet API handles that by refusing to decrypt the string. The value gotten back from decryption is trustworthy. It was indeed encrypted by someone who had access to the secret key.

code:

>>> frn.decrypt(encrypted)

b'x marks the spot'

Decryption is done in the same way as encryption. Fernet does contain a version marker, so if vulnerabilities in these are found, it is possible to move the standard to a different encryption and hashing system.

Fernet encryption always adds the current date to the signed, encrypted information. Because of this, it is possible to limit the age of a message before decrypting.

>>> frn.decrypt(encrypted, ttl=5)

This fails if the encrypted information (sometimes referred to as the token) is older than five seconds. This is useful to prevent replay attacks, one where a previously encrypted token was captured and replayed instead of a new valid token. For example, if the encrypted token has a list of usernames that are allowed some access, and is retrieved using a subvertible medium, a user who is no longer allowed in can substitute the older token.

Ensuring token freshness would mean that no such list would be decoded, and everybody would be denied, which is no worse than if the medium was tampered with without having a previously valid token.

This can also be used to ensure good secret rotation hygiene. By refusing to decrypt anything older than, say, a week, you make sure that if the secret rotation infrastructure broke, you would fail loudly instead of succeeding silently and thus fix it.

The Fernet module also has a MultiFernet class to support seamless key rotation. MultiFernet takes a list of secrets. It encrypts with the first secret, but try decrypting with any secret.

If you add a new key to the end, it is first not used for encryption. After synchronizing the addition to the end, you can remove the first key. Now all encryptions are done via the second key, and even those instances where it is not synchronized yet have the decryption key available.

This two-step process is designed to have zero invalid decryption errors while still allowing key rotation, which is important as a precautionary measure. A well-tested rotation procedure means that if keys are leaked, the rotation procedure can minimize the harm they do.

8.2 PyNaCl

PyNaCl is a library wrapping the libsodium C library, which is a fork of Daniel J. Bernstein’s libnacl. This is why PyNaCl is named the way it is. (NaCl, or sodium chloride, is the chemical formula for salt. The fork took the name of the first element.)

PyNaCl supports both symmetric and asymmetric encryption. However, since cryptography supports symmetric encryption with Fernet, the main use of PyNaCl is for asymmetric encryption.

The idea of asymmetric encryption is that there is a private and a public key. The public key can easily be calculated from the private key, but not vice versa; that is the asymmetry it refers to. The public key is published, while the private key must remain a secret.

There are, in general, two basic operations supported with public-key cryptography. You can encrypt with the public key in a way that can only be decrypted with the private key. You can also sign with the private key in a way that can be verified with the public key.

As discussed earlier, modern cryptographic practice places as much value on authentication as it does on secrecy. This is because if the media the secret is transmitted on is vulnerable to eavesdropping, it is often vulnerable to modification. Secret modification attacks have had enough impact on the field that a cryptographic system is not considered complete if it does not guarantee both authenticity and secrecy.

Because of that, libsodium, and by extension PyNaCl, do not support encryption without signing or decryption without signature verification.

To generate a private key, you just use the class method.

>>> from nacl.public import PrivateKey

>>> k = PrivateKey.generate()

The type of k is PrivateKey. However, at some point, you usually want to persist with the private key.

>>> type(k.encode())

The encode method encodes the secret key as a stream of bytes.

>>> kk = PrivateKey(k.encode())

>>> kk == k

True

You can generate a private key from the byte stream, and it is identical. This means you can again keep the private key in a way you decide is secure enough; a secret manager, for example.

In order to encrypt, you need a public key. Public keys can be generated from private keys.

>>> from nacl.public import PublicKey

>>> target = PrivateKey.generate()

>>> public_key = target.public_key

Of course, in a more realistic scenario, public keys need to be stored somewhere—in a file, in a database, or just sent via the network. For that, you need to convert the public key into bytes.

>>> encoded = public_key.encode()

>>> encoded[:4]

b'xb91>x95'

When you get the bytes, you can regenerate the public key. It is identical to the original public key.

>>> public_key_2 = PublicKey(encoded)

>>> public_key_2 == public_key

True

These bytes can be written to a file.

>>> with open("target.pubkey", "wb") as fpout:

... fpout.write(encoded)

The PyNaCl Box class represents pair of keys; the first private, the second public. Box signs with the private key, then encrypts with the public key. Every message that you encrypt always gets signed.

>>> from nacl.public import PrivateKey, PublicKey, Box

>>> source = PrivateKey.generate()

>>> with open("target.pubkey", "rb") as fpin:

... target_public_key = PublicKey(fpin.read())

>>> enc_box = Box(source, target_public_key)

>>> result = enc_box.encrypt(b"x marks the spot")

>>> result[:4]

b'xe2x1c0xa4'

This signs with the source private key and encrypts using the target public key.

When you decrypt, you need to build the inverse box. This happens on a different computer, one that has the target private key but only the source public key.

>>> from nacl.public import PrivateKey, PublicKey, Box

>>> with open("source.pubkey", "rb") as fpin:

... source_public_key = PublicKey(fpin.read())

>>> with open("target.private_key", "rb") as fpin:

... target = PrivateKey(fpin.read())

>>> dec_box = Box(target, source_public_key)

>>> dec_box.decrypt(result)

b'x marks the spot'

The decryption box decrypts with the target private key and verifies the signature using the source public key. If the information has been tampered with, the decryption operation automatically fails. This means that it is impossible to access plain-text information that is not correctly signed.

Another piece of functionality that is useful inside of PyNaCl is cryptographic signing. It is sometimes useful to sign without encryption; for example, you can make sure to only use approved binary files by signing them. This allows the permissions for storing the binary file to be loose if you trust that the permissions on keeping the signing key secure are strong enough.

Signing also involves asymmetric cryptography. The private key is used to sign, and the public key is used to verify the signatures. This means that you can, for example, check the public key into source control and avoid needing any further configuration of the verification part.

You first must generate the private signing key. This is similar to generating a key for decryption.

>>> from nacl.signing import SigningKey

>>> key = SigningKey.generate()

You usually need to store this key (securely) somewhere for repeated use. Again, it is worthwhile remembering that anyone who can access the signing key can sign whatever data they want. For this, you can use encoding.

>>> encoded = key.encode()

>>> type(encoded)

The key can be reconstructed from the encoded version. That produces an identical key.

>>> key_2 = SigningKey(encoded)

>>> key_2 == key

True

For verification, you need to have the verification key. Since this is asymmetric cryptography, the verification key can be calculated from the signing key, but not vice versa.

>>> verify_key = key.verify_key

You usually need to store the verification key somewhere, so you need to be able to encode it as bytes.

>>> verify_encoded = verify_key.encode()

>>> verify_encoded[:4]

b'x08xb1x9exf4'

You can reconstruct the verification key. That gives an identical key. Like all ...Key classes, it supports a constructor that accepts an encoded key and returns a key object.

>>> from nacl.signing import VerifyKey

>>> verify_key_2 = VerifyKey(verify_encoded)

>>> verify_key == verify_key_2

True

When you sign a message, you get an interesting object back.

>>> message = b"The number you shall count is three"

>>> result = key.sign(message)

>>> result

b'x1axd38[....'

It displays as bytes but are not bytes.

>>> type(result)

You can extract the message and the signature from it separately.

>>> result.message

b'The number you shall count is three'

>>> result.signature

b'x1axd38[...'

This is useful if you want to save the signature in a separate place. For example, if the original is in object storage, mutating it might be undesirable. In those cases, you can keep the signatures on the side. Another reason is to maintain different signatures for different purposes or allow key rotation.

If you want to write the whole signed message, it is best to explicitly convert the result to bytes.

>>> encoded = bytes(result)

The verification returns the verified message, which is the best way to use signatures. This way, it is impossible for the code to handle an unverified message.

>>> verify_key.verify(encoded)

b'The number you shall count is three'

However, if it is necessary to read the object from somewhere else and then pass it into the verifier, which is also easy.

>>> verify_key.verify(b'The number you shall count is three',

... result.signature)

b'The number you shall count is three'

Finally, you can just use the result object as is to verify.

>>> verify_key.verify(result)

b'The number you shall count is three'

8.3 Passlib

Secure storage of passwords is a delicate matter. It is so subtle that it must deal with people who do not use password best practices. If all passwords were strong and people never reused passwords from site to site, password storage would be straightforward.

However, people usually choose passwords with little entropy (123456 is still unreasonably popular, as well as password), they have a standard password that they use for all websites. They are often vulnerable to phishing attacks and social engineering attacks where they divulge the password to an unauthorized third party.

Not all threats can be stopped by correctly storing passwords, but many can. At the very least, they can be mitigated.

The Passlib library is written by people who are well versed in software security. It tries to eliminate the most obvious mistakes when saving passwords. Passwords are never saved in plain text; they are always hashed.

Note that hashing algorithms for passwords are optimized for different use cases than hashing algorithms used for other reasons; for example, one of the things they try to deny is brute-force source mapping attacks.

Passlib hashes passwords with the latest vetted algorithms optimized for password storage and intended to avoid any possibility of side-channel attacks. In addition, Salt is always used for hashing the passwords.

Although Passlib can be used without understanding these things, it is worthwhile to understand them to avoid mistakes while using Passlib.

Hashing means taking the users’ passwords and running them through a reasonably easy function to compute but hard to invert. This means that even if an attacker gets access to the password database, they cannot recover users’ passwords and pretend to be them.

One way that the attacker can attempt to get the original passwords is to try all combinations of passwords they can come up with, hash them, and see if they are equal to a password. To avoid this, special algorithms are used that are computationally hard. This means that an attacker would have to use a lot of resources to try many passwords so that even if, say, only a few million passwords are tried, it would take a long time to compare. Finally, attackers can use rainbow tables to pre-compute many hashes of common passwords and compare them all at once against a password database. To avoid that, passwords are salted before they are hashed; a random prefix (the salt) is added, the password is hashed, and the salt is prefixed to the hash value. When the user enters a password, the salt is retrieved from the beginning of the hash value before hashing it to compare.

Doing all of this from scratch is hard and even harder to get it right. Getting it right does not just mean having users log in but being resilient to the password database being stolen. Since there is no feedback about that aspect, it is best to use a well-tested library.

The library is storage agnostic. It does not care where the passwords are being stored. However, it does care that it is possible to update the hashed passwords. This way, hashed passwords can get updated to newer hashing schemes as the need arises. While Passlib does support various low-level interfaces, it is best to use the high-level interface of the CryptContext. The name is misleading since it does no encryption. It refers to vaguely similar (and largely deprecated) functionality built into Unix.

The first thing to do is decide on a list of supported hashes. Not all of them have to be good hashes; if you have supported bad hashes in the past, they still have to be on the list. In this example, you choose argon2 as the preferred hash but allow a few more options.

Using the argon2 hash, an extra dependency needs to be installed. Use pip install argon2_cffi to install it.

After installing argon2_cffi, construct a context for hashing passwords based on the guidelines discussed earlier.

>>> hashes = ["argon2", "pbkdf2_sha256", "md5_crypt", "des_crypt"]

Note that md5 and des have serious vulnerabilities and are not suitable for real application. You added them because there might be old hashes using them. In contrast, even though pbkdf2_sha256 is probably worse than argon2, there is no urgent need to update it. You want to mark md5 and des as deprecated.

>>> deprecated = ["md5_crypt", "des_crypt"]

Finally, after having made the decisions, you build the crypto context.

>>> from passlib.context import CryptContext

>>> ctx = CryptContext(schemes=hashes, deprecated=deprecated)

It is possible to configure other details, such as the number of rounds. This is almost always unnecessary, as the defaults should be good enough.

Sometimes you want to keep this information in some configuration (for example, an environment variable or a file) and load it; this way, you can update the list of hashes without modifying the code.

>>> serialized = ctx.to_string()

>>> new_ctx = CryptContext.from_string(serialized)

When saving the string, note that it does contain newlines; this might impact where it can be saved. If needed, it is always possible to convert it to base64.

When a user creates or changes a password, you need to hash the password before storing it. This is done via the hash method in the context.

>>> res = ctx.hash("good password")

When logging in, the first step is to retrieve the hash from storage. After retrieving the hash and having the users’ passwords from the user interface, you need to check that they match and possibly update the hash if it is using a deprecated protocol.

>>> ctx.verify_and_update("good password", res)

(True, None)

If the second element were true, you would need to update the hash with the result. It is not a good idea to specify a specific hash algorithm but to trust the context defaults. However, you can force the context to hash with a weak algorithm to showcase the update.

>>> res = ctx.hash("good password", scheme="md5_crypt")

In that case, verify_and_update would let you know you should update the hash.

>>> ctx.verify_and_update("good password", res)

(True, '$5$...')

In that case, you would need to store the second element in the password hash storage.

8.4 TLS Certificates

Transport Layer Security (TLS) is a cryptographic way to protect data in transit. Since man-in-the-middle attacks are a potential threat, it is important to be able to verify that the endpoints are correct. For this reason, certificate authorities sign the public keys. Sometimes, it is useful to have a local certificate authority.

One case where that can be useful is in micro-service architectures, where verifying each service is the right one allows a more secure installation. Another useful case is for putting together an internal test environment, where using real certificate authorities is sometimes not worth the effort. It is easy enough to install the local certificate authority as locally trusted and sign the relevant certificates with it.

Another place this can be useful is in running tests. You want to set up a realistic integration environment when running integration tests. Ideally, some tests would check that TLS is used rather than plain text. This is impossible to test if you downgrade to plain-text communication for testing purposes. Indeed, the root cause of many production security breaches is that plain-text communication code inserted for testing was accidentally (or maliciously) enabled. Furthermore, it was impossible to test that such bugs did not exist because the testing environment did have plain-text communication.

For the same reason, allowing TLS connections without verification in the testing environment is dangerous. This means that the code has a non-verification flow, which can accidentally turn on, or maliciously be turned on in production and is impossible to prevent with testing.

Manually creating a certificate requires access to the hazmat layer in cryptography. This is so named because this is dangerous. You must judiciously choose encryption algorithms and parameters, and the wrong choices can lead to insecure modes.

To perform cryptography, you need a back end. This is because originally, it was intended to support multiple back ends. This design is mostly deprecated, but you still need to create it and pass it around.

>>> from cryptography.hazmat.backends import default_backend

Finally, you are ready to generate a private key. For this example, you use 2048 bits, which is considered reasonably secure as of 2018. A complete discussion of which sizes provide how much security is beyond the scope of this chapter.

>>> from cryptography.hazmat.primitives.asymmetric import rsa

>>> private_key = rsa.generate_private_key(

... public_exponent=65537,

... key_size=2048,

... backend=default_backend()

... )

As always in asymmetric cryptography, it is possible (and fast) to calculate the public key from the private key.

>>> public_key = private_key.public_key()

This is important since the certificate only refers to the public key. Since the private key is never shared, it is not worthwhile and actively dangerous to make any assertions about it.

The next step is to create a certificate builder. The certificate builder adds assertions about the public key. In this case, you finish by self-signing the certificate since CA certificates are self-signed.

>>> from cryptography import x509

>>> builder = x509.CertificateBuilder()

You add names. Some names are required, though it is not important to have specific contents in them.

>>> from cryptography.x509.oid import NameOID

>>> builder = builder.subject_name(x509.Name([

... x509.NameAttribute(NameOID.COMMON_NAME, 'Simple Test CA'),

... ]))

>>> builder = builder.issuer_name(x509.Name([

... x509.NameAttribute(NameOID.COMMON_NAME, 'Simple Test CA'),

... ]))

You need to decide a validity range. For this, it is useful to have a day interval for easy calculation.

>>> import datetime

>>> one_day = datetime.timedelta(days=1)

You want to make the validity range start slightly before now. This way, it is valid for clocks with some amount of skew.

>>> today = datetime.datetime.now()

>>> yesterday = today - one_day

>>> builder = builder.not_valid_before(yesterday)

Since this certificate is for testing, you do not need to have it be valid for a long time. You make it valid for thirty days.

>>> next_month = today + (30 * one_day)

>>> builder = builder.not_valid_after(next_month)

The serial number needs to uniquely identify the certificate. Since remembering serial numbers is difficult, let’s use random serial numbers. The probability of choosing the same serial numbers twice is extremely low.

>>> builder = builder.serial_number(x509.random_serial_number())

You then add the public key that you generated. This certificate is made of assertions about this public key.

>>> builder = builder.public_key(public_key)

Since this is a CA certificate, it needs to be marked as such.

>>> builder = builder.add_extension(

... x509.BasicConstraints(ca=True, path_length=None),

... critical=True)

Finally, after adding all the assertions to the builder, you need to generate the hash and sign it.

>>> from cryptography.hazmat.primitives import hashes

>>> certificate = builder.sign(

... private_key=private_key, algorithm=hashes.SHA256(),

... backend=default_backend()

... )

That’s it! You now have a private key and a self-signed certificate that claims to be a CA. However, you need to store them in files.

The PEM file format is friendly to simple concatenation. Indeed, usually this is how certificates are stored; in the same file with the private key (since they are useless without it).

>>> from cryptography.hazmat.primitives import serialization

>>> private_bytes = private_key.private_bytes(

... encoding=serialization.Encoding.PEM,

... format=serialization.PrivateFormat.TraditionalOpenSSL,

... encryption_algorithm=serialization.NoEncrption())

>>> public_bytes = certificate.public_bytes(

... encoding=serialization.Encoding.PEM)

>>> with open("ca.pem", "wb") as fout:

... fout.write(private_bytes + public_bytes)

>>> with open("ca.crt", "wb") as fout:

... fout.write(public_bytes)

This gives you the capability to now be a CA.

For real certificate authorities, you generally need to generate a certificate signing request (CSR) to prove that the owner of the private key wants that certificate. However, since you are the certificate authority, you can just create the certificate directly.

There is no difference between creating a private key for a certificate authority and a private key for a service.

>>> service_private_key = rsa.generate_private_key(

... public_exponent=65537,

... key_size=2048,

... backend=default_backend()

... )

Since you need to sign the public key, you must again calculate it from the private key.

>>> service_public_key = service_private_key.public_key()

You create a new builder for the service certificate.

>>> builder = x509.CertificateBuilder()

For services, the COMMON_NAME is important; this is what the clients verify the domain name against.

>>> builder = builder.subject_name(x509.Name([

... x509.NameAttribute(NameOID.COMMON_NAME, 'service.test.local')

... ]))

You assume that the service is accessed as service.test.local, using some local test resolution. Once again, you limit the certificate validity to about a month.

>>> builder = builder.not_valid_before(yesterday)

>>> builder = builder.not_valid_after(next_month)

This time, you sign the service public key.

>>> builder = builder.public_key(public_key)

However, you sign with the private key of the CA; you do not want this certificate to be self-signed.

>>> certificate = builder.sign(

... private_key=private_key, algorithm=hashes.SHA256(),

... backend=default_backend()

... )

Again, you write a PEM file with the key and the certificate.

>>> private_bytes = service_private_key.private_bytes(

... encoding=serialization.Encoding.PEM,

... format=serialization.PrivateFormat.TraditionalOpenSSL,

... encryption_algorithm=serialization.NoEncrption())

>>> public_bytes = certificate.public_bytes(

... encoding=serialization.Encoding.PEM)

>>> with open("service.pem", "wb") as fout:

... fout.write(private_bytes + public_bytes)

The service.pem file is in a format that the most popular web servers can use: Apache, Nginx, HAProxy, and more. It can also be used directly by the Twisted web server through the txsni extension.

If you add the ca.crt file to the trusted root, and run, say, an Nginx server on an IP that the client would resolve from service.test.local, then when you connect clients to https://service.test.local, they verify that the certificate is valid.

8.5 Summary

Cryptography is a powerful tool, but one which is easy to misuse. Using well-understood high-level functions reduces many of the risks in using cryptography. While this does not substitute proper risk analysis and modeling, it does make this exercise somewhat easier.

Python has several third-party libraries with well-vetted code, and it is a good idea to use them.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 8. Cryptography

Create new playlist

Sign In

Sign Up

8. Cryptography

8.1 Fernet

8.2 PyNaCl

8.3 Passlib

8.4 TLS Certificates

8.5 Summary

Table of Contents for
8. Cryptography