Chapter 3

Compute Services in AWS

This chapter covers the following subjects:

Networking in AWS: Having the ability to create your own networks within AWS allows you to build highly available and highly scalable applications. This section describes the VPC service, which is used to create, manage, and secure networks in AWS.

Computing in AWS: Delivering compute services on demand has always been one of the basic components of AWS. This section looks at the EC2 and ECS services and how you can use them to deliver flexible computing solutions.

Storing Persistent Data: One of the typical requirements of any application is the ability to store data persistently. This section looks at the EBS service, which provides block storage support for your compute instances.

Scalability and High Availability: This section explores how to design scalable and highly available solutions in EC2 by combining the functionality of the AWS Elastic Load Balancer, Auto Scaling, and Amazon Route 53 services.

Orchestration and Automation: The final part of this chapter takes a look at how to automate and orchestrate the creation and management of applications. This section takes a look at the Elastic Beanstalk and AWS CloudFormation services.

This chapter covers content important to the following exam domains:

Domain 1: Deployment

  • 1.1 Deploy written code in AWS using existing CI/CD pipelines, processes, and patterns.

  • 1.2 Deploy applications using Elastic Beanstalk.

  • 1.3 Prepare the application deployment package to be deployed to AWS.

  • 1.4 Deploy serverless applications.

Domain 3: Development with AWS Services

  • 3.1 Write code for serverless applications.

  • 3.2 Translate functional requirements into application design.

  • 3.3 Implement application design into application code.

  • 3.4 Write code that interacts with AWS services by using APIs, SDKs, and AWS CLI.

One of the key aspects of cloud computing is the ability to consume compute services and design applications that are highly scalable, highly available, and highly resilient. This chapter discusses all aspects of EC2 and ECS services in AWS and how these services can be integrated with other AWS components.

“Do I Know This Already?” Quiz

The “Do I Know This Already?” quiz allows you to assess whether you should read the entire chapter. Table 3-1 lists the major headings in this chapter and the “Do I Know This Already?” quiz questions covering the material in those headings so you can assess your knowledge of these specific areas. The answers to the “Do I Know This Already?” quiz appear in Appendix A, “Answers to the ‘Do I Know This Already?’ Quizzes and Q&A Sections.”

Table 3-1 “Do I Know This Already?” Foundation Topics Section-to-Question Mapping

Foundations Topics Section

Questions

Networking in AWS

1, 6

Computing in AWS

2, 10

Storing Persistent Data

3, 8

Scalability and High Availability

4, 7

Orchestration and Automation

5, 9

Caution

The goal of self-assessment is to gauge your mastery of the topics in this chapter. If you do not know the answer to a question or are only partially sure of the answer, you should mark that question as wrong for purposes of the self-assessment. Giving yourself credit for an answer you correctly guess skews your self-assessment results and might provide you with a false sense of security.

1. VPC A is peered to VPC B. VPC B is peered to VPC C. You have set up routing in VPC A, which lists the VPC C subnet as a subnet of VPC B. You are trying to ping an instance in VPC C from VPC A, but you are not getting a response. Why?

  1. Transient connections are not supported by VPC peering.

  2. Your security groups of the VPC C instance do not allow incoming pings.

  3. You need to create an NACL in VPC C that will allow pings from VPC A.

  4. The NAT service in VPC B is not configured correctly.

2. You are tasked with migrating an EC2 instance from one availability zone to another. Which approach would be the best to achieve full data consistency?

  1. Shut down the instance. Then restart the instance and select the new availability zone.

  2. Keep the instance running. Select Migrate to AZ in the instance actions and select the new availability zone.

  3. Shut down the instance, create a snapshot, start a new instance from the snapshot, and select the new availability zone.

  4. Keep the instance running. Create a snapshot with the no-shutdown option. Start a new instance from the snapshot and select the new availability zone.

3. You are required to select a storage location for your MySQL database server on an EC2 instance. What AWS service would be the most appropriate for such an object?

  1. RDS

  2. EBS

  3. EFS

  4. S3

4. With ECS, what allows you to control high availability of a containerized application?

  1. Placement of ECS tasks across ECS instances

  2. Placement of ECS tasks into an ECS cluster

  3. Placement of ECS instances across regions

  4. Placement of ECS instances across availability zones

5. Which scripting languages are supported in a CloudFormation template? (Choose two.)

  1. YAML

  2. Ruby DSL

  3. JSON

  4. Python

6. To set up a route from an on-premises location to a VPC subnet through Direct Connect, which of the following do you need to use?

  1. RIPv2

  2. RIPv1

  3. Static routing

  4. BGP

7. To change the number of instances in an Auto Scaling group from 1 to 3, which count do you set to 3?

  1. Percentage

  2. Maximum instances

  3. Desired instances

  4. Running instances

8. To maximize IOPS in an EBS volume, which of the following would you need to select?

  1. Provisioned IOPS volume

  2. General purpose volume

  3. Disk-backed volume

  4. Dedicated IOPS volume

9. To automate the infrastructure deployment of a three-tier application, which of the following options could you use? (Choose all that apply.)

  1. CloudFormation

  2. CLI

  3. CloudTrail

  4. OpsWorks Stacks

10. Which of the following compute options would be best suited for a tiny 100 MB microservices platform that needs to run in response to a user action?

  1. Lambda

  2. EC2

  3. ECS

  4. EKS

Foundation Topics

images

Computing Basics

Before diving into specifics, you need to understand some basics and requirements of modern applications and the infrastructure components that they rely on to deliver functionality. This chapter puts a lot of focus on Infrastructure as a Service (IaaS) as the underlying technology.

Note

Chapter 1, “Overview of AWS,” looks more closely at Infrastructure as a Service (IaaS).

For an application to be able to provide any kind of service, it needs to be able to connect to clients. The ability to communicate in a unified manner on a network is thus a core requirement for any application, regardless of the environment in which it resides.

Modern networking requirements are typically divided into two categories:

  • Local area networks (LANs): These are private networks that allow communication only within a certain limited set of network addresses (usually) within one organization.

  • Wide area networks (WANs): These are either private or public networks that are designed to allow communication at a distance with multiple parties. When these networks are public, the term WAN is usually replaced with the Internet.

The industry standard network protocol used in all of these networks is Internet Protocol (IP). Internet Protocol comes in two versions: version 4 (IPv4) and version 6 (IPv6). The two protocol versions can be used simultaneously to allow for a transition between the older IPv4 to the newer IPv6. The main difference between these protocols is the number of addresses that can be used in each:

  • The IPv4 protocol has a 32-bit addressing field, which means the maximum number of potential IP addresses is limited to about 4.3 billion. The pool of assigned IPv4 addresses across the Internet has virtually all been assigned.

  • The IPv6 protocol has a 128-bit addressing field, which means the maximum number of potential IP addresses is approximately 340 undecillion, or 340 billion-billion-billion-billion addresses. That should last humanity for a while.

The other major difference between the two protocol versions is also that IPv4 has several clearly defined private address ranges that are not routable on the Internet, as per the IETF RFC 1918. This RFC was defined in 1996 to mitigate the growing lack of public IP addresses and extend the number of possible devices that can connect to the Internet. The largest possible RFC 1918 private network range can accommodate up to 16 million network attached devices; this means that behind each IPv4 public IP address, you could potentially host one such private network range.

The private IP address ranges are not routable on their own and are required to be connected to the Internet via an intermediary device that performs Network Address Translation (NAT). A NAT device allows for the traffic from the private network to be passed to the Internet and uses the public IP address of the NAT device for all the devices in the private network. Figure 3-1 illustrates the operation of a NAT device that connects all clients in a private network to the public IP space. As demonstrated, the NAT device has one public IP address and allows all private computers to connect to the Internet through that one IP address.

images

Figure 3-1 IPv4 and NAT

The IPv6 approach is a bit different as IPv6 has no clearly defined private range. Essentially, all the IPv6 addresses are public, and they are all instantly routable when connected to the Internet. You need to understand this fact because implementing IPv6 addressing could pose a completely different challenge to the way you access the Internet and secure your application.

Devices that are connected to the network are divided into clients and servers. A client is any device creating a request on a system, and the system responding is a server since it is serving the content. There are intermediary devices between the clients and servers, including gateways, routers, load balancers, and other network equipment.

A client and the server need to agree on an application protocol. There are many different protocols in use on the Internet; the most common protocols are Hypertext Transfer Protocol (HTTP) and its secure version, HTTPS. These protocols use the Transport Control Protocol (TCP) extension of Internet Protocol, denoted TCP/IP. TCP/IP gives the client and server the ability to determine which packets have been delivered over the network and which packets need to be re-sent. This allows the client to make only one request and the server to respond to that request in chunks, which is perfect for what HTTP was designed for—delivering web pages. This functionality means that the application essentially works in an asynchronous manner, taking requests and sending responses in no particular order. Figure 3-2 illustrates the client/server approach in HTTP. The client requests a page, and the server either responds with that page or returns an error code.

images

Figure 3-2 HTTP Client/Server Architecture

The benefit of asynchronous communication is that the server can perform many tasks at the same time and respond to the clients when the response is ready. This means the server can serve multiple clients at the same time as the next client is not required to wait for the previous client to get a response.

Synchronous communication protocols work differently from asynchronous protocols. For example, some database query languages use synchronous communication and are severely limited in the number of concurrent connections they can process at the same time by the actual number of threads that can be processed by the central processing units (CPUs).

An application can perform numerous different tasks, but the core functionality of any application’s abilities depends on the compute capacity and the CPUs ability to process requests. A request for processing is also referred to as a thread. A thread is essentially a set of commands and the data that is being processed being sent to the CPU. A modern CPU usually has a buffer that lines up threads for processing and a cache that can remember commonly used instructions and data. The cache can be extended into several layers, each containing more and more memory that has slower and slower performance. For example, a modern CPU in 2019 had

  • 1 to 2 MB of Layer 1 (or L1) cache that operates at the speed of the CPU and stores the most frequently used instructions and the most frequently used data.

  • 2 to 8 MB of L2 cache that is a bit slower, perhaps working at half the CPU speed, but that can hold many more commonly used instructions and much more data.

  • 8 to 64 MB of L3 cache operating at the slowest speeds that allows the CPU to remember much larger chunks of data and therefore speeds up operations.

When the data or the instruction being requested is found in the cache, the CPU can compute the task much faster. This is called a cache hit. The opposite is true when there is a cache miss: The data needs to be retrieved from the working memory or disk. Every year the random access memory (RAM) modules and disk systems get faster, and the cache on the CPU grows; together these changes add up to faster performance and lower latencies for your applications.

At the end of the caching chain is the disk. The disk is a representation of where you store your data in your compute system. While the name disk comes from old-school magnetic disks, the reality today is that most “disks” are solid-state drives (SSDs). The SSD has revolutionized the performance of the storage subsystems as the ability to access data in a random manner and the throughput of the devices has dramatically increased.

Even though SSDs are impressive, they are not even close to being the pinnacle of computing storage. Currently we are seeing improvements in the way data is stored with non-volatile memory devices being introduced both at the disk level and at the RAM level. The Non-Volatile Memory Express (NVMe) standard has brought us disks that now surpass SSD performance by multiple times. On top of that, Intel has recently introduced its Optane technology, with non-volatile modules that can be used as standard RAM modules. These modules are now matching the performance of memory while providing the storage capacities of NVMe drives. This technology is currently being tested by developers from all over the world so we can better understand how to take advantage of such an environment. An Optane module is a device that you start, that has no start time, that processes every request at the performance of typical memory modules, and that can be turned off at any time without “saving” anything to disk—and every zero and one is retained when the power is unplugged. This technology will revolutionize the way we store and process data and even use devices in years to come. This type of technology will surely enable the rise of the real-time operating system (RTOS). Figure 3-3 illustrates different levels of latencies in computing.

images
images

Figure 3-3 Comparison of Latencies in Computing

But for now you still need to consider the current technologies, which are not very well suited to disruptions. With current technologies, all of your data is still being processed in volatile memory, which will be completely lost when the application loses power. In addition, any disruption in network availability and the availability of services your application relies on can degrade the state of an application. You therefore need to build your applications to be highly available, highly resilient to disruption, and highly scalable so that they can take on any load you can give them.

The storage of data is the biggest obstacle to making sure an application can withstand any disruption and ensuring that it can always deliver the freshest data to clients. For example, many of us remember the days when we used to use hardware servers to run applications. We did it because it was the only option around. We had no cloud computing, no IaaS. We used to buy a box of metal that had some shiny bits inside it, install an operating system, connect the device to the network, and then install the application so that it would perform the task we needed it to. Sooner or later, a power outage, a network switch failing, or our colleague Dan from the server room team tripping over the extension cord powering the server caused an outage. Janice from accounting would usually be ranting about the outage, but all we could say is, “Sorry, can’t do anything about it. The server is down.” We knew we could get the server back up and running, we knew the data was still (mostly) present on the disks, and we were (mostly) able to get away with it.

But why did we all get so used to the phrase “The server is down” and put up with it? We always had the option to replicate the data, we always had the option to synchronize the state of the server to another server, and we always had the ability to make applications highly available. What was stopping us? It was either the complexity or the cost or the lack of skills or all of the above.

And then along came the cloud, and everything changed. All of our servers became ephemeral instances of read-only images. All our data being stored within those instances would not survive, we were told. We needed to start thinking about how to get this data off the instances, how to get the logs off, how to get the session state out of the scope of one operating system of one single instance. Basically, we were on a quest to make the instances “stateless.” With stateful servers, a user session will be lost if the server is lost. Figure 3-4 illustrates how stateless systems can share a data store that records the client sessions. In case of a failure, scaling, or other change in the server tier, the client can be reconnected to another server because the stateless systems share the data store.

images

Figure 3-4 Stateful Versus Stateless Design

As you can see in Figure 3-4, the stateless design has quite a few benefits over the stateful design. When implementing a stateful design, each user is tied to a specific server that contains the state of that user’s session. In the example in Figure 3-4, user A is talking to server X, which means servers Y and Z do not have any information of the state of the session for user A. The opposite is true for users B and C: The state of user B is recorded on server Z, and the state of user C is recorded on server Y.

With a stateless design, users A, B, and C can access any of the servers, and their state and data will be retrievable because all the state information is stored on a shared service outside the servers themselves; in the example in Figure 3-4, this is represented by an RDS database service.

Running in a cloud environment poses unique challenges and benefits, as discussed in Chapter 1. The stateless approach to building processing units has brought the ability to have more than one server serve the content and draw it from a back-end source that can be available to all instances. A “problem” actually turned out to be the solution to our high availability, reliability, and scalability issues. In addition, we gained the ability to automate environments much more easily. Having the components that process the application requests be stateless and having the data stored in a universally reachable back end allowed us to start architecting solutions that would be able to automatically respond to incoming requests by increasing the number of instances responding to those requests. Being able to orchestrate the deployment, scaling, and decommissioning of the application in an automated manner is a crucial benefit of the cloud.

Networking in AWS

One of the ways to deliver applications from the cloud is through Infrastructure as a Service (IaaS). The core requirement for any IaaS environment is the ability to control the network environment and connectivity so that you can expose the application to the Internet and connect to other private networks. However, you also need to be able to control the security aspects of any application running on an IaaS platform.

AWS provides several different networking-related services that allow you to design and deliver secure, highly available, and reliable applications from the cloud. The following are the most important networking tools available in AWS:

  • Amazon Virtual Private Cloud (VPC): A service for creating logically isolated networks in the cloud

  • VPC network ACLs and security groups: Tools for securing network and instance access in VPC

  • AWS Direct Connect and VPN gateways: Tools for connecting your on-premises networks with AWS

  • Amazon Route 53: A next-generation DNS service with an innovative API that allows for programmatic access to the DNS services

  • Amazon CloudFront: A dynamic caching and CDN service in the AWS cloud

  • Amazon Elastic Load Balancing (ELB): Load balancing as a service in the AWS cloud

  • Amazon Web Application Firewall (WAF): A tool that protects web applications from external attacks using exploits and security vulnerabilities

  • AWS Shield: An AWS managed DDoS service

Amazon Virtual Private Cloud (VPC)

The Virtual Private Cloud (VPC) service in AWS gives you the ability to arbitrarily define your private network environment. It gives you complete control over routing and security. The VPC service essentially gives you complete control over the network configuration and allows you to control the security and access to that network.

With every account, a default VPC and default subnets are created in each region. The default VPC is a prebuilt solution that you can easily deploy without having to manage the VPC; however, I recommend using the default VPC only for learning, testing, and proof-of-concept deployments. In all other cases, it is recommended that you configure your own VPC(s).

When configuring a VPC, you need to define a network address. The network address can then be further segmented into subnets. You define the addresses by using Classless Inter-Domain Routing (CIDR) notation. With CIDR notation, each address is composed of two groups of address bits: the static bits that represent the network and the dynamic bits that represent the host. To define the number of bits used in a network address, use a / (slash) with a number. For example, consider the IP address 192.168.0.0/24. An IPv4 address has 32 bits, and each decimal number between 0 and 255 represents 8 bits of the address; the bits are separated by dots for readability. For an address where the first 24 bits (192.168.0) are fixed and represent the network address, you use the CIDR notation /24. The remaining 8 bits are dynamic and represent the host addresses. Because each bit can be 0 or 1, you have 28 bits, and 256 available addresses in the range 0 to 255. The first address (192.168.0.0) is the address of the network, and the last address (192.168.0.255) is the broadcast address. Packets sent to the broadcast address will in turn be sent to all hosts in the network. The number of usable addresses is thus 254. In AWS, three additional addresses are reserved for the AWS services, so the actual number of usable addresses for your hosts is 251.

Table 3-2 provides some examples of network ranges in CIDR format and their characteristics.

images

Table 3-2 CIDR Range Examples

CIDR

Host Addresses

Broadcast Address

Number of Hosts

Private or Public?

10.0.0.0/8

10.0.0.1–10.255.255.254

10.255.255.255

16,777,214

Private (RFC 1918)

172.24.0.0/16

172.24.0.1–172.24.255.254

172.24.255.255

65,534

Private (RFC 1918)

0.0.0.0/0

0.0.0.1–255.255.255.254

255.255.255.255

Approximately 4.3 billion

Public (all addresses)

54.219.0.0/22

54.219.0.1–54.219.3.254

54.219.3.254

1022

Public

18.176.17.16/28

18.176.17.17–18.176.17.30

18.176.17.31

14

Public

52.219.84.155/32

52.219.84.155

52.219.84.155

1

Public

As you can see, CIDR notation allows you to define a network as any range of IP addresses from one single host to the whole Internet. You also use CIDR notation for security rules, so it is crucial that you understand the notation format.

images

To create a VPC, you can use the aws ec2 create-vpc command. The following example creates a VPC with the CIDR address 192.168.100.0/24:

aws ec2 create-vpc --cidr-block 192.168.100.0/24

The output of this command looks like the output shown in Example 3-1.

Example 3-1 Output from the aws ec2 create-vpc Command

{
     "Vpc": {
          "VpcId": "vpc-abcdef0123456789",                            
          "InstanceTenancy": "default",
          "Tags": [],
          "CidrBlockAssociationSet": [
              {
                  "AssociationId": "vpc-cidr-assoc-123abc456def",
                  "CidrBlock": "192.168.100.0/24",
                  "CidrBlockState": {
                      "State": "associated"
                  }
              }
          ],
          "Ipv6CidrBlockAssociationSet": [],
          "State": "pending",
          "DhcpOptionsId": "dopt-1234abcd",
          "OwnerId": "123123123123",
          "CidrBlock": "192.168.100.0/24",
          "IsDefault": false
     }
}

The important part of the output is the shaded VpcId information, which you need to know to configure other features.

Connecting a VPC to the Internet

images

When creating VPC networks and subnets, you typically use IPv4 private subnets. These subnets are usually defined as either private or public. It might sound confusing, but both private and public ranges in AWS use network ranges from the RVC 1918 private assignments. The only difference between a public subnet and a private subnet is that a public subnet has an Internet gateway (IGW) attached to it. The IGW automatically allows the instances running in the public subnet access to the Internet and gives you the ability to either automatically assign public IP addresses or attach previously allocated Elastic IP addresses to the instances.

To create a public subnet, you need to create a subnet, create an IGW, and attach the IGW to your VPC. Then you create a new routing table, add to the newly created routing table the route to the IGW, and then associate the routing table with the subnet. Let’s look at an example.

To create a subnet in the VPC, you use the aws ec2 create-subnet command and specify the VPC ID and the CIDR address. This example uses the first half the 192.168.100.0/24 CIDR address for the public subnet, which is represented by the CIDR address 192.168.100.0/25:

aws ec2 create-subnet --vpc-id vpc-abcdef0123456789--cidr-block
192.168.100.0/25

Remember to record the subnet ID from the output, as you will need it later on. Next, you need to create an Internet gateway:

aws ec2 create-internet-gateway

Use the IGW ID in the output to attach the IGW to the VPC with the aws ec2 attach-internet-gateway command:

aws ec2 attach-internet-gateway --internet-gateway-id
igw-01234ab3456ef0123 --vpc-id vpc-abcdef0123456789

Now you need to take care of the routing. First, you create the routing table in the VPC:

aws ec2 create-route-table --vpc-id vpc-abcdef0123456789

Be sure to record the routing table ID so that you can create a default route to the Internet (0.0.0.0/0) in the new routing table:

aws ec2 create-route --route-table-id rtb-00112233aabbccdd
--destination-cidr-block 0.0.0.0/0 --gateway-id igw-01234ab3456ef0123

This command should return a true response if it completes successfully. The last step is to associate the newly created routing table to the public subnet:

aws ec2 associate-route-table --route-table-id rtb-00112233aabbccdd
--subnet-id subnet-1234abcd0987fe

Note

While it is allowed to define public IPv4 network ranges in the VPC, these ranges will be routable only when advanced hybrid scenarios with BGP routing are in place. You should always define private ranges when you want to route the traffic through AWS routers.

When a VPC is configured to automatically assign public IP addresses, they are attached to the instance randomly from an AWS-owned pool of public addresses whenever the instance is started. Any kind of disruption of the instances other than a reboot will cause the public IP address assignment to be released. This means that even when you keep the state of the instance on persistent storage and shut it down, the public IP address will be disassociated.

If you need to be able to retain the IP address in your application, you can choose to attach an Elastic IP address to your instance. These addresses are assigned to your AWS account and are persistent regardless of the state of instances. This gives you the ability to maintain the same IP address, regardless of the state of the instance. It also allows you to detach and reattach the Elastic IP address to another instance, which is very useful in case of a failure of the old instance, for example.

An attachment of an address is in reality just a logical assignment of the public or Elastic IP address and the instance to which it is attached. This address is not visibly assigned to any adapter as all of the AWS-owned public addresses are only attached to the IGW. A DNAT or 1:1 NAT rule then maps the public IP address to the private IP address of the instance. As the public address is not seen from the instance itself, you need to examine the instance metadata to retrieve the information. You can do this by simply browsing or sending a curl command to the following address:

http://169.254.169.254/latest/meta-data/public-ipv4

On the other hand, a private subnet is defined by having no access from the Internet. The private subnet can be connected to an on-premises environment through a Direct Connect connection or a virtual gateway (VGW) via an IPsec VPN. You can also allow access from the private network to the Internet by implementing a NAT gateway or NAT instance to perform NAT only for any outgoing traffic requests.

To create a private VPC in the second part of the VPC CIDR address, represented by the CIDR address 192.168.100.128/25, you can just create a second subnet in the VPC by using the following command:

aws ec2 create-subnet --vpc-id vpc-abcdef0123456789 --cidr-block
192.168.100.128/25

While you are always able to create your own NAT instance and run any kind of NAT software on it, AWS offers you the ability to use a NAT gateway, which is a very effective solution and has the following characteristics:

  • Scales in performance from 5 Gbps up to 45 Gbps

  • Supports up to 55,000 simultaneous connections

  • Consumes one of your Elastic IP addresses and an automatically assigned private IP addresses in the public subnet

  • Does not support IPv6 traffic

Note

AWS strongly recommends not using NAT instances unless absolutely necessary (for example, when a custom or vendor device is specifically required in this role).

To create a NAT gateway, you need to first allocate a new Elastic IP address by running the following command:

aws ec2 allocate-address

Record the allocation ID for use in the NAT gateway creation command:

aws ec2 create-nat-gateway --subnet-id subnet-9876abcd0123
--allocation-id eipalloc-11112222aaaa

In case of IPv6, the definition of private and public subnets is technically the same, but because the IPv6 protocol is designed as a public address-oriented protocol, all your instances are immediately available in the public subnet as soon as an Internet gateway is attached. If you would like to keep the IPv6 instances private but still allow them to access the Internet (for updates, patches, other services, and so on), you need to use an egress-only Internet gateway. The egress-only Internet gateway automatically blocks any incoming requests to IPv6 addresses that are found in the address range behind the gateway.

Connecting the VPC to Other Private Networks

Not all applications are connected to the Internet. Your application might require a connection to private resources in your on-premises site. Or you might need to connect multiple private networks together and have them communicate to each other to make the application work. AWS provides several different services to achieve the right type of connection and cover all your application needs.

As previously mentioned, you can use Direct Connect and a virtual gateway (VGW) to connect your VPC to your on-premises network. AWS Direct Connect provides the ability to establish a dedicated low-latency private network connection between your on-premises environment and AWS. Direct Connect uses dedicated 1 Gbps or 10 Gbps optical fibers that can be partitioned into multiple virtual interfaces. The virtual interfaces provide the connectivity to both public and private AWS resources and enable you to encrypt the link with a VPN connection on top of Direct Connect.

There are also several VPN services available from AWS. For example, you can create a virtual gateway (VGW) to establish a point-to-point VPN across the public Internet between your on-premises environment and AWS VPC resources. The VPN is a more economical solution, but it has higher latency and lower throughput (up to 1.25 Gbps) than a Direct Connect solution.

Other VPN services from AWS include the AWS Client VPN service, which allows you to connect your client devices to the cloud, and the AWS CloudHub service, which allows you to interconnect multiple sites with a VPN and manage the connection centrally.

The VPN can also be used as a backup connection to the Direct Connect link to provide services in case the primary link suffers a failure. The traffic is always prioritized automatically over the Direct Connect link if the connection is available and failed over to the VPN if the Direct Connect link is down. Figure 3-5 illustrates a Direct Connect link with a backup VPN connection. This solution provides performance when the Direct Connect link is available, and it also provides cost-effective redundancy.

images

Figure 3-5 A VPC Connected via a Direct Connect Connection with a Backup VPN

When private connections between VPCs are required, you can use peering in VPC, which allows you to connect completely private VPC subnets to other VPC subnets, regardless of their location. This means you can arbitrarily connect resources in multiple regions via private addresses. The only real limitations of VPC peering are the requirement of non-overlapping IP address ranges for each peered VPC and the inability of VPC peering connections to perform transient traffic transport. For example, when VPC A is connected to VPC B, which in turn is connected to VPC C, only the communication from the direct peer is allowed. In the example in Figure 3-6, VPC A cannot communicate with VPC C because transient peering is not supported.

images

Figure 3-6 Transient Traffic in VPC Peering Is Not Supported

You also have the ability to connect private networks to other AWS services that are usually responding on public addresses. For example, the S3 service always responds on a public IP address, regardless of whether the request is coming from within AWS or from the Internet. If your instances do not have access to the Internet, they aren’t able to use S3. This can be mitigated by using VPC endpoints. The VPC endpoints allow for connecting other AWS services directly to your subnets, so you can pass traffic to the AWS service via the private network. The S3 and DynamoDB VPC endpoints are connected via the gateway endpoint, which provides a route to a private address range through the default VPC router, whereas other AWS services and any other AWS Marketplace-supported solutions are connected via an interface endpoint. An interface endpoint is supported by the AWS PrivateLink service and allows you to connect to the service through an elastic network interface that is attached straight into one of your VPC subnets and has an IP address from that subnet range.

Computing in AWS

Once you have configured your network, you can start your compute instances and connect them to each other, the Internet, other private networks, and, of course, AWS services. AWS essentially has three services that are designed for general computing needs:

  • Amazon Elastic Compute Cloud (EC2): An AWS curated service that gives you the ability to run virtual machines (called instances) with the Linux and Windows operating systems

  • Amazon Elastic Container Service (ECS): An orchestration service that gives you the ability to run containerized applications (called tasks) and manage the resources that support the running of containers

  • AWS Lambda: An AWS managed service that gives you the ability to run single snippets of code (called functions) to perform serverless processing for event-based applications

AWS provides a lot of flexibility when it comes to compute options. The services provided cover basically every need for every application out there and allow you to choose the right service for the right application or solution.

The EC2 service is a solution geared toward the typical consumer of IaaS and is designed to allow for complete control of the operating system and its configuration. In the shared security model, EC2 lays the most responsibility on the consumer.

With ECS, you can run tasks several ways:

  • Build your own: You can build your own container engine EC2 instances and register them to the ECS cluster. You can run tasks on top of the container engine. This approach is the most flexible but requires the most management.

  • Use ECS as the orchestrator: ECS can dynamically create and manage the EC2 instances on which to run tasks. This still gives you a lot of control but lets the ECS service take over EC2 instance orchestration.

  • Use the Fargate service: When you run tasks on the Fargate service, you are not required to run any container engines on EC2 as the Fargate service automatically provisions the infrastructure on which to run tasks. This service is fully managed and requires the least effort.

With AWS Lambda, you just need to bring your code. You never need to manage any infrastructure, and there will never be any idle compute power that you need to pay for. You simply pay per request and for the time and memory required to process the request. Lambda uses a simple approach that can be very cost-effective.

Note

Chapter 5, “Going Serverless in AWS,” discusses Lambda in more detail.

Having the flexibility to consume all these different types of compute resources in an untethered on-demand fashion is one of the biggest benefits of the AWS cloud.

Amazon EC2

As mentioned earlier, the EC2 service gives you the most flexibility. This is the crucial benefit of the EC2 service compared to other AWS solutions. You have the ability to choose:

  • The instance type: The instance type determines the amount of CPU and memory (and in some cases the size and number of disks) that will be attached to the instance.

  • The price: You can choose from several pricing models to get the most bang for your buck.

  • The operating system: You can use Linux or Windows as the operating system, and a massive number of EC2 appliances are available in the AWS Marketplace.

  • Network connectivity: You can connect the instance to one or many VPC subnets by using an elastic network interface.

  • The way your instances are started: With user data scripts, you can customize an instance every time it is started.

The instance type determines the configuration of the instance. Each instance type has a certain name that defines the instance family, the generation, and the size of the instance. For example, the m5.large instance belongs to the M type family, is the fifth generation, and is a large size (2 CPUs and 8 GB RAM).

Several families of instances in AWS have characteristics that make them suitable for special purposes:

  • General purpose: M type instances are designed for stable day-to-day tasks, and T type, burstable instances are designed for spiky and unstable tasks. T types have a certain baseline of operation but are allowed to burst above the baseline for short periods of time. Whenever bursting, T types consume burst credits. Once the credits are consumed, the instances are throttled. The newest generation (t3) introduced unlimited burstable instances. These can burst above the baseline but consume billed usage instead of burstable credits when they reach their maximum. The third type is A type instances, which are low-cost ARM-based instances.

  • Compute oriented: C type instances are designed for compute-intensive tasks. They have double the CPU density per the amount of RAM (for example, large has 2 CPUs and 4 GB RAM) compared to M type instances.

  • Memory oriented: R type instances have double the memory compared to M type instances (for example, large has 2 CPUs and 16 GB RAM). R type instances are intended for in-memory databases and such applications. For extreme performance, X type instances provide extreme capacity up to 128 CPUs and 2 TB RAM).

  • Accelerated compute instances: G type and P type instances have GPU acceleration enabled, and F type instances have programmable FPGAs.

  • Disk oriented: H type (high throughput), I type (high IOPS), and D type (disk space) instances are designed for disk-intensive operations.

For each type, you can select the operating system by using Amazon Machine Image (AMI), which contains all the components needed to run the instance:

  • Default block device mappings and boot sectors

  • The operating system and any additional applications

  • Cloud-init components to configure your instance at launch

  • Launch permissions that control who can run the instance

For each instance type you run, you can choose from several different pricing models to use the resources in the most economical way possible:

  • On-demand instances

  • Reserved instances

  • Spot instances

  • Dedicated instances

  • Dedicated hosts

On-demand instances are very well suited for any kind of random task for which you have no timeline. You can consume them when required and as long as required. There is no commitment, and no reservation is required. You simply log in to AWS and use them. On-demand instances excel at

  • Processing event-based tasks and message queue content

  • Running development and proofs of concept

  • Covering sudden and temporary capacity requirements in EC2 clusters

  • Longer-running event-based tasks

On-demand instances are flexible, but that flexibility comes at a cost disadvantage. For tasks that run every day or on a scheduled basis, you should use reserved instances (RIs), which are reserved for a one- or three-year term. There are several reserved instance types available in AWS:

  • Standard RI: Any kind of day-to-day or long-running tasks on one exact type of instance. Delivers up to 75% savings compared to an on-demand instance.

  • Convertible RI: Any kind of day-to-day or long-running tasks where the instance type is projected to change or grow (only the same size and bigger are supported). Delivers up to 54% savings compared to an on-demand instance.

  • Scheduled RI: For scheduled tasks such as monthly billing.

When deploying reserved instances, you need to be able to match the region or availability zone to the reservation to be able to take advantage of RI pricing. A typical error clients make is buying RI in one region and then migrating the application to another region. This means that the RI pricing no longer applies, and the AWS monthly bill is usually where the client finds this out. In such a case, the client is able to use the Reserved Instance Marketplace, were reservations already purchased by AWS consumers can be traded when no longer needed. This gives AWS customers options to buy even more flexible reservations.

One of the biggest benefits of AWS is that you can bid any price on the empty space available in AWS that is intended for new clients or existing clients to scale out into. This capacity comes at a discount of up to 90%. The only caveat is that when this capacity is requested by any on-demand client, spot instances need to give up their resources. AWS gives you 2 minutes’ warning before interrupting your instance, so you have time to commit any information to stateful storage. AWS also lets you control the interruption behavior, giving you the option to terminate, stop, or hibernate the instance. If you choose to hibernate, the instance retains its memory content and continues processing when the instance price falls below the instance bid. The spot instance interruption message is available from the instance metadata at http://169.254.169.254/latest/meta-data/spot/instance-action.

In some cases you might need a completely secure and isolated environment in the cloud. In such a case, you can choose to use dedicated instances and dedicated hosts. Both of these will run on isolated and dedicated EC2 hardware. You would use these two types for compliance and governance reasons or to comply with any laws governing your computing.

To control the operating system configuration, you can choose from quite a selection of AMI instances provided by Amazon. You can also find AMI instances on the Marketplace and launch them directly into your VPC. You can also create your own AMI instances by launching a new instance, customizing it, and then saving it as a custom AMI instance.

An AMI instance is regionally bound and gets a region-specific image ID. You can run an AMI instance in any availability zone within a region. You can also copy any AMI instance to another region, if required. The AMI instance in the new region receives a new AMI ID. You can also de-register an AMI instance to essentially delete the image from AWS. It is a good practice to de-register any old custom AMI instances as AWS charges you by the gigabyte of storage consumed by the custom AMI instances on S3.

Because EC2 is addressable via the API, you can integrate any kind of application to run an EC2 instance. To show how this works, Example 3-2 uses a node.js script and runs it via the AWS SDK.

images

Example 3-2 node.js Script That Creates an EC2 Instance

// set the AWS variable for the SDK
var AWS = require('aws-sdk');
// define he AWS region as us-east-2
AWS.config.update({region: 'us-east-2'});
// source the credentials from the .aws/credentials file in your home directory
var credentials = new AWS.SharedIniFileCredentials({profile: 'default'});
AWS.config.credentials = credentials;
// Create the EC2 definition
var ec2 = new AWS.EC2({apiVersion: '2016-11-15'});
var instanceParams = {
   ImageId: ' ami-0d8f6eb4f641ef691',
   InstanceType: 't3.small',
   KeyName: 'mykeypair',
   MinCount: 1,
   MaxCount: 1
};
// define a promise for the EC2 instance
var instancePromise = new AWS.EC2({apiVersion: '2016-11-15'}).
runInstances(instanceParams).promise();
// define the error handling for the EC2 promise
instancePromise.then(
  function(data) {
    console.log(data);
    var instanceId = data.Instances[0].InstanceId;
    console.log("Created instance", instanceId);

Save this file as ex2.js. To test whether the script will run, you can run the node ecj.js command, and an EC2 instance should start in your EC2 environment.

You can save the state of an instance by creating a snapshot. A snapshot of an instance is simply a point-in-time copy of the instance volume. Snapshots are incremental in nature, and each one stores only the blocks that have changed from the previous snapshot. Figure 3-7 illustrates the incremental nature of snapshots.

images

Figure 3-7 The Incremental Nature of Elastic Block Storage (EBS) Snapshots

You can also create an AMI instance straight from the snapshot for supported operating systems. When you deploy an instance from a snapshot, you need to be aware that the disk cache for the instance will not yet be populated, so the application might experience slow performance on the first read. You can mitigate this by reading all the disk sectors and “warming up” the disk cache in the process.

Each instance is started with a primary network interface. This interface is designed to be attached only to this exact instance; otherwise, it has all the characteristics of an ENI. You can create an ENI separately from an instance and then attach it to the instance as a secondary interface. This is useful when you need to connect an instance to several subnets (for example, security devices, firewalls, NAT instances). The secondary ENI gives the instance a direct Layer 2 connection to the subnet instead of routing through the default VPC router.

Because an ENI is created independently of an instance, you are also able to maintain the state of the network connection on the ENI independently of an instance’s state. This comes in handy when you need to use a fixed IP address or an unchangeable MAC address for licensing purposes. You can assign a static private IP address or an Elastic IP address to the ENI and then attach the device to the instance. The instance automatically inherits all these characteristics. When a MAC-dependent license is present in an instance, you should tie it to the ENI MAC. This way, you can recover the license even if this instance fails by simply reattaching the ENI to a new instance created from a snapshot or an AMI instance of the previous instance.

You also have the ability to completely customize the startup process of your instance by modifying the startup commands in the operating system when the instance starts. You can add a user data script that can execute Linux bash or Windows command-line or PowerShell scripts upon start. The script can include any kind of common task associated with starting an operating system, such as updating the system upon start or creating users. The script is executed with the root or administrator account, so this gives you complete control when you launch instances, even if your account does not have the same level of permissions after the instance is created. The instance will become available only when the user data script completes.

images

Example 3-3 uses a Python 2.7 script to run an instance with a Lambda function. This script can be used as a way to automate a loss of an instance or to add an instance to an environment that you would like to intelligently scale.

Example 3-3 Python Script That Creates an EC2 Instance

import boto3 # boto3 is the AWS SDK for python
EC2 = boto3.client('ec2', region_name='us-east-2') # defines the EC2 variable which
we will use in the lambda handler. We are defining the client as ec2 and the region
as us-east-2.
def lambda_handler(event, context): # this is the lambda handler
    instance = EC2.run_instances( # the instance definition
        ImageId='ami-0d8f6eb4f641ef691', # the image ID
        InstanceType='t2.micro', # instance type
        MinCount=1, # a minimum is always required by boto
        MaxCount=1, # a maximum would be useful for an Auto Scaling group
        InstanceInitiatedShutdownBehavior='terminate', # the default shutdown
        behavior
    )
    instance_id = instance['Instances'][0]['InstanceId'] # the definition of instance
ID which is printed and returned in the Lambda output
    print instance_id
    return instance_id
images

To be able to run this Lambda function, you need to give it the appropriate permissions with an IAM policy like the one shown in Example 3-4.

Example 3-4 Lambda Security Policy Required to Run the Python Script

{
     "Version": "2012-10-17",
     "Statement": [
         {
             "Effect": "Allow",
             "Action": [
                 "logs:CreateLogGroup",
                 "logs:CreateLogStream",
                 "logs:PutLogEvents"
             ],
             "Resource": "arn:aws:logs:*:*:*"
         },
         {
             "Effect": "Allow",
             "Action": [
                 "ec2:RunInstances"
             ],
             "Resource": "*"
         }
     ]
}

This policy essentially allows the Lambda service to interact with the CloudWatch logs to record actions against it and also launch instances with the ec2:Runinstances action.

Amazon ECS and Fargate

Modern applications are becoming more modular in design, and that is influencing how you use compute resources both on premises and in the cloud. The idea behind the modular approach is to create small units of compute that perform one task really fast and efficiently and then scale those units horizontally by introducing more of them to a compute cluster or service layer. This way, your application can dynamically and very precisely use only the resources that are actually required.

While virtual machine instances are still very much a part of modern applications, more and more emphasis is being placed on containers. Containers allow you to perform computing at a much smaller scale than do typical virtual machines. Containers also make possible what virtualization was never really able to deliver: the ability to run an application on any platform seamlessly with no configuration changes to either the environment or the container.

So what are containers? Essentially, a container is divided into two parts:

  • Container image (or just image): This is the package that contains the application and all the prerequisites, dependencies, libraries, and other underlying components an application requires to run.

  • Container instance (or just container): This is an instance of the container image that has been executed on a container engine.

Containers can run on a very wide range of operating systems, including on bare-metal container engines and in serverless container environments. The only prerequisite is that the image being used must comply with a certain image standard. The image standard defines both the structure of the container image storage and the runtime specification for running the container instance.

AWS uses the Docker standard, which is compliant with the Open Container Initiative (OCI). Docker is such a popular containerization engine that the terms container and Docker are almost used interchangeably. Several different containerization services are available from AWS:

  • Amazon Elastic Container Service (ECS): An orchestration service for EC2 instances and container tasks that allows for both full control and full automation of the container infrastructure

  • AWS Fargate: A serverless containerization managed service in AWS that allows you to run container tasks without having to worry about your infrastructure

  • Amazon Elastic Container Service for Kubernetes (EKS): An orchestration service for creating and managing Kubernetes clusters in AWS

  • Amazon Elastic Container Registry: A Docker-compatible hosted container registry

There are differences between ECS, Fargate, and EKS. While all three are able to deliver container infrastructure services for your application, each has a different approach to running containers and a different market niche focus:

  • ECS is focused on running fleets of Docker container engines. Customers who are familiar with Docker or who are looking for an option to run their existing Docker environments in the cloud will be attracted primarily to ECS.

  • Fargate is focused on spiky, infrequent, or event-based containerized application tasks. This service is very much tailored to new applications that are being built in a serverless manner with an event-based computing approach. Fargate provides the ability to run custom container images in the cloud completely on demand.

  • EKS is focused on providing customers with access to a Kubernetes environment in the cloud. Customers who are familiar with Kubernetes or who are looking for an option to migrate their Kubernetes-based applications to the cloud will be attracted primarily to EKS.

Both ECS and EKS have been designed to use EC2 instances as their underlying units of compute or Fargate to supplement ECS with even smaller units of compute (container tasks). Figure 3-8 illustrates the different cluster manager options (EKS or ECS) and the different container platform options (EC2 or Fargate).

images

Figure 3-8 AWS Container Orchestration and Execution Choices

Currently ECS is the gold standard approach to containerization in AWS, and Fargate is basically the next evolutionary step from ECS.

images

To create an ECS cluster, you can simply use the CLI aws ecs create-cluster command:

aws ecs create-cluster --cluster-name everyonelovesaws

As mentioned earlier in this chapter, Amazon ECS has adopted the Docker standard for running containers, and the ECS service is completely compatible with the standard approach to building and running containers in Docker. The ECS command line allows you to essentially substitute the Docker commands with ECS commands but use the same build files, the same images, and the same scripts in ECS that you would use in Docker. This means that the learning curve for migrating from Docker to ECS is virtually nonexistent. Alongside the standard definition of containers and images, the following components are defined in ECS and Fargate:

  • Task definition: A JSON file describes the containerized environment. You can specify a complete application setup with up to 10 container images to be used to run up to 10 different services. Within a task definition, you also specify the services and volumes, and you even implement features such as resource constraints for the running task.

  • Task: A task is essentially a running instance of a task definition. Tasks can be run manually or upon a certain trigger. You have the ability to set the number of tasks to run and also scale this number according to the requirements of the incoming requests.

  • Scheduling: The ECS task scheduler ensures that tasks are placed in the environment according to the definition of the scheduling you specify.

  • Cluster: A cluster is a logical grouping of tasks and services in both ECS and Fargate. You always require a cluster to be selected for running each task, and you can mix and match tasks from ECS and Fargate in one cluster.

  • Container agent: The agent is an application component running in the EC2 instance that allows communication with the ECS service and the scheduling of tasks on the instance. You are only required to maintain container agents when running tasks on ECS.

  • Container EC2 instance: This is an EC2 instance with a container agent installed that has been registered to ECS to run container tasks. You can either manually register container EC2 instances or leave the registration up to the ECS service, as it can both provision and register EC2 instances into the ECS environment.

  • Service and autodiscovery: These two mechanisms can group together the containers performing the same task. The service definition can be designed to scale automatically, and the autodiscovery can seamlessly add the newly created containers to the service to respond to requests.

images

To create a task definition, you can use the aws ecs register-task-definition command. It requires you to create a JSON file with the task definition that you will use to create the task. This task is designed to just run a small server on the Ubuntu container platform and issue a ping request to amazonaws.com to keep the task alive. Remember that when an ECS task completes its job, it automatically terminates.

To see how this works, you can write a file called mypingtask.json that will define the task. You need to define the following characteristics, as shown in Example 3-5:

  • name: The name of your task definition.

  • image: The container image to use.

  • cpu: The number of CPU shares to give to this container. Shares are calculated out of 1024, so giving the container 512 shares entitles it to half of all the CPU power in an ECS instance.

  • command: The command to run each part of the command, with these commands separated by a space and defined within its own value in quotation marks.

  • memory: Megabytes of memory to give this task. Ubuntu needs 512 MB to run.

  • family: To group the tasks by family, an application name where you want to group task definitions for the microservices.

Example 3-5 Container Task Definition

{
     "containerDefinitions": [
         {
             "name": "mypingtask",
             "image": "ubuntu",
             "cpu": 128,
             "command": [
                 "ping",
                 "amazonaws.com"
             ],
             "memory": 512,
          }
     ],
     "family": "demo"
}

Once a container is initialized from this task, it runs an Ubuntu container and issues a ping to amazonaws.com. To register this container, you can run the following CLI command:

aws ecs register-task-definition --cli-input-json file://
mypingtask.json
images

You can then reference the task by name to spin up a container. The .NET code snippet in Example 3-6 allows you to run an ECS task on your everyonelovesaws ECS cluster. You can include this snippet in your code to give the application the ability to spin up AWS ECS containers.

Example 3-6 .NET Code That Runs a Task Definition

var response = client.RunTask(new RunTaskRequest
{
    Cluster = " everyonelovesaws ",
    TaskDefinition = "mypingtask:1"
});

List<Task> tasks = response.Tasks;

Storing Persistent Data

Whether you are running EC2 instances or containers in AWS, the best practice is to keep your computing environments stateless. This means that none of the information that should be retained longer than the instance’s lifespan should remain in the instance itself.

With EC2, an instance is always run off a root device; this represents a disk attached to the instance that will either be sourced from the instance storage or from the EBS back end. As mentioned previously in this chapter, the instance type determines whether the instance is able to use the instance storage and how much instance storage is allocated to it.

An Amazon instance store is perfect for the stateless computing approach as the devices in the instance store are designed to be completely ephemeral. An instance store is sourced from physical storage devices attached directly to the hypervisor, and due to its ephemeral nature, it is ideal for any kind of temporary storage purpose, such as the paging file, various buffers, caches, scratch space, and other temporary data. When an instance using instance storage is part of a distributed cluster that will replicate data across the network, you can also utilize the ephemeral disks for running short-lived persistent services such as temporary NoSQL database clusters and MapReduce operations on a Hadoop cluster.

Possibly the biggest benefit of an instance store is that the performance of the directly attached storage subsystem is much higher than with the EBS back end; in addition, the price of the instance store volume is included in the instance price. In Linux EC2 instances, all instance store volumes are automatically mounted in the operating system, and in Windows, the disks are attached, but you need to create the volumes in Disk Management or by using the diskpart utility. When instance store volumes are included, you should make sure to use them.

The only drawback of using an instance store is that the data on the instance store is available only while the instance is running. All data is lost if a hardware failure or a user-initiated shutdown operation causes the instance to be stopped on the hypervisor. As soon as the instance is stopped, all blocks from the volume are flushed, and when the instance is started again, all changes made before the shutdown are discarded. This functionality is completely logical as you should expect that the instance will probably be started on another hypervisor the next time, and the physical storage from the previous session will not be attached to it.

Amazon EBS

To store data persistently, you can use the EBS service to create network-attached volumes. The EBS volumes are designed to deliver block device volumes that are inherently highly available within one availability zone. The data volumes are replicated to two disk subsystem facilities in the same availability zone. Each disk subsystem is designed to withstand multiple disk failures without losing data. The EBS volumes are also designed to persist the data, regardless of the instance state. You also have the ability to detach an EBS volume from one instance and attach it to another one. This can be very useful when you’re updating or upgrading a component in the operating system.

For example, consider an EC2 instance with a secondary EBS volume attached. The operating system on the primary device also has a database server installed. The database itself is stored on the secondary volume. Instead of updating the database engine to the newest version within the operating system and rebooting the server, you can perform the update by spinning up a new instance and making sure it is operational. When you determine that the new operating system is ready to take over the job, you simply stop the database service on the original instance, detach the secondary EBS volume from the original instance, and attach it to the updated instance. Once it is attached, you start the database service, and the database is served directly from the secondary EBS volume. In case of issues, you have an easy way to roll back to the old instance as you can just reverse the process and fall back to the previous setup.

But what if the new version corrupts your database, and you aren’t able to roll back? For backing up, you can simply create an EBS snapshot. By creating a snapshot, you are creating a point-in-time copy of the whole volume. The snapshots are stored in an AWS managed S3 bucket and are incremental in nature: Only the blocks that have changed since your most recent snapshot are stored to the snapshot. The snapshot can be used to deploy the volume from any of the available creation points.

The only drawback of snapshots is that they are performed without notifying the operating system. This means that any software running within the operating system will not automatically commit any outstanding I/O operations to disk when the snapshot creation starts. This could cause the data in a snapshot to get corrupted. To prevent this from happening, you need to implement a script that

  • Connects to the operating system and commits any outstanding I/Os

  • Momentarily freezes the disk in preparation for the snapshot

  • Initiates the snapshot so the data copy process is started

  • Releases the momentary disk freeze as soon as the snapshot has started

During the snapshot creation phase, the disk is fully accessible, but you might observe a slight performance impact in applications with very high I/O requirements. Best practice is to perform snapshots during low application utilization periods.

Scalability and High Availability

Say that you have an EC2 instance running in your VPC, it has an EBS volume attached for the persistent data, and you are making snapshots of the volume to make sure you can recover even from the worst-case scenario. But as you now know, an instance can run in only one availability zone at a time. This means that the availability of such an instance is limited to the SLA for the services in AWS that are tied to one availability zone. While this might meet the needs of your application, there is a way of making this setup even more highly available and resilient to failure. In addition, once you introduce high availability, you are also able to introduce scalability with exactly same approach.

When analyzing the requirements for making an application highly available, you need to assess the following areas:

  • Persistent data storage: Where is the data being generated by the application stored? Is it being stored on the instances (that is, are the instances stateful) or in a separate back-end environment?

  • Application state information: Is the data about the user sessions available uniformly across all instances, or is it unique to each instance?

  • The SLA: What kind of high availability does your SLA determine? What kind of recovery time objective (RTO) and recovery point objective (RPO) are acceptable when a failure occurs?

High Availability Design Patterns

When your instances are stateful, the only way to ensure high availability is through the use of clustering. Any data being served or stored by the instances should be seamlessly distributed to either all nodes of the cluster or a subset or an array of nodes that will be primarily responsible for serving that data. Many clustering tools exist for different types of applications. The cloud does not initially seem to be an ideal place to deploy clustered applications—but is it? In fact, clustered applications can work quite well in the cloud, and there are many service examples in the AWS cloud (DynamoDB, Elastic MapReduce, Redshift, and so on) that depend on different types of clustering to provide data consistency and data resiliency across multiple nodes.

If data distribution across the instances is not an option, then a rigorous backup schedule of each instance’s EBS volume and a fast approach to recovery is required. You need to be able to still run multiple instances without the need to replicate data. You also need to be able to run the multiple instances behind a load balancer and employ sticky sessions to direct a specific user to a specific server that can handle the specific request. Such an application design should not be considered for any modern applications due to the apparent caveats of this design in the event of an instance failure.

Given all the requirements, you should always strive to conform to the best practices outlined in AWS and think of all of your AWS resources as disposable pieces of the application that can be easily replaced. In fact, the best practice in AWS is to keep all computing instances completely stateless. This means moving all persistent data, even the logs, off the instance and storing it in an environment that is external to the compute. The compute should only do the computing and not the storing. If you keep your instances stateless, it becomes much easier to create highly available applications that can also be made highly scalable. All you need in this case is two or more instances in two or more availability zones behind ELB.

AWS Elastic Load Balancer

The Elastic Load Balancer (ELB) service is designed to deliver higher availability and even load distribution across a number of EC2 instances of ECS containers. The ELB service is automatically integrated with other AWS services and allows you to push metrics and notifications to other services to facilitate intelligent responses to ELB events. For example, you can integrate the ELB service with the Auto Scaling service to automatically scale the number of instances to meet the incoming traffic demands. There are three types of load balancers available in AWS:

  • Classic Load Balancer

  • Application Load Balancer

  • Network Load Balancer

Classic Load Balancer was the first load balancer offering from AWS. The service provides a robust and simple-to-use load balancer primarily designed to load balance Layer 4 (TCP-based) traffic with some Layer 7 capabilities, such as the ability to use a cookie to bind the user to the target server and create a sticky session. Classic Load Balancer has been at the core of pretty much every major application since the early days of AWS and is the only load balancer supported on the EC2 classic network.

In 2016, AWS introduced the Application Load Balancer. It is promised to be the next-generation load balancing solution from AWS and is designed to be a pure Layer 7 load balancer. The service can read and understand an application request and can route the request to multiple back-end target groups. Each target group can have a separate traffic rule assigned to it. With the traffic rule, you can define the pattern in the request that will direct the request to the appropriate back end. Consider this example:

  • The front-end target group can have a pattern that directs any request for the website directly to the front-end target group that hosts only the front-end HTML, CSS, and so on.

  • The images target group can have a pattern matching any string containing *.JPG or *.JPEG, and all requests for images are automatically redirected to a separate back end that hosts only the images.

  • A request made by a mobile browser would automatically be detected by the Application Load Balancer, and that request would be sent to a third target group that would specifically serve only the mobile content.

This kind of behavior is especially useful to any application designed for microservices. The target group can be represented as a microservices service layer.

The load balancers ensure high availability of traffic coming to the instances by deploying two redundant endpoints in two availability zones. This ensures that even if a failures occurs on the load balancer hardware, the service is not disrupted, and any requests coming in to the load balancer service can always be redirected to another availability zone. Figure 3-9 illustrates a highly available application deployed across two availability zones in a VPC.

images

Figure 3-9 High Availability of a Web Application Provided by the AWS ELB Service

In 2017, AWS delivered its second next-generation load balancer service: Network Load Balancer. It is a pure Layer 4 load balancer, designed to deliver very high network throughput and very low latencies of responses to client requests. It can serve tens of millions of requests per second and deliver consistent performance at any scale and with any access pattern. The service is designed for high-performance microservices environments, but the service is bound to a single availability zone.

Auto Scaling

Now that you understand all the prerequisites for making an application scalable, you need to take a look at Auto Scaling. The AWS Auto Scaling service is designed to deliver automatic scalability to your applications running on AWS. The idea behind Auto Scaling is that you can take the performance data from an application and increase the number of instances (scale out) when more of them are required to meet the demand or decrease the number of instances (scale in) when the demand disappears. This approach can help your application make much better use of resources and can also save you quite a sizable amount of your budget because you only need to run as much compute capacity as required by the current usage.

The Auto Scaling service works hand in hand with the CloudWatch service to retrieve the metrics and the ELB service to notify it of any changes in the size of the instance target group. The Auto Scaling service can scale the following AWS environments:

  • EC2: You can add or remove instances from an EC2 Auto Scaling group.

  • EC2 Spot Fleets: You can add or remove instances from a Spot Fleet request.

  • ECS: You can increase or decrease the number of containers in an ECS service.

  • DynamoDB: You can increase or decrease the provisioned read and write capacity.

  • RDS Aurora: You can add or remove Aurora read replicas from an Aurora DB cluster.

For the purpose of demonstrating the Auto Scaling capabilities of a compute cluster, the EC2 service is the primary focus here. The principles and definitions set out in an EC2 service can easily be extrapolated to the other services in the preceding list. When scaling EC2 instances, you first need to create an EC2 launch configuration, which specifies the instance type and AMI instance to use, the key pair to add to the instance, one or more security groups to assign to the instances, and the block device mapping with which the instances should be created. The launch configuration is then applied to an EC2 Auto Scaling group that defines the scaling limits; the minimum, maximum, and desired numbers of instances; and the scaling policy to use when a scaling event occurs. The scaling policy defines a trigger that specifies a metric ceiling (for scaling out) and floor (for scaling in), which is the maximum allowed usage before you trigger Auto Scaling. The scaling policy also defines a length of time for which that metric ceiling or floor can be breached before an alarm is triggered and what happens when the event triggers Auto Scaling.

For example, consider the following setup:

  • An Auto Scaling group with a minimum of two and a maximum of six instances

  • A CPU % metric with a ceiling of 80% for scale-out events

  • A CPU % metric with a floor of 20% for scale-in events

  • A breach duration of 10 minutes

  • A scaling definition of +/– 50% capacity on each scaling event

If the application is running between 20% and 80% aggregate CPU usage across the Auto Scaling group, there will be no triggers and thus no Auto Scaling. You start your application with a minimum of two instances. The application usage starts growing and grows above 80%. The scaling policy allows for 10 minutes for the application to settle before scaling out. Once the alarm is triggered, the application is expanded with one more instance, which represents 50% of the previous capacity of two instances. Figure 3-10 illustrates the operation of a scaling policy triggered by a CloudWatch alarm.

Now you have three instances. The application usage grows, and you are required to scale out. You will be adding two more instances because one would not be enough to fulfill the +50% scaling of the group. The environment now has five instances. On the trigger of ceiling breach, you would expect three instances to be added, but that would mean that you have more than six instances running, so only one is added, and you are at the maximum of six. When the application usage is reduced below 20% for 10 or more minutes, the application is scaled in, and the number of instances is decreased by 50%, leaving you with three instances. If the usage falls further, below 20%, your application can scale in but only by one more instance because the application minimum is set to two instances.

images

Figure 3-10 Example of AutoScaling policies and CloudWatch in operation

Amazon Route 53

As you have seen, making an application highly available within one region is fairly straightforward. But an application running in the cloud will not be very usable if you have no way of directing traffic at it. Distributing a list of IP addresses is a thing of the past, especially in the cloud, where both private and public IP addresses are considered disposable resources. So how can you deliver the users to the right service that has an ever-changing IP address on which it responds? The answer is DNS.

Amazon’s Route 53 service is essentially a next-generation managed DNS cloud service. Gone are the days of editing bind zone files. With Route 53, the approach is based on the standard way you communicate with all AWS services: through the API. The service allows you to dynamically update DNS records and deliver DNS responses to clients with a 100% SLA. Basically the only way Route 53 would go down is if there were no electricity anymore—and in that case, you really wouldn’t require any DNS services anyway.

While providing standard DNS request/response functionality is at the core of Route 53, the service also allows you to register domains, create public and private zones, and, possibly most importantly, provide traffic shaping functionalities through DNS responses. Route 53 doesn’t just respond with the first record in the list; it can perform health checks of the DNS targets as well as sense the latency from the user to the target. The Route 53 service has the following routing policies that help you shape the traffic:

  • Simple routing

  • Multivalue answer routing

  • Failover routing

  • Weighted routing

  • Latency-based routing

  • Geolocation and geoproximity routing

Simple routing is the default routing used to serve Route 53 DNS responses. It returns a single response for a single request. For example, when looking up www.pearson.com, a simple A record would return one IP address even if multiple values were recoded for that A record. The simple routing mechanism picks one IP address at random if multiple values are present.

To augment the capabilities when multiple responses are required, multivalue answer routing can be used. In this case, each request for one address returns up to eight possible responses. For each response, the Route 53 service can also perform a health check that can determine whether the source is responding. Any applications where a list of possible servers is required to perform tasks in a distributed manner would make good use of multivalue answer routing (for example, peer-to-peer applications, video streaming, HPC).

With certain routing policies, Route 53 also supports health checks. A health check automatically allows the DNS service to determine whether the target that is being returned in the response is healthy. The DNS service can perform both TCP port as well as HTTP and HTTPS health checks. When configuring an HTTP/HTTPS health check, the service can perform a simple response check, or it can also check for certain strings in your website. For example, an HTTP health check can be configured to look for a 200 or 300 HTTP response while also looking for a specific string in the website, such as the name of your company or any unique string you can hide in the HTTP code. On top of the health check, a response time threshold can be introduced to the health check. Using a response time threshold, the DNS service can also determine whether the site is responding too slowly to be of use to clients. Figure 3-11 illustrates a failover routing configuration with a health check in Route 53 DNS.

images

Figure 3-11 A Route 53 Health Check Determining Whether an Application Is Healthy

The only time a health check is not optional is with failover routing. Failover routing provides you with the ability to serve content from several AWS regions at the same time. When configuring failover routing, you need to determine the active endpoint that will be receiving all the traffic. In addition, a failover endpoint needs to be configured as a passive site, intended for failover of traffic in case the active endpoint fails the health check. This approach is useful when a backup or disaster recovery site is set up in a different region and the data is synchronized only in the direction from the active endpoint to the passive one.

When you are operating two or more endpoints with a two-way or multidirectional data replication approach, you can use weighted routing. This routing approach provides you with the ability to deliver traffic to several endpoints simultaneously and allows you to decide how much load to deliver to each of the endpoints. You can specify the weight of each endpoint according to the capacity of the endpoint. This can also be very useful when performing A/B testing or deploying applications with a blue/green deployment approach.

Note

A blue/green deployment approach allows you to deploy a completely new environment (green) and switch over the traffic from the existing environment (blue). This is very useful when switching to a whole new version of the application. You can also use a gradual approach for testing, which is schematically the same but allows you to release a new feature to a subset of users in the wild. This approach is usually called A/B testing, where A is the old version and B is the new version. You designate a subset of traffic to go to B (for example, 5%) to test new features in the wild.

When performance of an application is key, the latency-based routing approach provides the best user experience by determining the endpoint that has the lowest latency to the user and delivering the response in the fastest manner. It involves issuing a latency check between endpoints in the same geographic area as the requestor and determining which front end will respond faster. It is not an estimation; the service actually performs a latency check before serving a response to the end user.

In a similar manner, geolocation and geoproximity can be used to determine the endpoint to which the user needs to be routed. Geolocation determines the location of the user according to the source IP address and allows you to deliver content based on the country or region of the source IP address. This means you can deliver the application in the user’s language or comply with any laws and regulations in the country or region. For example, when running a global application where users store their personal data, data sovereignty principles will differently apply to your application for each country or region that the application is deployed in. Thus, you can route the users within a region or country to a regional AWS location to ensure that you comply with the laws and regulations governing data sovereignty.

Geoproximity, on the other hand, allows you to determine the longitude and latitude of the requestor and determine which AWS region will serve the content for that user. You can also determine different weights for the regions that are part of the geoproximity group. The bigger the weight, the larger the geographic area each region serves. This approach can be very useful when you want to deliver the lowest latency but also take into account the difference in provisioned capacity for your application in different regions.

Orchestration and Automation

When you know how to make your applications highly available and highly scalable in AWS, it is fairly easy to extrapolate how to make the applications fully automated. The fact that all of the AWS infrastructure components are easy to deploy and simple to use has led to a sprawl in the numbers of instances, containers, and other services used in the cloud. To control this sprawl and bring your application infrastructure in order, you need a unified and automated way to approach the deployment, updating, life cycling, and decommissioning of your applications.

Orchestration is one of the key aspects that enables you to control your environment without requiring you to perform any manual tasks. What you look for in orchestration is the ability to use a specification document that outlines the services to be deployed and push that document to the orchestration service. In turn, the orchestration service creates the resources. This ability improves the reputability of your deployments and the reliability of the application. You can deploy the same script into your development, test, QA, staging production, and other environments in a unified manner. Because the same infrastructure is being built in all of these environments, the reliability of the final deployment will be much higher because you can remove many more unknowns with orchestration and the cloud.

Basics of Cloud Orchestration and Automation

Orchestration tools allow you to implement the infrastructure as code approach to delivering capacity required by your application from the cloud. Instead of creating the cloud objects manually, you simply create an orchestration script that can be stored in a version repository and maintained in the same way as your development code. From the repository, you have the ability to deploy the script into the cloud, and the orchestration service will simply create the infrastructure for you. Once the infrastructure is deployed, you can simply deploy the software required in the same manner.

Infrastructure as code is very beneficial in DevOps and CI/CD environments as the orchestration script can be put through the same test cycles as the software by the CI environment as soon as any changes are detected. This means you can easily verify the infrastructure orchestration script in an automated and rapid manner.

Several open standards have been created to support interoperability, but probably the most widely adopted standard is Topology and Orchestration Specification for Cloud Applications (TOSCA), defined by the Organization for the Advancement of Structured Information Standards (OASIS). TOSCA defines a standard language that allows for building topologies, creating services, and defining any other components in the cloud. TOSCA-compliant orchestration tools allow for interoperability with configuration management tools and cloud management platforms, providing you with the ability to achieve a high level of unification in cloud orchestration.

Several services in AWS provide you with the ability to control the infrastructure, environment, and capacity that your application consumes:

  • AWS CloudFormation: CloudFormation provides a fully TOSCA-compliant infrastructure orchestration service that can perform any deployment of an AWS service in a very customizable manner.

  • AWS Elastic Beanstalk: This simple and easy-to-use managed service automatically orchestrates the infrastructure deployment and allows developers to focus on their code instead of the infrastructure.

  • AWS CodeDeploy: CodeDeploy gives you the ability to (automatically) deploy code and perform updates from your development cycle in a new or existing environment.

  • AWS OpsWorks: This managed environment provides a fully configurable Chef or Puppet configuration management service.

  • AWS Systems Manager: Systems Manager provides a set of tools to manage and control instances in the cloud, including the ability to run remote shell scripts and perform automated updates and installations.

The next sections look at the Elastic Beanstalk and CloudFormation services in a bit more detail. AWS CodeDeploy is covered in more detail in Chapter 6, “AWS Development Tools.” Because the OpsWorks and System Manager tools do not apply to the AWS Certified Developer certification, this book does not cover these tools in detail. If you would like to learn more about them, though, pick up the AWS Certified SysOps Administrator Associate Certification Guide, which discusses them in detail.

AWS Elastic Beanstalk

The Elastic Beanstalk service is designed to empower developers by automatically performing the day-to-day management tasks of infrastructure deployment and configuring the appropriate features to run the code. Elastic Beanstalk can automatically deploy an EC2 instance with an operating system, the appropriate programming language interpreter, any required application services, prerequisites, frameworks, runtimes, libraries, modules, and so on, as well as an HTTP or HTTPS service that can present the application on a standard HTTP port. Elastic Beanstalk can also configure any external components, including load balancers, databases, message queues, and object storage.

Elastic Beanstalk is the right solution for any environment where the business driver is the reduction of overhead due to architecting, operating, maintaining, updating, and patching the infrastructure. With Elastic Beanstalk, the resources are created in your account with full transparency. You can see each instance, each load balancer, each RDS database, and so on. To run your application, you simply need to provide the code and the specification for the Elastic Beanstalk environment. This allows the developers to focus on the code and simply deploy the application. Elastic Beanstalk takes care of the rest.

Elastic Beanstalk deployment can be broken down into the following components:

  • Application: A logical grouping of environments.

  • Environment: The specific platform the code needs to run on. An application can have multiple environments with multiple different sets of platforms to run on. Each environment within an application has a unique endpoint where it can be addressed.

  • Tier: The type of environment:

    • Web tier: An application front end that responds to client requests

    • Worker tier: An application back end that processes tasks queued up in a message queue

  • Configuration: All the information about the application. A configuration file can be saved and can be used to redeploy and customize the existing application. The configuration can also be applied to a new application to clone the existing environment because it contains all the specifics about the infrastructure and all the code. The configuration can also be used as a point-in-time backup.

  • Application version: A version of the code running within an environment. Multiple concurrent versions of the code can be uploaded into an Elastic Beanstalk application and deployed to multiple concurrently running environments.

To deploy an application with Elastic Beanstalk, you only need a package containing your code. Elastic Beanstalk is very flexible as it allows you to deploy the code by using the Management Console, the AWS CLI, the Elastic Beanstalk CLI, or any of the SDKs or by addressing the Elastic Beanstalk API directly. When deploying your code, you of course have to have code that is compatible with one of the Elastic Beanstalk–supported platforms, such as one of these:

  • Packer: An open-source platform for creating and managing AMI instances in AWS

  • Docker: The Docker container engine, which allows you to deploy single or multiple containers in to an Electric Beanstalk environment

  • Java: Java SE 7 and 8 code and Java 6, 7, and 8 on Tomcat

  • .NET: .NET Framework and .NET Core on Windows Server 2008 to 2016 and IIS versions 7.5 to 10

  • Node.js: Node.js language versions 4, 5, 6, 7, 8, and 10

  • PHP: PHP language versions 5.4 to 7.2

  • Python: Python language versions 2.6 to 3.6

  • Ruby: Ruby language versions 1.9 to 2.6

  • Go: Go language version 1.11

Alongside the code, you can specify any other AWS services that will be required to run the application. Elastic Beanstalk has a built-in interface that allows you to control the following services that are commonly used in application development:

  • Elastic Load Balancers: To make web tier applications highly available, you can implement a load balancer that can be automatically configured and deployed by Elastic Beanstalk.

  • SQS queues: These are automatically deployed to worker tiers and are the single point of contact with the workers. You post messages for the workers to process, and the workers listen to the message queue.

  • CloudWatch: You have the ability to control the delivery of metrics and logs to CloudWatch. This means you can deliver only the metrics that matter.

  • S3: Any S3 buckets to be used by the application for storage and log delivery. One thing to note is that when deleting applications where S3 buckets have been created through Elastic Beanstalk, the contents of the buckets and the buckets themselves are deleted also!

  • RDS databases: Any RDS databases to be used by either the web tier or the worker tier. Be careful because the RDS database is also deleted with the application if deployed by Elastic Beanstalk.

  • AWS X-Ray: An X-Ray daemon can be automatically installed on all the instances to allow for integration with the X-Ray distributed tracing.

This set of integrations gives you a lot of flexibility, but you also have the ability to perform fairly deep customization of the infrastructure components, including the packages, prerequisites, libraries, frameworks, and so on. You also have the ability to run commands during the environment initialization. The customization can be performed within the package by providing an .ebextensions file.

images

Elastic Beanstalk has its own CLI, called EB CLI, that can be used to deploy an Elastic Beanstalk environment. Let’s look at an example of creating an environment.

To install the EB CLI, first install the prerequisites (for Linux in this example):

Note

For instructions on how to install prerequisites and the tools on your operating system, take a look at https://github.com/aws/aws-elastic-beanstalk-cli-setup.

yum group install "Development Tools"
yum install 
zlib-devel openssl-devel ncurses-devel libffi-devel 
sqlite-devel.x86_64 readline-devel.x86_64 bzip2-devel.x86_64

Next, clone the repository:

yum install git
git clone https://github.com/aws/aws-elastic-beanstalk-cli-setup.git

Next, enter the directory and run the bundled installer:

cd aws-elastic-beanstalk-cli-setup/scripts

./bundeled-installer

The eb command should be added to $PATH automatically. This sometimes does not work, so if you cannot run eb directly, look for the executable in the ~/.ebcli-virtual-env/bin/python directory.

Now you can start using the EB CLI. First, you should run the eb init command to configure the EB CLI:

eb init --interactive

You are prompted to select a default region for deploying the Elastic Beanstalk environment, the credentials to be used, and the Elastic Beanstalk application to be created. In this example, you will name the new application everyonelovesaws and use your preferred programming platform. When you run the init command, an empty environment is created.

Next, you need to create an application. You can specify a name for the application and a load balancer type. Or you can just press Enter to select the defaults:

You can enter this command to create a sample application that you can then access through a DNS name in the output of the deployment:

eb create --sample

Running the eb create command in your code repository deploys the code present in the directory into the application. As you can see, this greatly simplifies the deployment of the environment.

Now you can log in to the AWS Management Console and browse to the EC2 management section. By clicking on the instances, you should be able to see the instance that was created for your EB application, as shown in Figure 3-12.

images

Figure 3-12 An EC2 Instance Is Created That Matches the Application Name Everyonelovesaws-dev2

Next, click the load balancers. You should see a load balancer that has been created for your application, as shown in Figure 3-13. If you need to filter the selection, you can select the filter elasticbeanstalk:environment-name: everyonelovesaws-dev2. Please change the name of the application to match the name of your Elastic Beanstalk deployment.

images

Figure 3-13 A Load Balancer Is Visible in the Load Balancer’s Section

Next, you can browse to the Elastic Beanstalk Management Console and select the everyonelovesaws application and the environment. You should see the environment in an Ok state, as shown in Figure 3-14.

images

Figure 3-14 The Elastic Beanstalk Application in the AWS Management Console

Finally, by clicking the URL in the name of the application, you can see that the sample application is operational, as shown in Figure 3-15.

images

Figure 3-15 The Application Is Deployed and Available on the URL Shown in the EB Management Console

You can now simply terminate the environment with the following command, simply replacing the everyonelovesaws-dev2 variable with your environment name:

eb terminate everyonelovesaws-dev2

AWS CloudFormation

images

When you require a bit more control of the environment than Elastic Beanstalk gives you, you might want to use CloudFormation. CloudFormation was designed as the core infrastructure as code service in AWS and is fully TOSCA compliant, allowing the service to integrate with external tools and configuration management.

CloudFormation allows you to create a JSON or YAML document that specifies the cloud objects you would like to build. It allows you to customize the application with very fine granularity and deliver infrastructure services for your application in a unified, repeatable, and reliable manner. CloudFormation can be accessed through the Management Console, the CLI, the SDKs, or directly through the CloudFormation API.

CloudFormation deploys all the resources in a template in parallel and allows you to control any resources specified in the template. For example, if you create a VPC through the CloudFormation service but then add instances manually, the service will not be allowed to reconfigure or delete the VPC until there are other cloud objects that depend on them. When you deploy a template, CloudFormation checks the syntax and validates the template before deploying. Failures during deployment are completely possible, and because a template is treated as one unit, any failure will cause CloudFormation to automatically roll back the resources created up until the failure occurred. This means you are never required to clean up any failed CloudFormation deployments in your AWS account.

In CloudFormation, you use templates to deploy stacks. Templates are the specification documents that have the resources defined, and the stack is a running set of connected resources in AWS. When creating changes to templates, you can simply create change sets, which allow you to complete the current state with the proposed changes before applying the changes to the stack.

An AWS CloudFormation template has the following sections:

  • Template Version: An optional component that specifies the CloudFormation template version that the template conforms to.

  • Description: An optional component that allows you to identify what the template does. The description can be any arbitrary string of text that makes sense to you.

  • Metadata: An optional component that can include metadata to pass to the cloud objects being deployed.

  • Parameters: An optional part of the template used to provide parameters to be passed to the resources upon creation. Can contain default values or a list of valid responses.

  • Mappings: An optional component that can map the resource parameters to the appropriate environment or region. For example, you can specify the instance type, image, key pair and so on to use depending on the environment tag; you can specify different parameters for the Dev, Test, and Prod environment tags.

  • Conditions: An optional component that allows you to control the conditions under which the cloud object can or cannot be created by CloudFormation.

  • Transform: An optional component for defanging the AWS Serverless Application Model to be used in the template. Used with Lambda, API Gateway, and so on.

  • Resources: The mandatory part of the template that defines the resources that CloudFormation should create.

  • Outputs: An optional component that gives you the ability to output important information about the stack once the deployment is completed.

Example 3-7 shows an example of a CloudFormation template that deploys an EC2 instance.

images

Example 3-7 CloudFormation Template Example

Image
{
  "AWSTemplateFormatVersion" : "2010-09-09",
  "Description" : "Allows you to create a t3.micro EC2 instance using the Amazon
Linux AMI in any US region. Uses a predefined ssh key named ec2-user. Please create
this key or replace the keyname parameter. Creates a security group to allow SSH
access.",
  "Parameters" : {
    "KeyName": {
      "Description" : "Name of the keypair to use with the instance",
      "Type": "AWS::EC2::KeyPair::KeyName",
      "Default" : "ec2-user"
    },
    "InstanceType" : {
      "Description" : "Define the instance type as t3.micro",
      "Type" : "String",
      "Default" : "t3.micro"
    }
  },
  "Mappings" : {
     "T3type" : {
      "t3.micro"    : { "Arch" : "HVM64" }
          },
        "RegionID" : {
      "us-east-1"        : {"HVM64" : "ami-0ff8a91507f77f867", "HVMG2" :
                           "ami-0a584ac55a7631c0c"},
      "us-east-2"        : {"HVM64" : "ami-0b59bfac6be064b78", "HVMG2" :
                           "NOT_SUPPORTED"},
      "us-west-2"        : {"HVM64" : "ami-a0cfeed8", "HVMG2" :
                           "ami-0e09505bc235aa82d"},
      "us-west-1"        : {"HVM64" : "ami-0bdb828fd58c52235", "HVMG2" :
                           "ami-066ee5fd4a9ef77f1"}
  }
  },
  "Resources" : {
    "EC2Instance" : {
      "Type" : "AWS::EC2::Instance",
      "Properties" : {
        "InstanceType" : { "Ref" : "InstanceType" },
        "SecurityGroups" : [ { "Ref" : "SecurityGroup" } ],
        "KeyName" : { "Ref" : "KeyName" },
        "ImageId" : { "Fn::FindInMap" : [ "RegionID", { "Ref" : "AWS::Region" },
                          { "Fn::FindInMap" : [ "T3type", { "Ref" : "InstanceType"
}, "Arch" ] } ] }
      }
    },
    "SecurityGroup" : {
      "Type" : "AWS::EC2::SecurityGroup",
      "Properties" : {
        "GroupDescription" : "Open port 22 for SSH access",
        "SecurityGroupIngress" : [ {
          "IpProtocol" : "tcp",
          "FromPort" : "22",
          "ToPort" : "22",
          "CidrIp" : "0.0.0.0/0"
        } ]
    }
        }
  },
  "Outputs" : {
    "InstanceId" : {
      "Description" : "Output the instance ID",
      "Value" : { "Ref" : "EC2Instance" }
    },
    "PublicIP" : {
      "Description" : "Output the Public IP of the EC2 instance",
      "Value" : { "Fn::GetAtt" : [ "EC2Instance", "PublicIp" ] }
    }
  }
}
images

To deploy the template, you save it as t3.json file in your working directory and run the following AWS CLI commands. First, create a key pair named ec2-user:

aws ec2 create-key-pair --key-name ec2-user > ec2-user.key

The output is redirected to ec2-user.key, which saves the private key so you can later SSH to the instance.

Next, use the following command to deploy the stack:

aws cloudformation deploy --template-file t3.json 

--stack-name everyonelovesaws

To list the stack properties, simply issue the following CLI command:

aws cloudformation describe-stacks --stack-name everyonelovesaws

You should see output similar to that in Figure 3-16, which shows the status of the stack. Look for the "StackStatus": "CREATE_COMPLETE", which indicates that the stack deployment is complete.

images

Figure 3-16 Output of the describe-stacks CLI Command

You can now connect to the instance you deployed via SSH by using the saved ec2-user key and the public IP address found in the OutputValue section of the describe-stacks. In this example the Public IP would be 18.220.196.151.

Secure the user key by modifying the permissions to make it readable only to the owner:

chmod 400 ec2-user.key

Next, use the ssh command to log in to the server with the ec2-user key:

ssh -i ec2-user.key [email protected]

When you are done with this exercise, you can remove the resources created by the stack and delete the stack by running the following command:

aws cloudformation delete-stack --stack-name everyonelovesaws

When the template is deployed, a stack is created. A stack can represent any part of an application or even a complete set of resources for a particular application. The best practice for stacks is to create stacks with separate functions, such as a single stack to create the VPC, another stack to deploy the security rules, another stack to deploy the EC2 instances, and so on. Each stack can output and feed information to the stack that is being deployed after it. You can chain the stacks together by specifying a stack within a stack. You can simply specify a path to a stack (an S3 path to the stack, for example) and embed the reference into another stack.

Because the CloudFormation service builds the stacks in parallel, it can be tricky to deploy complex environments. This is why the CloudFormation templates also include a DependsOn parameter that can be implemented as a condition in the resource creation. The DependsOn parameter gives you the ability to serialize the creation of the stack and allow for the creation of resources relied upon to be fully created before other resources are deployed. For example, for a template with a VPC, security groups, and EC2 instances, you need to use the DependsOn parameter to wait for the VPC to be completed before the security groups can be deployed and then again use DependsOn for the security groups to be created before the EC2 deployment is started. The output of each previous operation can also be fed into the next operation by using the Ref function. Ref allows you to retrieve information produced during creation and use it in the next steps. For example, the VPC ID is generated upon creation, and the Ref function can be used to retrieve the VPC ID when creating the subnets, deploying security groups and instances, and so on.

To implement changes after the stack is deployed, you can use change sets. You can simply create the changes to the stack within the CloudFormation Management Console, and before applying them to the running stack, you can preview the effects of the changes. Three types of changes can be created during a change set:

  • Non-disruptive: Changes that do not impact running services. For example, changing an Auto Scaling policy’s maximum instance setting does not have any effect on running services.

  • Disruptive: Changes that impact running services. For example, changing instance types requires the instances to reboot and makes them unavailable during the reboot.

  • Replacement: Changes that terminate and redeploy a resource. For example, changing an AMI instance to use with your instances terminates instances using existing AMI instance and deploys new ones with the new AMI instance.

Using change sets is a good way to maintain your templates in accordance with the infrastructure as code approach while also giving you the ability to easily determine the impact of the changes to the running stack. Using change sets is the recommended approach to versioning and delivering CloudFormation updates to your application.

Exam Preparation Tasks

To prepare for the exam, use this section to review the topics covered and the key aspects that will allow you to gain the knowledge required to pass the exam. To gain the necessary knowledge, complete the exercises, examples, and questions in this section in combination with Chapter 9, “Final Preparation,” and the exam simulation questions in the Pearson Test Prep Software Online.

Review All Key Topics

Review the most important topics in this chapter, noted with the Key Topics icon in the outer margin of the page. Table 3-3 lists these key topics and the page number on which each is found.

images

Table 3-3 Key Topics for Chapter 3

Key Topic Element

Description

Page Number

Foundation topic

Computing Basics

65

Figure 3-3

Comparison of latencies in computing

68

Table 3-2

CIDR range examples

71

Tutorial

Creating a VPC

72

Section

Connecting a VPC to the Internet

72

Example 3-2

Creating an EC2 instance with Node.js

80

Example 3-3

Creating an EC2 instance with Python 2.7 using a Lambda function

82

Example 3-4

Lambda function IAM policy to create an EC2 instance

82

Tutorial

Creating an ECS cluster

85

Tutorial

Creating a task definition

86

Example 3-6

Deploying a task in .NET

87

Tutorial

Installing the EB CLI on Amazon Linux and deploying a sample Elastic Beanstalk app

99

Section

AWS CloudFormation

101

Example 3-7

CloudFormation template used in the CF tutorial

103

Tutorial

Deploying the CF template in the AWS CLI

104

Define Key Terms

Define the following key terms from this chapter and check your answers in the glossary:

EC2

ECS

Lambda

VPC

subnet

security group

NACL

HA

SLA

RPO

RTO

YAML

JSON

Q&A

The answers to these questions appear in Appendix A. For more practice with exam format questions, use the Pearson Test Prep Software Online.

1. What approach lays the underlying foundation for high availability of EC2 instances within a region?

2. When an IPv4 instance needs a connection to the Internet, what AWS resource options are available for private and public subnets?

3. In ECS, what is a task definition, and what does it specify?

4. To deploy an EC2 instance in an automated manner, what tools could you use?

5. When a network with more than 500 instances needs to be created, which CIDR range would be recommended?

6. Which service would allow you to make an application highly available across two regions by adding a DNS health check?

7. Name the three types of ELB load balancers.

8. Complete the sentence: An EC2 or EBS snapshot is ______ by nature, meaning it consumes space that is equal to ___________.

9. You need to deliver a one-click solution that will be simple for your web developers to use when they are deploying new applications and changes to those application. Which tool would you recommend?

10. What is the purpose of the Parameters section in CloudFormation?

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset