Chapter 3. Compute Services in AWS

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 3 Compute Services in AWS

This chapter covers the following subjects:

Networking in AWS: Having the ability to create your own networks within AWS allows you to build highly available and highly scalable applications. This section describes the VPC service, which is used to create, manage, and secure networks in AWS.

Computing in AWS: Delivering compute services on demand has always been one of the basic components of AWS. This section looks at the EC2 and ECS services and how you can use them to deliver flexible computing solutions.

Storing Persistent Data: One of the typical requirements of any application is the ability to store data persistently. This section looks at the EBS service, which provides block storage support for your compute instances.

Scalability and High Availability: This section explores how to design scalable and highly available solutions in EC2 by combining the functionality of the AWS Elastic Load Balancer, Auto Scaling, and Amazon Route 53 services.

Orchestration and Automation: The final part of this chapter takes a look at how to automate and orchestrate the creation and management of applications. This section takes a look at the Elastic Beanstalk and AWS CloudFormation services.

This chapter covers content important to the following exam domains:

Domain 1: Deployment

1.1 Deploy written code in AWS using existing CI/CD pipelines, processes, and patterns.
1.2 Deploy applications using Elastic Beanstalk.
1.3 Prepare the application deployment package to be deployed to AWS.
1.4 Deploy serverless applications.

Domain 3: Development with AWS Services

3.1 Write code for serverless applications.
3.2 Translate functional requirements into application design.
3.3 Implement application design into application code.
3.4 Write code that interacts with AWS services by using APIs, SDKs, and AWS CLI.

One of the key aspects of cloud computing is the ability to consume compute services and design applications that are highly scalable, highly available, and highly resilient. This chapter discusses all aspects of EC2 and ECS services in AWS and how these services can be integrated with other AWS components.

“Do I Know This Already?” Quiz

The “Do I Know This Already?” quiz allows you to assess whether you should read the entire chapter. Table 3-1 lists the major headings in this chapter and the “Do I Know This Already?” quiz questions covering the material in those headings so you can assess your knowledge of these specific areas. The answers to the “Do I Know This Already?” quiz appear in Appendix A, “Answers to the ‘Do I Know This Already?’ Quizzes and Q&A Sections.”

Table 3-1 “Do I Know This Already?” Foundation Topics Section-to-Question Mapping

Foundations Topics Section	Questions
Networking in AWS	1, 6
Computing in AWS	2, 10
Storing Persistent Data	3, 8
Scalability and High Availability	4, 7
Orchestration and Automation	5, 9

Caution

The goal of self-assessment is to gauge your mastery of the topics in this chapter. If you do not know the answer to a question or are only partially sure of the answer, you should mark that question as wrong for purposes of the self-assessment. Giving yourself credit for an answer you correctly guess skews your self-assessment results and might provide you with a false sense of security.

1. VPC A is peered to VPC B. VPC B is peered to VPC C. You have set up routing in VPC A, which lists the VPC C subnet as a subnet of VPC B. You are trying to ping an instance in VPC C from VPC A, but you are not getting a response. Why?

Transient connections are not supported by VPC peering.
Your security groups of the VPC C instance do not allow incoming pings.
You need to create an NACL in VPC C that will allow pings from VPC A.
The NAT service in VPC B is not configured correctly.

2. You are tasked with migrating an EC2 instance from one availability zone to another. Which approach would be the best to achieve full data consistency?

Shut down the instance. Then restart the instance and select the new availability zone.
Keep the instance running. Select Migrate to AZ in the instance actions and select the new availability zone.
Shut down the instance, create a snapshot, start a new instance from the snapshot, and select the new availability zone.
Keep the instance running. Create a snapshot with the no-shutdown option. Start a new instance from the snapshot and select the new availability zone.

3. You are required to select a storage location for your MySQL database server on an EC2 instance. What AWS service would be the most appropriate for such an object?

4. With ECS, what allows you to control high availability of a containerized application?

Placement of ECS tasks across ECS instances
Placement of ECS tasks into an ECS cluster
Placement of ECS instances across regions
Placement of ECS instances across availability zones

5. Which scripting languages are supported in a CloudFormation template? (Choose two.)

YAML
Ruby DSL
JSON
Python

6. To set up a route from an on-premises location to a VPC subnet through Direct Connect, which of the following do you need to use?

RIPv2
RIPv1
Static routing
BGP

7. To change the number of instances in an Auto Scaling group from 1 to 3, which count do you set to 3?

Percentage
Maximum instances
Desired instances
Running instances

8. To maximize IOPS in an EBS volume, which of the following would you need to select?

Provisioned IOPS volume
General purpose volume
Disk-backed volume
Dedicated IOPS volume

9. To automate the infrastructure deployment of a three-tier application, which of the following options could you use? (Choose all that apply.)

CloudFormation
CLI
CloudTrail
OpsWorks Stacks

10. Which of the following compute options would be best suited for a tiny 100 MB microservices platform that needs to run in response to a user action?

Lambda
EC2
ECS
EKS

Foundation Topics

Computing Basics

Before diving into specifics, you need to understand some basics and requirements of modern applications and the infrastructure components that they rely on to deliver functionality. This chapter puts a lot of focus on Infrastructure as a Service (IaaS) as the underlying technology.

Note

Chapter 1, “Overview of AWS,” looks more closely at Infrastructure as a Service (IaaS).

For an application to be able to provide any kind of service, it needs to be able to connect to clients. The ability to communicate in a unified manner on a network is thus a core requirement for any application, regardless of the environment in which it resides.

Modern networking requirements are typically divided into two categories:

Local area networks (LANs): These are private networks that allow communication only within a certain limited set of network addresses (usually) within one organization.
Wide area networks (WANs): These are either private or public networks that are designed to allow communication at a distance with multiple parties. When these networks are public, the term WAN is usually replaced with the Internet.

The industry standard network protocol used in all of these networks is Internet Protocol (IP). Internet Protocol comes in two versions: version 4 (IPv4) and version 6 (IPv6). The two protocol versions can be used simultaneously to allow for a transition between the older IPv4 to the newer IPv6. The main difference between these protocols is the number of addresses that can be used in each:

The IPv4 protocol has a 32-bit addressing field, which means the maximum number of potential IP addresses is limited to about 4.3 billion. The pool of assigned IPv4 addresses across the Internet has virtually all been assigned.
The IPv6 protocol has a 128-bit addressing field, which means the maximum number of potential IP addresses is approximately 340 undecillion, or 340 billion-billion-billion-billion addresses. That should last humanity for a while.

The other major difference between the two protocol versions is also that IPv4 has several clearly defined private address ranges that are not routable on the Internet, as per the IETF RFC 1918. This RFC was defined in 1996 to mitigate the growing lack of public IP addresses and extend the number of possible devices that can connect to the Internet. The largest possible RFC 1918 private network range can accommodate up to 16 million network attached devices; this means that behind each IPv4 public IP address, you could potentially host one such private network range.

The private IP address ranges are not routable on their own and are required to be connected to the Internet via an intermediary device that performs Network Address Translation (NAT). A NAT device allows for the traffic from the private network to be passed to the Internet and uses the public IP address of the NAT device for all the devices in the private network. Figure 3-1 illustrates the operation of a NAT device that connects all clients in a private network to the public IP space. As demonstrated, the NAT device has one public IP address and allows all private computers to connect to the Internet through that one IP address.

The IPv6 approach is a bit different as IPv6 has no clearly defined private range. Essentially, all the IPv6 addresses are public, and they are all instantly routable when connected to the Internet. You need to understand this fact because implementing IPv6 addressing could pose a completely different challenge to the way you access the Internet and secure your application.

Devices that are connected to the network are divided into clients and servers. A client is any device creating a request on a system, and the system responding is a server since it is serving the content. There are intermediary devices between the clients and servers, including gateways, routers, load balancers, and other network equipment.

A client and the server need to agree on an application protocol. There are many different protocols in use on the Internet; the most common protocols are Hypertext Transfer Protocol (HTTP) and its secure version, HTTPS. These protocols use the Transport Control Protocol (TCP) extension of Internet Protocol, denoted TCP/IP. TCP/IP gives the client and server the ability to determine which packets have been delivered over the network and which packets need to be re-sent. This allows the client to make only one request and the server to respond to that request in chunks, which is perfect for what HTTP was designed for—delivering web pages. This functionality means that the application essentially works in an asynchronous manner, taking requests and sending responses in no particular order. Figure 3-2 illustrates the client/server approach in HTTP. The client requests a page, and the server either responds with that page or returns an error code.

The benefit of asynchronous communication is that the server can perform many tasks at the same time and respond to the clients when the response is ready. This means the server can serve multiple clients at the same time as the next client is not required to wait for the previous client to get a response.

Synchronous communication protocols work differently from asynchronous protocols. For example, some database query languages use synchronous communication and are severely limited in the number of concurrent connections they can process at the same time by the actual number of threads that can be processed by the central processing units (CPUs).

An application can perform numerous different tasks, but the core functionality of any application’s abilities depends on the compute capacity and the CPUs ability to process requests. A request for processing is also referred to as a thread. A thread is essentially a set of commands and the data that is being processed being sent to the CPU. A modern CPU usually has a buffer that lines up threads for processing and a cache that can remember commonly used instructions and data. The cache can be extended into several layers, each containing more and more memory that has slower and slower performance. For example, a modern CPU in 2019 had

1 to 2 MB of Layer 1 (or L1) cache that operates at the speed of the CPU and stores the most frequently used instructions and the most frequently used data.
2 to 8 MB of L2 cache that is a bit slower, perhaps working at half the CPU speed, but that can hold many more commonly used instructions and much more data.
8 to 64 MB of L3 cache operating at the slowest speeds that allows the CPU to remember much larger chunks of data and therefore speeds up operations.

When the data or the instruction being requested is found in the cache, the CPU can compute the task much faster. This is called a cache hit. The opposite is true when there is a cache miss: The data needs to be retrieved from the working memory or disk. Every year the random access memory (RAM) modules and disk systems get faster, and the cache on the CPU grows; together these changes add up to faster performance and lower latencies for your applications.

At the end of the caching chain is the disk. The disk is a representation of where you store your data in your compute system. While the name disk comes from old-school magnetic disks, the reality today is that most “disks” are solid-state drives (SSDs). The SSD has revolutionized the performance of the storage subsystems as the ability to access data in a random manner and the throughput of the devices has dramatically increased.

Even though SSDs are impressive, they are not even close to being the pinnacle of computing storage. Currently we are seeing improvements in the way data is stored with non-volatile memory devices being introduced both at the disk level and at the RAM level. The Non-Volatile Memory Express (NVMe) standard has brought us disks that now surpass SSD performance by multiple times. On top of that, Intel has recently introduced its Optane technology, with non-volatile modules that can be used as standard RAM modules. These modules are now matching the performance of memory while providing the storage capacities of NVMe drives. This technology is currently being tested by developers from all over the world so we can better understand how to take advantage of such an environment. An Optane module is a device that you start, that has no start time, that processes every request at the performance of typical memory modules, and that can be turned off at any time without “saving” anything to disk—and every zero and one is retained when the power is unplugged. This technology will revolutionize the way we store and process data and even use devices in years to come. This type of technology will surely enable the rise of the real-time operating system (RTOS). Figure 3-3 illustrates different levels of latencies in computing.

But for now you still need to consider the current technologies, which are not very well suited to disruptions. With current technologies, all of your data is still being processed in volatile memory, which will be completely lost when the application loses power. In addition, any disruption in network availability and the availability of services your application relies on can degrade the state of an application. You therefore need to build your applications to be highly available, highly resilient to disruption, and highly scalable so that they can take on any load you can give them.

The storage of data is the biggest obstacle to making sure an application can withstand any disruption and ensuring that it can always deliver the freshest data to clients. For example, many of us remember the days when we used to use hardware servers to run applications. We did it because it was the only option around. We had no cloud computing, no IaaS. We used to buy a box of metal that had some shiny bits inside it, install an operating system, connect the device to the network, and then install the application so that it would perform the task we needed it to. Sooner or later, a power outage, a network switch failing, or our colleague Dan from the server room team tripping over the extension cord powering the server caused an outage. Janice from accounting would usually be ranting about the outage, but all we could say is, “Sorry, can’t do anything about it. The server is down.” We knew we could get the server back up and running, we knew the data was still (mostly) present on the disks, and we were (mostly) able to get away with it.

But why did we all get so used to the phrase “The server is down” and put up with it? We always had the option to replicate the data, we always had the option to synchronize the state of the server to another server, and we always had the ability to make applications highly available. What was stopping us? It was either the complexity or the cost or the lack of skills or all of the above.

And then along came the cloud, and everything changed. All of our servers became ephemeral instances of read-only images. All our data being stored within those instances would not survive, we were told. We needed to start thinking about how to get this data off the instances, how to get the logs off, how to get the session state out of the scope of one operating system of one single instance. Basically, we were on a quest to make the instances “stateless.” With stateful servers, a user session will be lost if the server is lost. Figure 3-4 illustrates how stateless systems can share a data store that records the client sessions. In case of a failure, scaling, or other change in the server tier, the client can be reconnected to another server because the stateless systems share the data store.

As you can see in Figure 3-4, the stateless design has quite a few benefits over the stateful design. When implementing a stateful design, each user is tied to a specific server that contains the state of that user’s session. In the example in Figure 3-4, user A is talking to server X, which means servers Y and Z do not have any information of the state of the session for user A. The opposite is true for users B and C: The state of user B is recorded on server Z, and the state of user C is recorded on server Y.

With a stateless design, users A, B, and C can access any of the servers, and their state and data will be retrievable because all the state information is stored on a shared service outside the servers themselves; in the example in Figure 3-4, this is represented by an RDS database service.

Running in a cloud environment poses unique challenges and benefits, as discussed in Chapter 1. The stateless approach to building processing units has brought the ability to have more than one server serve the content and draw it from a back-end source that can be available to all instances. A “problem” actually turned out to be the solution to our high availability, reliability, and scalability issues. In addition, we gained the ability to automate environments much more easily. Having the components that process the application requests be stateless and having the data stored in a universally reachable back end allowed us to start architecting solutions that would be able to automatically respond to incoming requests by increasing the number of instances responding to those requests. Being able to orchestrate the deployment, scaling, and decommissioning of the application in an automated manner is a crucial benefit of the cloud.

Networking in AWS

One of the ways to deliver applications from the cloud is through Infrastructure as a Service (IaaS). The core requirement for any IaaS environment is the ability to control the network environment and connectivity so that you can expose the application to the Internet and connect to other private networks. However, you also need to be able to control the security aspects of any application running on an IaaS platform.

AWS provides several different networking-related services that allow you to design and deliver secure, highly available, and reliable applications from the cloud. The following are the most important networking tools available in AWS:

Amazon Virtual Private Cloud (VPC): A service for creating logically isolated networks in the cloud
VPC network ACLs and security groups: Tools for securing network and instance access in VPC
AWS Direct Connect and VPN gateways: Tools for connecting your on-premises networks with AWS
Amazon Route 53: A next-generation DNS service with an innovative API that allows for programmatic access to the DNS services
Amazon CloudFront: A dynamic caching and CDN service in the AWS cloud
Amazon Elastic Load Balancing (ELB): Load balancing as a service in the AWS cloud
Amazon Web Application Firewall (WAF): A tool that protects web applications from external attacks using exploits and security vulnerabilities
AWS Shield: An AWS managed DDoS service

Amazon Virtual Private Cloud (VPC)

The Virtual Private Cloud (VPC) service in AWS gives you the ability to arbitrarily define your private network environment. It gives you complete control over routing and security. The VPC service essentially gives you complete control over the network configuration and allows you to control the security and access to that network.

With every account, a default VPC and default subnets are created in each region. The default VPC is a prebuilt solution that you can easily deploy without having to manage the VPC; however, I recommend using the default VPC only for learning, testing, and proof-of-concept deployments. In all other cases, it is recommended that you configure your own VPC(s).

When configuring a VPC, you need to define a network address. The network address can then be further segmented into subnets. You define the addresses by using Classless Inter-Domain Routing (CIDR) notation. With CIDR notation, each address is composed of two groups of address bits: the static bits that represent the network and the dynamic bits that represent the host. To define the number of bits used in a network address, use a / (slash) with a number. For example, consider the IP address 192.168.0.0/24. An IPv4 address has 32 bits, and each decimal number between 0 and 255 represents 8 bits of the address; the bits are separated by dots for readability. For an address where the first 24 bits (192.168.0) are fixed and represent the network address, you use the CIDR notation /24. The remaining 8 bits are dynamic and represent the host addresses. Because each bit can be 0 or 1, you have 2⁸ bits, and 256 available addresses in the range 0 to 255. The first address (192.168.0.0) is the address of the network, and the last address (192.168.0.255) is the broadcast address. Packets sent to the broadcast address will in turn be sent to all hosts in the network. The number of usable addresses is thus 254. In AWS, three additional addresses are reserved for the AWS services, so the actual number of usable addresses for your hosts is 251.

Table 3-2 provides some examples of network ranges in CIDR format and their characteristics.

Table 3-2 CIDR Range Examples

CIDR	Host Addresses	Broadcast Address	Number of Hosts	Private or Public?
10.0.0.0/8	10.0.0.1–10.255.255.254	10.255.255.255	16,777,214	Private (RFC 1918)
172.24.0.0/16	172.24.0.1–172.24.255.254	172.24.255.255	65,534	Private (RFC 1918)
0.0.0.0/0	0.0.0.1–255.255.255.254	255.255.255.255	Approximately 4.3 billion	Public (all addresses)
54.219.0.0/22	54.219.0.1–54.219.3.254	54.219.3.254	1022	Public
18.176.17.16/28	18.176.17.17–18.176.17.30	18.176.17.31	14	Public
52.219.84.155/32	52.219.84.155	52.219.84.155	1	Public

As you can see, CIDR notation allows you to define a network as any range of IP addresses from one single host to the whole Internet. You also use CIDR notation for security rules, so it is crucial that you understand the notation format.

To create a VPC, you can use the aws ec2 create-vpc command. The following example creates a VPC with the CIDR address 192.168.100.0/24:

Key Topic Element	Description	Page Number
Foundation topic	Computing Basics	65
Figure 3-3	Comparison of latencies in computing	68
Table 3-2	CIDR range examples	71
Tutorial	Creating a VPC	72
Section	Connecting a VPC to the Internet	72
Example 3-2	Creating an EC2 instance with Node.js	80
Example 3-3	Creating an EC2 instance with Python 2.7 using a Lambda function	82
Example 3-4	Lambda function IAM policy to create an EC2 instance	82
Tutorial	Creating an ECS cluster	85
Tutorial	Creating a task definition	86
Example 3-6	Deploying a task in .NET	87
Tutorial	Installing the EB CLI on Amazon Linux and deploying a sample Elastic Beanstalk app	99
Section	AWS CloudFormation	101
Example 3-7	CloudFormation template used in the CF tutorial	103
Tutorial	Deploying the CF template in the AWS CLI	104

Table of Contents for Chapter 3. Compute Services in AWS

Create new playlist

Sign In

Sign Up

Chapter 3

Compute Services in AWS

“Do I Know This Already?” Quiz

Computing Basics

Networking in AWS

Amazon Virtual Private Cloud (VPC)

Connecting a VPC to the Internet

Connecting the VPC to Other Private Networks

Computing in AWS

Amazon EC2

Amazon ECS and Fargate

Storing Persistent Data

Amazon EBS

Scalability and High Availability

High Availability Design Patterns

AWS Elastic Load Balancer

Auto Scaling

Amazon Route 53

Orchestration and Automation

Basics of Cloud Orchestration and Automation

AWS Elastic Beanstalk

AWS CloudFormation

Review All Key Topics

Define Key Terms

Q&A

Table of Contents for
Chapter 3. Compute Services in AWS