CHAPTER 1

image

Introduction

In this chapter we briefly discuss the concept of infrastructure as code and DevOps. We also touch upon Chef and Ruby and cover some of the use cases of Opscode Chef and how it is being leveraged to solve technical problems faced by IT (information technology) departments.

Infrastructure as Code

The advent of public cloud computing has revolutionized the software development world. Small companies with a good idea can leverage the pay-per-use model provided by the public cloud computing companies and setup their infrastructure quickly and without any upfront costs.

For the traditional IT enterprises, the public cloud brings in cost advantages, flexibility, and the agility to setup their infrastructure environments very quickly without waiting for the ordering, procurement, and setup cycles involved in traditional datacenter setup.

Most of the public cloud providers deliverAPIs (application programming interfaces), which expose the features and functionality of the underlying cloud. Thus the infrastructure that typically used to be a setup and configuration activity in traditional datacenters has now become programmable through APIs.

The infrastructure components like Network, Firewalls, Compute, and Storage are exposed to programmers through APIs and can be consumed through command lines, REST APIcalls,and so on.

The large-scale infrastructure used by cloud providers and Internet scale companies like Google, Facebook, and Twitter needs a very different approach to setup, monitoring, and management from a typical enterprise with a few thousand servers.

Some of the provisioning and deployment models applicable for large-scale Internet infrastructure are very different from the typical enterprise use cases. The number of applications and servers are more homogeneous in an online business than the number of applications and diversity of infrastructure found in an enterprise.

Although AWS (Amazon Web Services) does not share details on its capacity or the addition of capacity, it states that it is adding capacity equivalent to what Amazon.com had in 2005 daily. This kind of massive capacity buildup and management of millions of virtual machines leveraging technologies, processes, and tools built for a smaller scale are not possible.

The public cloud is built on principles of scaled-out architecture. Thus, rather than adding computer resources to a virtual machine, applications quickly spin new machines when the demand increases and gracefully shut down machines when the demand decreases. This has become essential since cloud providers charge the customer on the basis of metered usage of services. Thus, if you are using a virtual machine in a cloud environment for a few hours, you will only be billed for the hours of usage.

The cloud providers provide integrations and APIs for making the up scaling and downscaling of resources simple and easy to do. Customers benefit by having capacity when needed and getting billed for what they use.

Today, a range of new technologies has emerged which makes the task of managing large-scale infrastructure and application landscape much easier.

Infrastructure as a code emerged in the last few years because of advancement in two technologies and the rise of consumer IT companies. Cloud computing and new web frameworks made it simpler and easier to develop out scale applications and created technologies that enabled infrastructure as a code.

The cloud and the new web frameworks have essentially democratized innovation and IT. No longer do you need expensive equipment and a datacenter setup to start your innovative company. The cloud provides seemingly limitless capacity to fulfill the needs of developers and startup with zero capital expenditure. You can be up and running on a prototype using your credit card. Thus smaller companies now can compete with their larger competitors, and the advantage that large organizations have by virtue of capital and infrastructure no longer remains a differentiator.

The idea of the cloud and the newer web development languages and frameworks was all about simplicity. The cloud made it simple for organizations to setup infrastructure, and the new web frameworks and languages like Ruby on Rails made it simpler, easier, and faster to develop applications.

Startup companies also have to operate within tight budgets; they do not have the luxury of spending money on operations and operations teams. Thus, the developers had to find a way to make operations as automated as possible, and the convergence of all the new technologies, along with the needs of developer communities and large-scale Internet companies, resulted in the fructification of the concepts of DevOps and infrastructure as code.

A lot of changes have led to this new breed of configuration management tools that help in automating your infrastructure. These tools help you in maintaining a blueprint of your infrastructure by breaking it down into components that interact with each other so that you can deploy it whenever you want.

It is important to understand that “infrastructure” does not mean infrastructure in the traditional IT definition, which is network devices, servers, firewalls, and so on. By infrastructure, we mean a collection of components that are used to deliver a service to the end user. The components can be virtual machines, network settings, configuration files, software packages, applications, processes, users, and so on.

Jesse Robins describes the goal of infrastructure as code:

“Enable the reconstruction of the business from nothing but a source code repository, an application data backup, and bare metal resources.”

Thus, infrastructure as code tools like chef came into picture. Chef enables developers to assemble and consume infrastructure components similarly to the way software components are designed, assembled, and consumed.

Figure 1-1 shows the different types of components of infrastructure.

9781430262954_Fig01-01.jpg

Figure 1-1. Infrastructure components

Infrastructure components are abstracted similarly to the way abstract classes and interfaces work in a software module.

Chef and other automation tools allow you to define objects and methods for an object; as an example, you may add and remove methods for installing packages.

The beauty of this approach is that the administrators of the end systems do not have to worry about the implementation details of how each component is deployed by the system and can focus on the exact task to be achieved.

Infrastructure is created as a blueprint in a software system which is executed by a provider on the end device. The provider provides the execution code based on the capabilities of the end device. Thus, the abstraction of the provider brings simplicity, and the developers can reuse the providers as per the needs of the application. The provider model encapsulates the execution aspects of the end system, and thus it greatly simplifies the work of the administrator.

Once the blueprint has been created, the same model can be applied multiple times to multiple similar endpoints.

The automation aspects of these tools also allows the endpoints to be audited to a specific baseline, and if the end points state is different than what it should be, systems like chef can automatically bring the end point back to the expected state of configuration.

The blueprint can be used to create various environments easily and quickly, and you can easily provision development, test, QA, and production environments using chef.

Without infrastructure as code and tools like chef, it would take days of effort from multiple teams to create these environments.

The additional benefit of this approach is that the complete environment becomes documented and modeled in a tool. Thus, using chef as a tool helps organizations to have a scalable and agile approach to configuration management and the deployment of infrastructure components. Automation using configuration automation tools like chef would save precious man-hours, which can be utilized for service improvement and the creation of new services. This also leads to significant cost savings as well as higher quality of service because of fewer human errors.

Overview

Chef is a framework that makes it easy to manage your infrastructure. Chef was initially written in Ruby, but the latest version is a mixture of Erlang and Ruby. A single chef server can handle upto 10,000 nodes.

With chef, we can

  • Manage both our physical and cloud servers.
  • Create perfect clones of our environments.
  • Easily configure applications that require knowledge about your infrastructure via ‘Search.’

Once we have automated our infrastructure with chef, we can replicate the whole infrastructure very easily. Chef can be mainly broken down into three components.

  • Server: The chef server holds the configuration data for each and every node registered with it.
  • Workstation: A workstation basically holds the local chef repository.
  • A node is a client that is registered with the chef server. It has an agent known as chef client installed on it.

Cookbooks, covered in Chapter 7 also are a very important part of chef. Cookbooks are the basic building blocks of chef. They hold the type of configuration that needs to be done on a node. Each cookbook defines a complete scenario, like package installation and configuration.

Nodes

A node can be termed a “virtual” or a “physical” server that is managed by chef. A node can also be on the cloud. A node needs to have an agent, known as chef client, installed on it. The agent is used to interact with the chef server. Ohai is a built-in tool that comes with chef and is used to provide node attributes to the chef client so that a node can be configured. There are basically two types of nodes that chef can manage.

  1. Cloud-based: It is basically a node that is hosted on any of the cloud providers (e.g., Amazon or Windows Azure). There is a chef CLI (command line interface) known as knife which can be used to create instances on the cloud. Once deployed, these nodes can be managed with the help of chef.
  2. Physical: It can be hardware or a virtual machine that exists in our own environment.

There are mainly two important components of a node.

  1. Chef client: An agent that runs on each node. The agent contacts the chef server and pulls the configuration that needs to be done on the node. Its main functions include
    1. Registering the node with the chef server.
    2. Downloading the required cookbook in the local cache.
    3. Compiling the required recipes.
    4. Configuring the node and bringing it to the expected state.
  2. Ohai: Chef client requires some information about the node whenever it runs. Ohai is a built-in tool that comes with chef and is used to detect certain attributes of that particular node and then provide them to the chef client whenever required. Ohai can also be used as a stand-alone component for discovery purposes. Ohai can provide a variety of details from networking to platform information.

Workstation

A workstation is a system that is used to manage chef. There can be multiple workstations for a single chef server. A workstation has the following functionalities:

  • Developing cookbooks and recipes.
  • Managing nodes.
  • Synchronizing the chef repository.
  • Uploading cookbook and other items to the chef server.

There are mainly two important components of a workstation.

  1. Knife: A command line tool used to interact with the chef server. The complete management of the chef server is done using knife. Some of the functions of knife include
    1. Managing nodes
    2. Uploading cook books and recipes
    3. Managing roles and environments
  2. Local chef repository: Chef repository is a repository where everything related to the chef server/nodes is stored.

Server

There is a centrally located server which holds all the data related to the chef server; this data includes everything related to the server (i.e., cookbooks, the node object, and metadata for each and every node registered to the chef server).

The agent (chef client) runs on each and every node, and it gets the configuration data from the server and then applies the configuration to a particular node. This approach is quite helpful in distributing the effort throughout the organization rather than on a single server.

There are three different types of chef server.

  • Enterprise chef
  • Open source chef
  • Chef solo

Enterprise Chef

Enterprise chef is the paid version of the chef server which comes with two types of installations: one is on-premise installation (i.e., in your datacenter behind your own firewall) and the other is the hosted version in which chef is offered as a service hosted and managed by Opscode.

The major difference between the enterprise version and the open source version is that the enterprise version comes with high-availability deployment support and has additional features on reporting and security.

Open Source Chef

The open source chef has most of the capabilities of the enterprise version. However, this version of chef server also has certain limitations. The open source version of chef can be installed only in stand-alone mode (i.e., it is not available in the hosted model). The open source chef components need to be installed on a single server, and it doesn’t offer the levels of security available in the enterprise version. It also doesn’t provide reporting capabilities like the enterprise version.

ChefSolo

Chefsolo comes with the chef client package and is used to manage a node without any access to the server. It runs locally on any node, and it requires the cookbook or any of its dependencies to be present on the node itself. This is generally used for testing purposes.

Cookbooks

A cookbook is a basic unit of configuration and policy definition in chef. A cookbook essentially defines a complete scenario. As an example, a cookbook for Apache or Tomcat would provide all details to install and configure a fully configured Apache or Tomcat server.

A cookbook contains all the components that are required to support the installation and configuration of an application or component, including

  • Files that need to be distributed for that component.
  • Attribute values that should be present on the nodes.
  • Definitions so that we need not write the same code again and again.
  • Libraries which can be used to extend the functionality of chef.
  • Recipes that specify the resources and the order of execution of code.
  • Templates for file configurations.
  • Metadata which can be used specify any kind of dependency, version constraints, and so on.

Chef mainly uses Ruby as its reference language for writing cookbooks and recipes. For writing specific resources, we used extended DSL (Domain Specific Language).

Chef provides an extensive library of resources which are required to support various infrastructure automation scenarios. The DSL provided by chef can also be extended to support additional capabilities or requirements.

Figure 1-2 shows the basic chef components and how they are used in automation.

9781430262954_Fig01-02.jpg

Figure 1-2. Basic structure of chef

Figure 1-3 shows the chef components in detail.

9781430262954_Fig01-03.jpg

Figure 1-3. Chef components in detail

The Value of Chef

With chef, you can automate your whole infrastructure and rebuild the whole environment very easily. Chef can automate every task that we perform manually in our datacenter in our daily routine and can save lots of time. Figure 1-4 shows a typical environment. We can delete and launch any instance at a point in time, and we do this manually, but with chef we can automate the whole process.

9781430262954_Fig01-04.jpg

Figure 1-4. A fully automated infrastructure

Why Chef?

As explained previously, chef gives your infrastructure the flexibility, speed, and efficiency you have always wanted. Automation through chef can provide the speed and agility needed by business today to compete. Chef can be used to quickly provide IT solutions and repeatable configurations with minimal human intervention.

Automating your infrastructure with chef could help you to deploy features in minutes rather than days. Chef can manage any number of servers without much complexity, and thus it helps you in managing your infrastructure easily, at less cost, and while avoiding human errors.

Chef helps your enterprise in moving to public clouds and complements the public cloud model by providing integrations with major public cloud providers.

Core Principles of Chef

Chef is a highly configurable and extensible tool with immense power in the hands of administrators to automate their infrastructure. It provides flexibility, agility, and speed to administrators, and they can leverage the tool the way they best deem fit in their scenarios.

The main principles on which chef works are

  • Idempotence
  • Thick client, thin server
  • Order of execution

Idempotence

Idempotence means that a chef recipe can run multiple times on the same system and the return will be identical. Chef ensures that the configuration changes to the end system (node) are done when the underlying configuration differs from the desired state and no changes are made to the system if they are not needed.

Thus, administrators can define the end configurations, and chef will ensure that the nodes have the desired configuration on them.

Thick Client, Thick Server

Chef uses an agent known as chef client to interact with the chef server.

The chef agent does the heavy lifting; it downloads the required files from the chef server onto a local cache. The chef client is responsible for compiling the client-side code, and then the code is executed by the agent on the node.

The thick client approach of chef makes it highly scalable, since the heavy lifting is done by the agent on each node and not on the server. This makes chef an ideal candidate for large-scale Internet application deployment and management.

Order of Execution

The compilation of recipes on the node is done in the exact order that is specified. The code execution of the agent is also done in the order that it is specified.

Thus, it is important to ensure that the correct order of execution is followed in the creation of recipes, so that the desired results are correct.

This approach makes sure that a prerequisite is met first so it becomes easier to manage.

Who Uses Chef?

Chef is being used very widely. One of chef’s biggest customers is Facebook. Many Internet companies and enterprises use chef today to automate their infrastructure environments.

Key Technologies

In this section, we discuss some of the technologies that are used in chef—mainly, Ruby and Erlang.

Ruby

Ruby is a simple object-oriented programming language which has been developed and designed in such a way that it is easy to read and understand, and it behaves in a predictable fashion. Ruby was developed and designed by Yukihiro “Matz” Matsumoto of Japan in 1995 and is influenced by scripting languages like Python, Perl, Smalltalk, Eiffel, Ada, and Lisp. Ruby borrows heavily from Perl, and the class library is an object-oriented reorganization of Perl’s functionality. Ruby was launched for the general public in 1995, and since then it has drawn devoted coders worldwide. Ruby became famous in 2006 and has been widely used since then.

Chef mainly uses Ruby as its reference language for writing cookbooks and recipes, with an extended DSL. Here we discuss some of the basic concepts of Ruby that might be needed while using chef.

Variables

Variables are used to store any kind of value, which can be a string or an integer, which is then used reference purposes. We need to declare a variable and then assign a value to that variable, which can be done with the help of assignment operator (=). For example, if we need to assign a numeric value to a variable, X, we would do the following:

X=20

This would create a variable, X, and would assign a value of 20 to it.

Figure 1-5 shows assigning values to four different variables. It would create four variables (a, b, c, and d) with values of 10,20,30, and 40, respectively.

9781430262954_Fig01-05.jpg

Figure 1-5. Assigning values to variables

Ruby also supports parallel assignment of variables. The same result can be achieved more quickly, using parallel assignment.

Figure 1-6 shows this operation.

9781430262954_Fig01-06.jpg

Figure 1-6. Assigning values to variables using parallel assignment

Working with Strings

Ruby uses the string object to store strings. The string object can also be used to call a number of methods. These methods can be used to manipulate a string in many ways. To create a new empty string, we use the new method of the string object as shown in Figure 1-7.

9781430262954_Fig01-07.jpg

Figure 1-7. Creating an empty string

If we want to create a new string with some value, we can pass an argument in the new method as shown in Figure 1-8.

9781430262954_Fig01-08.jpg

Figure 1-8. Creating a string with some value

There is another way to create a string which uses the string method provided by kernel, as shown in Figure 1-9.

9781430262954_Fig01-09.jpg

Figure 1-9. Creating a string with some value (kernel method)

The best thing about Ruby is that it takes care of many things. We can create a string by simply declaring it as shown in Figure 1-10.

9781430262954_Fig01-10.jpg

Figure 1-10. Initializing a string with some value (direct declaration)

We can use both single quotes (‘) and double quotes (“) to delimit stings in Ruby. However, there is a difference in both. Double quotes are used when we want to interpret escaped characters like tabs or newlines while single quotes are used when we need to print the actual sequence.

Figure 1-11 depicts the difference between the two.

9781430262954_Fig01-11.jpg

Figure 1-11. Working with single and double quotes

Ruby can be easily embedded in a string. Figure 1-12 illustrates this process.

9781430262954_Fig01-12.jpg

Figure 1-12. Accessing a variable

We need to use double quotes if we want to embed Ruby in a string. Single quotes won’t work in this case.

Arrays

Like a string, a Ruby array is also an object which can contain a single item or more. These items can be a string, an integer, or a fixnum. We can create an array in Ruby using a number of mechanisms. We can create an uninitialized array in Ruby using the new method of array class shown in Figure 1-13.

9781430262954_Fig01-13.jpg

Figure 1-13. Initializing an empty array

Figure 1-13 creates an array named days_of_month with nothing in it.

We can also create an array with a fixed number of elements in it by passing the size as an argument (see Figure 1-14).

9781430262954_Fig01-14.jpg

Figure 1-14. Initializing an array with five elements

Figure 1-14 will create an array of five elements with no value in it. If we need to add some data to the array, many options are available (see Figure 1-15). One of them would be to place the same data in each element during the array creation process

9781430262954_Fig01-15.jpg

Figure 1-15. Initializing an array with some value

We can also create an array by using the [] method of the array class and specifying the elements one after one as shown in Figure 1-16.

9781430262954_Fig01-16.jpg

Figure 1-16. Populating different value in each element of an array

We can access any element of a Ruby array by referencing the index of the element. For example, see Figure 1-17 if you want to access the second element of the array created in Figure 1-16.

9781430262954_Fig01-17.jpg

Figure 1-17. Accessing an object in an array

Operators

Ruby has a number of classified operators.

  • Assignment operators
  • Math operators
  • Comparison operators
  • Bitwise operators

In Ruby, as in other languages, a number of arithmetic operators can be used to perform a number of functions. Table 1-1 provides a list of these operators.

Table 1-1. Arthimetic Operators

Operator

Function

+

Used to add the variables on both sides of the operator.

-

Used to subtract the right side operand from the left side operand.

*

Used to multiply the values on both sides of the operator.

/

Used to divide the left hand operand by right hand operand.

%

Used to divide the left hand operand by right hand operand and return the remainder.

**

Used to perform exponential calculation on operators.

Figure 1-18 shows the use of the division operator; if we don’t want the result to be truncated then we need to express at least one of the operands as a float.

9781430262954_Fig01-18.jpg

Figure 1-18. Working with operators

If we need to compare two variables then we need to use comparison operators. Table 1-2 shows a list of comparison operators available in Ruby.

Table 1-2. Comparison Operators

Operator

Function

==

It is used to check equality. The output would be a true or a false.

.eql?

It has the same functionality as == operator.

!=

It is used to check for inequality. The output would be false in case equality and true in case of equality.

<

Used to compare two operands. The output will be true if the first operand is less than the second one and false otherwise.

>

Used to compare two operands. The output will be true if the first operand is greater than the second one and false otherwise.

>=

Used to compare two operands. The output will be true if the first operand is greater than or equal to the second one and false otherwise.

<=

Used to compare two operands. The output will be true if the first operand is less than or equal to the second one and false otherwise.

Figure 1-19 shows the use of comparison operators.

9781430262954_Fig01-19.jpg

Figure 1-19. Working with operators

Ruby bitwise operators allow operations to be performed on numbers at the bit level.

Methods

Methods in Ruby are used to organize your code in a proper way. Ruby also promotes the reuse of code so that we do not write the same code again and again.

Ruby helps in organizing your code into groups to call said code whenever required.

The following piece of code shows a typical method:

def name( arg1, arg2, arg3, ... )
   .. ruby code ..
   return value
end

Erlang

Overview

Erlang is a general-purpose concurrent programing language that is mainly used to build highly available and scalable real-time systems. Erlang is being widely used in many industries like telecom, e-commerce, and so on. It has a system that provides built-in support for concurrency, fault tolerance, and distribution.

Along with being a programming language, Erlang also focuses on high reliability and concurrency. Erlang can perform dozens of task at a time. It uses an actor model to achieve it (i.e., each actor is treated as a separate process in a virtual machine). For example, consider yourself to be an actor in Erlang’s world: you would be a person sitting alone in a dark room waiting for a message, and as soon as you receive a message you provide a valid response.

With the help of this actor model, Erlang is able to perform tasks at a faster rate, which in turn makes it faster. We can treat this actor model as a world where everyone can perform a few distinct tasks and just wait to receive a proper message. It means everyone is dedicatedly working on a specific task and not concerned about what other people are upto. To achieve this, we write processes (actors) in Erlang, and these actors do not share any kind of information. Every communication that is taking place is traceable, safe, and explicit. The ability of Erlang to scale, recover, and organize code makes it more awesome.

The main reason Erlang is able to scale so easily is that the nature of the process is very light, and a large number of processes exist. Although it is not required to use all of them at a time, you have them as a backup and can use them if required.

Evolution and History

In 1984, CSLabs at Ericsson conducted on going research on various languages and methodology approaches that were best suited for the applications in telephony domains. A few techniques were rule-based programming, imperative programming, declarative programming, and object oriented programming.

There are some properties that telephony domains demand, such as

  • Grained concurrency: Typical telecommunication involves large equipment, complex real-time systems, and various activities which should occur concurrently and are handled by processes or threads.
  • Asynchronous message passing: This is a basic requirement of telephone systems. Asynchronous message passing gives ways to distribute processing.

The research done on varieties of languages finally confirmed that building a scalable and distributed telephony application cannot be done by using any of the languages or with any of the methodologies. There are some parts of an application which can be best programmed in one methodology and other parts in using some other methodology.

The primary aim of this research was to develop a style of programming which can lead to beautiful code, and which will also help programmers gain efficiency when writing bug-free code.

Erlang Creation

Joe Armstrong started another experiment with Prolog, and gave the name Erlang to this new experimental language after the Danish mathematician Agner Krarup Erlang, creator of the Erlang loss formula. Erlang can be defined as a concurrent functional programming language which mainly follows two traditions (see Figure 1-20).

9781430262954_Fig01-20.jpg

Figure 1-20. How Erlang evolved

  • Functional and logic programming languages: Erlang inherits lists, pattern making, atoms, catch and throw, and so on, from these languages. Examples of these types of languages are Lisp, Miranda, Haskel, and ML.
  • Concurrent programming languages: Erlang uses features like process communication modules and processes from these types of languages. Examples of these types of languages include Modula, Chill, and Ada.

Erlang was created while keeping in mind various designs that are ideal for telephony applications. It contains features like concurrency, OS independent, garbage collection, tail recursion, different data types and collections, support selective message receive statement, asynchronous message passing, and default error handling.

Erlang Features

Concurrency

Erlang implements concurrency independent of the operating system. Processes in Erlang have no shared memory. Different processes in Erlang communicate to each other by sending and receiving messages asynchronously. These processes are very lightweight; hence hundreds and thousands of process can run at a time, but their memory requirement varies dynamically. Erlang is useful for applications that require response time of order of milliseconds

Distributed

Erlang supports transparent distribution. An Erlang program can run on more than one machine which may each have different operating systems running. Erlang processes on one node and communicates a different process on another node using asynchronous message passing.

Sequential Erlang

The syntax of Erlang is quite similar to that of ML. It has data types like numbers, lists, and tuples and it uses pattern matching to select between alternatives. Recursion is used to construct loops.

Robust

When an Erlang process crashes it will only crash the process, not the entire system. Erlang processes can monitor each other so that if there is an error in one process, others can receive the error message. This also provides monitoring processes to take corrective actions like restart transactions, for example. In distributed systems, nodes can be configured to provide failover scenarios. Due to this feature of Erlang we are able to design soft-fail systems. For example, an error in the call of a telecommunication system will bring down that call only and not the entire system.

Software Upgrading in Running Systems

This function in Erlang can be performed without disturbing the current state of the system. We can directly change the code in the running system which means we can upgrade a system without disturbing the currently running operations.

The newly spawned process will use the new version of the module while the ongoing process will use the old one and remain undisturbed.

Portability

Erlang has been developed mainly in C, so it is available on most of the operating systems that can run C.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset