Basic concepts

In networking, load balancing is designed to distribute workloads across multiple network resources. Load balancing in pfSense has two types: gateway load balancing and server load balancing:

The purpose of gateway load balancing is to distribute internet-bound traffic over more than one WAN interface. Thus, configuration is separate from server load balancing, and is done by configuring gateway groups by navigating to System | Routing and clicking on the Gateway Groups tab.
The purpose of server load balancing is to distribute traffic among multiple internal servers. In some cases, we may also want to have redundant servers available for failover, and this is supported in pfSense as well. Server load balancing also ensures both reliability and high availability because it ensures that only servers that are online respond to client requests. A good load balancer will also direct traffic away from servers that already have high levels of traffic and toward servers that have lower levels of traffic. In this way, it tends to ensure that a single server is not taxed too heavily. A load balancer should also provide a means of adding and subtracting servers from the server pool as needed.

Gateway load balancing merits a separate discussion, and we will hold off on discussing it until we address various multi-WAN configurations in the next chapter. Instead, we will discuss load balancing as a means of distributing traffic among multiple servers.

Load balancing can be either hardware-based or software-based. If it is hardware-based, then the load balancing functionality is provided via firmware. Updating the firmware, whether it be to provide bug fixes, new features, or security patches, can be difficult, and in most cases is harder than a simple software update. The advantage, however, is that hardware load balancers tend to have dedicated circuits that can optimize load balancing for specific cases, and they also tend to support a certain number of connections and amount of traffic right out of the box. Load balancing was once done exclusively by hardware devices. It is possible to do load balancing via software now, however; in fact, pfSense is such an example of software-based load balancing. This form of load balancing can be much less expensive and easier to update, but they can be sensitive to changes made in the underlying operating system. When using software-based load balancing, we have to make sure that OS changes do not adversely affect the load balancer.

One way we can differentiate between forms of server load balancing is by the type of algorithm they use. Some of the common algorithms used are as follows:

Random: Requests are directed to a randomly chosen server. Provided that there is a large enough number of requests, this should result in good load balancing.
Round-robin: Requests are distributed across the pool of servers sequentially. This can lead to an uneven load if some requests are more demanding of resources than others, but it is a simple algorithm that provides good average case outcomes.
Weighted round-robin: This is a variant of round-robin in which servers are assigned a greater weight. Servers with higher weights tend to service more requests than ones with lower weights.
Least connection: This algorithm takes into account the number of requests a server is servicing. The server that is servicing the least number of requests (and therefore has the least number of connections) is chosen. There is also a weighted variation on this algorithm by which a weight is assigned to each server. In the case of a tie between two or more servers with an equally low number of connections, the request is serviced by the server with the higher/highest weight.
Least traffic: With this algorithm, the system monitors the bit-rate of outgoing traffic from each server. The request is serviced by the server with the least outgoing traffic. As with the other algorithms, there is a weighted version.
Least latency: In order to determine which server services a request, the system makes an additional request to servers in the pool. The first server to respond gets the request. There is a weighted version of this algorithm as well.
IP hash: This algorithm uses the source and destination IP addresses to generate a hash number that determines which server services the request. One of the advantages of this method is that if the connection is interrupted, when it is resumed, requests will be directed to the same server as long as the source and destination IP addresses remain the same. Thus this algorithm is ideal for scenarios that demand that requests from the same source IP addresses (for example e-commerce sites in which resumed connections should be directed to the server with the customer's shopping cart).
URL hash: Similar to IP hash, but the hash is done on the URL requested. This is a good choice for a pool of proxy servers, because it guarantees that requests for a certain URL are always directed to a specific server, thus eliminating duplicate caching.
SDN adaptive: The SDN stands for Software-Defined Networking; this algorithm takes information about the network from layers 2 and 3 and combines in with information about the data from layers 4 and 7. It uses this information to make a decision as to which server will handle the request. The decision is based on the status of the network, the status of applications currently running, the overall health of the network and the health of the network's infrastructure.

Another way of differentiating between types of load balancing is to consider whether it is done on the client side or server side. Client-side load balancing relies on the client randomly choosing an internal server from a list of IP addresses provided to the client. This method relies on loads for each client being roughly equal. Although it might seem this would be an unreliable method of load balancing, it is surprisingly effective. Over time, clients will connect to different servers, and the load will be evenly distributed over the different servers. Moreover, this is a very simple method of configuration, since it doesn't require us to set up server pools.

Client-side load balancing relies on the law of large numbers, a probability theorem that states that the result from a large number of trials should be close to the expected value. In the same way that one side of a traditional die should come up roughly one-sixth of the time if we roll the die enough times, the load for each server in a server pool that generates significant traffic over a long enough period of time should be roughly equal if randomly chosen by the client.

Server-side load balancing uses software to listen on the port/IP address to which external clients connect. When a client connects, the software forwards the request to one or more backend servers. There are multiple advantages to this approach:

Since the software decides which backend server to which the client is forwarded, server-side load balancing tends to be able to guarantee effective load balancing at all times.
The load balancing process is transparent to the client, who has no idea of our network configuration.
Since the client doesn't get to connect directly to the backend server, it tends to be more secure than client-side load balancing. For that matter, since the client doesn't know that we are utilizing backend servers, there is an added security through obscurity benefit.
We can present a special message to users when all backend servers are down.

The method used in pfSense is server-side load balancing, so all the benefits of this method are available to us. This requires us to set up load balancing pools (containing the backend servers), and one or more virtual servers (to which clients will connect). We will also be required to set up firewall rules to allow traffic to pass to the backend servers, as we shall see when we configure load balancing.

Ideally, our load balancing setup should provide dynamic configuration. It should be easy to add and subtract servers as needed, so that we only use the server capacity that we actually need. Some web hosting and cloud computing services are particularly good in this respect, allowing capacity to scale up in response to traffic spikes without interrupting current connections. pfSense provides the capability to add and remove systems from both a load balancing pool and a CARP failover group.

CARP has a single purpose: to allow multiple hosts to share the same IP address or group of IP addresses. To understand how this works, we have to reintroduce a concept originally introduced in Chapter 4, Using pfSense as a Firewall. As you may recall, virtual IPs allow multiple devices to share the same (virtual) IP address. We can configure two routers to share the same virtual IP address (each must also have a unique non-virtual IP address). One of these routers is designated as the master, and the other is designated as the slave. The master router is the one which handles traffic and ARP requests for the shared virtual IP address, under normal circumstances. When the master goes down, the slave takes over. This ensures that the network is able to function normally even when there is a hardware failure.

CARP itself provides a means of ensuring that the slave router knows when the master is down. When the slave router stops receiving CARP advertisements from the master (or in some configurations, when the CARP advertisements from the master become less frequent than advertisements from the slave), then the slave router begins handling traffic for the CARP virtual IPs.

There are also two protocols to ensure synchronization between the two routers. pfsync is for state synchronization, and Extensible Markup Language – Remote Procedure Call (XML-RPC) is for configuration synchronization. The pfsync connection between the two routers is achieved via a crossover cable on a dedicated network interface.

Table of Contents for Basic concepts

Create new playlist

Sign In

Sign Up

Table of Contents for
Basic concepts