Chapter 7. Virtual Hosts

 

“Pilgrim, how you journey on the road you chose To find out where the winds die and where the stories go”

 
 --PilgrimA Day Without Rain—Enya

Fortunately, you don't need a separate Apache server for each Web site that you want to run. Virtual hosting is the term given to the capability to run multiple Web sites on the same computer and on the same Apache server process[1].

There are a number of different techniques for setting up virtual hosts. This chapter covers these various techniques and offers some examples for setting up common configurations.

In all the various methods for doing virtual hosting, the concept is basically the same. The user goes to a URL specifying a particular hostname and gets different content for each hostname. Generally, the user is not aware that they are loading content from the same physical computer system. Somehow, Apache determines which Web site you are requesting content from, and gives that to you, even though all the different sites are running on the same Apache daemon.

The two most common ways of accomplishing this are IP-based and name-based virtual hosting.

IP-Based Virtual Hosts

With IP-based virtual hosting, each hostname on the server is given its own IP address.

Setting Up Multiple IP Addresses

All modern operating systems allow you to have more than one IP address on one physical network card. Earlier operating systems actually required you to add an additional network card for each new IP address. That is no longer the case, however, in any operating system you are likely to encounter.

The specific details of how this is accomplished—the exact procedure for putting multiple IP addresses on your network interface—will vary from OS to OS and you need to consult your documentation.

On the other hand, if you do have multiple network cards in your machine, IP-based virtual hosting will work for that also.

In addition to setting up the IP addresses on your machine, you will also need to set up the DNS records that will direct the hostnames to the IP addresses you have assigned to your server. That, also, is beyond the scope of this book. Contact your DNS administrator to add the hostnames to your DNS zone, or to register a new DNS domain. You can't just make up hostnames and have them magically work.

If you are not able to add records to DNS, or if you just want to test, you can access the IP-based virtual hosts by simply using the IP address in the URL. For example:

http://192.168.5.10/

Configuring the Virtual Host

After you have your IP addresses set up and have the DNS records pointing the correct names to the correct IP addresses, you can proceed with configuring your Apache server to answer to these names.

This is done in a <VirtualHost> section, as was mentioned in Chapter 4, “Configuration Directives.” All the configuration directives for a virtual host are contained in the <VirtualHost> section, with one section per virtual host. A <VirtualHost> section looks like the following:

<VirtualHost 192.168.1.2>
  ServerName vhost1.apacheadmin.com
  ServerAlias www.vhost1.apacheadmin.com
  DocumentRoot /usr/local/apache/vhosts/vhost1
  ErrorLog logs/vhost1.error
  AccessLog logs/vhost1.access
</VirtualHost>

The address in the <VirtualHost> directive can be specified as a hostname, rather than as an IP address, but it is highly recommended that you use the IP address instead. The reason for this is simple. If, when the server is rebooting, it cannot immediately contact a DNS server to determine the IP address of the VirtualHost, it will simply start up without that particular VirtualHost being loaded. Apache needs the IP address, not the name, to answer requests, so if it cannot determine the IP address from the name it is simply unable to load the configuration. Using the IP address avoids this lookup and ensures that the server will start correctly, with all host configurations loaded, even if the network is unavailable at the time the server is coming up.

The server uses the name of the virtual hosts, specified by the ServerName directive, when it constructs self-referential URLs, such as for a redirect.

It's a good idea to have a separate log file for each of your virtual hosts, although it is not required. By logging each host separately, you can much more easily determine problems on a per-host basis. If all your hosts log to the same log files it becomes very difficult to isolate problems when they occur because they become buried in with entries from all the other hosts. See Chapter 24, “Logging,” for more information.

You only need to put directives in a VirtualHost section when the values are different from those set in the main server configuration. All other values are inherited from the main server.

Name-Based Virtual Hosts

Name-based virtual hosts are the same as IP-based virtual hosts in almost every way, except you don't need more than one IP address. By having more than one name pointing to the same IP address, you can arbitrarily host many virtual hosts on the same IP address.

Note

Multiple host names pointing to the same IP address are referred to in DNS lingo as “cnames.”

The configuration is almost identical to that of IP-based virtual hosts, except that you need to tell Apache, with the NameVirtualHost directive, which IP addresses on your server will be used for name-based virtual hosts.

NameVirtualHost 192.168.1.3

<VirtualHost 192.168.1.3>
    ServerName vhost1.apacheadmin.com
    ServerAlias vhost1
    DocumentRoot /usr/local/apache/vhosts/rhiannon/htdocs
</VirtualHost>

<VirtualHost 192.168.1.3>
    ServerName vhost2.apacheadmin.com
    ServerAlias vhost2 www.vhost2.apacheadmin.com
    DocumentRoot /usr/local/apache/vhosts/demo/htdocs
</VirtualHost>

The name of the server, specified by the ServerName directive, is used to determine which virtual host is displayed. The browser supplies the name of the host that it is trying to connect to in the request headers, and Apache uses this information to map the request to the correct files or other resources.

Older browsers[2] were unable to use name-based virtual hosts because they did not supply this request header.

More specifically, clients or proxies that support only the HTTP 1.0 protocol might fail to get the right virtual host because the Host header is not part of the HTTP 1.0 protocol, and is required for name-based virtual hosting.

However, all currently available browsers support the HTTP 1.1 protocol, which contains name-based virtual host support as one of its requirements. And almost all HTTP 1.0 clients and proxies support the Host header as an extension to the 1.0 protocol.

The Apache documentation contains instructions for working around this limitation in older browsers, if you think that it is worth the effort. However, the solution is inelegant and might not be necessary for your site. You should watch your server logs to see if you are getting visits from browsers old enough to warrant this sort of work-around. You might want to consider using IP-based vhosts if you feel older browsers are a large enough portion of your visitors.

Note the use of the ServerAlias directive in the previous examples. This directive is useful when a particular site can be accessed by more than one name. Two specific examples, illustrated previously, come to mind. In the first example, I have a host that can be accessed from the inside, or from the outside, of my network. Inside the network, or from the machine itself, I don't need to type the entire name of the machine because it will check the local domain first, so the ServerAlias allows me to do this. In the second example, I have added a ServerAlias of www followed by the original hostname. It has been my experience in recent years that people are so trained to expect Web addresses to start with www that they are incapable of typing a URL without it. Simply adding that to the hostname saves a lot of time on the phone explaining to people that the www is not necessary.

Port-Based Virtual Hosts

It's not a very common practice, but it is also possible to set up virtual hosts by varying the port number that the server is running on, rather than the host name or IP address. The configuration for such a setup would look like this:

<VirtualHost 192.168.1.103:75>
ServerName vhost.apacheadmin.com
ServerAlias vhost
DocumentRoot /usr/local/apache/vhosts/strange
</VirtualHost>

You must also add a Port directive for each additional port on which you want your server to listen. The Port directive would look like this:

Port 75

If you choose a port below, or equal to, 1024, you will need to be root to start the server. Stated differently, you can run a Web server as an unprivileged user by choosing a port higher than 1024.

This host can be accessed via the URL http://vhost.apacheadmin.com:75

Note that SSL, which runs on a different port from unencrypted HTTP, is generally set up in the configuration file as a port-based virtual host. However, browsers know that when a URL is prepended with https:// rather than http://, the connection is to be made on port 443 rather than 80.

In Chapter 22, “SSL,” you'll learn about SSL, which is a technology that provides for secure encrypted connections on the Web. For reasons that will be made more apparent there, you cannot use name-based virtual hosting in conjunction with an SSL site.

The short form is that the negotiation of the connection encryption takes place before the client has a chance to tell the server which named host it wanted to connect to. Consequently, by the time it gets to that stage, it might have already negotiated a secure connection to the wrong site.

So, if you want to put up a secure Web site using SSL, you have to have a unique IP address for each SSL-enabled virtual host.

Bulk Virtual Hosting

Frequently, when you're running virtual hosts, you'll find that the number of hosts grows faster than your ability to sensibly manage them. A few techniques you might use to simplify the task of managing these hosts' configurations follow.

Per-vhost Configuration Files

As recommended in Chapter 4, when you are configuring your virtual hosts, you might consider putting each virtual host's configuration into its own individual file. Then you could place these files into a subdirectory of your conf directory, perhaps called vhosts. Then add the following directive to your main configuration file, httpd.conf:

Include conf/vhosts/

Note that the directory path given in the example is relative to the ServerRoot, and not an absolute path.

Apache will read all files in the specified directory and parse directives found in those files. Therefore, you cannot have any files in this directory that are not configuration files, such as temporary files, Readme files, and so on.

The more virtual hosts you have the longer it is going to take to parse all the vhost configurations[3], and, therefore, the longer it is going to take for your server to start up.

mod_vhost_alias

When you are running more than just a few virtual hosts—when you start getting into the tens, or even hundreds, of virtual hosts, you will notice a substantial time taken to start your Apache server. During this time, your server is not responding to HTTP requests. That is, while your server is starting, or restarting, it is effectively unavailable to the end-users. When you are a service provider—which, as a server admin, you really are—this sort of downtime needs to be avoided whenever possible.

mod_vhost_alias is one of the modules available for making bulk virtual hosting more efficient. If each of your virtual hosts has essentially the same configuration, you can configure them all with one set of directives.

mod_vhost_alias provides just four directives—two for use with name-based virtual hosts, and two for use with IP-based virtual hosts.

If you are using name-based virtual hosts, the directives that you will be using are VirtualDocumentRoot and VirtualScriptAlias. These directives mean exactly what their names imply, but the syntax is a little unusual. The value given to the directives will contain one or more variables into which will be substituted all or part of the hostname being requested by the client. If you are familiar with C, or similar programming languages, these variables will remind you of arguments to the sprintf function. The following things can appear in the directive value.

Template Meaning
%% A literal % character.
%p The port number of the virtual host being requested.
%N.M All or part of the host name, depending on the values of N and M.

The values N and M are, respectively, the portion of the dot-separated hostname to be inserted, and the number of characters from that portion to be used.

The interpretation of the value of N is as follows:

0 The whole name
1 The first part
2 The second part
-1 The last part
-2 The next-to-last part
2+ The second and all following parts
-2+ The next-to-last part, and all preceding parts

1+ and -1+ would mean exactly the same thing as 0.

If the value given results in selecting more of the name than there actually is available to select, then a single underscore is interpolated in place of the given variable.

This will all be made much clearer by several examples.

The trivial example is to use the full hostname in the directive, as follows. In your configuration file, put a directive that looks this:

VirtualDocumentRoot /usr/local/apache/vhosts/%0/htdocs

Then, any incoming request for a valid virtual host—that is, any hostname that DNS points to your server—will have files served out of a directory named by the hostname. For example, a request for the URL http://www.boxofclue.com/vhosts.html will get the file located at /usr/local/apache/vhosts/www.boxofclue.com/htdocs/vhosts.html.

This technique has one large problem: Most virtual hosts can be accessed by more than one hostname. For example, if the previous URL was requested instead as http://boxofclue.com/vhosts.html, which should give the same resource, Apache will attempt to serve the file /usr/local/apache/vhosts/boxofclue.com/htdocs/vhosts.html, which is not the same file path it tried in the other case. It might either be a different file or not exist at all.

This dilemma can be solved in a few different ways. The simplest way around this is to simply create symbolic links from all alternate possible file paths to the “correct” file path, and allow Apache to locate the files in that way. However, one of the major reasons for using this module in the first place is to reduce the amount of administrative tasks required to create a new virtual host, so this is hardly ideal.

The better way to solve this is to use a different combination of variables provided by mod_vhost_alias to construct unique filepaths per virtual host.

The following example proposes one such configuration option. Put this directive in your configuration file:

VirtualDocumentRoot /usr/local/apache/vhosts/%-1/%-2/htdocs

The variable %-1 will evaluate as the last part of the hostname—usually com, net, org, or some other top level domain (TLD). So, your virtual hosts will be divided into subdirectories by their TLD.

The second variable, %-2, evaluates as the next-to-last (or, as the documentation refers to it, the penultimate part) of the hostname. For example, for the hostnames www.boxofclue.com and boxofclue.com, %-2 will evaluate to the string boxofclue, and files will be served out of the directory /usr/local/apache/vhosts/com/boxofclue/htdocs, giving you a more manageable subdivision of your virtual host directories.

In the event that you have many hundreds of virtual hosts, as is the case for some large ISPs, you might want to subdivide your directories even further. For example, you might split hosts into subdirectories alphabetically, as follows:

VirtualDocumentRoot /usr/local/apache/vhosts/%-1/%-2.1/%-2/htdocs

In this configuration, files for the host www.boxofclue.com will be served out of the directory /usr/local/apache/vhosts/com/b/boxofclue/htdocs.

This subdivision can continue to any depth you like, as required by the number of virtual hosts you are serving, you could, for example, further subdivide with the following directive:

VirtualDocumentRoot /usr/local/apache/vhosts/%-1/%-2.1/%-2.1%-2.2/%-2/htdocs

In this case, files for the host www.boxofclue.com will be served out of the directory /usr/local/apache/vhosts/com/b/bo/boxofclue.com. The variable combination %-2.1%-2.2 gets evaluated is the first, followed by the second, letters of the next-to-last part of the hostname; this is what gives the subdirectory bo.

Continue this subdivision until you have sufficiently few hosts per-directory to keep track of them.

Note that you can use this same technique to have each virtual host served out of the home directory of the particular user, if you choose usernames appropriately to map directly to the hostnames of their respective sites.

Running Multiple Daemons

In very rare cases, you might want to run more than one Apache server process on the same machine to handle different virtual hosts. This might be done, for example, when you need a very different set of modules for different Web sites. You could run one Apache process to serve static HTML pages and images, and a separate Apache process running mod_perl to serve your dynamic content.

In these cases, all that is required is that you maintain separate configuration files, and start the Apache server with the -f flag to specify a configuration file located somewhere other than the location specified when the server was built.

/usr/local/apache/bin/httpd -f /usr/local/apache/conf/host_two.conf

Summary

Virtual hosts provide the best way to serve multiple Web sites off of the same physical server machine, and, therefore, make the best use of your available resources.



[1] Webster's dictionary defines the word “virtual” as follows: “being such in essence or effect though not formally recognized or admitted.” I'm not sure what is “virtual” about a virtual host. It's just as real as the main host is, but in the mid-90s everything was “virtual.”

[2] Really older versions you are unlikely to see in any real-world setting.

[3] This will be the case whether the configurations are in external (Include'ed files) or in your main configuration file.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset