IN THIS CHAPTER
Oh I have slipped the surly bonds of earth And danced the sky on laughter-silvered wings | ||
--High Flight—John Gillespie Magee |
Apache's behavior is configured via one or more text configuration files. These files are read on server startup, and contain directives that control everything about the server. In this chapter, we'll talk about how these configuration files work, and what you need to do to get your server acting exactly the way that you want it to.
Please note that this is not a comprehensive blow-by-blow, talking about every configuration directive that Apache supports. There are two main reasons for this.
One is that the Apache documentation itself, which is available free online, contains just such a comprehensive listing. There is a copy of this documentation on the CD that accompanies this book, but the version online is guaranteed to be the absolute latest information, including changes and corrections that are made on almost a daily basis by the Apache documentation team. This documentation can always be found at http://httpd.apache.org/.
The other reason is that it seems to make more sense to divide configuration directives into the various topics that they address. CGI directives, therefore, appear with Chapter 15, “CGI Programs,” virtual host directives appear with Chapter 7, “Virtual Hosts,” and so on.
This chapter discusses the configuration files themselves—their formats, their syntax, and techniques that can be used to simplify their maintenance.
Apache has one main configuration file in which all the parameters controlling the operation of the server are specified. This file, called httpd.conf
, is located in the conf/
subdirectory of wherever you installed Apache. Ordinarily, this is /usr/local/apache/conf
.
When you first install Apache, you'll find a number of different files living in that directory. Most of these files, however, are example files, put there as a sort of tutorial by demonstration.
The first files you will see in conf
are highperformance-std.conf
and highperformance.conf
. These files are identical—the idea is that the -std
version, you'll keep around as a backup in case you do things to the other one, and want to get it back the way that it was originally. This configuration file is intended to give you a head start in configuring a Web server that is tuned to peak performance in your particular setting. See Chapter 13, “Performance Tuning,” for more information on tuning your server for maximum performance. The next set of files, httpd-std.conf
and httpd.conf
, are the main server configuration file. We'll come back to this in a moment.
magic
is the configuration file for mod_mime_magic
, which is a module dealing with mime types.
mime.types
is a configuration file relating file extensions with MIME types. See Chapter 8, “MIME and File Types,” for more details about this file.
httpd.conf
is the main server configuration file, and the file that you will most often be working with. httpd-std.conf
is there so that you can make changes to the configuration file with impunity, and not be concerned that you won't be able to remember how to get it back to its original state.
It is a very good idea to make backup copies of known-good configuration files, particularly when you are going to try some modifications. In particular, you should keep around the configuration file that is distributed with Apache, as a good example of a well-formed configuration file. This file is usually called httpd.conf-dist
.
httpd.conf
consists of just a few types of lines.
Directives actually make configuration changes to the server, so these are what you will become the most familiar with. Sections, although technically a form of directive, divide your server into different pieces on which different sets of directives are to act. Comments provide documentation so that you remember why you made particular configuration changes.
The basic currency of the Apache configuration file is the directive. A directive is a keyword that is followed by a value, or values, which dictates one particular aspect of the server's behavior.
Here are some examples of directives:
KeepAlive On MaxThreadsPerChild 20 ServerAdmin [email protected] Alias /icons/ "/usr/local/apache/icons" IndexOptions FancyIndexing VersionSort
The directives that are available for you to set is determined by what modules you have installed. A number of directives directly affect the Apache core, whereas others are for configuring the behavior of individual modules.
To show a complete listing of all directives that are available to you on your particular server, you can use the -L
flag with httpd
, as shown here:
/usr/local/apache/bin/httpd -L
This will list every module that you are able to set, listing the name of the module that provides the directive, the expected argument or arguments, the contexts[1] that the directive can be used, and what override conditions, if any, must exist in order for the file to be permitted in a .htaccess
file. (See Chapter 6, “.htaccess Files—Per-Directory Configuration,” for more information on .htaccess
files.)
A section[2] is a method for limiting the scope of one or more directives to a particular directory, a set of files, or a set of URLs. Sections look similar to HTML tags, and enclose one or more directives.
<Directory /usr/local/apache/htdocs/private> Deny from all Allow from 192.168.1.105 </Directory>
The directives enclosed in the section apply only to the limited subset of your server's space, which is specified by the section. In the previous example, the Deny
and Allow
directives will apply only to files located in the /usr/local/apache/htdocs/private
directory, and any subdirectories thereof, unless overridden by a directive applied to a specific subdirectory.
There are a number of different types of sections, specifying a number of different ways to divide up the content served by your Web server.
A <Directory>
section, as you would expect, specifies that the enclosed directives apply to the particular directory listed, and all subdirectories thereof, unless overridden by another directive applied to a deeper directory. The directory path specified is the full path.
If a <Directory>
section is used on Microsoft Windows, and if the directory being specified is on the same drive letter as the ServerRoot
directory, then the drive letter doesn't need to be specified.
In the following example, the ServerRoot
is on the D:
drive, and so the <Directory>
section is also assumed to be referring to the D:
drive.
ServerRoot d:apache <Directory apachedocsprivate> AllowOverride None </Directory>
Directory
sections are easier to use when your site has been well planned out, and the content in a particular directory is of a particular type.
A Directory
section is used to indicate that a particular part of your site contains resources that are to be treated somewhat differently from files in the rest of the site. These files are treated differently because they are either a different file type (such as being image files, or CGI programs), or because there are different restrictions on their use (such as a requirement of a certain authorization to get in).
A <DirectoryMatch>
section behaves much the same way as a Directory
section, except that instead of taking an exact directory path as its argument, the argument is a regular expression. The enclosed directives are then applied to any directory that matches the regular expression.
Stated simply, a regular expression is a way of describing a particular pattern of characters. The regular expression engine will compare the pattern to a given directory, and determine whether or not it matches. This allows you to specify more than one directory in a single section, if there are multiple directories that share common traits.
For more information on regular expressions as they are implemented in Apache, please see Appendix C, “Regular Expressions.”
In the example that follows, the <DirectoryMatch>
section specifies that the enclosed directive (the AllowOverride
directive) is to be applied to all directories that look like /users/
, followed by a string beginning with either A
or B
(uppercase or lowercase):
<DirectoryMatch /users/[aAbB].* > AllowOverride All </DirectoryMatch>
So, for example, the directory /users/Bowen
would have the directive AllowOverride All
applied to it, by virtue of matching the specified pattern.
This gives you a lot more control over which directories directives are applied to, and allows you to do in one directive what otherwise could potentially take a large number of individual directives.
<Files>
sections indicate that the enclosed directives should be applied only to the specified files. Wildcard characters can be used. A ?
will match a single character, and * will match any sequence of characters.
<Files *.gif> DefaultType image/gif </Files>
As with <Directory>
, <Files>
has a sibling <FilesMatch>
, which accepts extended regular expressions as arguments.
These directives are particularly useful for restricting access to some files in a particular directory, but not others, as shown in the following examples.
If you have several CGI programs, one of which is intended solely for site admins, you could use a Files
directive to restrict access to just that one file:
<Files admin.cgi> AuthName Admin AuthType Basic AuthUserFile /usr/local/apache/passwords/admin.passwd AuthGroupFile /usr/local/apache/passwords/admin.groups Require group siteadmins </Files>
(See Chapter 21, “Authentication, Authorization, and Access Control,” for more details on authentication and authorization.)
Alternatively, if you have been a little less consistent in your naming scheme, but your admin files have somewhat similar names, you can still restrict access to them all with one directive.
<FilesMatch "admin.(cgi|pl|exe)"> AuthName Admin AuthType Basic AuthUserFile /usr/local/apache/passwords/admin.passwd AuthGroupFile /usr/local/apache/passwords/admin.groups Require group siteadmins </Files>
Because the Files
directive accepts wildcard characters, creative use of that directive is often simpler and more intuitive than resorting to the FilesMatch
directive.
The <IfDefine>
section will be applied only if a particular parameter is defined. A parameter can be defined when the server is started up, with the -D
command-line option.
Thus, if your httpd.conf
were to contain directives as follows:
<IfDefine ReferLog> LogFormat "%{ Referer} i -> %U" referer CustomLog logs/referer referer </IfDefine>
And, if you were then to start your Apache server with the command line:
/usr/local/apache/bin/httpd -D ReferLog
Then the directives enclosed in the <IfDefine>
section will be applied. In this case, starting your server with the -D ReferLog
flag would cause the server to maintain a log file listing of the URLs from which people followed links to your site.
An <IfDefine>
section can also be defined to apply directives when a particular parameter is not set. This is done by prepending a !
to the specified parameter:
<IfDefine !ReferLog> ...
The directives enclosed in the previous section would be applied to the server if the ReferLog
variable were not defined. That is, if you start your server without the -D ReferLog
flag, these directives would be applied.
Placing directives for a particular module inside of a <IfModule>
section ensures that they are only applied if that particular module is loaded. This is a convenient way to have a standard configuration file, and not have to make per-system configuration changes just because a particular module is not installed. This is also the way that the default Apache configuration file is distributed, so no matter which modules you choose to build into your server, the configuration file will still work.
In the following example, the directives are applied only if the threaded
module is loaded.
<IfModule threaded.c> StartServers 3 MaxClients 8 MinSpareThreads 5 MaxSpareThreads 10 ThreadsPerChild 25 MaxRequestsPerChild 0 </IfModule>
<Limit>
and <LimitExcept>
sections refer to request methods. The enclosed directives are applied only if the HTTP request was made with one of the specified methods.
A request method is the manner in which the document, or resource, was requested from the Web server. This will usually be GET
, POST
, or HEAD
, but will occasionally be something else. Without going into too much detail GET
is usually used to get a document or resource. POST
is usually used to send in the contents of a Web form. HEAD
is a way to check the status of a document, typically to see if it has changed since the last time you looked at it, or if you can just reuse the copy you already have cached.
The <Limit>
directive, then, allows you to restrict access to a particular document based on how that document (or resource) is being accessed.
<LimitExcept>
is the opposite of <Limit>
, limiting access for methods not listed.
When Apache receives a request for a resource, there is a phase during which Apache maps the URL to either an actual file on the server, or to some resource. A <Location>
section defines a mapping from a URL to some non-file resource. And, as with the other Match
directives, <LocationMatch>
maps from a URL pattern to a resource.
The resource can be just about anything, but generally it will be a handler of some variety. A handler is an action that is to be taken when particular files, types of file, or particular URLs, are called. Some handlers are part of the core server, and others are included in modules.
Handlers will be discussed in detail in Chapter 14, “Handlers and Filters.”
When multiple Web sites, with different hostnames, are served from one Web server machine, they are referred to as virtual hosts. Chapter 7 is dedicated entirely to virtual hosts, so we'll just say that directives enclosed in a <VirtualHost>
section apply only to documents served from that virtual host.
Lines beginning with the hash character (#) are comments, and are completely ignored by Apache when it reads through the configuration file on server restart.
Note that the line must begin with the hash. You can't start a comment mid line.
The default configuration file that comes with Apache is very heavily commented. Many beginning Apache users find that they can configure their server just by looking at the comments in that file and making changes based on the recommendations outlined there. All the basic directives are discussed, with examples and default settings, right in the comments.
As a result of this, the default configuration file is rather large, and nearly half of it is comments. That can be a little frustrating for an experienced user who already knows what he's looking for but has to scroll through dozens of lines of comments.
When you make configuration changes, it's a good idea to add comments explaining what the configuration is for, when it was made, and who made it. This will help you to remember, when you look back at the file several months later, why you were doing it.
It is also a very good idea to put your configuration file into some sort of revision management system, such as CVS
, so that you can track changes, and undo them if they have undesirable effects.
There are a variety of different circumstances in which you would want to load your configuration from somewhere other than your main server configuration file. If, for example, you are testing a different configuration, but don't want to overwrite your existing configuration file, you might want to maintain multiple config files, and switch among them.
You can do this with the -f
flag when you start Apache:
/usr/local/apache/bin/httpd -f /path/to/alternate/apache.conf
Note that the specified configuration file is loaded in place of your regular configuration file, and therefore must specify a complete configuration.
apachectl
is not aware of what configuration file you have loaded, nor does it allow you to pick an alternate configuration file. So if you apachectl restart
when you are running with an alternate configuration file, Apache will be restarted with the default configuration file, rather than the one that was specified by -f
.
Apache reads your configuration files only on server startup. This means that when you make changes to your configuration file, they do not take affect right away, but only when you restart the server. This gives you a chance to test your changes before putting them into production.
You can test your new configuration file using the configtest
argument to the apachectl
command. This is done simply by typing
apachectl configtest
apachectl
will read through your configuration file and ensure that you have used correct syntax in the file. It is important to note that it does not verify that your configuration will work, but merely that it is correct syntax. If, for example, you refer to a DocumentRoot
that does not exist, this error will not be caught at this stage, but only when you restart the server with the new file.
If the configuration file has bad syntax, apachectl
will report this condition to you, and tell you which line the bad syntax appears on, as shown in the following example:
% apachectl configtest Syntax error on line 26 of /usr/local/apache/conf/httpd.conf Illegal option FollowSumLinks
In this example, I had misspelled the option FollowSymLinks
as FollowSumLinks
, and this was identified as an invalid configuration directive.
Having corrected this error, and running the command again, I get an indication that the problem has been resolved:
% apachectl configtest Syntax OK
Note that if you have bad syntax in your configuration file when you try to start your server, it will not start. If you try to restart your server with a bad configuration file, it will ignore the restart and continue running so that it does not get stuck in a state of not running, and be unable to start up.
We highly recommended you run apachectl configtest
each time you make any modifications to your configuration file. Apache gives you the option of including a file into your httpd.conf
configuration file at the time that the server is restarted and the configuration files are loaded. This is accomplished with the Include
directive, as shown here:
Include conf/modperl.conf Include /etc/apache.otherconf
If the file path does not start with a leading slash (or, on Windows, with a drive designation) then the path is assumed to be relative to the ServerRoot
. Why would you want to do this?
As your server configuration becomes more and more complex (Which it will, unless you have a very simple site, and are content to leave it that way forever), it becomes increasingly desirable to split it up into smaller, more manageable parts. Although this might, in some way, harken back to the day when there were three configuration files,[3] if handled carefully, this can greatly simplify your server administration. I'll give three examples where this might be a good thing to do, but I expect that your own situation will suggest other possibilities.
First, there's the situation when you have a very large, very complex module, which you have built into your server for some added functionality. mod_perl
, mod_ssl
, and mod_rewrite
come to mind. These are very useful, very powerful modules, which can double the size of your configuration file.
In Apache 2.0, the default SSL configuration is in a separate file, which is loaded with an Include
directive.
By separating the directives for that particular module out into a separate configuration file, you can make both parts of the configuration easier to read.
There is, of course, a slight performance hit on server start, because there is additional file I/O, but this only happens once, and is a very minor consideration.
Second, and perhaps most commonly, included configuration files are very useful when managing a large number of virtual hosts. By putting the configuration for each virtual host into its own file you can very swiftly locate the configuration for a particular host, and modify it without having to paw through dozens of other lines of configuration files looking for a particular host.
However, before you rush out and implement this plan, make sure you read the next section on including directories.
Note also that the performance hit on server start is going to go up as you add additional files. If you have hundreds of virtual hosts, and each one is in a separate file, not only will you have hundreds of include lines, but you'll have to open and read in those hundreds of configuration files. You might want to consider using mod_vhost_alias
, which is discussed in Chapter 7. mod_vhost_alias
allows you to configure large numbers of virtual hosts with just a few directives, rather than needing directives for each virtual host.
And third, there's the question of multiple people managing different parts of the Web site. By putting these different parts of the configuration into different files, and giving the necessary file permissions so the right people can edit them, you can delegate responsibility for the configuration file without letting everyone in the company have write access to the main server configuration file. This is particularly useful in the case of virtual hosts, where you are very likely to have a different person managing each virtual host.
Note, of course, that you will still have to have someone with root privileges restart the server in order for the configuration changes to take effect.
Of course, if you really buy into this notion of splitting configuration off into other files, you might end up with an inordinate number of Include
directives in your configuration file. This is particularly the case if you are using this scheme for virtual hosts.
Fortunately, there's a really good solution for this. The Include
directive also takes a directory, instead of a file, as the value of the argument. When given a directory, the Include
directive reads every file in the specified directory and includes it into the configuration.
So, if you have dozens of virtual hosts, you can put all those configuration files in a single directory, and include them all with one directive:
Include conf/vhosts
When you start (or restart) your Apache server, you'll see something like the following in your error_log
:
[Sat Jun 30 21:52:58 2001] [notice] SIGHUP received. Attempting to restart Processing config directory: /usr/local/apache/conf/vhosts Processing config file: /usr/local/apache/conf/vhosts/apache Processing config file: /usr/local/apache/conf/vhosts/boxofclue.com Processing config file: /usr/local/apache/conf/vhosts/buglet Processing config file: /usr/local/apache/conf/vhosts/cpan Processing config file: /usr/local/apache/conf/vhosts/cvs Processing config file: /usr/local/apache/conf/vhosts/dates Processing config file: /usr/local/apache/conf/vhosts/drbacchus.com Processing config file: /usr/local/apache/conf/vhosts/gaddisphoto.com Processing config file: /usr/local/apache/conf/vhosts/photos.tm3.org Processing config file: /usr/local/apache/conf/vhosts/reefknot.org Processing config file: /usr/local/apache/conf/vhosts/rt Processing config file: /usr/local/apache/conf/vhosts/tm3 Processing config file: /usr/local/apache/conf/vhosts/zzz_last [Sat Jun 30 21:52:59 2001] [notice] Apache/1.3.19 (Unix) mod_perl/1.25 configured — resuming normal operations
Notice that the files are included in alphabetic order. More specifically, they are included in the order that they appear in a directory listing. In the previous example, you'll notice a zzz_last
file on the end. This is the one containing the virtual host settings for the _default_
virtual host, and perhaps some other global server configuration directives.
It is also very important to note that every file in the directive will be included, therefore, you should make sure that no stray files end up in the directory, which can cause Apache to fail on startup. Temporary files generated by your editor are a frequent source of problems, for example.
The Options
directive is one of the main tools for turning features on and off in various parts of the site. Judicious use of this directive will allow you to control very tightly what is allowed, and not allowed, in each content directory. It can be set in your main server configuration, in a VirtualHost
section, in a Directory
section, or in a .htaccess
file. For the scope of the section in which you set it the specified options will be turned on.
Options
takes one or more of seven possible values, All
of them, or None
of them. Most of these possible values will also be discussed in other chapters because they turn on (or off) major functionality of the server.
The syntax of the Options directive is as follows:
Options [+|-] option [+|-] optionPrepending a [ps]
to a particular option adds that option to those that are turned on, whereas prepending a ñ
turns off that particular option.
Options +ExecCGI +Includes -FollowSymLinks
In this example, the ExecCGI
and Includes
options are turned on, and the FollowSymLinks
option is turned off.
Although the [ps]
and ñ
are optional, omitting them means that you are turning off all other options that might have been set, and turning on only those that you have specified. The following directive, for example, turns on Indexes
, but also turns off any other options that might have been turned on. It helps to remember that directives set in a particular directory also apply to any subdirectories of that directory.
Options Indexes
Options +ExecCGI
The ExecCGI
option turns on the capability to execute CGI programs inside the specified scope. This option will be discussed in greater detail in Chapter 15, “CGI Programs.”
Although this directive allows you to execute CGI programs in a directory which is not marked with a ScriptAlias
directive—that is, to execute CGI programs in a document directory—this is generally not a good idea for two reasons. First of all, it makes it very difficult to track down all the CGI programs on your site if and when you are having problems. Secondly, it is a potential security problem. Any CGI program is a potential security hole, and permitting them in directories where the file permissions are typically a little more lenient is asking for trouble.
Options +FollowSymLinks
By default, symbolic links are ignored when they appear in a directory served by Apache, which makes it impossible to escape from the document directory. If, for example, you had a link to /home
in your document root, following that symbolic link would permit Web users to download arbitrary files from anyone's home directory, which would clearly be a security problem.
However, if you have Options +FollowSymLinks
turned on, then Apache will follow these links.
Make very sure you are aware of the security implications in turning on this option. Make sure that you do not have any symbolic links to directories that might contain files that should not be available to the general public. Never permit this option for directories that are managed by potentially untrustworthy people. If you absolutely must have symbolic links from your content directories, see the SymLinksIfOwnerMatch
option as a possible alternative.
Because Microsoft Windows does not permit symbolic links, this option does not apply to Apache on Windows.[4]
Options +SymLinksIfOwnerMatch
This is the same as the previous option, with one important difference. Apache will follow symbolic links only if the target of the directory is owned by the same user as the link itself. That means a user cannot link to a directory that they do not own, and thus get access to the contents of that directory.
As with FollowSymLinks
, this option is not available on the Windows version of Apache.
Options +Includes
Options Includes
turns on the capability to have Server-Side Includes (SSI) in files. SSI gives you the ability to embed a variety of commands in HTML documents, and have them evaluated when the page is served to a client.
SSI is discussed in detail in Chapter 16, “Server-Side Includes.”
SSI has many of the same security concerns as CGI programs, in that it allows the execution of arbitrary commands on the server. To defang this beast, consider using IncludesNOEXEC
instead.
Options +IncludesNOEXEC
This directive turns on permission to use SSI, but forbids the use of the #exec
command, or using #include
to load a CGI program. This removes most of the security risk associated with permitting Server-Side Includes.
Options +Indexes
This option enables the generation of automatic indexes in directories that do not have an index.html
file (or whatever file you have indicated with the DirectoryIndex
directive.
You can find out more information about automatic generation of indexes in Chapter 11, “Directory Indexing.”
Turning on this option means that files in directories—even if you don't have links to them from anywhere on your site—will be visible to anyone looking at your Web site. However, if you are planning the security of your Web site around the principle of “hoping nobody notices,” then you will have larger problems than this. That is to say, if you would not want random strangers to be in possession of certain files, you should never have them available on an unauthenticated Web site.
Options +MultiViews
The MultiViews
option turns on a very powerful aspect of content negotiation. MultiViews
is a feature whereby Apache figures out which document is most likely to be acceptable to the client, and gives them that one. MultiViews
and content negotiation in general, will be discussed in Chapter 10, “Content Negotiation.”
Options All
As you would expect, Options All
turns on all the various Options
. Well, almost all of them. MultiViews
is not turned on with All
, and must be explicitly asked for.
Please remember that there are serious security concerns when you start splitting up your configuration file and putting it all over the place; particularly if you start giving out permissions to edit those files.
Being able to edit those subfiles is no different from being able to edit your main server configuration file. Any directive can be put in those included files, and will have every bit as much weight as though it had appeared in the main server configuration file.
Make very sure of two things. First, ensure the files themselves are in secure directories. If the file is world-writeable, so are all the files in it, and even if they can't edit the file itself someone could delete or rename them. For example, if /usr/local/apache
is world-writeable[5] someone can remove the directory apache
, and replace it with his own directory, with anything he wants in it. So it's not enough that the individual files in that directory have the correct permissions on them.
Secondly, make sure that you trust the folks that you're giving file write permission to.[6] If they can edit these files, they can do whatever they want to your server configuration.
Apache configuration files contain three types of things. Sections, specified with HTML-like tags, delineate the scope, or range, of a particular set of directives. Directives, specified as a directive name, followed by a value, configure all the individual settings on the server. And comments, specified with a leading pound sign (#), are ignored by the server, and serve only to annotate the configuration file.
Apache ships with heavily commented default configuration files to get you started quickly.
[1] This concept will be covered in more detail in the next section on containers, and in the upcoming chapter on .htaccess
files.
[2] Note that the documentation alternately refers to them as directives, sections, and containers, depending on the context, and the author of that particular part of the documentation.
[3] srm.conf, httpd.conf, and access.conf, presumably split the directives into sensible categories, but the distinctions were always rather nebulous.
[4] No, shortcuts are not the same as symbolic links, and Apache will not follow shortcuts.
[5] Yes, I know, that's a horrible thought, but I've seen it happen.
[6] And, of course, the first law of security is “Don't trust anybody.” See Chapter 19, “Apache Security,” on Security Considerations.