Tom Syroid
Email Tom
Tom's Website
BTLB Logo Brian P. Bilbrey
Email Brian
Brian's Website


Go to the Table Of Contents

Did you read the Preface? Thanks!

26 - Serving Web Pages with Apache

In This Chapter

So you want to be a Webmaster, do you? Well you've come to the right place. In this chapter we'll show you how to get Apache up and running, how to compile the product from source, examine several example web server configurations, and touch briefly on a number of advanced topics - namely, filesystem security, user authentication, virtual hosting, and logging.

Before we begin, however, it's time for a quick reality check. While bringing up a web server is not terribly difficult, there are a host (pardon the pun) of hidden responsibilities and prerequisites entailed in having a "web presence". Before embarking on any Grand Adventures with Apache, please ensure you consider the following:

In short, running your own web server can be an exciting and rewarding experience. It can also be frustrating, expensive, and a huge time-sink. Still with us? Good. Let's begin with the basics and a general introduction to the world of Apache.

Introducing Apache

Apache began life as a series of patches to the web server program developed by NCSA, httpd. When the lead programmer for httpd left NCSA, program development stagnated, and a group of webmasters/programmers decided to step in and create an informal coalition to revive the program and direct its future development. This diverse band of hackers dubbed the project Apache ("a patchy server" - nothing to do with the native American Apache tribe). In April 1995, the first public version of Apache was released (0.6.5) and virtually overnight it became one of the most popular web servers among the then-fledgling Internet community.

According to a recent Netcraft survey (http://www.netcraft.com/survey/), Apache remains a powerful contender in the fickle world of web server software, with a purported 60% share of the market. Apaches is a perfect example of how a well-planned and implemented Open Source product can compete head-to-head with expensive commercial alternatives, and (in the minds of many) win hands-down.

Unless you've done a custom Linux installation, Apache is probably already resident on your hard disk. To verify its existence, check for the presence of the /etc/httpd directory or run an RPM query like so:

[tom@janus tom] rpm -qa | grep apache
apache-1.3.11-1
apache-devel-1.3.11-1
apache-docs-1.3.11-1

If the package is not installed, use the RPM utility to install it from your distribution CD-ROM, or if you're feeling adventurous, download the source code and compile it (see the section "Building Apache from Source" later in this chapter for details).

Like many *NIX server programs, Apache consists of a single daemon (httpd), a handful of utilities, and a configuration file (/etc/httpd/conf/httpd.conf) that contains all the options necessary to bring a web server to life. Before you begin, we suggest making a hardcopy printout of httpd.conf. Read through the various sections it contains (options are liberally commented) and follow along as we walk through the key option statements you'll need to edit. Much of our commentary is based upon a clean compile and installation from source code, but all of it corresponds fairly closely to the pre-built binaries supplied by Caldera.

Note
Technically speaking, Apache has three configuration files: httpd.conf, src.conf, and access.conf. Use of the latter two is optional. They are still supported for backward compatibility, but the Apache Group recommends putting all configuration parameters in httpd.conf.

httpd.conf

is divided into three general sections: global environment, [main] server configuration, and virtual hosts configuration. Each section contains a series of directives followed by an option or parameter. In addition, some sections contain block directives that control the operations permitted on a directory, file, or virtual host. These blocks begin with the directive surrounded with angle brackets (for example, <directory>), followed by a list of options, and end with the directive - again in angle brackets - preceded by a backslash (</directory>). Any line prefaced with a pound sign (#) is considered a comment.

Global environment

The Global Environment section of httpd.conf contains options to specify the location of Apache's binaries (ServerRoot), the number of concurrent server daemons that the main server spawns (StartServers), timeout parameters, which interface and port Apache is to bind to, and which DSO (Dynamic Shared Object) modules are loaded at run-time. The following global section was extracted from the httpd.conf.default file created by Apache (when compiled from scratch). We've removed the comments and cut most of the lines from the DSO list for brevity.

### Section 1: Global Environment

ServerType standalone
ServerRoot "/usr"
#LockFile /var/run/httpd.lock
PidFile /var/run/httpd.pid
ScoreBoardFile /var/run/httpd.scoreboard
#ResourceConfig conf/srm.conf
#AccessConfig conf/access.conf
Timeout 300
KeepAlive On
MaxKeepAliveRequests 100
KeepAliveTimeout 15
MinSpareServers 5
MaxSpareServers 10
StartServers 5
MaxClients 150
MaxRequestsPerChild 0
#Listen 3000
#Listen 12.34.56.78:80
#BindAddress *

### DSO Module List ###
LoadModule vhost_alias_module lib/apache/mod_vhost_alias.so
LoadModule env_module         lib/apache/mod_env.so
LoadModule config_log_module  lib/apache/mod_log_config.so

... material removed ...

ClearModuleList
AddModule mod_vhost_alias.c
AddModule mod_env.c
AddModule mod_log_config.c

... material removed ...

#ExtendedStatus On

Generally speaking, you shouldn't have to change anything in this section. The ServerType directive controls how Apache handles its child processes. The options are standalone (the default; Apache spawns a number of child processes when the daemon is started) or inetd (each time a request comes in Apache starts a process to handle it, then kills the process when the request is complete). The location of the Lock, PID, and ScoreBoard files are determined according to the layout used at the time the program was compiled.

The KeepAlive and Timeout settings determine how long a connection is kept open when a client first accesses the server - don't mess with these unless you know what you're doing.

The various Servers and Clients directives control Apache's child processes - how many are started when the program is first initialized (StartServers), how many "idle" processes are allowed before Apache starts killing them (MaxSpareServers), and how low the pool of idle processes falls before new servers are spawned (MinSpareServers). MaxClients limits the total number of requests Apache answers simultaneously, and effectively limits the total number of servers that can run concurrently.

The MaxRequestsPerChild guards against accidental memory leaks in Apache. The number here determines how many requests each child server handles before it dies. As you can see, the default is zero, which means a child process continues to run until the program is stopped. At the time of this writing, there were no known memory leaks in Apache, but if you have any reservations about the revision you're using (especially if you're using pre-release code), it's probably a good idea to set this number to 30 or 40 as a safeguard.

Apache's DSO modules are covered at length later in this section; server status options (ExtendedStatus on or off) are discussed later in this chapter under the section "Logging and Monitoring Apache".

'Main' server configuration

The Server configuration section contains options to specify the user and group Apache runs under, the Document Root, which ports the server listens on, the email address of the Apache administrator, log location and format, rules for content negotiation, and a series of option blocks that determine directory access and module response (similar to if/then statements).

The server section of httpd.conf is long so we won't list all the directives here. It consists primarily of directive blocks that allow or deny actions within named directories. There are several important directives in this section, so we'll list selected segments and annotate them with our suggested changes.

### Section 2: 'Main' server configuration

Port 8080
User nobody
Group nobody
ServerAdmin [email protected]
ServerName www.syroidmanor.com
DocumentRoot "/home/httpd/html"

For some reason, the default httpd.conf file sets the port Apache listens on to 8080. While 8080 is an unprivileged port, which reduces security vulnerabilities, 80 is the usual port for most installations. Edit this to 80 unless for some reason you want the server to respond to another port.

Next, ensure the User and Group directives are set to a user and group with restricted access rights (see the section "A Word about File Permissions" later in this chapter for more details). The default nobody/nobody entries work for most installations.

The ServerAdmin and ServerName directives should be set to meaningful values that correspond to your installation. These directives are not used by Apache directly (or any other programs running on the server); they are used by some server-generated pages, such as error documents, that are returned to the client when circumstances dictate. If you do not have a valid DNS name for your host, do not just create one - instead, enter its IP address under ServerName.

DocumentRoot is the root directory that Apache serves web pages from. If you're just starting out, and simply want to test the server, leave this entry as is. If you have an existing web site already built, either move the structure to this location or edit the DocumentRoot entry to point to an alternate location.

<Directory "/home/httpd/html">
    Options Indexes FollowSymLinks MultiViews
    AllowOverride None
    Order allow,deny
    Allow from all
</Directory>

If you changed the DocumentRoot directive, you must also change the above entry to match.

<IfModule mod_dir.c>
    DirectoryIndex index.html
</IfModule>

The DirectoryIndex directive tells Apache what index file to look for when it enters a directory. Most HTML editors use the "html" extension by default, but there are some that use "htm" unless instructed otherwise (for example, some version of Microsoft FrontPage). If you're in doubt as to which your editor uses, add index.htm to this line. That's also a good idea if you're providing virtual hosting services for a number of sites, each of which is likely to be run by a different person using other tools.

HostnameLookups Off

The HostnameLookups directive determines whether Apache logs the domain name of visiting clients, or just an IP address (for example, www.syroidmanor.com (on) or 142.165.206.61 (off)). The Apache Group recommends leaving this option off for both performance and "netiquette" reasons. If HostnameLookups is on, every client request to the server generates at least one lookup request to a neighboring nameserver. Not only does this impact on the web server's performance, but it also clogs the Internet with unnecessary traffic. We recommend leaving this option off, and if you want to know the hostnames of visitors to your site, use a log analysis program such as Webalizer (this program is part of an "All Packages" eDesktop installation, but see the website, http://www.webalizer.com, for the latest version) to generate these statistics during off-peak hours.

Virtual Hosts

Virtual Hosts provide the administrator with the means to serve numerous web sites with a single instance of Apache from one server. Each host is defined in an individual option block that specifies where the host's document root is, where logs are stored, the webmaster's email for that host, etc. Most options found in the main server configuration section can also be applied to an individual virtual host, where they override the global configuration option for that virtual host.

Configuring virtual hosts is an involved process and deserves a section all of its own. You'll find all the grizzly details later in this chapter under the section "Virtual Hosting with Apache."

Testing httpd.conf and starting/stopping the server

Any time you work with Apache's configuration file, it's a good idea to check your work for syntax and/or spelling errors: type /usr/sbin/apachectl configtest. If all is well and good, you should see a message similar to the following:

[tom@janus conf] /usr/sbin/apachectl configtest
[Wed Oct 15 19:00:01 2000]
Syntax OK

If there any errors were introduced by your edits, Apache notes the line number and suggests a possible cause for the error. Do not pass Go; do not collect $200 - go back and fix them before continuing.

Hint
If your httpd.conf file contains errors, Apache will usually start and then immediately die without warning or error message. Any time you start or restart Apache - especially after editing httpd.conf - always double check the daemons are running by typing ps -ef and confirming the running processes.

With a valid configuration file in place, it's now time to start Apache and test your handiwork. The command options for the Apache control script are:

/usr/sbin/apachectl  start | stop | restart | graceful | configtest | help

There are two additional options available for the apachectl command; both require that the lynx browser be installed on the system and the mod_status module compiled and enabled. fullstatus displays a status screen showing the server's current state, and status provides a short summary of the former. Details on monitoring the status of Apache is detailed later in the chapter under the section "Logging and Monitoring Apache."

Hint
Caldera recommends using the init script (/etc/rc.d/init.d/httpd) to start and stop Apache. Using apachectl with the configtest option is fine, but the init script does some housekeeping and performs some other sanity checks that apachectl does not do.

Apache should now be running. To test it, start up a browser on the server and type http://localhost or http://127.0.0.1. You should see Apache's sample index page or, if you pointed the DocumentRoot to an existing web site, the index page contained in the root directory. If you have access to a remote system, enter the name of the domain or the IP of the machine and you should see the same. Ahhh, you have Apache pointed to a test site you created just for the occasion, and instead of an index page, you're getting a 404 error. Read on...

A word about file permissions

Two of the most common reasons a client is denied access to a web site are failure to correctly set the DocumentRoot option in httpd.conf, and incorrect file permissions on either a directory in the path or on one or more files within a web site. The document root error problem can be easily remedied by reviewing the settings outlined in preceding section. The topic of file permissions is the focus of this little fireside chat.

Hint
Whenever you edit a web page and post it to a server, always check that it is accessible from a client browser - preferably by accessing the web server from the same IP a reader would. If you follow this advice religiously, you'll be surprised at just how often a page is returned 403 or 404 ("forbidden access" and "not found", respectively) because a permission bit is not set correctly.

When Apache starts, the server daemon needs to bind to port 80. This is the default port browsers are programmed to access a web server on (you can force a browser to connect to an alternate port by typing http://servername:portnumber). However, for security reasons only the root user can attach a process to a port lower than 1024. This introduces a major security dilemma. You don't want to expose a daemon running as root to the outside world, as this would give anyone accessing the server via that daemon superuser powers. The solution the Apache group came up with is both elegant and functional.

When the httpd daemon is invoked (by root, usually in the sysinit scripts), a "master" process takes care of binding the server to port 80 and is thereafter relegated to nothing more than spawning and managing a number of "child" processes (the number of "children" spawned is controlled by the StartServers = option in httpd.conf). These child processes - which are responsible for answering all requests for web pages - are then started under a different user/group (OpenLinux defaults to the user and group "nobody", an account with almost no permissions, as a security feature), specified by the User and Group options in Apache's configuration file.

Warning
Never, ever, ever start Apache until you've double-checked the User and Group options in httpd.conf. And never start Apache with the User option set to root or the Group set to anything that has administrative permissions. The names used here are not significant (Nobody/Nobody or Webuser/Webgroup are common); the important points to note are that the entries used for User and Group point to a username and group that has little or no authority on the server.

So what does all this talk of spawning children have to do with permissions? Well, in a word, everything. When a client browser comes knocking at the door of your web server, provided everything else is kosher, they're granted access to the requested pages as user nobody. This means that any document served to the public must have its permission bits set such that either the file is readable by the group nobody or the file's other (or world) permission bit is readable. This is very important stuff, so let's review file permissions in general and then establish some clearly defined rules regarding public file access under Apache.

First, recall that a triad of permission bits control file access under UNIX. For example:

[Hydras:tom]/home/tom/webs/insights> ll
total 344
-rwxr-xr--   1 tom      usr         8958 Sep 04 13:18 dictionaries.htm
-rwx---r--   1 tom      usr        24577 Sep 04 13:18 favorites.htm
drwx--x--x   4 tom      usr          512 Aug 18 14:17 computing

The above listing shows that tom owns the files, and tom is a member of group usr. The first file is marked read/write/execute (rwx) for the owner, read/execute (r-x) for any member of the group usr, and read-only (r--) to all other users. The second file is rwx for the owner, no access for members of the group usr, and read-only for all other users. Given that the user nobody is a member of the set other, this means that if these two files were placed in a directory Apache was configured to serve from, they would both be readable by the public (because the public is permitted access to the directory as user nobody). If, however, we were to remove the remove the read-only bit from the permission group other for the file dictionaries.htm (the permission bits would then read: - rwx r-- ---), any requests for this file would then return an error message # 403 - "No Authorized Access".

That's gotcha number one - there's a second gotcha and it relates to directory permissions.

The user Apache runs under must have execute permission on every directory contained within a web document tree, right up to the root of the filesystem. (Bear in mind that execute permission means something slightly different for directories: it conveys the ability to list information about the files in the directory.) For example, John Smith stores his web pages in /home/jsmith/webs (in Apache parlance, John's DocumentRoot). If Apache is configured to run under User Nobody, Group Nobody, this means that user nobody (the third permission set - other) must have execute permission on all directories within John's DocumentRoot, plus execute permission on /home/jsmith/webs, /home/jsmith, and /home. Break this rule anywhere, and instead of the client seeing a document, Apache returns that now familiar 403 error code.

To summarize:

Modules - The core of Apache's flexibility

Apache was designed from the ground up to be modular; that is, the original programmers assumed that it would be extended by other developers, who would write small pieces of code which could be integrated into Apache with ease. They did this by creating a modular API and a well-defined series of phases that every request went through. This approach allows developers to customize a particular aspect of Apache by simply stringing together a series of API methods and bundling these methods into a separate code element that is called during a particular phase of a request. These discrete code elements are called modules, and provide the flexibility and extensibility Apache has become famous for.

Developers were quick to respond, and to date there are hundreds of Apache modules available. Many of them are registered with the Apache project, and can be found at http://modules.apache.org. Chances are pretty good that if there's a feature or function you need, someone else has also stumbled across the same need and written a module to do the job. Apache is bundled with a number of useful modules, and depending on the configuration options chosen when the program was compiled, each one can enabled or disabled - more on this in a moment. In Tables 26-1 through 26-4 we've compiled a list of the modules currently included with Apache. The tables are grouped according to the functionality a module adds to the core server: content generation, access control and user authentication, utility modules, and header generation and control. Names shown in bold are enabled by default.

Table 26-1
Content Modules

Module Description
mod_autoindex Generates automatic directory indexes when no index.html file is present.
mod_actions Lets you run CGI scripts when a file of a certain type is requested. This makes it much easier to execute scripts that process files.
mod_dir Provides for "trailing slash" redirects and serving directory index files.
mod_cgi Provides for execution of CGI scripts.
mod_imap Provides for .map files, replacing the functionality of the imagemap CGI program.
mod_include Provides for server-parsed html documents.
mod_isapi Provides support for ISAPI Extensions when running under Microsoft Windows.
mod_mime Provides for determining the types of files from the filename.
mod_mime_magic Attempts to determine the MIME type of a file by looking at a few bytes of its contents, the same way the Unix file(1) command works.
mod_mmap_static Uses the mmap() function to provide a statically configured list of frequently requested, unchanged files.
mod_speling [sic] Attempts to correct misspellings of URLs that users might have entered by ignoring capitalization and allowing up to one misspelled word.
mod_status Allows a server administrator to remotely check the status of a server (for performance, load, connected users, etc.).
mod_userdir Allows for user-specific directories in the form http://www.server.com/~user/.

Table 26-2
Access Control and Authentication Modules

Module Description
mod_access Provides for access control based on client hostname or IP address.
mod_auth Provides for user authentication using plain text files.
mod_auth_anon Allows "anonymous" user access to authenticated areas.
mod_auth_db Provides for user authentication using Berkeley DB files.
mod_auth_dbm Provides for user authentication using DBM files.
mod_digest Provides for user authentication using MD5 Digest Authentication. It has been deprecated and is replaced by mod_auth_digest.
mod_auth_digest Provides for user authentication using MD5 Digest Authentication similar to mod_digest, with many more options.

Table 26-3
Utility Modules

Module Description
mod_alias Provides for mapping different parts of the host filesystem in the the document tree, and for URL redirection.
mod_info Provides a comprehensive overview of the server configuration including all installed modules and directives in the configuration files.
mod_log_config Provides for logging of the requests made to the server, using the Common Log Format or a user-specified format.
mod_proxy Provides for an HTTP 1.0 caching proxy server.
mod_rewrite Provides a rule-based rewriting engine to rewrite requested URLs on the fly.
mod_so Provides for loading of executable code and modules into the server at start-up or restart time. On Unix, the loaded code typically comes from shared object files (usually with .so extension), whilst on Windows this module loads DLL files.
mod_vhost_alias Provides support for dynamically configured mass virtual hosting.

Table 26-4
Header Modules

Module Description
mod_asis Provides for .asis files. .asis files have headers prepended to the content and are sent, well, as is.
mod_cern_meta Provides for CERN httpd metafile semantics.
mod_env Provides for passing environment variables to CGI/SSI scripts.
mod_expires Provides for the generation of Expires headers according to user-specified criteria.
mod_headers Allows for the customization of HTTP response headers. Headers can be merged, replaced or removed.
mod_negotiation Provides for content negotiation based on, for example, language preference or browser type.
mod_setenvif Provides for the ability to set environment variables based upon attributes of the request.
mod_unique_id Provides a magic token for each request which is guaranteed to be unique across "all" requests under very specific conditions.
mod_usertrack Generates a log of user activity on a site using cookies.

These modules provide Apache with flexibility, extensibility, and performance. You could, for example, compile Apache with just the modules you need and eliminate the rest. This would provide for a targeted server that's easy to maintain, in a small, tight high-performance code block. If, on the other, you plan to serve up a diverse range of content, provide user authentication, and enable CGI scripts to be run, simply add the necessary modules to meet these needs.

To further expand on this flexibility, Apache provides two ways to compile modules into its core code: static or shared. Statically compiled modules are linked to the core daemon code when the program is built. Shared modules link a small portion of the core daemon to the module, and a portion of the module is added to the core. In addition, an ancillary module is linked to the core httpd code (mod_so.c) that serves to "bootstrap" the modules at runtime.

Compiling modules as shared is the preferred approach when building an Apache installation for two reasons. First, once compiled, shared modules can be unloaded from the program's core executable code by commenting out the LoadModule and matching AddModule lines from httpd.conf. For example, if the image map module is compiled as a shared module, and not needed for a particular implementation, simply comment out the lines LoadModule imap_module and AddModule mod_imap.c, then restart the server. This provides the webmaster with the means to dynamically reconfigure functions within Apache without recompiling the program.

The second reason is that shared modules (either a module provided by Apache or a custom third-party module) can be added to an installation after the server is built using a utility called apxs. The new module is compiled outside Apache's source tree and added to the server's libraries with apxs. Then add the requisite module definition lines (LoadModule and AddModule) to httpd.conf and restart the server. Adding shared modules in this manner is only possible when Apache is initially compiled with mod_so.c.

Needless to say, shared modules provide an enormous degree of flexibility for the developer or webmaster trying to administer a web server. The next section, "Building Apache from Source" details just how make all this magic happen.

Building Apache from Source

Running Apache from pre-compiled binaries is fine for a relatively simple web site serving static pages, but if you're going to get serious about the business of running and maintaining a full-featured web server, knowing how to build Apache from source is almost mandatory. "Rolling your own code" provides several key advantages over vendor-supplied packages, the most important being:

If you want performance, extensibility, and you like to know exactly what's on your system (and how all the pieces work), nothing beats compiling a program from source - especially when it comes to a front-line server like Apache. The more you know about a product and how it works, the easier it is to guard against security threats. And nothing is more prone to security hacks than a web server. Fortunately, Apache is tight, out of the box, and can be made tighter with a little bit of insight into what goes where, why it's there, and what it does.

Earlier versions of Apache were not the easiest animal to compile. Making configuration tweaks and adjustments required dipping down into the .../src tree, and manually editing Makefile.tmpl. With the release of the 1.3 revision series, all this changed. Current Apache releases now use the APACI (APache AutoConf-style Interface) method for building and installing from source - which is just a fancy way of saying the program now uses the same commands people have come to expect when compiling a Linux program:

[tom@velocity src]$ gunzip apache 1.3.X.tar.gz | tar xvf -

[tom@velocity src]$ ./configure --prefix=PREFIX [option1, option2...]

[tom@velocity src]$ make

[tom@velocity src]$ make install

Note
Apache uses an autoconf-style interface, but it does not actually use the GNU autoconf package. Instead, scripts are employed to create a similar batch configuration and build process. When Apache 2.0 is released (likely sometime in early 2001), it will formally use GNU autoconf to generate dynamic configuration files.

The first step, of course, is to browse (or FTP) to the Apache site - http://httpd.apache.org/dist/ - and grab the latest stable source package. A notice at the top of the page tells you what version is latest and greatest. There's also links provided for mirror sites, and text files available that list code changes and announcements.

In the examples provided throughout the balance of this chapter we'll be working with version 1.3.14, and our "workbench" directory (the directory we use for compiling and storing source code) is /usr/local/src. Here's the drill:

[tom@velocity tom] su -
password: yada-yada

[root@velocity root] cd /usr/local/src

[root@velocity src] ftp -i ftp.apache.org
...blah-blah-blah

[root@velocity src] gzip -d apache_1.3.14.tar.gz | tar xf -

[root@velocity src] cd apache*

Once the package is downloaded and uncompressed, your first stop should be a perusal of Apache's configuration options.

[root@velocity] apache_1.3.14]$ ./configure --help

The output displayed from this command is important - so much so, we recommend you pipe it to a file (by typing ./configure --help >~/apache-configure-options), print it out, and study it carefully. Our second piece of unsolicited advice is generic and applies any time you compile a program on your system - run the script program to log all of your interactions. Once you've reviewed and decided on an appropriate set of configuration options, type script ~/apache_compile_001.txt or some similar file spec. Compile the program, and when it completes, type Ctrl + d to stop the logging. A log file is priceless when it comes time to debug a problematic compilation. And finally, read the documentation provided with Apache! It will tell you what's new, what's different, and point out specific compile-time options and pitfalls for most common architectures.

We won't go through each and every configuration option, as your choices here will be based on your own unique requirements, but there are several important ones you should be aware of.

How Apache configures program-file locations

Apache provides a very slick mechanism for configuring where program files are installed. The --with-layout= option allows you to choose from a number of directory layouts used by common distributions (Apache [default], GNU, Mac OS X, RedHat, Mandrake, and many more). What makes this such a valuable feature? Because it removes the need to specify six or seven separate location options (binary directories, man page directory, common log directory, etc.) at the command line. Anyone who's ever struggled to update a program from a source package that's pre-configured to put everything in different locations than an existing installation will fully appreciate this enormously useful option.

Examining layout.config

There are two ways to determine which layout option best suits your existing installation. The first is to examine the layout.config file located in the topmost source directory (in our example, /home/usrname/src/apache_1.3.14). The following is a snippet of this file showing the default Apache directory layout.

#   Classical Apache path layout.
<Layout Apache>
    prefix:        /usr/local/apache
    exec_prefix:   $prefix
    bindir:        $exec_prefix/bin
    sbindir:       $exec_prefix/bin
    libexecdir:    $exec_prefix/libexec
    mandir:        $prefix/man
    sysconfdir:    $prefix/conf
    datadir:       $prefix
    iconsdir:      $datadir/icons
    htdocsdir:     $datadir/htdocs
    cgidir:        $datadir/cgi-bin
    includedir:    $prefix/include
    localstatedir: $prefix
    runtimedir:    $localstatedir/logs
    logfiledir:    $localstatedir/logs
    proxycachedir: $localstatedir/proxy
</Layout>

From Tom's Bag of Tricks
Can't find a setup in layout.config that matches your installation? Create your own. Copy the section that best matches your installation, and paste a new section into the file. Edit the title, and adjust any required paths. When you call the --with-layout option, simply specify the title you used for the new layout. For example, to use the newly added "<Layout Unique>" section, the command would be ./configure --with-layout=Unique. Note that section titles are case-sensitive.

Using the show layout Option

The second method to display a directory layout is to use the --show-layout option in conjunction with the named section. While this approach presupposes knowledge of the available layouts, it has an extremely useful role in ensuring all the right rocks are going in the right jars. The following listing shows the RedHat layout option:

[tom@velocity apache_1.3.14]$ ./configure --with-layout=RedHat --show-layout
Configuring for Apache, Version 1.3.14
 + using installation path layout: RedHat (config.layout)

Installation paths:
               prefix: /usr
          exec_prefix: /usr
               bindir: /usr/bin
              sbindir: /usr/sbin
           libexecdir: /usr/lib/apache
               mandir: /usr/man
           sysconfdir: /etc/httpd/conf
              datadir: /home/httpd
             iconsdir: /home/httpd/icons
            htdocsdir: /home/httpd/html
               cgidir: /home/httpd/cgi-bin
           includedir: /usr/include/apache
        localstatedir: /var
           runtimedir: /var/run
           logfiledir: /var/log/httpd
        proxycachedir: /var/cache/httpd

Compilation paths:
           HTTPD_ROOT: /usr
      SHARED_CORE_DIR: /usr/lib/apache
       DEFAULT_PIDLOG: /var/run/httpd.pid
   DEFAULT_SCOREBOARD: /var/run/httpd.scoreboard
     DEFAULT_LOCKFILE: /var/run/httpd.lock
      DEFAULT_XFERLOG: /var/log/httpd/access_log
     DEFAULT_ERRORLOG: /var/log/httpd/error_log
    TYPES_CONFIG_FILE: /etc/httpd/conf/mime.types

Building Apache with a shared module

To build Apache with a shared module, you must enable the module (if it is disabled by default) and tell the makefile to configure the module in shared mode. For example, to add the mod_info module (not configured by default) as a shared module the configuration command would be as follows:

[tom@velocity apache_1.3.14]$ ./configure --with-layout=RedHat \
	--enable-module=mod_info \ 
	SERVER_CONFIG_FILE: /etc/httpd/conf/httpd.conf \
	ACCESS_CONFIG_FILE: /etc/httpd/conf/access.conf \
	RESOURCE_CONFIG_FILE: /etc/httpd/conf/srm.conf

The backslashes in the above listing indicate that a command line is continued onto the next line, as though a carriage return is not involved. On another topic, why are we showing you the RedHat layout in an OpenLinux book? Ah, good question grasshopper. It just so happens the RedHat option uses the same directory locations as OpenLinux, with one exception - the libexecdir (highlighted in bold). Caldera's distributions use the directory /usr/libexec/apache instead of /usr/lib/apache.

We can correct this small faux pas, and compile Apache with the correct installation location for OpenLinux with two simple configuration options:

./configure --with-layout=RedHat --libexecdir=/usr/libexec/apache

Or, we can just leave well enough along and simply use RedHat's layout. It's not critical where the library files are installed, just as long as Apache knows where to find them - which it does.

As discussed in the preceding section, "Modules - The Core of Apache's Flexibility," modules can be compiled statically or shared. For an onscreen reminder of the default status of a module (enabled or disabled), type ./configure --help and look for the section titled "--enable-module=NAME/--disable-module=NAME".

Compiling modules statically

Simply typing ./configure builds Apache with those modules enabled by default, statically linked. To enable a disabled module, use the --enable-module=NAME option; to disable an enabled module, use the --disable-module=NAME option. No rocket science involved here.

    --enable-shared=info

Note that the --enable-shared= option does not reference the full module name; just the name after the "mod_" prefix. Several points and tips regarding building with shared modules are in order:

In the following examples, we'll try to tie all the above options and explanations into concrete examples by showing you how to configure and compile four distinct Apache servers: a default build, a build for serving custom content, a build to serve static content, and a build that utilizes DSO modules to provide the "kitchen sink" - a very flexible kitchen sink.

Example One: Building a Web Server with Apache's Defaults

In this first example, we'll start with a basics and build Apache with configuration defaults. Begin in the directory containing the uncompressed source code (in our case, thi is /usr/local/src) and issue the configure and make commands as a "regular" user. You must su to root for the final installation step.

[tom@velocity src]$ ./configure
*** much verbose output ***

[tom@velocity src]$ make
*** more verbose output ***

[tom@velocity src]$ su
Password: DontPAnic

[root@velocity src]# make install
*** more verbose output ***

Note that we have not specifically enabled or disabled any modules, nor have we explicitly told the compiler to dynamically link them during the build process. Therefore, the 17 default modules as detailed in Tables 26-1 through 26-4 are compiled statically. The following command displays the build options for the completed binary:

[root@velocity src]# /usr/local/apache/bin/httpd -l
Compiled-in modules:
  http_core.c
  mod_env.c
  mod_log_config.c
  mod_mime.c
  mod_negotiation.c
  mod_status.c
  mod_include.c
  mod_autoindex.c
  mod_dir.c
  mod_cgi.c
  mod_asis.c
  mod_imap.c
  mod_actions.c
  mod_userdir.c
  mod_alias.c
  mod_access.c
  mod_auth.c
  mod_setenvif.c

Example Two: Building a Web Server for Custom Content

Most computer projects follow a remarkably consistent pattern: 80 percent planning, 10 percent implementation, and 10 percent maintenance. If you skimp on the planning phase - trust us - you'll end up paying the price in one of the two remaining phases. Planning a web server installation consists primary in answering one key question: What kind of content are you going to provide?

In this example, we are going build a copy of Apache for a well-designed private or commercial website. In this context, well-designed means that whoever is charged with designing the web knows the layout, and has already specified the type of files it will host.

The Webmaster has decided the site will not contain imagemaps or asis files, so the mod_imap and mod_asis modules can be disabled. There are no user directories on the system (all contributors have pre-established user accounts under /home), and all directories will have index files, so mod_userdir and mod_autoindex can also be disabled. Finally, none of the pages require any sort of authentication, so mod_auth can be dispensed with as well (the other mod_auth_* modules are not compiled in by default). The choice is made to keep mod_access, however, to protect the server-status page.

mod_status is required to keep track of the status of the server, and mod_access to limit access to the status page to internal IPs only. mod_dir allows the Webmaster to specify that the default index for each DocumentRoot directory is index.shtml. (Using mod_actions, we define files with a .shtml extension to be handled by mod_include, which means that the web server will parse them for special processing directives, which it will execute. We have also, through mod_dir, told Apache to serve a file called index.shtml whenever someone requests a directory, for example, a URL that ends with a /.) These are all enabled by default, and require no extra enable-module directives. Since the marketing department has seen fit to publish mixed-case URLs in already released advertisements, Apache needs mod_speling (see Table 26-1), which makes URLs case insensitive (--enable-module=speling).

Reminder
Don't forget that when specifying modules to enable or disable, you need to list the name of the module, without the "mod_" prefix.

The standard Apache layout fits the bill with one minor exception - the system administrator wants all log files written to a separate existing partition and stored in the directory /logs/httpd. This is accomplish by passing --logfiledir=/logs/httpd to the configure script.

Here is the required configuration command:

[root@velocity root]# cd /usr/local/src/apache_1.3.14

[root@velocity apache_1.3.14]# ./configure --with-layout=Apache \
  --logfiledir=/logs/httpd \
  --enable-module=speling \
  --disable-module=imap \
  --disable-module=asis \
  --disable-module=userdir \
  --disable-module=autoindex \
  --disable-module=auth \
  --verbose

Apache stores this configuration in a file called config.status located in the root of the source tree (where the configure script lives), so the build can be duplicated easily. After configuration process finishes, you'll be returned to the command prompt; type make and watch the magic unfold. Once the compile is complete, make install will put the files into the directories specified by the layout chosen (again, this requires root access to the machine).

The finished binary looks this:

[root@velocity src]# /usr/local/apache/bin/httpd -l
Compiled-in modules:
  http_core.c
  mod_env.c
  mod_log_config.c
  mod_mime.c
  mod_negotiation.c
  mod_status.c
  mod_include.c
  mod_dir.c
  mod_cgi.c
  mod_actions.c
  mod_speling.c
  mod_alias.c
  mod_access.c
  mod_setenvif.c

Slim, compact, and to the point.

Example Three: Building a Static Content Web Server

In this third example, we're going to take the "custom content" example shown above and "tighten the ship" even further. The following web server is build for one thing and one thing only - to deliver static pages. This is a common scenario for a large number of sites, and requires only a small handful of Apache's modules enabled. The server's only role is to fetch files from disk and deliver them to the requesting client. Thus we can disable all access, include, index, and CGI references. The mod_rewrite module is included to allow redirecting any requests for sales.website.com to www.website.com.

The configuration command for this server follows:

[tom@velocity src]$ ./configure --with-layout=RedHat \
  --disable-module=imap \
  --disable-module=asis \
  --disable-module=userdir \
  --disable-module=autoindex \
  --disable-module=auth \
  --disable-module=include \
  --disable-module=dir \
  --disable-module=cgi \
  --disable-module=env \
  --disable-module=setenvif \
  --disable-module=negotiation \
  --enable-module=rewrite \
  --verbose

Once the compile process is complete, the finished binary looks like this:

[root@velocity src]$ /usr/sbin/httpd -l
Compiled in modules:
  http_core.c
  mod_log_config.c
  mod_status.c
  mod_actions.c
  mod_alias.c
  mod_rewrite.c

Example Four: Building a Flexible, Loaded Web Server

OK, we admit it - the web is a wild and wooly place at times. What if you can't anticipate what kind of content or features you'll need down the road? Thanks to the power of Apache's module design, there is a solution that affords you both flexibility and performance. Here are the commands you'll need:

[root@velocity src]$ ./configure --with-layout=GNU \
  --enable-module=most \
  --enable-shared=max

[root@velocity src]$ make

[root@velocity src]$ make install

In this example, Apache is built using the GNU layout option. The --enable-module=most option enables most options, leaving out mod_auth_db which needs third party libraries not available on every platform, plus mod_log-agent and mod_log_referrer which are deprecated. The third option, --enable-shared=max , enables DSO support for all compiled modules. Using this approach, you can decide which modules you want to enable (by commenting out the appropriate LoadModule and matching AddModule line in httpd.conf) at runtime.

Warning
Always remember, if you're disabling a shared module in httpd.conf, you must comment out both the LoadModule and matching AddModule lines as a pair. Failure to do so causes Apache great anguish - the daemons will initialize without protest, but they will immediate quit without warning or notification.

The compiled binary, in this case, is remarkably sparse:

[tom@velocity src] /usr/bin/httpd -l
Compiled in modules:
  http_core.c
  mod_so.c

This is due to the fact that all modules are dynamically linked for this build example. mod_so.c is the "bootstrap" element Apache uses to load all modules specified in httpd.conf when the program is executed. Don't let the sparse output here fool you. If you've compiled all 17 modules as shared, and all 17 are active in the configuration file, all 17 will be loaded and enabled at runtime.

Building a customized Apache implementation from source is easy. As you can see, the hardest part is in the planning - deciding what kind of content your site will serve, understanding which modules do what, and deciding how to configure the required modules (statically or dynamically). By using only the modules you need, you can build a web server that is fast, streamlined, and simple to maintain.

Ill-Documented Feature
One frequent question that comes up on the mailing lists regarding the Apache Web server sub-system is the problem of publishing from browsers. Netscape's publish-to-web feature actually uses FTP to upload files, so FTP services must be enabled, and properly configured. The equivalent feature in Internet Explorer / MS FrontPage requires the Microsoft Front Page Extensions module, a complex configuration process that we do not have the space to address. Correctly configuring a system for safe and secure remote upload of Web content requires it's own book. Use the Apache and FTP resources that we reference in this book to explore these issues for yourself, if necessary.

Addressing Security and Authentication Issues

Any time you open a server to world access - which is how Apache is typically implemented - security and vulnerability become key issues for the administrator. Apache, "out of the box", is a relatively well-secured program. At this point in time we're going to assume that you already have your installation protected through the use of proper User and Group directives, and that you understand the relationship between file permissions and the aforementioned User/Group accounts. In this section we address another layer of filesystem protection: block directives and user authentication.

Protecting your filesystem

Apache supports a number of block directives that limit the actions allowed for a particular directory, file, or object. An object in this context refers to URLs, symbolic links, scripts, server-side includes, indexes, content views, and other such abstract entities. It's very important to understand how these blocks work and how access rules within them apply. Failure to understand the rules can result in a filesystem that you think is protected left wide open to anyone who comes knocking.

Controlling filesystem access

Filesystem access can be controlled on as individual user basis (which we'll cover later in this chapter), or by allowing/denying access by host using a specific IP, a range of IPs, a hostname, or a group of hostnames. The commands used for this are allow from and deny from.

The order in which allow and deny commands are applied is not set by the order in which they appear in the file. The default order is deny then allow; if a client is excluded by deny, it is excluded unless it matches allow. If neither is matched, the client is granted access. Read the above again - it pivotal in correctly implementing block directives.

For example:

allow from 123.231
deny from all

This denies everyone except clients with an IP that starts with 123.231. Remember, the default order is deny then allow, with a single exception - if the order command is used. The order command controls the order in which the deny/allow directives are applied. Now, examine the following commands:

order allow,deny
allow from 123.456
deny from all

This version closes the site from all access because the order statement is used, and the deny from all directive is applied last, overriding the allow from which is parsed first. To turn the last example on its head, we could write,

order deny,allow
allow from all
deny from 123.456

Which is a functionally useless statement, as it lets everyone in without restriction. The next point to note is that if two order directives apply to the same host, the last directive prevails. In other words, directives in a block are executed by default order (deny/allow), which is overruled by the order command, and if a contention exists, the last statement wins.

order deny, allow
deny from all
allow from 123.456
allow from 123.456.789

The above block grants access to the hosts in 123.456.789 only, not the super-set network 123.456, as you might assume. This is because the last option statement from the same host takes precedence.

Hint
While it is possible to use a domain name (for instance, a hypothetical thebadguys.com) in the allow and deny directives, it's generally a bad idea. This is because someone who controls their own DNS server can create a DNS record which allows them to appear to be from another domain. Spoofing IP addresses is considerably more difficult, therefore safer as a method of specifying access control.

Now that you have the hammer and nails in hand, let's look at the "blocks" Apache uses to protect and control a filesystem.

Using block directives

Block directives come in a variety of flavors. Each one provides the means to apply one or more directives to a specific virtual host, directory, or file. In addition, two "if/then" blocks are provided that enable or disable directives based on the presence of a module or a configuration file passed as an argument to the server at startup.

Combining the above block options with a set of directives and deny/allow statements allows the webmaster to not only restrict access to certain areas of the filesystem, but also fine-tune what actions can be performed within a given web site. Here are some examples to get your creative juices flowing.

<Directory />
  Order deny,allow
  Deny from all
</Directory>

The preceding Directory block restricts anyone from accessing the filesystem (any requests to Apache, that is). Block directives are recursive, so this is a good starting point to securing your server. It also effectively blocks all requests to all webs, which pretty much renders Apache useless. So the next step is to open up the directories where user documents are located.

<Directory /home/httpd/html>
  Options Indexes FollowSymLinks
  AllowOverride none
  Order allow,deny
  Allow from all
</Directory>

Now requests to documents under /home/httpd/html can be accessed by "all", indexes will be generated "on-the-fly" for directories lacking an index.html file, and symbolic links can be followed to access files referenced in one directory but residing in another. Note that the FollowSymLinks option does not allow a client to follow a symlink that points outside the /home/httpd/html directory structure - remember, block directives are recursive and we began by setting Options none on the root filesystem. This also brings up a second point to keep in mind: block directives are read linearly from httpd.conf, so always start with the most restrictive permissions on '/' and work down and out both when you create blocks and in relationship to the server's filesystem.

Caution
The most common error administrators make when creating block directives is forgetting the closing "/" on the final statement. Apache's block directives work exactly like HTML tags. They begin with an opening directive (<Directory>) and must close with a matching directive (</Directory>).

Block directives are one of the key building blocks of an Apache configuration. They're flexible, powerful, and dangerous - one wrong move and you can easily completely close off all access to your web server. The balance of this chapter contains numerous examples of block directive in action. We also refer you to Apache's default configuration file (httpd.conf.default) for additional usage hints and tricks.

User authentication

In the next two sections, we're going to examine the topic of user authentication. Third-party products and modules aside, Apache supports two methods of restricting user access to directories and/or files: basic authentication and digest authentication. The key difference between these two methods of authentication lies in how passwords are transmitted - basic sends clear text passwords; digest encrypts passwords using a hash function. Keep in mind that with either method, once a user is authenticated all subsequent transactions between the client and server are sent in clear text. If fully encrypted transactions are what you're after, SSL (Secure Socket Layer) is the route to follow. SSL is beyond the scope of this chapter; full details on implementing SSL under Apache can be found at http://www.apache-ssl.org.

Basic Authentication

Basic authentication is relatively simple in both principle and implementation. The client requests access to a directory or file that requires authentication. Apache replies with a request for a username and password (Error 401). When this information is returned, it's checked against a file containing a list of users and (ironically) encrypted passwords. If the username supplied is on the list, and the password matches, the client is granted access. Apache also supports groups, so you can group a list of names and allow or deny access to a group as a whole.

Username/password combinations are valid for a given realm, which simply provides the administrator with a means to further granularize (or alternatively broaden) the scope of access.

Apache requires two elements for basic authentication: a username/password file, and the appropriate directives in httpd.conf. We'll start by creating and populating the password file.

The first step is to create a directory where the password file will reside (preferably above the document root where no one can mess with it). Next, 'su' to root and create the password file using the Apache utility htpasswd. For detailed instructions on the use of htpasswd, see the manual page; for a quick "reminder" of command line options simply type htpasswd without any arguments.

[bilbrey@velocity httpd] mkdir /home/httpd/users

[bilbrey@velocity httpd] cd users

[bilbrey@velocity users] htpasswd -c basic tom
New password: dontpanic
Re-type password: dontpanic
Adding password for user tom

[bilbrey@velocity users] htpasswd basic brian
New password: flutterblast
Re-type password: flutterblast
Adding password for user brian

This creates a password file called basic, and adds entries for two users, tom and brian. Note that when the file is first created, the '-c' (create) option is required. After this, simply specify the password file you want the user added to, and a username. Now we have the password file in place, it's time to add some directives to httpd.conf.

User nobody
Group nobody
ServerName www.syroidmanor.com
ServerAdmin [email protected]
DocumentRoot /home/httpd/htdocs
ErrorLog /home/httpd/htdocs/logs/error_log
CustomLog /home/httpd/htdocs/logs/access_log custom

<VirtualHost 142.165.206.61>
ServerAdmin [email protected]
ServerName ols.syroidmanor.com
DocumentRoot /home/httpd/htdocs/ols
ErrorLog /home/httpd/htdocs/ols/logs/error_log
CustomLog /home/httpd/htdocs/ols/logs/access_log custom

<Directory /home/httpd/htdocs/ols
AuthType Basic
AuthName darksecrets
AuthUserFile /home/httpd/users/basic
require valid-user
</Directory>

</VirtualHost>

Here we've added a directory block for authentication within the <VirtualHost> block. With this configuration, anytime a user tries to access the /home/httpd/htdocs/ols directory, they'll be prompted for a username and password which will be verified against the password file (/home/httpd/users/basic) that we just created.

The directives responsible for the authentication process are AuthType, AuthName, and the location of the user password file. Users requiring access to this directory will need to supply a username, a password, and depending on the browser, the AuthName or realm. Most browsers keep a local record of the realm after the first access.

Optionally, you can also add the AuthGroupFile directive. This directive supplies the name and location of a plain-text file containing a list of group names and users whom are members. For example:

AuthGroupFile /home/httpd/users/groups

The group file would contain entries of the form:

groupname: username username username...
groupname: username username

The require directive is the key that invokes the password checking process - leave it out, and the authentication process fails. The options are:

require [user1 user2 user3] [group1 group2 group3] [valid-user]

The last option, valid-user, accepts any user found in the AuthUserFile.

Caution
Do not mistype the valid-user option as valid_user. Doing so will produce a cryptic and misleading authentication error when the client tries to access the directory. This is because Apache interprets require_user as a username, not an option.

Digest Authentication

Using basic user authentication via plain text passwords is not a terribly effective approach. After all, if someone really wants into a restricted access directory or file, all they have to do is use a network sniffer and capture the username/password combination as it's passed in clear-text across the network connection. There is an alternative to plain text passwords, however, and it's called digest authentication.

Digest authentication uses a cryptographic hash function known as MD5 to create a password hash that is sent in place of the plain text password. In addition to the hash, the client also sends the URI, the MD5 method used to create the hash (there are several), and a nonce. A nonce is simply a number sent to the client by the server, which is different each time, which the client uses to build the password hash thus making it different each time. This serves to protect against replay attacks.

Here's how a digest authentication transaction plays out:

  1. The client requests a URL from the server.
  2. The server checks the URL and sees that it's a protected file or directory. The server then sends the client an error "401" (Authentication Required) along with a nonce.
  3. The client combines the user's password and the nonce to create a hash, the return this to the server along with the requested URL and the hash method.
  4. The server receives the hash (remember, the server generated the nonce and knows what the number is), retrieves the user's password, and checks this password against an authentication file.

Before digest authentication can be implemented under Apache, three elements must be in place: the mod_auth_digest module must be compiled either statically or dynamically into the program's code (see "Building Apache from Source," earlier in this chapter); a digest file containing user/realm/password must exist; and required directives must be added to httpd.conf.

Warning
Digest authentication must be supported by the client's browser, and not all vendors support (or support correctly) the MD5 method. Make sure you thoroughly test any digest authentication implementation thoroughly, with a wide range of browsers, before putting it into effect.

The program used to create the digest password file is called htdigest, and is typically located in /usr/local/bin (this location can vary based on the installation layout used). The command syntax is as follows:

htdigest [-c] passwordfile realm user

The '-c' option is only required when the file is first created; it can be dropped thereafter. Realm is an arbitrary name for the authentication group you wish to create. Using different realms allows the administrator to add different users to different realms, and keep all authentication information in one file. Begin by creating a directory to contain the digest password file (for example, /home/httpd/digest):

[tom@janus httpd]$ mkdir digest

[tom@janus httpd]$ cd digest

[tom@janus digest]$ /usr/local/bin/htdigest -c authusers darksecrets brian
Adding password for user brian in realm darksecrets.
New password: illnevertell
Re-type password: illnevertell

[tom@janus digest]$ /usr/local/bin/htdigest authusers darksecrets dan
 * * *

The first htdigest command above creates (-c) a new file called authusers and adds the user brian to the realm darksecrets.

Note
Digest authentication can, in principle, use the realm or the username. Support for this is extremely spotty, however, so unless something miraculous happens to browsers between now and when this book goes to press, we do not recommend this approach.

Now let's create an httpd.conf entry that allows Brian to access the famed darksecrets realm.

User nobody
Group nobody
ServerName www.syroidmanor.com
ServerAdmin [email protected]
DocumentRoot /home/httpd/htdocs
ErrorLog /home/httpd/htdocs/logs/error_log
CustomLog /home/httpd/htdocs/logs/access_log custom

<VirtualHost 142.165.206.61>
ServerAdmin [email protected]
ServerName ols.syroidmanor.com
DocumentRoot /home/httpd/htdocs/ols
ErrorLog /home/httpd/htdocs/ols/logs/error_log
CustomLog /home/httpd/htdocs/ols/logs/access_log custom

<Directory /home/httpd/htdocs/ols
AuthType Digest
AuthName darksecrets
AuthDigestFile /home/httpd/digest/authusers
require valid-user
</Directory>

</VirtualHost>

Note that we've used a directory block to encapsulate the authentication directives. The AuthType directive specifies the authentication type (Digest), AuthName names the realm, and AuthDigestFile points to the digest password file. For the require directive we've used valid-user. Alternately, you could specify a list of usernames.

Digest authentication provides a reasonably secure balance between authorizing users with clear text passwords, and a full-time encryption protocol like SSL. Unfortunately, digest authentication is dependent on browser support. Your decision to authenticate users with using digests should be based on whether or not you can control the browsers clients use.

The Role of .htaccess

There is a third way of enforcing user authentication that we'll touch on before wrapping up this section - the .htaccess file.

Note
Keep in mind as you read through the following material that the .htaccess file is not restricted to authentication directives (although this is how it's most commonly used). Almost all the directives allowed in directive blocks and the main server configuration section of httpd.conf, can also be placed in this file.

Any time changes are made to httpd.conf, the server must be restarted before those changes can take effect. This is because Apache only reads its configuration file at startup. As an alternative, changeable directives (including authentication statements) can be placed in .htaccess. When it exists, this file is read by the server at each access. The advantage of using an .htaccess file is flexibility - the webmaster (or web owner, if the file is placed in the web's document root) can edit entries here and have them take effect without requiring a restart. The disadvantage is a serious degradation in performance as the file must be parsed and the directives there analyzed before every request is served.

The name .htaccess is used by convention, but this file can be called anything you want. Just make sure you tell Apache what access control file to look for. The directive is AccessFileName filename ... (for example, AccessFileName .htaccess .myaccess1 .otheraccess2) and can be placed anywhere in the main server configuration section of httpd.conf. Adding this directive requires a restart, and in most cases, also requires the client browser to be restarted to clear password caching.

Why does the use of .htaccess files exact a performance penalty? In addition to the time required to read the configuration from this file each and every time a request is made, the server also must ensure there are no other overriding access files contained elsewhere on the filesystem. For example, let's say a client requests access to the file /home/httpd/testsite/htdocs/index.html and there is an .htaccess file present in this directory. Apache searches for the following:

/.htaccess
/home/.htaccess
/home/httpd/.htaccess
/home/httpd/testsite/.htaccess
/home/httpd/testsite/htdocs/.htaccess

As you might imagine, between the initial file parse and this multiple search mechanism, this slows the server down rather dramatically. You can turn multiple searching off with the following directive:

<Directory />
AllowOverride None
</Directory>

In addition to performance issues, using the .htaccess configuration method carries with it some serious security issues. First, unless you explicitly prevent it, clients can see the .htaccess file. This loophole can be closed by adding the following lines to Apache's configuration file:

<files .htaccess> <
order allow,deny
deny from all
</files>

The .htaccess in the listing above should be replaced with whichever filenames you use for the purpose.

The second security concern arises with the web owner. .htaccess files are typically used on servers supporting a large number of users, where each user has their own web site under /home/httpd/~username. By allowing .htaccess files in the user's document root, this allows the web owner to tailor various configuration options permitted within their own document tree. This also opens up the possibility of the web owner adding options to their .htaccess file that could possible compromise server policy or security (for example, server-side includes or the execution of CGI scripts).

To run a tight ship, the webmaster should always ensure the following block is present in httpd.conf:

<Directory />
AllowOrderride none
Options None
Order deny,allow
Deny from all
</Directory>

This effectively locks down the server to any of the aforementioned performance and security concerns. Then, depending on policy, the webmaster can "unlock" one or more specific options either through global block statement (say, for the /home/httpd/~username directories) or by adding directive to individual <Virtual Host> blocks - which is the topic of the next section. For a list of options available for the option directive, please see the Apache documentation.

Tip
Apache's online documentation can be found (depending on layout) in the DocumentRoot/manual directory in HTML format or on the web at http://httpd.apache.org/docs/.

Virtual Hosting with Apache

Apache supports running multiple virtual servers, commonly called virtual hosting. In English, this simply means that Apache can be configured to serve pages to more than one domain name from a single IP. This is achieved with the NameVirtualHost directive and one or more <VirtualHost> blocks specific to each domain. Virtual hosting has three general forms: Name-based, IP-based, and mixed Name/IP. Name-based is by far the most common, so we'll cover it first.

Name-based virtual hosting

Below is a segment from one author's actual virtual host section.

### Section 3: Virtual Hosts ###

NameVirtualHost 142.165.206.61

<VirtualHost 142.165.206.61>
	ServerAdmin [email protected]
	ServerName www.syroidmanor.com
	DocumentRoot /home/tom/webs/syroidmanor
	ErrorLog /home/tom/webs/syroidmanor/logs/error_log
	CustomLog /home/tom/webs/syroidmanor/logs/access_log common
</VirtualHost>

<VirtualHost 142.165.206.61>
	ServerAdmin [email protected]
	ServerName insights.syroidmanor.com
	DocumentRoot /home/tom/webs/insights
	ErrorLog /home/tom/webs/insights/logs/error_log
	CustomLog /home/tom/webs/insights/logs/access_log common
</VirtualHost>

<VirtualHost 142.165.206.61>
	ServerAdmin [email protected]
	ServerName www.daynotes.com
	DocumentRoot /home/tom/webs/daynotes
	ErrorLog /home/tom/webs/daynotes/logs/error_log
	CustomLog /home/tom/webs/daynotes/logs/access_log common
</VirtualHost>

<VirtualHost _default_:*>
	ServerAdmin [email protected]
	DocumentRoot /usr/local/apache/htdocs
</VirtualHost>

The key to the virtual hosts section is the NameVirtualHost statement. This tells Apache to subdivide requests to the IP 142.168.206.61 by domain name according to the information contained in the <VirtualHost> directive blocks.

DNS records for the IP 142.165.206.61 point to the domain syroidmanor.com, hosted by the server Hydras. This DNS record also contains several CNAME, or alias records, namely www.syroidmanor.com and insights.syroidmanor.com. A second DNS record exists for the domain daynotes.com, which is also assigned the IP 142.165.206.61.

When a client sends a request for the URL http://www.daynotes.com, the DNS record is resolved to 142.165.206.61 and the client is forwarded to the server at this location. Apache is listening on port 80, and when the request arrives the VirtualHost section is scanned for a ServerName match. If one is found, the directives in that section are executed, and the client is pointed to the document directory referenced by the DocumentRoot directive - in this case, /home/tom/webs/daynotes. Note that in this example, all the webs served from this machine have their own error and access logs. If the ErrorLog and CustomLog directives were not present for a given Virtual Host, then logging would still occur but the entries would be written to the default "common" log files (typically, in the directory /var/apache/logs).

The last directive block, <VirtualHost _default_:*> is present as a "fall-through". If none of the ServerName directives match the request, then the client is directed to the web pages found at /usr/local/apache/htdocs, which in this case, are simply Apache's default indexes supplied with the program. This block could easily be adapted to point to a directory containing an error document specific to the site.

IP-based hosting

IP-based hosting is very similar to name-based, except the server has more than one IP assigned to it. For example:

### Main Server Config ###

Port 80
ServerName server.domain.com
DocumentRoot /www/domainone/htdocs

### Virtual Hosts Section ###

NameVirtualHost 222.33.33.111
NameVirtualHost 222.33.33.222

<VirtualHost 222.33.33.111>
ServerName www.domainone.com
DocumentRoot /www/domainone/htdocs
...
</VirtualHost>

<VirtualHost 222.33.33.222>
ServerName www.domaintwo.com
DocumentRoot /www/domaintwo/htdocs
...
</VirtualHost>

Requests to the IP 222.33.33.111 are directed to the /www/domainone/htdocs directory, and requests to 222.33.33.222 are directed to /www/domaintwo/htdocs.

Caution
Pay close attention to your entries in the Virtual Hosts section. Any directories specified here must exist before Apache is started, or the daemon will refuse to run and quit (see the error log file for a possible explanation). Also, pay attention to spelling. Typing /homes instead of /home will not just disable the Virtual Host block containing the error - it will prevent Apache from starting.

As an alternative, you can redirect requests by to the server by IP and port number. Such a configuration is typically used when a second IP is not available or for testing a new web site before making it available to the public. We do not recommend using port-based redirection on a long-term basis - users tend to have a strong aversion to adding port numbers to a URL.

### Main Server Config ###

Port 80
Listen 222.33.33.111:80
Listen 222.33.33.111:8080
ServerName server.domain.com

### Virtual Hosts Section ###

<VirtualHost 222.33.33.111:80>
ServerName www.domainone.com
DocumentRoot /www/domainone/htdocs
...
</VirtualHost>

<VirtualHost 222.33.33.111:8080>
ServerName www.domaintwo.com
DocumentRoot /www/domaintwo/htdocs
...
</VirtualHost>

Mixed name/IP hosting

The third method of redirecting requests using Virtual Hosting is by using a combination of Name and IP. The key to employing this method is to tell Apache which IP (or, in a more complex scenario, IPs) to subdivide. In the example shown, two IPs are assigned to the server: 192.168.0.2 and 192.168.0.3. The NameVirtualHost directive tells the daemon to subdivide requests for 192.168.0.2 only. In this example, both www.domain.com and www.domaintwo.com are CNAME'd to 192.168.0.2; www.domainthree.com is assigned to 192.168.0.3.

NameVirtualHost 192.168.1.2

<VirtualHost 192.168.1.2>
ServerAdmin [email protected]
DocumentRoot /www/domain/htdocs
...
</VirtualHost>

<VirtualHost 192.168.1.2>
ServerAdmin [email protected]
DocumentRoot /www/domaintwo/htdocs
...
</VirtualHost>

<VirtualHost 192.168.1.3>
ServerAdmin [email protected]
DocumentRoot /www/domainthree/htdocs
...
</VirtualHost>

When a request arrives on interface 192.168.0.2, Apache redirects the client according to the domain name provided. All requests to 192.168.1.3 are automatically assumed to be for the domain associated with that IP, and sent to /www/domainthree/htdocs.

Finally, keep in mind most directive options found in httpd.conf's main server section can also be used within a virtual host block.

<VirtualHost 222.33.33.111>
ServerName www.domain.com
DocumentRoot /www/domain/htdocs
  <Directory /www/domain/htdocs/secrets>
   Order Deny,Allow
   Deny from all
   Allow from 222.33.33
  </Directory>

In the above example, only clients connecting from the 222.33.33.X address block are permitted access to the /www/domain/htdocs/secrets directory.

Hint
New syntax was introduced with version 1.3.14 that eliminates the need to specify a numeric IP for the NameVirtualHost directive. Please see the Apache Group's online documentation for specifics and more virtual host examples (http://www.apache.org/docs/vhosts).

Virtual hosting can get thorny in a hurry with complex configurations, especially for servers supporting a large number of sites. Our advice is to go slow, add and test one virtual host section at a time, and remain conscious of how fickle Apache is about spelling and punctuation.

Logging and Monitoring Apache

Logging is an important component of any server application, perhaps more so for a web server. Webmasters like to not only know the status of their server, but they also want to know who's visiting their site, how often, and which pages are being read. Apache provides three general types of logs: error, access, and custom.

Depending on how you're configured Apache, logs are written to one location (the default is /var/log/httpd) if the server is providing web services for a single site or multiple locations if the server is supporting virtual hosts and separate log directives have been specified for each host. In the case of the latter, error and access logs pertaining to that host are recorded in the location specified by the log directive; error and access logs that are server specific are written to the locations listed in the server configuration section of httpd.conf.

For example, the following snippet illustrates the configuration file for a web server hosting a single site:

User nobody
Group nobody
ServerAdmin [email protected]
ServerName www.orbdesigns.com
DocumentRoot /home/httpd/html
ErrorLog /var/log/httpd/error_log
LogLevel warn
CustomLog /var/log/httpd/access_log common

All web pages are served from the DocumentRoot /home/httpd/html, all errors are recorded to /var/log/httpd/error_log, and all client access details are written to /var/log/httpd/access_log.

Extra Info
The LogLevel directive controls the level of error logging Apache performs. Possible values include: debug, info, notice, warn, error, crit, alert, and emerg. Debug is the most verbose, and includes all lower levels below it (info thru emerg). So setting the LogLevel to warn (as illustrated in the configuration above) logs errors generated by the levels warn[ing], error, crit[ical], and emerg[ency].

When log locations are specified in a <VirtualHost> container however, things work a little different. Consider the following scenario:

User nobody
Group nobody
ServerAdmin [email protected]
ServerName www.syroidmanor.com
DocumentRoot /home/httpd/html
ErrorLog /var/log/httpd/error_log
LogLevel warn
CustomLog /var/log/httpd/access_log common

### Virtual Hosts Section ###

NameVirtualHost 142.165.206.61

<VirtualHost 142.165.206.61>
	ServerAdmin [email protected]
	ServerName www.daynotes.com
	DocumentRoot /home/tom/webs/daynotes
	ErrorLog /home/tom/webs/daynotes/logs/error_log
	CustomLog /home/tom/webs/daynotes/logs/access_log common
</VirtualHost>

<VirtualHost _default_:*>
	ServerAdmin [email protected]
	DocumentRoot /usr/local/apache/htdocs
</VirtualHost>

In the above configuration, server errors are written to /var/log/httpd/error_log, errors specific to requests made to the web site www.daynotes.com are written to /home/tom/webs/daynotes/logs/error_log, and client access information is written to /home/tom/webs/daynotes/logs/access_log. In addition, any requests for a host not listed in the Virtual Host section "fall through" to the directory /usr/local/apache/htdocs. As a log location is not specified for this block, error and access logging information is written to the default location, /var/log/httpd/error_log or access_log respectively.

Apache's error log

Error logs record any errors detected by Apache and take two general forms: access errors and server errors/notifications.

Access Errors

An access error occurs when a client cannot access a requested file or directory, a URL embedded on a page is invalid, or an action is attempted that is prohibited by a directory rule. Below are several examples of access errors:

[Tue Oct  3 21:09:53 2000] [error] [client 209.53.245.180] Directory index 
	forbidden by rule: /home/tom/webs/insights/2000/

[Wed Oct  4 20:42:06 2000] [error] [client 209.53.245.60] Invalid URL in 
	request GET /1999/../../syroidmanor/tom.htm HTTP/1.0

[Thu Oct  5 21:43:12 2000] [error] [client 216.34.121.67] File does not 
	exist: /home/tom/webs/insights/1999/Other/these_things.htm

[Mon Oct  9 07:09:27 2000] [error] [client 4.34.160.63] (13)Permission 
	denied: file permissions deny server access: /home/tom/webs/insights/2000/20001009.htm

Most access errors are relatively self-explanatory. The log format shows the date and time the error occurred, the type of error, the client that experienced the error, and the file or directory that generated it. The first error above was logged in response to a request for an index file in a directory where no such access is permitted. The second error points to an invalid URL within the requested document, and the third was generated by a request for a document that does not exist. The final example is common - the permission bits set on the file do not permit access to the user or group attempting access (in this case, nobody/nobody).

Server Errors/Notifications

In addition to request errors, Apache's error log also tracks what's happening at the server level. Depending on the LogLevel set, Apache can be configured to record emergencies (an error that stops the server completely), notices (for example, a restart to reload the configuration file), or debug mode (all server actions). See the Extra Info box above for the full range of LogLevel options.

Below are several examples of server errors:

[Fri Oct 20 00:49:21 2000] [notice] Apache/1.3.14 (Unix) 
	configured -- resuming normal operations

[Fri Oct 20 00:53:07 2000] [notice] SIGUSR1 received.  
	Doing graceful restart

[Thu Oct 19 12:53:30 2000] [crit] (67)Address already in 
	use: make_sock: could not bind to port 80

[Thu Oct 19 23:33:47 2000] [warn] pid file /var/apache/run/httpd.pid 
	overwritten -- Unclean shutdown of previous Apache run?

The error log should be your first source of information when troubleshooting a problem with Apache.

Apache's access log

Every time a client accesses your web server, it generates an entry in the access log. This file is typically used by the Webmaster to determine how many "hits" a site received in a given period, who was visiting and from where, what pages were read most frequently, and even what browser the client used.

Apache uses the common log file format, which is a standard supported by the W3C (the World Wide Web Consortium) and most analyzer programs. Entries are logged as follows:

remotehost identifier authuser date "request" status bytes

These entries are defined as follows:

We mentioned it earlier in the chapter, but a reminder is in order: turning on HostNameLookup is not a good idea. It slows down logging dramatically (as each visiting host's IP must be resolved), which results in an less responsive server. There are numerous logging tools available that can analyze logs after the fact, and resolve host names without loading down Apache. (We can recommend Webalizer. It's GPL licensed, and available from ftp://ftp.mrunix.net/pub/webalizer/).

Update Today!
If you're running an older version of Apache, you might find the TransferLog directive used instead of CustomLog. If so, we strongly recommend an upgrade to the latest Apache release. Not only will you gain extensive custom logging abilities, but numerous security updates as well.

A scan of your httpd.conf file reveals that the common log format is actually a layout option. The pertinent logging line should look similar to the following:

CustomLog /var/log/httpd/access_log common

Several lines above the CustomLog entry are several LogFormat directives, their custom layouts, and an "alias" that can be appended to any CustomLog entry (combined, common, referrer, or agent).

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
LogFormat "%h %l %u %t \"%r\" %>s %b" common
LogFormat "%{Referer}i -> %U" referer
LogFormat "%{User-agent}i" agent

As you've probably already surmised, creating you own custom log format is as simple as entering a new LogFormat line, specifying what information you want included using variables, and tagging a unique alias at the end of the line. For a complete list of available variables and their meaning, consult the Apache documentation (or point your browser to http://httpd.apache.org/docs/mod/mod_log_config.html#formats).

Log rotation

One of the first lessons the novice webmaster learns is that access logs can get BIG in a hurry, especially for high-traffic sites. Depending on the content of the web pages being served, and the amount of information the log is recording, it's not at all unusual to see sites that receive just 1000 hits a day generating access logs over 20MB in size in less than two weeks! Left unchecked, ballooning log files can quickly fill a filesystem to capacity. And in case you haven't heard yet, if you fill /var to capacity such that it can't write any more system errors to the appropriate log file, you risk bringing not just Apache to its knees, but the whole operating system as well.

Fortunately, the good folks at Caldera were aware of this threat to system stability and have supplied you with all the tools required to keep log files in check. The program is called logrotate, and if you installed Apache from the OpenLinux distribution CD you'll even find the activation script pre-configured to manage web server logs - providing these logs are stored in the default location. You'll understand this provision as soon as you see the script.

logrotate is a Linux program (no relationship to Apache), and was designed to automatically manage system log files. It incorporates mechanisms for log roll-over or rotation, compression, deletion, and will even email log files to an administrator on request.

To manually force a log rotation, use the command /usr/sbin/logrotate -f (or --force). This "forces" a rotation to take place independent of the schedule set by cron.

Here's how all this works. Once a day (typically at 4 am; set in

/etc/crontab

), cron executes any scripts it finds in the /etc/cron.daily directory, one of which is called logrotate. The logrotate script executes the /usr/sbin/logrotate command, which in turn reads the contents of the /etc/logrotate.conf file. logrotate then goes to the /etc/logrotate.d directory and runs any scripts it finds there. One of the scripts in the /etc/logrotate.d directory is called apache.

Before we examine these files, it's important to note the following items. The order of the above execution is important. The logrotate.conf file sets the default parameters logrotate uses to execute the scripts found in /etc/logrotate.d. The actual name of the scripts found in /etc/cron.daily and /etc/logrotate.d have no bearing on anything - the apache script could be called fiddle-dee-dum and it would still execute as outlined above. The name of the scripts is a convenience; whatever is contained in either directory is executed.

So why is the order of execution so important? Because, as designed, the apache log rotation is configured to take place once a week. But why, you ask, is it only run once a week when the script is called daily by cron? The answer lies in logrotate.conf:

# /etc/logrotate.con
# sample logrotate configuration file
#
# rotate log files weekly
weekly

# keep 4 weeks worth of backlogs
rotate 4

# send errors to root
errors root

# create new (empty) log files after rotating old ones
create

# uncomment this if you want your log files compressed
#compress

# RPM packages drop log rotation information into this directory
include /etc/logrotate.d

# no packages own lastlog or wtmp -- we'll rotate them here
/var/log/wtmp {
    monthly
    create 0664 root utmp
    rotate 1
}

# system-specific logs may be configured here

Note the top line (weekly), and recall that logrotate bases its default configuration on the contents of logrotate.conf. In other words, cron calls logrotate daily, sees that the last web log rotation was Sunday, and as it's only Tuesday, takes no further action.

The second option, rotate 4, determines how many weeks of back-logs the script saves. When a new log file is created, logrotate "pushes" the existing log onto a stack and renames it log_name.1, then creates a new empty file, log_name. At the end of four weeks, you're left with 4 files: log_name.1, log_name.2, log_name.3, and log_name.4. The fifth week, log_name.4 is discarded and the process continues. Increase this number to increase the number of back-logs.

The option errors root is the email address logs are sent to if the script fails for any reason. The create option is self explanatory. Uncomment the compress line to compress old logs.

The file show below, /etc/logrotate/apache, is the actual script logrotate uses when the time comes to rotate Apache's log files:

# /etc/logrotate.d/apache
# Sample apache logrotate file
#
# Edited 10/11/00 tms
# added /home/tom/webs/daynotes error/access logs

/var/log/httpd/access_log {
    missingok
    size=500K
    postrotate
        /usr/bin/killall -USR1 httpd
    endscript
}

/var/log/httpd/error_log {
    missingok
    mail [email protected]
    postrotate
        /usr/bin/killall -USR1 httpd
    endscript
}

/home/tom/webs/daynotes/logs/access_log {
    missingok
    postrotate
        /usr/bin/killall -USR1 httpd
    endscript
}

/home/tom/webs/daynotes/logs/error_log {
    monthly
    missingok
    mail [email protected]
    postrotate
        /usr/bin/killall -USR1 httpd
    endscript
}

We've done some spelunking to the script to illustrate several features available under logrotate (changes are highlighted in bold). In the first stanza, the size=500K has been added. When .../http/access_log exceeds 500K, the log is rotated; until then it is skipped. In the next stanza, we've added a line that mails the log file to root before it's rolled. As the majority of the logging on this server takes place under the /home/webs tree, this reminds us to scan the daemon error log once a week for any problems. The last two stanzas have been added to rotate the logs defined in a Virtual Hosts section. For more details on allowable options, type man logrotate.

Monitoring Apache

Apache provides the means to get comprehensive diagnostic information on the state of the server, the configuration in use, what modules are loaded, and a partridge in a pear tree (if desired). Note that both the commands discussed in this section require that Apache be compiled with the mod_status.o module (see the section "Building Apache from Source" earlier in this chapter for details).

Most administrators consider status reports confidential, so the very first thing you'll want to do is restrict who can issue a query to the server.

User nobody
Group nobody
ServerName www.syroidmanor.com
DocumentRoot /home/httpd/html

<Location /status>
  order deny,allow
  allow from 192.168.0.1
  deny from all
  SetHandler server-status
</Location>

<Location /info>
  order deny,allow
  allow from 192.168.0.1
  deny from all
  SetHandler server-status
  SetHandler server-info
</Location>

In the above configuration segment, we use two <Location> blocks and limit access to the status and info commands to the IP 192.168.0.1. Recall that when using the order directive, the last entry has the last word. So we set the access permissions to order deny,allow, deny from all, and allow (from our internal IP only). A handler is a piece of code built into Apache that performs a certain action. In this case, we're telling the server to invoke the server-status code when called by the location /status, and both server-status and server-info when called by /info. Taken as a package, these blocks allow only the specified IP to invoke the necessary code to access Apache's internal status commands.

Now fire up a browser (ensure you're connecting from the IP set by the allow from directive), and type http://www.yourservername.com/status into the location bar. You should see output similar to the following:

Apache Server Status for www.syroidmanor.com
Server Version: Apache/1.3.14 (Unix)
Server Built: Oct 15 2000 20:10:03
Current Time: Thursday, 28-Oct-2000 22:03:10
Restart Time: Monday, 10-Oct-2000 08:03:17
Server Uptime: 
Total Accesses: 7 - Total Traffic 187 kB
CPU Usage: 1.00 (1.00) load average: 1.72, 1.94, 1.98
.145 requests/sec - 0 B/second - 0 B/request
3 requests currently being processed, 7 idle servers
---SNIP---

The report goes on for several lines and ends with an output of real-time requests, showing host IPs, the virtual hosts in use (if any), and the page being accessed by the client. All very useful and informative stuff.

In similar fashion, we can examine the actual configuration of the server - that is, the modules in use, the directives in use, etc. - by typing http://www.yourserver.com/info.

We won't even attempt to show you the output from this command; it's at least nine or ten pages in length, very detailed, and specific to the server. The info command is very useful if you find yourself responsible for a remote server and you need to see how it's configured, or what effects a changed configuration has had on running parameters.

Real-Time Monitoring - An Alternate Approach

If you don't want the overhead of adding the status and/or info modules to Apache, there is another way to monitor your server logs in real-time. Simply open a terminal window on the server, and type: tail -f /path/toyour/logs/access_log [or error_log]

This shows the last few lines of your log file, and as clients connect (or errors are generated), the corresponding entries are added to the output displayed. This is a great "quick and dirty" way to monitor server activity when you don't need all the detail provided by Apache's status commands.

Consulting Apache Resources

In OpenLinux, with Apache RPMS (or fully installed from source code), there is a single manual page for httpd. This directs our attention to the manual (in HTML format, of course), which is found at /usr/doc/apache-1.3.11/manual/ (for the eDesktop version). Recommended reading.

Aside from the documentation at the main site http://www.apache.org/, there are a variety of other Internet resources. A good starting point for online research is the recently updated (September, 2000) Apache Overview HOWTO, at http://www.linuxdoc.org/HOWTO/Apache-Overview-HOWTO.html. It contains documentation, tips, tricks, and links to several other useful sites.

Needless to say, there are many books in print about the world's most popular web server software. We've used Apache, The Definitive Guide by Ben Laurie and Peter Laurie (O'Reilly and Associates) and Apache Server Unleashed by Rich Bowen, et al (Sams Publishing). A complete listing of printed matter about Apache is found at http://www.apache.org/info/apache_books.html.

Summary

It's almost certain that we skipped over one of your favorite features or capabilities from Apache. It's a daunting task that we set for ourselves - describe and configure the world's most popular webserver, to which whole books have been devoted, in less than 50 pages. So it goes. We cover the following topics in this chapter:

Go to the Table Of Contents


Licenced under the Open Content License ver. 1

All Content Copyright © 2001 - Brian P. Bilbrey & Tom Syroid All Rights Reserved.