MacDevCenter    
 Published on MacDevCenter (http://www.macdevcenter.com/)
 See this if you're having trouble printing code examples


Apache Web Serving with Jaguar, Part 2

by Kevin Hemenway coauthor of Mac OS X Hacks
04/11/2003

Editor's note: In his rewritten-for-Jaguar previous article, Kevin Hemenway showed you how to easily start serving web pages from your Mac OS X computer. In this tutorial, he explores the world of CGI access. To gain the most from what Kevin has to offer here, you'll need some familiarity with the Terminal application.

If you haven't explored that feature yet, I recommend that you first read our Terminal companion article, "Learning the Terminal in Jaguar." Once you're comfortable using the command line, you can return here and dig deeper into Apache.

Learning a Bit More About Apache

So, after the work we did in the last article, we've got a prettier URL, but still a rather boring site. We need features to impress the boss, and to turn them on, we're going to have to start fiddling with Mac OS X's Terminal. We're going to assume you know how to edit and save files via the command line, either through a native shell editor (like vi or emacs) or a GUI editor, such as BBEdit. We prefer BBEdit 7.0 and its shell utility (which you can install via Preferences->Tools->Install "bbedit" Tool).

Before we turn on features, we need to know where Apache's configuration file lives. To find out, we'll query the Apache web server itself, so open a Terminal and enter the following command line:

httpd -V

This will spit out a screen of information specific to your Apache installation (this command will work with any Apache install--for Mac, Linux, or Windows). On an OS X 10.2.4 machine, we find out that we're using Apache 1.3.27:

Server version: Apache/1.3.27 (Darwin)
Server built:   10/16/02 21:48:47

As well as where the server configuration and error log is located:

-D SERVER_CONFIG_FILE="/etc/httpd/httpd.conf"
-D DEFAULT_ERRORLOG="/var/log/httpd/error_log"

Along with the default error log (which is insanely useful for debugging), Apache also writes to /var/log/httpd/access_log, which keeps track of every request your Apache web server has received. Note: In previous versions of the Apple-supplied Apache, a DEFAULT_XFERLOG would also have been defined in the output of httpd -V, and it'd point to the access_log we just mentioned. This functionality has since been moved to a CustomLog statement within the Apache configuration file. If you have no clue what I'm talking about, that's just fine, too--this is only useful to nitpickers or long-time OS X web servers.

Mac OS X Hacks

Related Reading

Mac OS X Hacks
100 Industrial-Strength Tips & Tricks
By Rael Dornfest, Kevin Hemenway

Learning About CGI Access

It's now time to fiddle with the most common feature available to a web server: CGI. Without getting overly esoteric, CGI allows us to install thousands of different scripts that can be accessed through a normal web browser. CGI scripts are often written in Perl (also installed by default under OS X), and can allow users to access databases, use interactive forms, chat on bulletin boards, and so on.

Apache comes with two simple scripts that can verify CGI is configured correctly. Before we test them, however, let's see what we can learn from the Apache configuration file. To start, open up your config file in your favorite text editor (this example assumes BBEdit):

bbedit /etc/httpd/httpd.conf

Be forewarned: the Apache configuration file is rather large, but also well-documented. Take its introductory warning to heart: "Do NOT simply read the instructions ... without understanding what they do. They're here only as hints or reminders. If you are unsure consult the online docs. You have been warned." The online docs are available at the Apache web site.

The quickest way to find and learn about the Apache configuration file is to search for the feature you want to enable. In our case, we'll start looking for "CGI." The first entry we find is:

LoadModule cgi_module libexec/httpd/mod_cgi.so

Followed shortly by:

AddModule mod_cgi.c

You'll see a number of these lines within the Apache config file. If you've ever worked with a plugin-based program, you'll easily recognize their intent--these lines load different features into the Apache web server. Apache calls these "modules," and you'll see a lot of the module names prefixed with mod_, such as mod_perl and mod_php. Lines that are commented out (that is to say, lines that are prefaced with a # character) are inactive.

Because our CGI lines are already active (they're not prefaced with #), we'll continue searching:

ScriptAlias /cgi-bin/ "/Library/WebServer/CGI-Executables/"

<Directory "/Library/WebServer/CGI-Executables">
    AllowOverride None
    Options None
    Order allow,deny
    Allow from all
</Directory>

The ScriptAlias directive allows us to map a URL to a location on our hard drive. In this case, the ScriptAlias line is mapping a URL of http://127.0.0.1/cgi-bin/ to the hard drive location of /Library/WebServer/CGI-Executables/. If you browse to that folder, you'll see the two CGI scripts I mentioned above: printenv and test-cgi. The <Directory> block isn't that important right now, so we'll move on to our next search result:

# AddHandler cgi-script .cgi

This is your first major decision concerning your Apache installation. When a certain directory has been ScriptAliased (as our CGI-Executables directory has, above), the files within that directory are always executed as CGI scripts. If the files were moved out of that directory (say, into our DocumentRoot), they'd be served as normal text files.

By uncommenting the AddHandler line, you're telling Apache to execute and run any file that ends in .cgi. This can happen from any directory and from any user, and is often considered a security hazard (for example, if you had an anonymous FTP server that uploaded directly into your web space, a malicious user could upload a damaging .cgi script, then execute it through their browser).

In a default installation of Apache on Mac OS X, CGI scripts are only allowed within /Library/Webserver/CGI-Executables/. Uncommenting the above line (removing the # character) allows CGI scripts to be executed from any user directory, such as /Users/morbus/Sites. In our case, because we aren't using the User directories (because they create ugly URLs for GatesMcFarlaneCo's intranet), we're going to leave the line commented.

If CGI access is turned on already (as per the AddModule and LoadModule lines at the beginning of our article), we should be able to reach one of the pre-installed scripts and see a happy result, right? Try http://127.0.0.1/cgi-bin/test-cgi (yes, test-cgi not test.cgi). You were probably greeted by a not-so-joyous response: "FORBIDDEN," Apache screams. "You don't have permission to access!"

Huh? Why didn't this work? Now is a perfect time to prove how useful the Apache web server logs can be. If you recall our discussion about httpd -V above, you'll remember that Apache's access log is located at /var/log/httpd/access_log. Let's look at the very last line of that file, easily reached with tail:

tail /var/log/httpd/access_log

You'll see that the last line looks similar to:

127.0.0.1 - - [19/Feb/2003:20:26:22 -0500]
  "GET /cgi-bin/test-cgi HTTP/1.1" 403 292

Quickly, this line shows where this particular request came from (127.0.0.1), the time the file was requested, the protocol used (HTTP/1.1), the response code (403), and the size of the response (292 bytes). This is all fine and dandy, but doesn't tell us what went wrong. For this, we'll dip into our error log (pinpointed by the httpd -V command):

tail /var/log/httpd/error_log
.
.
[Wed Feb 19 20:26:22 2003] [error] [client 127.0.0.1]
  file permissions deny server execution:
  /Library/WebServer/CGI-Executables/test-cgi

Bingo! This tells us exactly what went wrong--the file permissions were incorrect. For Apache to run a CGI script, the script in question needs to have "execute" permissions. To give the test-cgi file the correct permissions, do the following in the Terminal:

cd /Library/WebServer/CGI-Executables
sudo chmod 755 test-cgi

After running the above (you'll be asked for an administrator's password), load the URL again, and you should be happily greeted with gobs of environment information. (To learn more about permissions and the chmod and sudo, consult your favorite search engine, friendly geek, or O'Reilly-stocked library.)

With the basics of CGI out of the way, you can now install CGI-based applications to complement your intranet. Need a content management system for the GatesMcFarlaneCo developers to keep everyone up to date on their coding progress? Try the ever popular Movable Type.

Turning On Server-Side Includes

Server-side includes, better known as SSIs, allow you to include content from other files or scripts into the page currently being served. This is done by Apache before the page is actually shown to the user--a visitor will never know what you've included or where.

Commonly, SSIs are used to include things such as headers, footers, and "what's new" features across an entire site. When you need to change your site logo's height and width, for instance, you can change the header include only, and the changes will be reflected immediately wherever you've included that file.

SSIs, by default, are turned off. To turn them on, we're going to use the same "search for the feature" trick we did above. Open your Apache configuration file, and search for "server side." Our only match is near the previously seen AddHandler for CGI scripts:

# To use server-parsed HTML files
#
# AddType text/html .shtml
# AddHandler server-parsed .shtml

Happily, this is exactly what we're looking for. Those simple Add lines tell us a lot. They establish a pattern based on what we already know about CGI. If you recall, we could have turned on the CGI feature for files ending in .cgi--in other words, any file you created with the .cgi extension (whether it was a CGI program or not), would be treated as an executable script.

Likewise, these lines are telling us that we can turn on the server-side include feature for files ending in .shtml. Whether we actually use the SSI feature in these files doesn't matter--they'd still be treated and processed as if they did.

This is important. You may be thinking "If SSIs are so great, why not enable them for .html filenames?" Ultimately, it's a matter of speed. If you have 3,000 .html files, and only 1,000 of them actually use SSI, Apache will still look for SSI instructions in the other 2,000. That's a waste of resources. Granted, processing SSI incurs very little overhead, but if you're being hit 50,000 times a second, it can certainly add up. This isn't too worrisome for our GatesMcFarlaneCo intranet, but is good to know for your future Apache projects.

For now, uncomment the AddType and AddHandler lines. This will turn on the SSI mojo power. But where? When we were learning about CGI, we saw a configuration setting that said our CGIs lived in /Library/Webserver/CGI-Executables/--we now have to tell Apache where to enable our server-side includes.

Because we've building an intranet that's going to live in /Library/Webserver/Documents (Apache's DocumentRoot), that's where we want our SSI capability to be active. Go to the top of your Apache configuration file and search for "/Library/Webserver/Documents/". The second result looks like the following (we've removed the commented lines from this example):

<Directory "/Library/WebServer/Documents">
   Options Indexes FollowSymLinks MultiViews
   AllowOverride None
   Order allow,deny
   Allow from all
</Directory>

You'll notice that this looks similar to the <Directory> entry we saw when we were looking into CGI. As before, we're going to skip the bulk of it (we'll pay attention to our ignored lines a little bit later in our series). For now, add the word Includes to the Options line, like so:

Options Indexes FollowSymLinks MultiViews Includes

Options is an Apache directive that can turn on or off different features for the indicated <Directory> and all subdirectories beneath it. Subdirectories can override their parent configuration. In future articles of our series, we'll talk a bit more about the different configurations available to Options.

Because we've made changes to Apache's configuration file, we now need to restart Apache (so that it reloads /etc/httpd/httpd.conf). The easiest way to do this is via our Sharing preference panel. Much like we started the web server in Part 1, we now need to stop and then start it to enable our changes. Do this now. Chuckle once or twice, if you must.

Note: If, after you've clicked Start, the status continually shows "Web Sharing starting up..." for more than a few seconds, you may have an error in your Apache configuration file, caused by a mishap in your editing. The Sharing preference assumes a valid configuration file, and will stall until it receives a positive response from Apache (which it'll never get). To check your configuration for correctness, enter httpd -t in a Terminal--if there's an error, you'll be told the line number to investigate.

To test that SSIs are working properly, rename the index.html we created in part one to index.shtml (since .shtml is the only extension we've enabled SSIs for), and edit to match the snippet below:

<html>
<body>
 <h1>Gleefully Served By Mac OS X</h1>
 <pre>
  <!--#include virtual="/cgi-bin/test-cgi"-->
 </pre>
</body>
</html>

Here, we're including the test CGI script into the contents of our main index page. When you load http://127.0.0.1/index.shtml into your browser, you'll see our "Gleefully Served" message, as well as the output of the CGI script itself. We could have easily created a static file (navigation.html) and included that within index.shtml instead. Note: Changing the file extension to .shtml forces us to use a slightly longer URL--http://127.0.0.1/index.shtml and not just http://127.0.0.1/. We'll examine why (and how to fix it using DirectoryIndex) in a future installment of our series.

SSI is configured and working, but what can you do with it? What if your marketing department wants to create an image gallery of the newest ads they've planned? For an advanced look at the possibilities of SSI, check out this SSI Image Gallery article, also written by yours truly.

What's Next?

In our next installment, I'll walk you through how to turn on PHP and test that it's working correctly. Until then, good luck working with CGI and server-side includes, and be sure to Talkback if you've got questions I've yet to answer.

Kevin Hemenway is the coauthor of Mac OS X Hacks, author of Spidering Hacks, and the alter ego of the pervasively strange Morbus Iff, creator of disobey.com, which bills itself as "content for the discontented."


Return to the Mac DevCenter.


Copyright © 2009 O'Reilly Media, Inc.