MacDevCenter    
 Published on MacDevCenter (http://www.macdevcenter.com/)
 See this if you're having trouble printing code examples


Learning the Mac OS X Terminal, Part 5

by Chris Stone
07/02/2002

The series continues in Learning the Terminal in Mac OS X, Automating Mail from the Mac OS X Terminal, Configuring Email from the Mac OS X Terminal, and Customizing the Mac OS X Terminal.

Today we'll take another look at cron. You'll learn how to have your own crontab run a script regularly and email reports to you just like the system cron jobs do. Your new script will copy a directory of your choosing to another drive for backup. While this is not a complete substitute for backing up with a commercial solution like Retrospect, for example, it does provide a free and easy way to ensure that at least your most important data exists on two drives.

Readers of Derrick Story's article, Taming the Entourage Database will know how important it is to keep your Office X Identities directory regularly backed up, and I'll use that directory as an example for this article. If you don't use Entourage, you can choose instead to back up any other directory, including even your entire ~/Documents folder.

Choosing the Right Tool

The heart of this procedure, then, is the actual command line that does the copying, and that's where we'll start. As with most Unix procedures, there's more than one way to skin a potato (to coin a cat-friendly phrase), so Mac OS X comes with several command-line tools that can copy files. Of these, there are four that you might consider for this job, but really only one that will do just what we need.

Related Reading

Learning Unix for Mac OS X
By Dave Taylor, Jerry Peek

For this task, the most important considerations for choosing the right file copy utility is the ability to preserve the permissions, resource forks, and creator/type codes of the original files. If we can ensure that these file attributes will be maintained, we can then safely copy any directory and be confident that the copy will be as good as the original. This is how the four tools meet those criteria:

Utility Preserves Permissions? Preserves Resource Forks & Creator/Type?
cp yes no
CpMac no yes
ditto yes yes
rsync yes no


ditto

As you can see, only one of these utilities, ditto, fills the bill. Briefly, ditto is an Apple-developed tool for copying entire directories. Most important, ditto has a -rsrc option flag that ensures resource forks as well as creator/type codes are preserved in the copies. For example, this command will copy the Office X Identities directory from my home directory to the Backup directory on the external drive "Secondary":

ditto -V -rsrc ~/Documents/"Microsoft User Data/Office X Identities" "/Volumes/Secondary/Backups/Office X Identities"
The above command is all one line.

The additional -V flag turns on verbose copying, which instructs ditto to print a line for each directory and file copies. These lines, then, will make their way into the cron job's emailed report looking something like this:

>>> Copying Documents/Microsoft User Data/Office X Identities/
copying file ./Main Identity/Database ... 340328640 bytes
copying file ./Main Identity/Database Cache ... 17432 bytes
copying file ./Main Identity/Mailing Lists ... 20784 bytes
copying file ./Main Identity/Rules ... 20784 bytes
copying file ./Main Identity/Signatures ... 12560 bytes
copying file ./Newsgroup Cache ... 8 bytes

Here are a few things you should note about the two pathnames in the command:

To learn more about ditto, consult its man page with the man ditto command.

So, for our purposes, using ditto as shown in this example will do just what we need. There is, however, another option you should be aware of, rsync.

rsync

rsync is a directory synchronization tool that's smart enough to copy only new or changed files, thus speeding up some backups significantly. If you plan to back up a large directory that only partially changes from day to day (your entire ~/Documents directory, for example), rsync would seem to be the best solution. Unfortunately, as you see in the table, rsync doesn't preserve resource forks.

However, the good news is that there is a resource-fork-aware version of rsync in development called RsyncX. It's available here. (The installation contains a GUI front-end as well.) If you decide to use RsyncX instead of ditto, the documentation at the RsyncX site will get you going. Test your RsyncX commands well, making sure that all data copies properly. To use RsyncX in place of ditto in the command line above, for example, you would use:

rsync -ave /usr/bin/ssh ~/Documents/"Microsoft User Data/Office X Identities" "/Volumes/Secondary/Backups/Office X Identities"
The above command is all one line.

Note that RsyncX still uses the rsync command. In fact, this command line would run with the original version of rsync installed with Mac OS X, but would not preserve resource or creator/type codes forks properly. Note also that, in my tests, when RsyncX does copy a file, it does so much more slowly than ditto. In the case of the Office X Identities directory, then, where most data does change between backups, ditto will in fact perform a faster backup, and therefore is the better choice for this task.

The Shell Script

Your next step should be to determine the proper ditto command line for your system and test it several times from the prompt, making sure that it works repeatedly, and that all copied data is in good shape.

Once you've worked out a good ditto command line, the next step is putting it into a shell script file. Just like the jobs in the system crontab refer to the actual daily, weekly, and monthly scripts that do the heavy work, so will your crontab refer to a "backup" script. Having this command in a separate script file will, for one thing, allow you to easily run it manually whenever you like.

The conventional directory for storing user scripts is ~/bin, which you should create if it doesn't already exist:

mkdir ~/bin

The ~/bin directory is good place for your scripts since, by default, the shell will look for executables there whenever it receives a command. This directory path is one of several known collectively as your search path. This list allows the shell to quickly execute a file without it having to search the entire filesystem, or you having to include the full pathname to each executable.

We'll call the file backup.sh, using the convention of naming Bourne shell scripts such as this with the .sh extension. Create the file with pico using this command:

pico ~/bin/backup.sh

Once in pico, enter this first line, which tells the shell to use /bin/sh (the Bourne shell) to run the script:

#!/bin/sh

Next use echo to output what will become the first line of your cron report:

echo "Results of the daily backup:"

Then enter your version of this ditto command line:

ditto -V -rsrc ~/Documents/"Microsoft User Data/Office X Identities" "/Volumes/Secondary/Backups/Office X Identities"

Your pico session should then look something like this:

pico Session

Remember that pico will not display wrapped lines, but instead use the $ symbol, where a line goes beyond the edge of the window.

Finally, type control + O, return, and then control + X to save the file and exit pico as usual.

The script will work fine as is when called from cron, but if you ever want to run it by name from the command line, you'll first need to make it executable, and then have the shell rebuild its list of executables found in its search path. To do this, use chmod to set the file's executable bit:

chmod +x ~/bin/backup.sh

Then enter the rehash command to rebuild the list of executables:

[localhost:~] chris% rehash
[localhost:~] chris%

This, then, will allow you to execute the script by simply entering the name of the script:

[localhost:~] chris% backup.sh
Results of the daily backup:
>>> Copying /Users/chris/Documents/Microsoft User Data/Office X Identities

Once you're sure that the backup.sh script is working well, it is then time to create your crontab.

The User Crontab

In Part 1, you learned how to modify the system crontab by simply opening it in a text editor. For this procedure, however, you will be creating a user crontab. User crontabs (or "cron tables") run under regular user accounts instead of root, therefore their scripts can only access those directories and applications that the user can. This arrangement allows any user to create a personal schedule of automated commands without risk to essential system files.

User crontabs differ from the system crontab in that you cannot edit user crontabs directly. For one thing, user crontabs and the directory in which they reside are owned by root, thus are inaccessible to non-root users. The proper way to edit user crontabs is with the crontab utility (not to be confused with the crontab files it creates). Known as a "setuid root program," crontab has had its permissions set so it will always run as root, a functionality that provides you the link to the otherwise restricted locations. The crontab program also checks that your crontab is formatted correctly before installing it as /private/var/cron/tabs/username.

Before you begin with the crontab utility, however, you'll first need to formulate and test the command line you'll use in your cron job. Base your command line on this, which is what I would use on my machine:

30 18 * * * sh ~/bin/backup.sh 2>&1 | mail -s "Daily Backup Report" chris

Previously in the Series

Learning the Mac OS X Terminal: Part 1


Learning the Mac OS X Terminal, Part 2


Learning the Mac OS X Terminal, Part 3


Learning the Mac OS X Terminal, Part 4


This line should be easy to understand if you remember the system crontab in Part 1. What does differ, however, is the lack of the sixth "user" field in the user crontab. This field in the system crontab identifies which user account the job should run under. Since, of course, the user crontab will always run under that user's account, that field is unnecessary in the user crontab.

The next field holds the actual command line to be run. To roughly paraphrase:

First, use the Bourne shell to run the script we've created as ~/bin/backup.sh:

sh ~/bin/backup.sh

Next, send all of the output from that script, including error messages, on to the next command using the pipe character ("|"):

2>&1 |

Finally, have the mail utility receive the input from the pipe, use it as the body of a new mail message with the subject "Daily Backup Report" and send it to user "chris":

mail -s "Daily Backup Report" chris

The line you should enter will differ only in the scheduling fields and the account name used at the end. You probably want to schedule for a time when you're not usually busy on your Mac, but if you do happen to be working when the job starts, you won't notice much, if any, disruption from ditto running in the background (depending on your machine's performance, of course).

To test the command from the prompt, first change to the Bourne shell (just type sh and a return):

[localhost:~] chris% sh
localhost%

Then enter the command line (not the scheduling fields or the sh command):

localhost% ~/bin/backup.sh 2>&1 | mail -s "Daily Backup Report" chris
localhost%

If all went well, you should just receive a new prompt, and then in a few moments the mailed report should arrive looking something like this:

From root Tue Jun 25 21:17:54 2002
Date: Tue, 25 Jun 2002 21:17:54 -0700 (PDT)
From: Chris <chris>
To: chris
Subject: Daily Backup Report

Results of the daily backup:
>>> Copying /Users/chris/Documents/Microsoft User Data/Office X Identities
copying file ./.DS_Store ... 6148 bytes
copying file ./Main Identity/Database ... 10047232 bytes
copying file ./Main Identity/Database Cache ... 17092 bytes
copying file ./Main Identity/Mailing Lists ... 20784 bytes
copying file ./Main Identity/Rules ... 20784 bytes
copying file ./Main Identity/Signatures ... 12560 bytes
copying file ./Newsgroup Cache ... 8 bytes

To leave the Bourne shell and return to tcsh, just type in exit, and you'll get a new tcsh prompt:

localhost% exit
[localhost:~] chris%

Once you've checked that the files have been copied correctly, you're ready to use the crontab utility, which actually hands off much of the job to the text editor of your choice. By default, this is the vi editor. If you are already familiar with vi and would like to use it to edit your crontab, skip this next command. Otherwise, since by now you're probably most comfortable with pico, set it as your editor with this command:

setenv EDITOR pico

In this case, the setenv command is setting an environment variable called EDITOR to the value pico. This setting is only temporary, however, lasting only for the current shell session (that is, until you close that Terminal window). Therefore, you'll need to issue this command during each session in which you edit your crontab. It's not difficult to make this setting permanent, but I'll have to save that procedure for a future article.

Finally, enter this command to create and edit your user crontab:

crontab -e

(The other crontab options are -l, which displays your crontab, and -r, which removes it.)

Add your line to your crontab in pico as you would in any other file. My example cron job, of course, was set to run everyday at 6:30 p.m., but you might first want to schedule yours to run just a few minutes from when you edit your crontab so you can soon know if it works or not.

Be sure to follow that line with a new empty line, which cron requires. This is what my pico session looks like:

pico Session

Finally, save the file with the temporary name given, and then close pico as usual. Once you do, you'll see a final line from crontab reporting that your new crontab was installed:

crontab: installing new crontab
[localhost:~] chris%

You can confirm that it has run by just waiting for the email report to come, or you could see it run using the command-line process watcher top. Much like the GUI Process Watcher application inside /Applications/Utilities, top lists the running processes one per line. If you run top with its -u flag, you'll see the list dynamically ordered, with the most active processes at top:

[localhost:~] chris% top -u

Processes: 52 total, 2 running, 2 stuck, 48 sleeping... 137 threads 13:03:21
Load Avg: 0.91, 0.59, 0.41 CPU usage: 6.8% user, 29.9% sys, 63.2% idle
SharedLibs: num = 89, resident = 22.7M code, 1.52M data, 5.87M LinkEdit
MemRegions: num = 3185, resident = 91.2M + 8.32M private, 59.7M shared
PhysMem: 79.0M wired, 58.2M active, 340M inactive, 477M used, 419M free
VM: 2.08G + 44.6M 6656(0) pageins, 0(0) pageouts

PID COMMAND %CPU TIME #TH #PRTS #MREGS RPRVT RSHRD RSIZE VSIZE
434 ditto 11.9% 0:02.31 1 16 17 424K 296K 648K 1.79M
436 top 7.6% 0:00.38 1 14 17 268K 320K 524K 1.82M
383 Terminal 5.1% 0:13.65 8 117 243 2.89M 10.7M 8.84M 109M
375 Microsoft 3.4% 6:44.00 2 79 292 12.8M 18.0M 19.7M 119M
0 kernel_tas 2.5% 1:14.20 27 0 - - - 64.8M- 733M-
382 CPU Monito 2.5% 0:32.75 1 67 88 1.21M 6.29M 3.07M 100.0
356 Window Man 1.7% 0:59.54 3 156 152 2.01M 20.0M 21.8M 79.5M
376 Microsoft 0.8% 2:14.07 8 128 283 14.3M 31.0M 33.8M 142M
367 Finder 0.0% 0:35.43 2 93 365 21.0M 15.9M 26.1M 130M
377 TextEdit 0.0% 0:34.01 2 93 130 7.56M 9.22M 11.6M 109M

0 idle_threa 63.9% 24:26.99

Mac OS X Pocket Reference

Related Reading

Mac OS X Pocket Reference
A User's Guide to Mac OS X
By Chuck Toporek

There at the top of the list you can see ditto getting started. Once it has finished, that process will leave the list. To stop top and return to the prompt, just press q. As you can see, top tells you a lot more about your system, so start with its man page (man top), to learn all about it. Also, here's a good page from Apple that describes top.

Finally, once you're sure it's working, don't forget to go back and reset your crontab to the time you want it to run regularly.

This article introduced you to the basics of shell scripting and the user crontab, both very powerful features of Mac OS X's Unix. Stay tuned for future articles that show still more ways to put this power to use.

Chris Stone is a Senior Macintosh Systems Administrator for O'Reilly, coauthor of Mac OS X in a Nutshell and contributing author to Mac OS X: The Missing Manual, which provides over 40 pages about the Mac OS X Terminal.


Return to the Mac DevCenter.

Copyright © 2009 O'Reilly Media, Inc.