The scenario: Instead of just using one Mac, you regularly use two (a desktop and a laptop) and would like to keep up to date copies of your data on all of your machines. After all, when working at home, you want to take advantage of the large monitor and dual processors of a desktop PowerMac, and when you are on the road, you want all the portability of an iBook or a PowerBook.
Most homemade solutions to this problem are haphazard and error prone. However, there is a tool that software developers use that can help you. It's called CVS. And with it, you can work with your data no matter where you are.
CVS is an open source tool that provides version control. Version control is the practice of maintaining information about a project's development by tracking changes and coordinating the development efforts of many developers. CVS uses a centralized repository (sometimes called an archive or a depot) to store all the information about each and every file, as well as every change to those files, contained in a project. These kinds of systems are used in the development of projects small and large, including the development of operating systems like Mac OS X.
When using version control, each and every developer of the project has a copy of these files on his or her own machine. As a developer makes changes, they are committed back into the central repository allowing the other developers on the project access to the latest code. This allows lots of people to cooperate on the same source files with a minimum of fuss. If two developers make changes to the same file at the same time, CVS will defer the commit of the second file until the second developer resolves the conflict. Usually, these conflicts are easily dealt with and development proceeds.
CVS supports all sorts of additional operations that are useful to large teams. However, for our purposes (which are much less demanding than software development), we can take the functionality that we've just described and use it to solve the problem of managing our own data on multiple machines. Even if you are the only person to use your data, CVS can help you easily maintain it on as many machines as your bank account can fund.
CVS comes as part of Mac OS X's development tools. In order to use it, you'll need to install the developer tools. You can find the tools on the Tools CD-ROM that came with your copy of Mac OS X or you can download them from the Apple Developer Connection. Of course, you could get the source for CVS and compile it yourself, but then you'd need GCC. Just save yourself the hassle and get the developer tools.
First, you need to identify a machine that can serve as the repository. If you have two machines, such as an iBook and a PowerMac, then you should choose the one that stays at home and powered on, in this case the PowerMac, as your repository. If you are lucky enough to have a third machine that you use as a server for other purposes, then you should probably use that machine to store your repository on. The important thing is that the machine you use as a server should be one that is likely to be powered up and available when you need it.
Once the repository is set up, you can access it from the machine you set it up on or from other machines. The first case, where both the repository and the working copy of your files are, is an example of local usage. The second case, for example, when you check out your files onto your iBook, is called remote usage. In both situations, you use the same set of CVS commands, but you have to do a bit more set up work for the remote case.
Once you've decided on which machine to place the repository, you have
to pick where on that machine you want your repository to live. You want
to make sure that it is a location that you'll remember easily later. For
my setup, I use the
/Library/Depot directory. Once you've
decided where you want it, create the directory, then initialize your
repository with the following commands:
[Mercury ~] duncan% mkdir /Library/Depot [Mercury ~] duncan% cvs -d /Library/Depot init
-d argument lets CVS know where the repository is
init argument tells CVS to initialize the
directory as a new repository. This blesses the directory as a CVS
repository and installs a copy of the files that will control how it
To make sure all is well, we are going to perform an initial checkout of the repository. Make an empty directory (on the same machine as the repository) and execute the following command in that directory:
[Mercury ~/tmp] duncan% cvs -d /Library/Depot checkout .
Once again, the
-d argument lets CVS know what directory
the repository is located in. The
checkout . argument
(don't forget the dot at the end!) tells CVS to checkout a copy of
everything in the repository. You should see the following output from
cvs checkout: Updating . cvs checkout: Updating CVSROOT U CVSROOT/checkoutlist U CVSROOT/commitinfo U CVSROOT/config U CVSROOT/cvswrappers U CVSROOT/editinfo U CVSROOT/loginfo U CVSROOT/modules U CVSROOT/notify U CVSROOT/rcsinfo U CVSROOT/taginfo U CVSROOT/verifymsg
The files that were checked out are the administration files. By editing, and then checking these files back in, we can change how CVS works. Mostly, we will want to leave these alone for our use, but there is one file that we will need to modify.
In addition to several quirks, CVS has one major flaw. By default it treats all files as text files and can't, by itself, tell the difference between text and binary. It wants to treat all files as text because then it can save space in the repository by only storing the difference between files. For HTML files, this is great. However, for binary files that we work with all the time, such as Microsoft Word files (.doc) or Excel files (.xls), this strategy falls on its face and will make a mess.
To fix this, edit the
CVSROOT/cvswrappers file to look
like the following (the lines you need to add are in bold):
# This file affects handling of files based on their names. # # The -t/-f options allow one to treat directories of files # as a single file, or to transform a file in other ways on # its way in and out of CVS. # # The -m option specifies whether CVS attempts to merge files. # # The -k option specifies keyword expansion (e.g. -kb for binary). # # Format of wrapper file ($CVSROOT/CVSROOT/cvswrappers or .cvswrappers) # # wildcard [option value][option value]... # # where option is one of # -f from cvs filter value: path to filter # -t to cvs filter value: path to filter # -m update methodology value: MERGE or COPY # -k expansion mode value: b, o, kkv, &c # # and value is a single-quote delimited value. # *.ai -k 'b' *.doc -k 'b' *.bmp -k 'b' *.class -k 'b' *.classes -k 'b' *.dmg -k 'b' *.eps -k 'b' *.gif -k 'b' *.gz -k 'b' *.icns -k 'b' *.jar -k 'b' *.jpg -k 'b' *.jpeg -k 'b' *.nib -k 'b' *.ofile -k 'b' *.pdf -k 'b' *.png -k 'b' *.ppm -k 'b' *.ppt -k 'b' *.pqg -k 'b' *.prj -k 'b' *.ps -k 'b' *.psd -k 'b' *.tar -k 'b' *.tif -k 'b' *.tiff -k 'b' *.ttf -k 'b' *.xls -k 'b' *.Z -k 'b' *.zip -k 'b'
This is not an exhaustive list, but it serves as the day-to-day list that I use in my repository. Make sure that any binary files that you will put in your repository are on this list. Note that the file is case sensitive so you may want capital versions as well.
Once you have edited the file, we need to check it back in. To do this, issue the following command:
[Mercury ~/tmp] duncan% cvs commit -m "Sync"
This tells CVS to commit our changes back to the repository. The -m argument is the commit message that will be kept in the repository. When you execute this command, you should see the following output:
cvs commit: Examining . cvs commit: Examining CVSROOT Checking in CVSROOT/cvswrappers; /Library/Depot/CVSROOT/cvswrappers,v <-- cvswrappers new revision: 1.2; previous revision: 1.1 done cvs commit: Rebuilding administrative file database
This output will tell you each and every action that is taken by CVS. In this case, it notices that we've modified one of the configuration files and rebuilds its administrative database.
You might notice that we didn't use the -d argument to CVS this time. We only need to tell CVS where the repository is if we haven't checked it out yet into the directory that we are working in. Once checked out, CVS leaves itself enough information to figure things out.
To checkout a repository on other machines, we are going to use the ability to run CVS over SSH. This requires two things:
The SSH server is up and running on the machine on which the repository is located.
CVS_RSH environment variable is set on the client
machine on which we are going to checkout the repository.
To satisfy the first requirement, simply enable the Remote Login Service option in the Sharing preference panel, as show in Figure 1.
This enables the SSH server on your machine which will let you login to your machine from any other machine that has an SSH client.
There are a few different ways you can satisfy the second requirement. You can set the environment variable on the command line with the setenv command. To do this, simply execute the following line:
[Titanium ~/tmp] duncan% setenv CVS_RSH ssh
Of course this will soon become annoying, as you'll always have to
remember to execute this command. You could always set it in your
~/.tcshrc file, but there's an even better solution. You can
set it in your ~/.MacOSX/environment.plist file. This will
make sure that it is set for every application that runs allowing programs
that have built in CVS integration, such as Project Builder, to seamlessly
use your repository. All you need to do is create the
~/.MacOSX directory (if it doesn't exist), and save the
following as your
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE plist SYSTEM "file://localhost/System/Library/DTDs/PropertyList.dtd"> <plist version="0.9"> <dict> <key>CVS_RSH</key> <string>/usr/bin/ssh</string> </dict> </plist>
This is by far the best solution, although you'll need to log out of your machine and back in for it to take effect. Once you've done this, you're ready to check out the repository. To so do, we're going to use a variant of the cvs checkout command that we used before that will tell CVS that our repository is located on a different machine. This command is of the form:
cvs -d :ext:[user]@[machine]:[repository directory] checkout
On my machine, I execute the following
[Titanium ~/tmp] duncan% cvs -d :ext:duncan@Mercury.local:/Library/Depot checkout .
Once again, don't forget the dot at the end! If this is the first time that you've used SSH between your machines, you'll see some output asking if you are sure you want to connect. You will then be challenged for your password for the machine containing the repository. After that, the files will be checked out as before.
There is another way to remotely access a CVS repository (called pserver access), but it is more difficult to set up and not as secure for our purposes. If you really want to set up this kind of access, there is information on how to do it in many of the CVS books.
Now that we've successfully checked out the repository onto two machines, we're ready to start using CVS for our files. The rest of this section will give you the basic commands you need to work with your files.
Let's say that we want to keep some pictures in the repository. To do so, we'd create a Pictures subdirectory in our checked out copy of the repository, copy the images into them, and then add the files to CVS.
The following commands illustrate how we might do that:
[Mercury:~/tmp] duncan% mkdir Pictures [Mercury:~/tmp] duncan% cp ~/Pictures/me.jpg Pictures/me.jpg [Mercury:~/tmp] duncan% cvs add Pictures Directory /Library/Depot/Pictures added to the repository [Mercury:~/tmp] duncan% cvs add Pictures/me.jpg cvs add: scheduling file `Pictures/me.jpg' for addition cvs add: use 'cvs commit' to add this file permanently [Mercury:~/tmp] duncan% cvs commit -m "Sync" cvs commit: Examining . cvs commit: Examining CVSROOT cvs commit: Examining Pictures RCS file: /Library/Depot/Pictures/me1.jpg,v done Checking in Pictures/me.jpg; /Library/Depot/Pictures/me.jpg,v <-- me.jpg initial revision: 1.1 done
To checkout the file onto the other machine, we would issue the cvs update command as follows:
[Mercury:~/tmp] duncan% cvs update -d
-d option to the update command tell CVS to check out
any subdirectories that were added since the last time we performed an
update. You should see the following output:
cvs update: Updating . cvs update: Updating CVSROOT cvs update: Updating Pictures U Pictures/me.jpg
Voila! Your data is now mirrored and updated between multiple
machines. Anything you add to one machine will appear on other
machines. All you need to remember to do is to add files to the
repository, and to regularly run the
cvs update -d
Occasionally you'll want to remove a file from the repository. To do so, simply remove the file, then issue a cvs delete command. Here's an example:
[Mercury:~/tmp] duncan% rm Pictures/me.jpg [Mercury:~/tmp] duncan% cvs delete Pictures/me.jpg cvs remove: scheduling `Pictures/me.jpg' for removal cvs remove: use 'cvs commit' to remove this file permanently [Mercury:~/tmp] duncan% cvs commit -s "Sync" cvs commit: Examining . cvs commit: Examining CVSROOT cvs commit: Examining Pictures Removing Pictures/me.jpg; /Library/Depot/Pictures/me.jpg,v <-- me.jpg new revision: delete; previous revision: 1.1 done
Moving files is a pain with CVS. There is no cvs move command, so you have to delete the file from where it was and add it to wherever else you want it to be.
So now that we've learned how to use CVS, how should we use it? Well, the answer is "it depends". Everyone's sweet spot will be different, but after using CVS to maintain my data on multiple machines for several years, here's a set of guidelines:
Don't check in your entire home directory into CVS. There's a lot of data there that you don't need to replicate. Instead, focus on just checking in the important things: your documents.
Don't check in applications. Again, it's the data that is important. You can install the same application on multiple machines easily. What CVS is best at is making sure that your data files for those applications are mirrored across all of your machines.
That said, CVS is the perfect place to stash your shell scripts and
other goodies that you might have in your
Do make sure that you have the appropriate binary flag set in
CVSHOME/cvswrappers before checking in a binary file for the
first time. If you don't you could have trouble later.
In general, I keep the contents of my
in CVS which lets me have all of my documents with me wherever I go. As
well, I keep my
~/bin folder in CVS so that all of my shell
scripts and command line tools stay with me. And, finally, I keep all of
my Code in a
~/Code folder. To keep these updated, I have a
~/bin that executes the following:
cd ~/Code cvs update cd ~/Documents cvs update cd ~/bin cvs update
You'll want to come up with whatever scheme makes the most sense for your usage patterns. Experiment a little bit. See what works. But by starting with these guidelines, you should find your sweet spot faster.
CVS is by no means the perfect tool for the job. People that use Source Code Management (SCM) tools (the fancy term for the task that CVS performs) will tell you all sorts of nits that they have with CVS. These nits usually include the fact that even when you check in three files together, CVS doesn't note that the versions of those three files are related. As well, moving files in CVS is problematic. You have to first delete the file from CVS and then add it in its new location. Not only is this burdensome, but you loose the history of the file.
Even with these faults, CVS is a very useful tool to use when maintaining your data across multiple machines. And looking to the future, there is a successor to CVS in development called Subversion which will ease many of the woes of CVS. You can find more out about Subversion at http://subversion.tigris.org/.
This article gets you started with using CVS to manage your data. However, at some point you'll probably want to dig deeper into what CVS can do. The following resources can be of help:
CVS Pocket Reference, by Gregor N. Purdy (published 2000 by O'Reilly and Associates, with a 2nd Edition due in August). This small and affordable little guide gives you the complete list of CVS commands and options to those commands.
The CVS site, located at http://www.cvshome.org/. This website contains the source code for CVS, FAQs, and the 184 page "official" user manual for CVS by Per Cederqvist, et al.
James Duncan Davidson is a freelance author, software developer, and consultant focusing on Mac OS X, Java, XML, and open source technologies. He currently resides in San Francisco, California.
Return to the Mac DevCenter.
Copyright © 2009 O'Reilly Media, Inc.