Creating Filesystem Backups with 'rsync'
Pages: 1, 2
Prerequisites
The heart of the synchro script is the rsync command. What synchro does is
automatically pass the right arguments to rsync for any of my servers,
so that I don't have to build an rsync command file for each server.
First, some terms. A partition is a slice of a hard drive and is
referred to by a device name. In Linux, the partition names for the
first IDE drive are usually /dev/hda1, /dev/hda2, and so on. For a
SCSI drive, the names are /dev/sda1, /dev/sda2, etc. A filesystem is
a formatted partition. The mount command is used to mount a
filesystem somewhere in the directory hierarchy and is referred to by
its "mount point." For example, the filesystem located in partition
/dev/hda7 could be mounted at /home and referred to as the /home filesystem.
I refer to a filesystem or partition containing original data as the source and the place to copy it as the destination.
synchro is written in Perl, any recent (5.x or better) version of Perl should work. It calls
some system commands including mount and optionally fsck. You
will need the rsync command which is often not installed by default.
If you use a popular Linux distribution, it is on your CD-ROM. You can
also obtain it from the primary FTP site.
The beauty of using rsync is that it only copies the files that have
changed. If a given filesystem does not change much over a day then it
can be thousands of times faster than using a copy or tar command.
|
'synchro' knows about different filesystems; I have tested it with the
usual Linux
Now when anyone uses the command |
Configuration
As distributed, synchro assumes that both your hard drives will be
partitioned the same way. I put one drive on /dev/hda and the other
on the second controller at /dev/hdc. So, for example:
Source filesystem Partition Backed up in
/ /dev/hda1 /dev/hdc1
/home /dev/hdc7 /dev/hda7
This system makes it easy for me to remember find things when I need
to recover a file. If a file is removed from /home, I can use
mount to see that /home filesystem lives in /dev/hdc7 and then say
mount /dev/hda7 /mnt/synchro to temporarily make the backup copy
available. Normally all backup filesystems are left unmounted.
I put the code that determines the destination into a subroutine
called get_dest. If you have different requirements (such as different
drives than "a" and "c"), you can change the code in lines [70-94] to
customize it.
You can either explicitly pass the list of filesystems in on the
command line, or you can put them in a list in lines [45-52]. By default, I
look for /boot, /, /var, and /home. The command line overrides the built-in list.
synchro uses a built-in list called "extras" mostly to exclude things
that should not be copied, such as the /dev directory. The rsync
command does not handle the /dev directory gracefully! If you tell it
to copy /dev/hda1, for example, it tries to copy the entire unformatted
partition instead of just replicating the device file. When a
filesystem name matches an "extras+ entry, the right-hand part (after the
=> symbol) is added to the rsync command.
The default extras in lines [55-58] works well for all my systems.
I use /mnt/synchro as a temporary mount point. The script creates this
directory if it does not exist. Change line [68] if you want to use a
different location.
Initial setup
If you run synchro with a -h for help you will get this output:
This script synchronizes the partitions on two hard drives.
Usage: synchro [options] [filesystem...]
-d dryrun - show commands that would be run
without performing any actions
-f fsck - perform fsck commands on destination
partitions
-h show this message and exit
-n pass -n option to rsync so that it will report
without copying files
-v pass -v option to rsync so it will report
while copying
[[/font]]
When I install synchro on a new system, I first run it with -d to see
what commands it will execute. If they look okay, then I run it once
manually to copy everything. Then I run it again with -v. This time, it
will report on what files if any have changed.
Because synchro will never back up the /dev files, I use a tar command
pipeline during setup to copy the /dev files. Usually this is a
one-time thing because /dev files don't normally change unless you
change your hardware. Here is the command:
mount /dev/hdc1 /mnt/synchro
tar cvf - /dev | (cd /mnt/synchro; tar xpf -)
After I am satisfied that it's working correctly, I put an entry into
/etc/crontab to run it once a day. I use the -f option, so that the destination filesystems are checked everytime it's run. I made this a
command-line option so that you aren't forced to run it if you don't
want to.
If I am about to perform major changes, such as removing an account,
sometimes I will make a copy of /home using the command-line mode,
such as
synchro -v /home
The -v is passed on to rsync so that it will list out the files that
are changed.
Here is an outline of what synchro does. Line numbers are in brackets.
- Read command-line options [29] and filesystems [39-40], if any. If no filesystems are given use the default internal list [45-52].
- Create a mount point if one does not exist. [98-100]
- Run the mount command to build lists of filesystem types and partition names. [105-113]
- Loop over the list of filesystems [121-156]. For each filesystem,
- Get any extra options from the "extras" list. [124]
- Determine the destination name using info from step 2. [128-130]
- Check the destination filesystem with the
fsckcommand. [132-139] - Mount the destination filesystem. [141-144]
- Perform the
rsyncto synchronize content. [146-150] - Unmounts the destination filesystem. [152-155]
That's it. Also of note in the script is the syscmd() subroutine in
lines [158-176]. All system commands are routed through here to make
it easy to run the script in "dryrun" mode. If -d is given as a
command-line argument, the command will be printed in syscmd, but not
executed.
I will readily admit I'd love to use hardware-supported RAID-1 in
addition using this daily rsync scheme, but my tiny IT budget just
does not allow it. I've used various incarnations of this script for a
number of years now. I hope you find it useful, too.
Brian Wilson wrote most of this article while sitting in the Marin headlands overlooking the Golden Gate Bridge. He claims that bicycles and laptops and corporate downsizing definitely have their advantages.
Return to the Linux DevCenter.