Editor's note: Ethan has collected this series and other information into Managing RPM-Based Systems with Kickstart and Yum.
My two previous articles explained how to use Kickstart to automate OS installs and upgrades. This article demonstrates some techniques for the third piece of the system maintenance cycle: keeping your machines up to date. That includes how to:
yum repository, such that your machines don't
have to update from the public servers. This saves you bandwidth, shortens
update times, and gives you more control over what updates you install.yum run unattended.What's interesting is that the last two techniques are half technology, half architecture: a naming convention and a few symbolic links go a long way. Some custom code doesn't hurt, either.
I tested the steps outlined in this article under Fedora Core versions 2 and
3, but they should also work under Red Hat 9 and FC1. This article assumes you
have a modest familiarity with RPM, Kickstart, and yum. Refer to
the Resources section for links to documentation and other
articles on these topics.
yum RepositoryRed Hat 9 and the Fedora Core series include yum (Yellow Dog
Updater, Modified) to simplify system updates. Point yum to a
collection of RPMs (a repository, or repo for short)
and it will find the latest packages to install on your system. Fedora's
default yum install includes public repo definitions, so you can
keep your system up to date by running yum's cron jobs.
Running your own internal repo saves bandwidth if you have multiple machines, because only one machine fetches new RPMs from the outside world. This makes internal updates faster, because few Internet hookups can match LAN speeds. You can also fold your own RPMs into the repo and manage all software updates from the same centralized resource.
Most of all, pointing machines to a private repo gives you control: you can
limit what yum sees and, in turn, what it installs.
yum downloads a repo's newest RPMs, which aren't always the best
for you. For example, an new version of a shared library may require you to
recompile some homegrown code. That makes for an ugly surprise.
A yum repo is just a collection of RPMs and some metadata
extracted from them. yum clients use the metadata to determine
what RPMs are in the repo. Setting up a repo, then, requires:
To setup your wget job, first select a download site from the
Fedora team's list of
update mirrors.
Next, wrap your wget command in a shell script, Perl tool, or
whatever else suits your fancy. I use the following wget
switches:
--progress=dot:mega: each dot represents 64K, instead of the
default 1K. This lets you track download progress without flooding your screen
or log file on large files.--accept=rpm: download only files with a *.rpm
extension. There's no need to grab miscellaneous HTML or text files.--recursive: even though you're fetching only one level of
files, this causes wget to descend into the specified directory to
fetch the RPMs.--no-parent: don't follow links that point above this
directory, or you risk downloading the entire site.--relative: follow only relative links. Absolute links usually
point to off-site resources, or at least resources that aren't under the current
directory.--no-directories: don't create a directory structure that
matches the remote site. You just want the files.--exclude-directories='*/SRPMS/*': sometimes the source RPMs
are in a directory beneath the binary RPMs. You probably don't want them for
your private yum repo.--no-clobber: don't overwrite existing files. You don't want
to download the entire set of RPMs each time, just the updates.--wait {n}: pause {n} seconds between downloads,
so your job doesn't hammer the remote server. (This is more useful for HTTP
download mirrors than for FTP mirrors.)--directory-prefix={dir}: where to put the downloaded
files.You can call your wget script manually or via cron. Please show
courtesy to your download site's maintainer and schedule jobs for off-hours,
and set the --wait flag to 60 or 120 seconds (1 or 2
minutes) or more between downloads. If the job runs overnight, the extra download time
won't make a difference.
Setting up the web server is even easier: point the document root to the directory where you downloaded the RPM updates. For flexibility and growth, you may want to standardize on a directory structure, such as that shown in Figure 1.

Figure 1. Sample directory structure for a yum repo, hosted on a Kickstart server
This directory structure accommodates several OS releases and architectures. In this example, the updated RPMs for Fedora Core 2, i386 architecture go under the web server's document root in FC2-i386/updates/Fedora/RPMS.
Run yum-arch to extract RPM metadata if the repo is for FC2 or
older. Using the directory structure from above:
$ yum-arch {document root}/FC2-i386/updates
This command scans the RPMs in the tree and dumps header information into
the FC2-i386/updates/headers directory. There is one
.hdr file for each RPM in the tree.
FC3 stores its header info in a different format, generated by the
createrepo command:
$ createrepo {document root}/FC3-ia64/updates
createrepo stores the RPM metadata in a set of XML files. In
the above example, these files exist under the web server document root in
FC3-ia64/updates/repodata.
yum-arch still exists under FC3, so you can create the older
header format for FC2 clients. It may be possible to run
createrepo under older Red Hat releases in order to serve FC3
clients. Because both tools are written in Python, they might work under other
operating systems. Admittedly, I haven't tried this.
yum Clients to Use the New RepoConfiguring a client is as simple as editing a few text files.
For FC2 and earlier, the repo definitions live in /etc/yum.conf.
You don't want the client machines downloading from the public repos anymore,
so comment out those preexisting definitions with #
characters.
Next, define an entry for your shiny new local repo:
[internal-updates]
name = internal update server
baseurl = http://{update-server}/FC$releasever-$basearch/updates
This repo definition breaks down as follows:
[internal-updates] marks the beginning of a new repo
definition. This name should be unique within the file.name is a descriptive name for the repo.baseurl points to the update web server. The path portion of
the URL is the directory containing the headers directory created
by yum-arch.yum expands $releasever and
$basearch into the current host's OS revision and hardware
architecture, respectively. A mix of these variables and a predictable repo
directory layout allows you to maintain a single repo def for your entire
shop.FC3 separates repo definitions from the main yum.conf. To disable the existing repos, add:
enabled=0
to all of the .repo files in /etc/yum.repos.d. Create
your own internal-updates.repo file that contains just a stanza,
similar to the example FC2 entry. Next, test the repo configuration in a
nondestructive manner:
# yum check-update
This will contact the repo web server, fetch RPM header info (either from
headers or repodata, depending on the target
machine's OS version), and list the RPMs for which updates are available. If
you're satisfied with those results, tell yum to update the
machine based on the repo's contents:
# yum update
You certainly don't have to call yum by hand on all of your
machines every time you want to update them. Enable the yum daemon
to take advantage of automatic (cron'd) updates:
(set the daemon to start on every system boot)
# chkconfig --add yum
# chkconfig yum on
(start the daemon now, so you don't have to reboot)
# service yum start
There's a trade-off between the risk of unattended, automated updates and the cost of manual labor. Manual updates tend to win out in more formal shops. Later in the article, I'll demonstrate a method that provides a layer of change control while allowing machines to update themselves.
|
As long as you've downloaded the updated RPMs, you may as well fold them into the Kickstart process. In turn, newly Kickstarted machines will start their life with the updates already applied. To do this, you must put the latest RPMs under the Kickstart tree's Fedora/RPMS directory, copy the Fedora/base directory (from the original OS install media) to the Kickstart tree, and rebuild the hdlist files. As before, formalizing a directory structure and naming convention will help your repo scale.
|
Related Reading
Learning Red Hat Enterprise Linux & Fedora |
The first step is the toughest, because you can't simply download the RPM
updates right into the Kickstart tree's os/Fedora/RPMS directory.
Whereas yum downloads the latest RPMs, Anaconda (ergo Kickstart)
doesn't gracefully handle situations in which there are multiple versions of a
package in the install tree.
You must therefore replace old package versions in the install tree with their newer counterparts. Doing this by hand is for neither the faint of heart nor the lazy. Being a proud member of the latter category, I prefer to let code do the heavy lifting. The key is to use the RPM API to extract package header info and compare versions. Doing this based on RPM filenames alone is asking for a headache. (I've written a tool to do just this -- Novi.)
First, create a new directory to serve as the install tree. Using the directory structure outlined above, that is FC3-i386/dist under the document root. Put the latest packages under that directory, in Fedora/RPMS. (You can save space by hard-linking the RPMs from the install and update trees, if possible.) Copy the directory Fedora/base from the original OS media, too. You should have a directory structure similar to that of Figure 2.

Figure 2. Directory structure for a pre-updated Kickstart tree
Notice the dist directory has the same layout as the os directory, which holds the original install media. dist is essentially the os tree with newer RPMs.
Next, use genhdlist to rebuild the hdlist files. FC3
is a little pickier than FC2 and requires that you first generate a package
order file:
$ PYTHONPATH=/usr/lib/anaconda \
/usr/lib/anaconda-runtime/pkgorder \
{path to FC2-i386/dist} \
i386 Fedora > order.txt
$ /usr/lib/anaconda-runtime/genhdlist \
--withnumbers \
--fileorder order.txt \
--productpath Fedora \
{path to FC2-i386/dist}
FC2 requires only the genhdlist command, without the
--withnumbers and --fileorder flags. Feel
free to ignore pkgorder's warnings about ignore
package name relation(s).
Point your Kickstart clients to the new install directory and add:
url --url http://{build-server}/FC2-i386/dist
to ks.cfg. You won't have to double back after the install to apply the updates.
Call yum-arch or createrepo on the Kickstart tree
to let it do double duty as a yum repo. This creates a
one-stop shop for your machines: whether you're installing the OS or just
updating it, your entire shop will run the same set of RPMs.

Figure 3. Pre-updated Kickstart install tree that doubles as a yum repo
Notice this is the same as the dist directory mentioned earlier, just with the headers (or repodata) directory for the RPM metadata.
This setup is far from perfect, though. In a large shop, you probably want
to test new RPMs on a few machines before you install them everywhere. Then
again, you want to take advantage of cron jobs to let yum do its
work unattended.
I've spent my career near or in software teams, and as a result I tend to think in terms of release versions: What do we consider stable and production-quality? versus What are we still testing in the lab? and so on. If you had a designated stable build, you'd have no problem letting your production machines update from it automatically. Furthermore, having clear, labeled builds simplifies systems management because you can tell what "label build" a machine is running.
I applied some software development practices and devised a mix of directories and symbolic links to solve this problem. The trick is to create a new, labeled directory each time you fetch new updates from the public repos. I prefer to use dates as my labels in YYYYmmdd format. Figure 4 is an example of such a directory tree.

Figure 4. Labeled (dated) directories for combination Kickstart/yum trees
Populate the dated directory's Fedora/RPMS subdirectory with the latest RPMs. Copy the base directory from the original OS install media. In the end, the dated directory should resemble the os and dist directories described previously.
Then promote each build through a test cycle:
Point designated scratch machines directly to the dated directories, such as:
http://{update-server}/FC2-i386/dist-20050105
The scratch machines' Kickstart and yum configurations will
change with every build, to point to the (new) dated directory.
When a label has proven somewhat stable, release it to a wider audience. Designated test machines are integration areas for homegrown and third-party software. Your internal developers and QA staff will primarily use these.
Create a symbolic link dist-testing that points to the dated build
directory (dist-20050105 in this example). In turn, the test machines'
Kickstart and yum configurations point to the symbolic link
abstraction:
http://{update-server}/FC2-i386/dist-testing
The test machines' Kickstart and yum configurations don't
change. They are free to update from this repo at leisure.
After the build label has proven itself on the test machines, release it
to the production hosts. Create a symbolic link, dist-stable, that
points to the build's dated directory. Production hosts point to this
designated "stable" URL:
http://{update-server}/FC2-i386/dist-stablePromote each build through this cycle to ensure RPM updates' stability and compatibilty with your environment. Of course, you're free to add more steps to the promotion cycle, or split your production machines into A/B groups that update on staggered schedules.
Running your own yum repo saves time and bandwidth, and gives
you much more control than pointing your machines to the public repos. Folding
this into the Kickstart process brings you closer to a shop that runs
itself.
This article barely touched on yum client maintenance, such as
the occasional cache cleansing. See Resources section for links to
the yum documentation.
yum's
website includes documentation, FAQs, and more.Q Ethan McCallum grew from curious child to curious adult, turning his passion for technology into a career.
Return to the Linux DevCenter.
Copyright © 2009 O'Reilly Media, Inc.