MacDevCenter    
 Published on MacDevCenter (http://www.macdevcenter.com/)
 See this if you're having trouble printing code examples


Distributed Tiger: Xgrid Comes of Age

by Drew McCormack
08/23/2005

Waiter...there's a Distributed Computational Network in my Panthera Tigris!

Around a year ago, I waxed lyrical in these pages [1,2] over the then preview release of Xgrid, an Apple product for distributed computation. I made various predictions, some on the money, and others less so.

When Mac OS X 10.4 Tiger was released, Xgrid lost its preview status, and became a real boy. In fact, it's shipped in every copy of OS X Client and Server. Now, I would like to revisit Xgrid, bring you up-to-date on the changes, and show you how it could be useful in your application development, whether your interest is DNA or DVDs, Picasso or Poincaré.

In this first of two articles, I will show you how to set up a small Xgrid for testing purposes, submit simple jobs to the grid with the command line interface (CLI), and query their progress. The second article will be a Cocoa Tour de Force, involving new Tiger technologies like Automator and Core Image, in addition to Xgrid.

I'm going to take quite a bit for granted, and won't go into any detail related to the architecture of Xgrid. If you are not familiar with Xgrid at all, your best course of action would probably be to read the first of my original articles, before proceeding with this one. Apple has also provided a document on Xgrid Administration for Mac OS X Server, which gives a good overview of how Xgrid works.

What's Changed?

Related Reading

Mac OS X Tiger for Unix Geeks
By Brian Jepson, Ernest E. Rothman

Although Xgrid 1.0, which shipped with Tiger, is fundamentally the same as the preview releases that I wrote about in the earlier articles, there are some important differences. Probably the most significant is that Apple has now steered away from the relatively inflexible approach of submitting jobs with a GUI tool, and instead opened up the client-side Cocoa API for public consumption. This was one of my original predictions that turned out to be correct (he said smugly).

This move has a number of advantages; anyone who used the preview releases of Xgrid would soon have realized that supplying a GUI, and requiring developers to write plugins, was not the best approach for grid computing. The problem is that distributed computation can take so many different forms; there are as many applications as there are developers to implement them, and each application requires a completely different user interface.

What's more, many developers would rather integrate Xgrid functionality directly into their applications, as this provides for a better user experience than requiring users to install plugins and work with a separate Xgrid client application. In fact, the second of my original pieces on Xgrid demonstrated how you could do just that, even without a Cocoa API, by invoking the xgrid command-line tool from within a Cocoa application.

Apple must have been listening to its developers, because Xgrid 1.0 does include a Cocoa API, which allows developers to directly integrate Xgrid into their applications without depending on command-line hacks. The original Xgrid client GUI has largely disappeared, but some of its plugins live on as examples in the /Developer/Examples/Xgrid directory, which appears after you install the Xcode developer tools. The GridSample project is particularly useful because it allows users to run arbitrary serial and parallel (MPI) jobs.

Another benefit of opening up the Xgrid API to developers is that it paves the way for third parties to build more powerful clients, in addition to integrating Xgrid directly into their applications. Good examples of this are PyXG and GridStuffer. PyXG is a python interface to Xgrid developed by Brian Granger, and GridStuffer is a free Cocoa client developed by Charles Parnot at Stanford.

GridStuffer is similar to the GridSample application, but is considerably more advanced, offering many more options for job submission and monitoring. (Incidentally, Charles is grid master of one of the largest Xgrids in the world--maybe THE largest. He uses it for biochemical modeling, so if you have some spare CPU cycles, why not donate them to his research?)

Xgrid has not only evolved on the client side; the controller has also seen considerable change. My original prediction that Xgrid would facilitate peer-to-peer computation--where Macs would seek each other out via airport, and help each other perform expensive operations like iMovie renders--still seems some way off. Apple has instead opted for a more traditional model, with a central Mac OS X Server acting as controller. They have provided a basic controller on the client version of Tiger, but it is intended for testing purposes only, and doesn't have some of the security features (e.g., Kerberos Single Sign-On) of the controller supplied with the server version of the OS.

Configuring an Xgrid Agent

If you aren't concerned with security, setting up Xgrid is very straightforward. We will now go through the steps required to get a simple test grid up-and-running on your local machine. This grid will not use any form of authentication, so it should only be used on systems that are isolated from the big, bad world. Later, I will show you how you can add password authentication, which is more involved, but also offers more peace of mind.

First, we will configure the Xgrid agent. The agent is the guy that performs the calculations contained in a job. You configure the agent using the Sharing pane in System Preferences. If you open the pane, and select the Services tab, you should see Xgrid in the table view on the left. Select it, and click the Configure... button. Configure the agent to connect to the first available controller, always accept tasks, and not use any authentication. Figure 1 shows how it should look.

Configuring the Xgrid agent in System Preferences screenshot.
Configuring the Xgrid agent in System Preferences.

Not all Macs run Mac OS X 10.4, so you will be happy to hear that there is an agent for 10.3 (Panther) clients available from Apple's website here.

Setting Up a Controller

The next piece of the Xgrid puzzle is the controller. Because most of us do not run Mac OS X Server, I am going to take you through setting up an Xgrid controller on the client version of Mac OS X Tiger. Apple hasn't gone out of its way to make setting up this controller obvious, but it can be accomplished with a few well-chosen commands in Terminal. If the command line is completely alien to you, you might consider using XgridLite, which achieves the same end result with a more user-friendly graphical interface.

The command-line tool xgridctl can be used to stop and start the controller and/or the agent. If you are familiar with the command-line interface to the Apache Web Server (i.e., apachectl), xgridctl should seem familiar, because it operates in a very similar manner. Simply open the Terminal application, and enter the following command to start the Xgrid controller:


drew pbook: sudo xgridctl c start

Enter your password when requested. Note that you need to be an administrator of your Mac to start the controller, and issue commands as the superuser, which explains the use of sudo. The c in the command indicates that you wish to send a command to the controller, rather than the agent, for which you use the letter a. The command itself comes last, in this case start. Other options include stop and restart. For a complete overview of xgridctl, see the man page by entering the following in terminal:


drew pbook: man xgridctl

The Xgrid Admin Tool

One of Apple's best kept secrets is that it gives away a piece of Mac OS X Server for free. Some of the Server administration tools can be downloaded here. The Xgrid Admin tool is included in this suite, which can be used to log in to an Xgrid controller, and monitor its activities. I suggest you download the tools, install, and then take a peek in /Applications/Server. Start the Xgrid Admin tool, and when prompted for a controller, choose the one on your Mac that you started above.

The Xgrid Admin tool provides a summary of the activity of any controller or grid that you select in the table view on the left. The Overview tab gives you the current capacity of the controller or grid, how many jobs are running, how many are pending, and so forth. The Agents tab is used to find out which agents are connected and what they are doing, and the Jobs tab shows the status of all jobs that are pending, running, or have run on the controller or grid in question.

The Xgrid Admin tool.
The Xgrid Admin tool.

It can help your understanding of the inner workings of Xgrid considerably if you monitor what happens in the Xgrid Admin tool as you submit jobs to the Xgrid controller. When you are learning to use the Xgrid command-line client in the next section, I suggest you occasionally take a peek at the Xgrid Admin tool, and observe its response to the various requests from the client.

The xgrid Command-Line Client

The last element of Xgrid is the client. The client submits jobs to the controller, which then arranges for them to be run on one of the available agents. Apple only supplies one client application--aside from the example projects mentioned earlier--the command-line tool xgrid. You use xgrid not only for job submission, but also for querying grid status. To make sure your controller is running, and find out which grids are available, you can issue this command:


drew pbook: xgrid -grid list
{gridList = (0); }

The output is in property list format. An array of list identifiers is returned for the key gridList; in this case the array only contains a single number, 0, which is the identifier for the default grid on the controller.

To query the status of one of the grids in the list, you simply issue a command like the following:


drew pbook: xgrid -grid attributes -gid 0
{gridAttributes = {gridMegahertz = 0; isDefault = YES; name = Xgrid; }; }

The -grid option indicates that xgrid is to retrieve the attributes of one of the grids hosted by the controller, and -gid identifies the grid in question, in this case that with identifier 0. Various pieces of information are returned, including whether the grid is the controller's default or not (isDefault), the name of the grid (name), and the current workload (gridMegahertz).

Submitting Jobs and Querying Job Status

To use xgrid to run a job, you use the -job option.


drew pbook: xgrid -job run /bin/hostname
myhost.name.com

Supplying the -job option with run causes the command, in this case /bin/hostname, to be run synchronously. That is, the xgrid command will wait until the job is finished before returning control.

Running jobs synchronously is fine for short tests, but anything more substantial is better handled asynchronously. The submit command is passed with the -job option to run jobs asynchronously.


drew pbook: xgrid -job submit /bin/hostname
{jobIdentifier = 7; }

What you will notice about this is that the output of the hostname command does not appear. Instead, a property list is printed, with a job identifier. The idea is to use this identifier in future xgrid commands in order to query the state of your job, or retrieve its results.

To find out how a job is proceeding, you supply the attributes keyword with the -job option, like so:


drew pbook: xgrid -job attributes -id 7
{
    jobAttributes = {
        activeCPUPower = 0; 
        applicationIdentifier = "com.apple.xgrid.cli"; 
        dateNow = 2005-07-31 13:42:29 +0200; 
        dateStarted = 2005-07-31 13:37:53 +0200; 
        dateStopped = 2005-07-31 13:37:54 +0200; 
        dateSubmitted = 2005-07-31 13:37:50 +0200; 
        jobStatus = Finished; 
        name = "/bin/hostname"; 
        percentDone = 100; 
        taskCount = 1; 
        undoneTaskCount = 0; 
    }; 
}

The job identifier returned when we submitted the job is supplied with the -id option when querying job status. As you can see, many job attributes are printed, including how much CPU power is currently assigned to it (activeCPUPower), the percentage of the job that is currently complete (percentDone), and the job status (jobStatus).

The output of this command indicates that the job is Finished, so we can now retrieve the results:


drew pbook: xgrid -job results -id 7
myhost.name.com

The results keyword is passed with the -job option when retrieving job output, and the job identifier also needs to be supplied, of course.

There are many other options for the xgrid command-line tool. You can get a list of them by issuing xgrid -h, and more detail is available via the man page (man xgrid). Particularly worthy of note is that you can indicate a particular controller with the -h option, and that there are many commands that give you control over running jobs, enabling you to suspend or stop them, for example.

Submitting Multitask Jobs

The xgrid command in Tiger is more powerful than the one in Technology Preview 2. For a start, you can submit jobs that include multiple tasks. When you submit a job with multiple tasks, the controller splits it up, and distributes the tasks over the available agents. When all tasks are complete, the job is finished.

To stipulate many tasks in a single command could get messy, so Apple has introduced a new batch mode of operation. When you are using the batch mode, you don't enter options after the xgrid command, but instead provide a property list file containing details of the various tasks. You can read about the structure of this property list in the Xgrid man page (man xgrid), but there is actually an easier way to get started.

Ernest ("Ernie") Prabhakar, Xgrid Product Manager at Apple, recommends that you run a standard nonbatch job, and then retrieve its specifications, using those as the basis for future batch runs. Here is an example that Ernie provided on the Xgrid mailing list: first you run a standard asynchronous Xgrid job, in this case the cal command.


drew pbook: xgrid -job submit /usr/bin/cal 3 2005
{jobIdentifier = 8; }

You monitor the job's attributes until its status is Finished.


drew pbook: xgrid -job attributes -id 8       
{
    jobAttributes = {
        activeCPUPower = 0; 
        applicationIdentifier = "com.apple.xgrid.cli"; 
        dateNow = 2005-08-07 11:33:16 +0200; 
        dateStarted = 2005-08-07 11:31:22 +0200; 
        dateStopped = 2005-08-07 11:31:25 +0200; 
        dateSubmitted = 2005-08-07 11:31:21 +0200; 
        jobStatus = Finished; 
        name = "/usr/bin/cal"; 
        percentDone = 100; 
        taskCount = 1; 
        undoneTaskCount = 0; 
    }; 
}

Now you can retrieve the job specification, which is the property list you need to submit batch jobs:


drew pbook: xgrid -job specification -id 8
{
    jobSpecification = {
        applicationIdentifier = "com.apple.xgrid.cli"; 
        inputFiles = {}; 
        name = "/usr/bin/cal"; 
        submissionIdentifier = abc; 
        taskSpecifications = {
            0 = {arguments = (3, 2005); 
                 command = "/usr/bin/cal"; }; 
        }; 
    }; 
}

The best approach is to pipe the output of this command to file and edit it, either with your favorite text editor, or with Apple's Property List Editor (in /Developer/Applications/Utilities). The following command will dump the property list to the file batch.plist:


drew pbook: xgrid -job specification -id 8 > batch.plist

Once you have the property list, you can add new tasks to it. Here is an example with three separate tasks:


{
    jobSpecification = {
        applicationIdentifier = "com.apple.xgrid.cli";
        inputFiles = {};
        name = "Sleepy Calendar";
        submissionIdentifier = abc;
        taskSpecifications = {
            0 = {arguments = (3, 2005); command = "/usr/bin/cal";};
            1 = {arguments = (20); command = "/bin/sleep";};
            "hi" = {arguments = ("Hi Mom!"); command = "/bin/echo";};
            };
    };
}

The three tasks are:

  1. The cal command from earlier, which prints a calendar for March 2005.
  2. A sleep command that simply waits 20 seconds before exiting.
  3. An echo command that prints "Hi Mom!" and exits.

Each command is provided in the dictionary corresponding to the key taskSpecifications. Each task has an arbitrary identifier; you can use simple indexes like 0 and 1, or more meaningful keys like the string "hi"; it's up to you. The attributes of each task are given in a dictionary, which include entries for the command itself, and the arguments that should be passed to it.

To run the batch job, you simply use the -job option with the batch keyword.


drew pbook: xgrid -job batch batch.plist
{jobIdentifier = 10; }

Now would be a good time to take a look at the Xgrid Admin tool; you should be able to see the CPU power meter in the Overview tab spring into action, and the new Sleepy Calendar job appear in the table view of the Jobs tab (no relation to everyone's favorite iCEO). A progress bar will tell you when all tasks are complete, which shouldn't take much longer than 20 seconds.

Retrieving the results of a batch job is no different to nonbatch jobs:


drew pbook: xgrid -job results -id 10
     March 2005
 S  M Tu  W Th  F  S
       1  2  3  4  5
 6  7  8  9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31

Hi Mom!

Results are returned for each task in order, beginning with the output of the cal command, and ending with echo's greeting.

A footnote on the format of the property list: you can also use XML property lists, but they tend to be more verbose. You can also easily convert a property list to XML format using the plutil command. To convert the batch.plist file, for example, you simply issue the following command:


drew pbook: plutil -convert xml1 batch.plist

This replaces the contents of the batch.plist file with the XML below, which contains exactly the same information.


<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" 
    "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
        <key>jobSpecification</key>
        <dict>
                <key>applicationIdentifier</key>
                <string>com.apple.xgrid.cli</string>
                <key>inputFiles</key>
                <dict/>
                <key>name</key>
                <string>Sleepy Calendar</string>
                <key>submissionIdentifier</key>
                <string>abc</string>
                <key>taskSpecifications</key>
                <dict>
                        <key>0</key>
                        <dict>
                                <key>arguments</key>
                                <array>
                                        <string>3</string>
                                        <string>2005</string>
                                </array>
                                <key>command</key>
                                <string>/usr/bin/cal</string>
                        </dict>
                        <key>1</key>
                        <dict>
                                <key>arguments</key>
                                <array>
                                        <string>20</string>
                                </array>
                                <key>command</key>
                                <string>/bin/sleep</string>
                        </dict>
                        <key>hi</key>
                        <dict>
                                <key>arguments</key>
                                <array>
                                        <string>Hi Mom!</string>
                                </array>
                                <key>command</key>
                                <string>/bin/echo</string>
                        </dict>
                </dict>
        </dict>
</dict>
</plist>

Setting Up a Secure Xgrid

To this point we have largely ignored the issue of security. If this doesn't bother you, you can happily skip this section. But if it does worry you, it is possible to use password authentification with the Xgrid controller supplied in the client version of Tiger, although Apple hasn't gone out of its way to make it easy to configure. The problem is, there isn't any straightforward means of entering the controller's passwords.

A controller has two passwords: one for agents, and one for clients. These passwords need to be stored in a special format in the files /etc/xgrid/controller/agent-password and /etc/xgrid/controller/client-password, but you'll probably find these files are missing from your file system. Luckily, Brian Granger discovered a simple way to create these files, and posted his solution to the Xgrid mailing list, so what follows is thanks to Brian.

The controller password files may not exist on Mac OS X client, but the agent files do. You create them whenever you enter a password in the Xgrid configuration pane of System Preferences. So the trick to getting password files for the controller is simply to copy the ones created for the agent.

The first thing you have to do is set a password for your agent. You do this like so:

  1. Open the Sharing pane in System Preferences.
  2. Select the Services tab, and then the Xgrid row of the table view.
  3. Press the Configure... button.
  4. Select the Password authentication method from the pop-up button, and enter a password.
  5. Click the OK button.

There should be a file containing the agent password at the path /etc/xgrid/agent/controller-password. The file is called controller-password because it is the password used to authenticate the agent to a controller.

Now that you have a correctly formatted password file, you can simply copy it to the correct path for the controller passwords, like so:


sudo cp /etc/xgrid/agent/controller-password \
/etc/xgrid/controller/agent-password
sudo cp /etc/xgrid/agent/controller-password \
/etc/xgrid/controller/client-password

(Note that you have to be an administrator to do this; you will be prompted for your account password.) The controller now has one password file to authenticate the agent, and one to authenticate the client. In this case, the passwords are all the same, because we simply copied the same file for each. If you want different passwords, you could change the agent password in the System Preferences multiple times, and after each change, copy the password file, but for most a single password will suffice.

At this point you have to edit the property list of the controller so that it uses password authentication. Open the file /Library/Preferences/com.apple.xgrid.controller.plist, and change the entries for the AgentAuthentication and ClientAuthentication keys to Password, rather than None. It should end up looking something like this:


<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" 
    "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
        <key>AgentAuthentication</key>
        <string>Password</string>
        <key>ClientAuthentication</key>
        <string>Password</string>
</dict>
</plist>

With all the configuration complete, all you have to do is restart the agent and controller. For the agent, you can either open the System Preferences Sharing pane and stop and start the Xgrid agent, or you can enter the following command:


drew pbook: sudo xgridctl a restart

For the controller, you can enter this:


drew pbook: sudo xgridctl c restart

To test that everything went according to plan, try submitting one of the xgrid jobs above, using the -p option to provide the controller password.

If you find all of this tiresome, you might like to try Ed Baskerville's XgridLite tool, which was mentioned earlier. It can do it all at the touch of a button.

Until Next Time...

That completes our introduction to setting up Xgrid, and using the command-line xgrid tool. Next time, we will leave Terminal and take a look at the Cocoa API to Xgrid. To spice things up a bit, we will draw on a couple of other Tiger technologies--Automator and Core Image--to build an Xgrid-enabled, image-processing tool. See you then.

Drew McCormack works at the Free University in Amsterdam, and develops the Cocoa shareware Trade Strategist.


Return to the Mac DevCenter

Copyright © 2009 O'Reilly Media, Inc.