MacDevCenter    
 Published on MacDevCenter (http://www.macdevcenter.com/)
 See this if you're having trouble printing code examples


Integrating Xgrid into Cocoa Applications, Part 1

by Drew McCormack
05/11/2004

Editor's note: This is the first of two articles exploring Xgrid. Today, Drew McCormack provides you with a little background information, then moves to installation, and finishes off with a command-line script for distributing compilation using Xgrid.

"Hey Rocky, watch me pull a rabbit from this hat." Bullwinkle, pulling a lion from his hat.

Magic?

Joe was sitting in coach on a redeye out of New York, and was having trouble sleeping. He decided to work on a home movie he had been editing, pulled out his old iBook, and charged up iMovie.

"Hmm, I really need to spruce this up with a few transitions," he thought, and set things in motion.

Then he waited... and waited... and waited.

"Boy, these video renders are hard work for such an old laptop," he thought, and prepared for a long, boring slog.

Suddenly, as if to answer his prayers, iMovie sprung into life. The progress bars were moving at what seemed like the speed of light, and his renders were complete in no time.

What was going on? Had his iBook found a few more MHz lying around? Was he that tired? Or was it just magic?

Meanwhile...

Related Reading

Cocoa in a Nutshell
A Desktop Quick Reference
By Michael Beam, James Duncan Davidson

Steve had just given a keynote at the newly reinstated New York Macworld Expo, and was heading home. The jet was in shop, so he was riding first class with a commercial carrier.

The keynote had gone well. He had unveiled the long-awaited dual processor PowerBook G5, and brought the house down. To reward himself, he had decided to take the demo laptop with him, and nobody at the show was going to say otherwise, not Phil, not anybody.

"Better check my email," he thought, opening the G5 laptop, and connecting to the onboard satellite network. And with this one simple action, Steve inadvertently helped render the transitions in Joe's home movie...

The Democratization of Distributed Computing

At one time you would have been burned at the stake for such witchery, but these days any five year old could tell you there is no magic. It's all down to Airport, Rendezvous, and the latest piece in the puzzle: Xgrid.

Xgrid was unveiled by Apple at the last Macworld to almost negligible fanfare. Admittedly it is still in an early stage of development, but the silence from Apple was almost deafening for a technology that may well prove to be one of the most significant in years. Yes, you read right, Xgrid may well be as big a revolution as the iMac was for home computing, and iMovie was for home movies.

Xgrid is software for distributing computation. A computer with Xgrid installed can send computational tasks to other Xgrid-enabled computers, and receive the results back upon completion.

Xgrid's roots are in a project at NeXT called Zilla. Zilla was the first desktop program to offer spontaneous clustering of idle computer resources. These days everyone knows about SETI@home and Folding@Home, but Zilla was the first. Apple acquired the technology from NeXT, and it has finally led to Xgrid.

The last few years have seen numerous applications gain a distributed mode of operation, from Maya to Shake, and now even Xcode. In this sense, Xgrid is not a new technology, but there is a difference: Each of these distributed applications must first be installed on each computer in the cluster. The computing nodes are static, only changing upon human intervention.

Xgrid also differs from applications like SETI@home, because, although these applications do form networks dynamically, they carry out a single task, and are not capable of arbitrary computation. The user of a computer running SETI@home cannot take advantage of the SETI@home network itself. The only application that runs on the SETI network is SETI@home.

Xgrid facilitates the formation of spontaneous clusters (that is, zero configuration) that can perform arbitrary computations. A user can submit any computational task they desire to Xgrid, and it will be run on any Xgrid computers configured to accept it.

As with so many things Apple develops, the idea is far from new. Other grid systems, such as Globus, Condor, and Sun Grid, have been around for some time, and are actually much more advanced than Xgrid. So why is Xgrid important?

It all comes down to the Apple touch. Take something that already exists, like an MP3 player, see how it could be useful to real people, and then produce an iPod. Take a good idea from some geeks in a research lab, like a Graphical User Interface, realize its potential for real people, and produce a Macintosh. Apple has always been the computer company with vision, and it is about to realize the potential of grid computing for real people.

Apple has taken something that is typically very complex — as you would know if you had ever tried to get any other grid software up and running — and made it excruciatingly simple to install and use. At the same time, they are beginning to blur the definition of what constitutes a single computer. In the future, computers will seek each other out, sharing the computational load, distributing work to where resources are idle. If an application can perform tasks on any networked Mac in the world, where does one computer end, and the next begin?

We are witnessing the beginning of the Democratization of Distributed Computing, and it didn't even warrant a mention in January's keynote. But don't worry, it will come. Apple knows what it has. If Mac OS X 10.4 doesn't have Xgrid built in by default, ready to blur the edges of personal computing, I will swim ten laps around Manhattan Island with only an Xserve cluster for buoyancy.

Yeah, Yeah, But What's This Article About?

OK, enough gushing. Let's get down to some technical stuff. This is the first of two articles about Xgrid for Cocoa developers. Apple has concentrated its efforts to date promoting Xgrid as a simple batch queuing system for scientists, but we know it is much more than that. Unless I am gravely mistaken, once Xgrid has stabilized somewhat, Apple will publish a Cocoa API, and start pushing the technology to non-scientific developers. In the meantime, we can already get a feel for how Xgrid could integrate into Cocoa apps by utilizing the Xgrid command line interface, made available in Preview 2.

In this, the first of our two articles, I will give you the low down on how Xgrid works. This introduction will be targeted at developers, not scientists, so I won't be beating around the bush. With formal introductions out of the way, I will quickly cover installation of Xgrid, and finish off with a command-line script for distributing compilation using Xgrid. Yes, we are going to build a distributed compiler, like the one in Xcode, with just Xgrid, and less than 100 lines of shell script.

In the second article, we will move on to utilizing Xgrid from within a Cocoa application, without requiring the Xgrid Client application to be running. To do this, we will wrap the command-line interface in Cocoa code. To demonstrate, I will build a simple, distributed Cocoa image-processing application, using the command-line tool ImageMagick.

Embarrassing

Here's a geek joke for you.

Q. Where does Xgrid fit into the spectrum of parallel computation?

A. At the embarrassing end.

Grid computing is most suited to calculations that are loosely coupled, or embarrassingly parallel. If you imagine Symmetric Multi-Processing (SMP) systems, like a dual processor PowerMac G5, being at one end — the tightly coupled one — grid computing is at the other. Whereas the processors in an SMP share the same memory and disk space, and can very simply communicate with one another at a very high rate, the processors in a grid don't share any memory, or disk space, and communication between them is very slow. Basically, the tasks that should be performed on a grid should be independent of one another.

If you are interested in reading more about multithreaded programming for SMP machines, check out my previous Mac DevCenter article here.

Quickstart Xgrid

So, how does Xgrid work? Tasks are divided between three different roles: Controller, Agent, and Client. The Controller is the traffic cop of Xgrid, managing when jobs are run, and where they are run, accepting new submissions, and ensuring output ends up where it should. Agents are responsible for actually performing the computations. The Controller sends work to one or more Agents as it is submitted and resources become available. The Client, as you have no doubt guessed by now, is the guy that submits the jobs, and receives the output upon completion.

These three roles can all be played by the same computer, if that is your wish. (In fact, that is a good way to test your Xgrid applications.) Typically though, you would have a single Controller on your LAN, with all other desktop machines playing the part of Agents and Clients.

The flow of events required to complete a job is something like the following: A Client computer submits a job, along with any necessary resources, such as input files and executables, to the Controller. This is usually achieved with the Xgrid Client application, a GUI interface, but since Preview 2, it has also been possible to use the command-line Client command, xgrid.

The Controller, having received the submission, examines the available Agents, and sends the job to one of them. The Agent runs the job in the directory /tmp, as user nobody; nobody is only allowed to write in /tmp, reducing the security risk. When the job is completed, or fails, the output is returned to the Controller, along with any files requested by the Client when the job was submitted. The Controller then returns the output to the Client, or at least informs it that the job is complete and how the output can be reached.

All of this generally happens transparently. The Agents go looking for the Controller via Rendezvous, so there is no configuration needed. Controllers are also located by the Client application via Rendezvous, so, as a user, it is simply a question of picking one of the available Controllers and firing away.

This has been perhaps the fastest introduction to Xgrid in its short history. If you want or need more, there are several resources you can seek out. The first is the Xgrid Guide, which can be found in /Library/Xgrid/Documentation after you have downloaded and installed Xgrid. Another is the online article Xgrid: High-Performance Computing for the Rest of Us. This provides a high-level introduction to Xgrid, not getting into too many of the practicalities. A good hands-on series of articles, written by Daniel Cote, is available here.

Installing Xgrid

Unlike most grid computing architectures, Xgrid is a breeze to install. You can download Xgrid Preview 2 here. Once you have it on your desktop, mount the disk image, and double click the Xgrid.pkg package.

This package installs a whole range of goodies. The first is a System Preferences pane. Open the System Preferences, and click on the new Xgrid pane. Here you will find configuration options for two of the three different roles discussed above: Controller and Agent. Click the Agent tab, and press the Start button. Your computer can now act as an Agent, connecting to a Controller, and receiving work whenever it is idle.

The Xgrid System Preferences Pane Screenshot.
The Xgrid System Preferences pane

While we are here, click the Always radio button under the section "Choose when the agent may accept tasks." By default, the Xgrid Agent will only accept tasks when it has been idle for 15 minutes, or the screensaver is running. By selecting to always accept tasks, the Agent can run tasks at any time. Note that Xgrid tasks are always run at the lowest priority, so even if you are doing something else on the computer while the Agent is running a job, you probably won't notice much of a slow down.

Now click the Controller tab, and press the Start button. This starts a Controller running on your computer, which any Agents and Clients on your LAN may attempt to connect to.

There are a number of ways of restricting access to the various roles played by your computer in Xgrid. If you want to prevent uninvited Clients from accessing your Controller, you can set a password under the Controller Security tab. Any Client application will then need to provide this password to make use of your Controller.

You can also restrict access to your Xgrid Agent by requiring that any Controller supply a password to it. You set this password under the Agent Security tab. To enter the password used by your Controller to connect to Agents, go to the Controller tab.

Note that although a Controller can be required to authenticate to an Agent, it is not yet possible to require that an Agent authenticate before accessing a Controller. In theory, any Agent can contact your Controller, and ask for jobs to run. This is a potential security risk, because the Agent computer, which could belong to anyone, will receive a copy of any files and software needed to run a job.

There are lots more goodies included in the Xgrid install, including documentation for Xgrid users (/Library/Xgrid/Documentation), and details on how you write plugins for the Xgrid Client application (/Library/Xgrid/Developer). This is all interesting stuff, but we haven't room to cover it here. We are more interested in the less conventional uses of Xgrid.

The Xgrid Application Client

The last part of the puzzle is the Xgrid Clients. There are now two: a Cocoa app unusually christened Xgrid, and the command-line Client, going by the imaginative title of xgrid. We will spend most of our time working with the command-line tool, but let's take a quick look at the GUI to test our installation, and see how things work in practice.

Go to the /Applications directory, and open the Xgrid app. You should be presented with a choice of available Controllers. Probably only your local Controller will be visible. You can either choose to connect to the local Controller, or you can click the button Start Local Service, which can be used for testing purposes.

Having connected to a Controller, you should see the New Job window. Here you can see the available clusters. Double click the cluster called "Rendezvous." You should see which Agents are connected to the Controller. You will probably see only your own computer here, and hopefully it has the status 'Available'.

Returning to the New Job window, you can see a number of job types represented in the column on the right. Each of these is an Xgrid plugin, which can be made to submit different types of jobs. You can write and install these plugins yourself by following the documentation described in /Library/Xgrid/Developer.

We are going to quickly run the Mandelbrot plugin. Double click on it. If the calculation doesn't start, click the Start button. This plugin calculates the Mandelbrot set, which you should see begin to appear. You should also see a Tachometer window, which indicates how many MHz are being applied to the job across the whole Xgrid. If you are just using one computer, the reading should be about the same as your CPU speed. If you install Xgrid on another computer, and connect both Agents to the Controller, the tachometer should give a value equal to the sum of the available CPU speeds.

Mandelbrot Xgrid Plugin Screenshot.
The Mandelbrot Xgrid plugin, with Tachometer

The Command-Line Client

We are now going to take a detailed look at the command-line Client, because it will teach us more about how Xgrid works, and also because it is currently the best way to make use of Xgrid from a standalone Cocoa application.

The command-line Client is accessed through the xgrid command. The best place to read about it is in the man page. Simply open a terminal window, and type

man xgrid

In submitting any job to Xgrid, the Client has to provide a number of things to the Controller. The first is the executable to be run. More often than not, this will be a shell script, but it could also be a Cocoa tool, for example. Along with the executable, the Client has the option of supplying a standard input file and/or a directory of files. This directory will be copied by the Controller to the Agent running the job, and becomes the working directory on the Agent machine (that is, the directory where the executable will run). You can supply anything in the working directory, from input files to libraries and executables.

After a job completes, the standard output and error streams are returned to the Controller. The Client can choose to have these streams piped to files, or they can simply be directed to the output/error streams of the shell that submitted the job. Any other files created in the working directory on the Agent can also be retrieved, if requested by the Client. These are returned to an output directory indicated when the job is submitted.

Example: Distributed Builds with Xgrid

Let's see a concrete example. We are going to write a simple script that will perform distributed builds, like Xcode. A nice thing about using Xgrid for this is that it is not necessary to install our application — in this case the gcc compiler — on each computer in the cluster. Instead, if we choose, we can send our executables along with each job. This is an important distinction between grid applications, and other distributed applications like Xcode: the latter need to be preinstalled on every system.

You can download the script described here, along with some Objective-C code to practice on, by clicking here. You can also practice with your own source code if you like, of course. The source code provided here is taken from an open source Cocoa plotting framework I developed called 'Narrative'. I use this framework in the Cocoa application Trade Strategist, which is Technical Analysis software for the stock market. You can download the full source of Narrative here.

The first lines of the script look like this:


#!/bin/sh

#-------------
# Set filenames variable to be everything on the
# command line.
#-------------
filenames=$*

The shebang indicates that this script will be run in a Bourne shell (which is actually bash on Mac OS X). The variable filenames is set on the next executable line. In a Bourne shell, the $* variable represents anything that appears on the command line after the script name. The script thus expects that the names of the files to be compiled will be given on the command line.

The next lines set the locations of a number of temporary directories, which will be used to prepare our jobs and retrieve output.


#-------------
# Set path to output directory, and the submit directory.
# Submit directory is the current directory.
# Output directory is in /tmp. Create it here.
#-------------
outputdir=/tmp/xgridcc_outdir_$$
mkdir -p $outputdir
submitdir=`pwd`

The variable outputdir holds the path to the directory where we want xgrid to return the output files. We put this in /tmp so as to minimize the risk of damaging useful data. Another Bourne shell variable is used in forming the directory name: $$ is the value of the PID number of the running process. Using this helps avoid naming conflicts with other processes. After the output directory path has been set, the directory is created with mkdir. The next line sets the submitdir variable to the path of the current directory. The pwd command writes the path of the current directory, and the back quotes indicate that the output of the command should be inserted in place of the command itself.

Now we begin a loop over the files to be compiled:


#-------------
# Loop over files. Create one job per file.
#-------------
for filename in $filenames
do

  #-------------
  # Change to submission directory
  #-------------
  cd "$submitdir"

  #-------------
  # Setup variables for input directory, and
  # job file.
  #-------------
  inputdir=/tmp/xgridcc_inpdir_$$_$filename
  xgridscript=$inputdir/run
  mkdir -p $inputdir

At the beginning of the loop, we make sure that we're in the directory where the job was originally submitted. Then we set some more variables; for the input directory, which becomes the working directory on the Agent, and for the path to the script that will be passed to the xgrid command. Each iteration of the loop will submit a single job to the Xgrid, with a script to compile a single file. Each submission needs its own input directory, so that is why it's created here, rather than outside the loop.

Now that we have an input directory, we need to populate it with the files needed to perform a compilation.


  #-------------
  # Copy gcc to input directory.
  #-------------
  cp /usr/bin/gcc $inputdir/gcc

  #-------------
  # Copy source file to the input directory
  # Copy all headers to input directory
  #-------------
  cp $filename $inputdir
  cp *.h $inputdir

We could install gcc on every computer, and make use of the installed copy in our job, but to demonstrate more a typical use, we supply our own version of gcc in the input directory. The file to be compiled ($filename), and all header files, are also copied to the input directory. In a more sophisticated build system, we would copy only the headers needed to compile the file, but we will sacrifice efficiency for simplicity here.

You will recall that above we set a variable for the path to a script file that will be the executable run by Xgrid on the Agent machine. We have a path, but still need to create the file itself. Here is how:


  #-------------
  # Create the script that will be submitted for this file
  # to xgrid.
  # Make it executable.
  #-------------
  cat <<eor > $xgridscript
#!/bin/sh
./gcc -c -O5 $filename
eor

  chmod +x $xgridscript

We use cat to concatenate two lines to the output stream, which is piped to the path we created earlier ($xgridscript). The special operator << indicates that the input stream should be read from the lines that follow. Using <<eor, the input stream is read from the script file up until the next line that contains eor. (Note that there is nothing special about the string 'eor'; you can use any string you like.)

The script that we create is simplicity incarnate. It just compiles our file with the gcc compiler in the working directory. Remember that this script will be run on the Agent machine, in the working directory. The working directory contains the version of gcc that we copied earlier, and we want to use this copy, not the copy in /usr/bin. The -O5 flag results in high optimization, and is just designed in this case to make the compilation last a bit longer. After the script has been created, it is made executable using the chmod command.

After all of that, we are finally ready to submit the job.


  #-------------
  # Submit job to xgrid
  #-------------
  cd "$inputdir"
  xgrid -h localhost -job run -in "$inputdir" -out "$outputdir" run &

done

We first change to the input directory, where the 'run' script is located, and then use the xgrid command to run the job. The -h option indicates the Controller machine we are submitting to. In this case, it is simply localhost. The next option is -job run, which indicates we are running a job synchronously, that is, the xgrid command will not return until the job has completed. The -in option gives the path of the input directory, and -out, the output directory. Lastly, the name of the executable script is supplied, and the whole command put into a subprocess using the &. If we didn't do this, the script would wait until one job was complete before submitting another. Our intention is to submit all jobs simultaneously.

The end is in sight.


#-------------
# Wait until all subprocesses are finished.
#-------------
wait

#-------------
# Move output files back to submit directory.
#-------------
mv $outputdir/*/*.o "$submitdir"

#-------------
# Remove temporary directories
#-------------
for filename in $filenames
do
  inputdir=/tmp/xgridcc_inpdir_$$_$filename
  rm -rf $inputdir
done
rm -rf $outputdir

The wait command causes the script to block until all subprocesses have completed. After all jobs have finished, we move all object files from the output directory, back to the submit directory. Lastly, another loop over file names is used to delete the input directories, and the output directory is also deleted.

The only thing left to do is try it. In a directory full of source files, and a copy of the script xgridcc, issue the following command:


time xgridcc *.m

This will submit the jobs, and provide timings to boot.

My tests with xgridcc show that grid computing is more difficult than falling off a log, at least if you want to garner some advantage from it. If I compile Narrative on my 600MHz iBook, without using Xgrid, it takes 1 minute, 43 seconds. Using Xgrid with a single Agent it takes twice as long: 2 minutes, 57 seconds. So, the overhead of Xgrid, which includes communication, copying files on the Client, and transferring files between Client, Controller, and Agent, is significant. When I added a 400MHz iMac to my iBook, it took 2 minutes, 2 seconds. Even with two computers, xgridcc was still slower than gcc.

The results are not that impressive, but don't lose heart. We have not made much effort to optimize xgridcc. We could improve it in any number of ways. For one, we could submit multiple files in each job. This would improve the ratio of communication to computation, and reduce the copying and transfer of data. Another improvement would simply be to use something faster, or more plentiful, than a single 400MHz iMac. At my place of work, I have a 466MHz G4, but a colleague has just received a dual 1.8GHz G5. I can use Xgrid to submit jobs to the G5, without having to log on. For me, the performance gain of Xgrid far outweighs the overhead.

Next Time...

That's it for now. In part two we are going to start integrating Xgrid into a standalone Cocoa application. We will try something a little more exciting than compilation: batch image processing. You'll be able to apply effects to your whole iPhoto library in one hit, rather than laboriously going through them one at a time. Until then, think distributed.

Drew McCormack works at the Free University in Amsterdam, and develops the Cocoa shareware Trade Strategist.


Return to the Mac DevCenter

Copyright © 2009 O'Reilly Media, Inc.