Integrating Xgrid into Cocoa Applications, Part 1by Drew McCormack
Editor's note: This is the first of two articles exploring Xgrid. Today, Drew McCormack provides you with a little background information, then moves to installation, and finishes off with a command-line script for distributing compilation using Xgrid.
"Hey Rocky, watch me pull a rabbit from this hat." Bullwinkle, pulling a lion from his hat.
Joe was sitting in coach on a redeye out of New York, and was having trouble sleeping. He decided to work on a home movie he had been editing, pulled out his old iBook, and charged up iMovie.
"Hmm, I really need to spruce this up with a few transitions," he thought, and set things in motion.
Then he waited... and waited... and waited.
"Boy, these video renders are hard work for such an old laptop," he thought, and prepared for a long, boring slog.
Suddenly, as if to answer his prayers, iMovie sprung into life. The progress bars were moving at what seemed like the speed of light, and his renders were complete in no time.
What was going on? Had his iBook found a few more MHz lying around? Was he that tired? Or was it just magic?
Steve had just given a keynote at the newly reinstated New York Macworld Expo, and was heading home. The jet was in shop, so he was riding first class with a commercial carrier.
The keynote had gone well. He had unveiled the long-awaited dual processor PowerBook G5, and brought the house down. To reward himself, he had decided to take the demo laptop with him, and nobody at the show was going to say otherwise, not Phil, not anybody.
"Better check my email," he thought, opening the G5 laptop, and connecting to the onboard satellite network. And with this one simple action, Steve inadvertently helped render the transitions in Joe's home movie...
The Democratization of Distributed Computing
At one time you would have been burned at the stake for such witchery, but these days any five year old could tell you there is no magic. It's all down to Airport, Rendezvous, and the latest piece in the puzzle: Xgrid.
Xgrid was unveiled by Apple at the last Macworld to almost negligible fanfare. Admittedly it is still in an early stage of development, but the silence from Apple was almost deafening for a technology that may well prove to be one of the most significant in years. Yes, you read right, Xgrid may well be as big a revolution as the iMac was for home computing, and iMovie was for home movies.
Xgrid is software for distributing computation. A computer with Xgrid installed can send computational tasks to other Xgrid-enabled computers, and receive the results back upon completion.
Xgrid's roots are in a project at NeXT called Zilla. Zilla was the first desktop program to offer spontaneous clustering of idle computer resources. These days everyone knows about SETI@home and Folding@Home, but Zilla was the first. Apple acquired the technology from NeXT, and it has finally led to Xgrid.
The last few years have seen numerous applications gain a distributed mode of operation, from Maya to Shake, and now even Xcode. In this sense, Xgrid is not a new technology, but there is a difference: Each of these distributed applications must first be installed on each computer in the cluster. The computing nodes are static, only changing upon human intervention.
Xgrid also differs from applications like SETI@home, because, although these applications do form networks dynamically, they carry out a single task, and are not capable of arbitrary computation. The user of a computer running SETI@home cannot take advantage of the SETI@home network itself. The only application that runs on the SETI network is SETI@home.
Xgrid facilitates the formation of spontaneous clusters (that is, zero configuration) that can perform arbitrary computations. A user can submit any computational task they desire to Xgrid, and it will be run on any Xgrid computers configured to accept it.
As with so many things Apple develops, the idea is far from new. Other grid systems, such as Globus, Condor, and Sun Grid, have been around for some time, and are actually much more advanced than Xgrid. So why is Xgrid important?
It all comes down to the Apple touch. Take something that already exists, like an MP3 player, see how it could be useful to real people, and then produce an iPod. Take a good idea from some geeks in a research lab, like a Graphical User Interface, realize its potential for real people, and produce a Macintosh. Apple has always been the computer company with vision, and it is about to realize the potential of grid computing for real people.
Apple has taken something that is typically very complex — as you would know if you had ever tried to get any other grid software up and running — and made it excruciatingly simple to install and use. At the same time, they are beginning to blur the definition of what constitutes a single computer. In the future, computers will seek each other out, sharing the computational load, distributing work to where resources are idle. If an application can perform tasks on any networked Mac in the world, where does one computer end, and the next begin?
We are witnessing the beginning of the Democratization of Distributed Computing, and it didn't even warrant a mention in January's keynote. But don't worry, it will come. Apple knows what it has. If Mac OS X 10.4 doesn't have Xgrid built in by default, ready to blur the edges of personal computing, I will swim ten laps around Manhattan Island with only an Xserve cluster for buoyancy.
Yeah, Yeah, But What's This Article About?
OK, enough gushing. Let's get down to some technical stuff. This is the first of two articles about Xgrid for Cocoa developers. Apple has concentrated its efforts to date promoting Xgrid as a simple batch queuing system for scientists, but we know it is much more than that. Unless I am gravely mistaken, once Xgrid has stabilized somewhat, Apple will publish a Cocoa API, and start pushing the technology to non-scientific developers. In the meantime, we can already get a feel for how Xgrid could integrate into Cocoa apps by utilizing the Xgrid command line interface, made available in Preview 2.
In this, the first of our two articles, I will give you the low down on how Xgrid works. This introduction will be targeted at developers, not scientists, so I won't be beating around the bush. With formal introductions out of the way, I will quickly cover installation of Xgrid, and finish off with a command-line script for distributing compilation using Xgrid. Yes, we are going to build a distributed compiler, like the one in Xcode, with just Xgrid, and less than 100 lines of shell script.
In the second article, we will move on to utilizing Xgrid from within a Cocoa application, without requiring the Xgrid Client application to be running. To do this, we will wrap the command-line interface in Cocoa code. To demonstrate, I will build a simple, distributed Cocoa image-processing application, using the command-line tool ImageMagick.
Here's a geek joke for you.
Q. Where does Xgrid fit into the spectrum of parallel computation?
A. At the embarrassing end.
Grid computing is most suited to calculations that are loosely coupled, or embarrassingly parallel. If you imagine Symmetric Multi-Processing (SMP) systems, like a dual processor PowerMac G5, being at one end — the tightly coupled one — grid computing is at the other. Whereas the processors in an SMP share the same memory and disk space, and can very simply communicate with one another at a very high rate, the processors in a grid don't share any memory, or disk space, and communication between them is very slow. Basically, the tasks that should be performed on a grid should be independent of one another.
If you are interested in reading more about multithreaded programming for SMP machines, check out my previous Mac DevCenter article here.
So, how does Xgrid work? Tasks are divided between three different roles: Controller, Agent, and Client. The Controller is the traffic cop of Xgrid, managing when jobs are run, and where they are run, accepting new submissions, and ensuring output ends up where it should. Agents are responsible for actually performing the computations. The Controller sends work to one or more Agents as it is submitted and resources become available. The Client, as you have no doubt guessed by now, is the guy that submits the jobs, and receives the output upon completion.
These three roles can all be played by the same computer, if that is your wish. (In fact, that is a good way to test your Xgrid applications.) Typically though, you would have a single Controller on your LAN, with all other desktop machines playing the part of Agents and Clients.
The flow of events required to complete a job is something like the
following: A Client computer submits a job, along with any necessary
resources, such as input files and executables, to the Controller. This
is usually achieved with the Xgrid Client application, a GUI interface,
but since Preview 2, it has also been possible to use the command-line
The Controller, having received the submission, examines the available
Agents, and sends the job to one of them. The Agent runs the job in
/tmp, as user
is only allowed to write in
/tmp, reducing the security
risk. When the job is completed, or fails, the output is returned to
the Controller, along with any files requested by the Client when the
job was submitted. The Controller then returns the output to the Client,
or at least informs it that the job is complete and how the output can
All of this generally happens transparently. The Agents go looking for the Controller via Rendezvous, so there is no configuration needed. Controllers are also located by the Client application via Rendezvous, so, as a user, it is simply a question of picking one of the available Controllers and firing away.
This has been perhaps the fastest introduction to Xgrid in its short
history. If you want or need more, there are several resources you can
seek out. The first is the Xgrid Guide, which can be found in
/Library/Xgrid/Documentation after you have downloaded
and installed Xgrid. Another is the online article Xgrid:
High-Performance Computing for the Rest of Us. This provides
a high-level introduction to Xgrid, not getting into too many of
the practicalities. A good hands-on series of articles, written by Daniel
Cote, is available here.
Unlike most grid computing architectures, Xgrid is a breeze to install.
You can download Xgrid Preview 2 here.
Once you have it on your desktop, mount the disk image, and double click
This package installs a whole range of goodies. The first is a System Preferences pane. Open the System Preferences, and click on the new Xgrid pane. Here you will find configuration options for two of the three different roles discussed above: Controller and Agent. Click the Agent tab, and press the Start button. Your computer can now act as an Agent, connecting to a Controller, and receiving work whenever it is idle.
While we are here, click the Always radio button under the section "Choose when the agent may accept tasks." By default, the Xgrid Agent will only accept tasks when it has been idle for 15 minutes, or the screensaver is running. By selecting to always accept tasks, the Agent can run tasks at any time. Note that Xgrid tasks are always run at the lowest priority, so even if you are doing something else on the computer while the Agent is running a job, you probably won't notice much of a slow down.
Now click the Controller tab, and press the Start button. This starts a Controller running on your computer, which any Agents and Clients on your LAN may attempt to connect to.
There are a number of ways of restricting access to the various roles played by your computer in Xgrid. If you want to prevent uninvited Clients from accessing your Controller, you can set a password under the Controller Security tab. Any Client application will then need to provide this password to make use of your Controller.
You can also restrict access to your Xgrid Agent by requiring that any Controller supply a password to it. You set this password under the Agent Security tab. To enter the password used by your Controller to connect to Agents, go to the Controller tab.
Note that although a Controller can be required to authenticate to an Agent, it is not yet possible to require that an Agent authenticate before accessing a Controller. In theory, any Agent can contact your Controller, and ask for jobs to run. This is a potential security risk, because the Agent computer, which could belong to anyone, will receive a copy of any files and software needed to run a job.
There are lots more goodies included in the Xgrid install, including
documentation for Xgrid users (
and details on how you write plugins for the Xgrid Client application
/Library/Xgrid/Developer). This is all interesting stuff,
but we haven't room to cover it here. We are more interested in
the less conventional uses of Xgrid.
Pages: 1, 2