MacDevCenter    
 Published on MacDevCenter (http://www.macdevcenter.com/)
 See this if you're having trouble printing code examples


Bioinformatics Meets Mac OS X

by Bruce Stewart
12/14/2001

Many of the important bioinformatics applications that previously existed only for Unix platforms are now being brought over to the Macintosh, thanks to Mac OS X and its Unix underpinnings.

Traditionally, scientific research has been performed on Unix workstations. This is partly because of the Unix operating system's long, stable history, and partly because Unix systems allow for the easy sharing of information. Now that Apple's Mac OS X operating system is Unix-based, scientists will more easily be able to run their favorite number-crunching or sequence-searching applications on their desktop Mac. Apple has compiled a list of bioinformatics applications that have been ported to Mac OS X in this article on its Science and Technology site.

Many bioinformatics applications have been written with portability in mind. Dr. William Van Etten, lead bioinformaticist at the Blackstone Technology Group, has been involved in bringing many such applications over to the Mac OS X platform. Van Etten has ported Primer3, Clustalw, HMMER 2, and the EMBOSS suite (which includes over 150 open source sequence-analysis programs), and in most of these cases, Van Etten says "port" is really too strong of a word. Of most of the bioinformatics applications he's brought over, Van Etten says, "If they're using GNU autoconf tools, it's pretty much a snap. Compiling on Mac OS X is the same as on any other Unix variant." This is in part thanks to the fact that the GNU autoconf tools are fully supported on Mac OS X.

O'Reilly Bioinformatics Technology Conference

Don't miss Unix-based Mac OS X as a Full-Featured Bioinformatics Development and Analysis Platform, Tuesday, January 29, 2002, at the O'Reilly Bioinformatics Technology Conference.

The GNU autoconf tools are a set of programs that help make code configurable and portable to various versions of Unix. The autoconf programs produce shell scripts to automatically configure software packages, as with the bioinformatics applications Van Etten is making Mac-savvy.

Of course, they're not all that easy. NCBI's BLAST, or Basic Local Alignment Search Tool, is a widely used sequence-comparison tool in bioinformatics that is an example of a program that took a bit more work to port to Mac OS X, according to Van Etten. After a couple of weeks of working on the BLAST port on the train to and from work, Van Etten had BLAST working on his Mac, and he points out that the effort was no different than it would have been for making BLAST work on any other specific Unix platform.

TurboBLAST is a distributed, parallel execution framework built on top of NCBI BLAST that has been ported to the Mac by TurboGenomics. Again, the effort involved seems to have been minimal, this time because of Mac OS X's Java compatibility. "Porting TurboBLAST over from Linux required almost no work. Most of our code is in Java, and Mac OS X's built-in Java support meant that the components simply worked when we ran them on the Mac," says Nathan Willard, TurboBLAST product manager.

Dr. Gavin Sherlock, of the Center for Clinical Sciences Research at Stanford, ported XCluster to Mac OS X, which is an application for analyzing DNA microarrays. Sherlock agrees the process was straightforward and trouble-free: "It was easy to port XCLuster--it required no changes to the code, and only a couple of changes to the Makefile. It took about ten minutes!"

While they are not directly bioinformatics applications, Van Etten has also ported two load management systems over to Mac OS X. These programs can be thought of as "bioinformatics enabling." Because much of bioinformatics research requires processing power greater than a single machine can provide, batch-queueing systems like the Sun Grid Engine and PBS, or Portable Batch System, are necessary to harness the power of clusters of computers. Porting these applications to Mac OS X will help Macs enter into the clustering arena. Van Etten has already successfully implemented a cluster of Mac OS X machines, and he expects demand for this to grow.

Scientists are porting bioinformatics tools to the Macintosh platform because often they are already Macintosh users, and they want the convenience of being able to perform their research on their primary desktop computers. Traditionally scientific researchers have needed a desktop computer for all of their productivity applications, and a separate platform for the compute engine to support their research. "The tremendous benefit of Mac OS X is it gives you both," says Van Etten. "The only thing that comes close is Linux, but for most bioinformaticists, the Linux desktop user experience is a little sophisticated."

Now that the Macintosh has a Unix-based operating system, bioinformaticists who are looking for the most efficient, cost-effective means of conducting their research will increasingly be able to use their Macs. Especially in research areas where a single machine is robust enough to run the applications; the benefits of being able to use the same platform for daily tasks and research are obvious. TurboGenomics' Willard points out, "In a tightening economy, a machine that runs your favorite bioinformatics programs and Office makes a lot of sense."

Bruce Stewart is a freelance technology writer and editor.


Return to the Mac DevCenter.

Copyright © 2009 O'Reilly Media, Inc.