MacDevCenter    
 Published on MacDevCenter (http://www.macdevcenter.com/)
 See this if you're having trouble printing code examples


Programming With Cocoa Bitmap Image Filters

by Michael Beam, speaker at the O'Reilly Mac OS X Conference
08/06/2002

Today we reenter the realm of Cocoa graphics, where we'll learn how to work with bitmap images on a pixel basis. Cocoa represents bitmap images using the NSImageRep subclass called NSBitmapImageRep. This subclass can work with data formats for many file types: TIFF, GIF, JPEG, PNG, BMP, and even raw, untagged data.

To demonstrate how we can work with bitmap images on a low level--byte by byte--we will implement a filter in ImageApp that will take a color image and convert it to a gray-scale image. Our filter will take an NSImage, extract the raw bitmap data, operate on that data, and then return a new, modified NSImage object.

Preparing ImageApp for the Task

If you need the project file to start with, download it here. It's the same file we created in the last column.

Before we discuss the complexities of NSBitmapImageRep and our gray-scale filter, it's a good idea to first set up the infrastructure in ImageApp to support such behavior. This will be in the form of a new menu, Filter, which contains the single menu item, Make Gray scale, which will in turn invoke a method in MyDocument to run the image through the filter. Make Gray scale will communicate to MyDocument through First Responder in MainMenu.nib. (The discussion we had in the last column about File's Owner could also apply to First Responder. There is no First Responder instance contained in the nib; it's another stand-in object. First Responder is of the same nature as File's Owner.)

To set this up, open MainMenu.nib in Interface Builder. Double-click on First Responder to bring up the class info panel. From here add an action to First Responder called makeGrayscale:. This is the action to which we will connect our new menu item.

To create a new menu in the main menu, drag a Submenu object from the Cocoa-Menus palette and drop it between the Edit and Window menus. Rename the menu Filter, and change the name of the default menu item from Item to Make Gray scale. Then connect the menu item to the makeGrayscale: action of First Responder. After you save your work, we'll close shop in Interface Builder and return to Project Builder.

Back in Project Builder

To prepare ImageApp for the addition of a filter object, we need to implement the makeGrayscale: action in MyDocument. The filter object used to change the image will be an instance of the class IAGrayscaleFilter. This class defines only one method -filterImage:, which takes an argument and returns an NSImage. The NSImage will be set to autorelease, so we need to assert MyDocument's ownership over the returned image by sending a retain message to the returned image. Finally, the returned image will be set as the activeImage of the document, and we will then update the display to show the new image. To update the display, we need to add a method to IAWindowController that will allow us to tell the image view what image to draw. This method will be called -setImageToDraw:. Let's take a look at -makeGrayscale: to see how it looks; this will go in MyDocument.m:

- (void)makeGrayscale:(id)sender
{
    IAGrayscaleFilter *filter = [[IAGrayscaleFilter alloc] init];
    [activeImage autorelease];
    activeImage = [[filter filterImage:activeImage] retain];
    [windowController setImageToDraw:activeImage];
    [filter release];
}

The first thing we did was instantiate and initialize IAGrayscaleFilter, and assign the new object to the filter variable. For MyDocument to recognize IAGrayscaleFilter, we need to import its header file. Make sure you add that. Next we send an autorelease message to the object currently assigned to activeImage. We do this because in the next line of code we reassign activeImage to a different NSImage object. We need to release it before we lose contact with it. Next we execute the filter operation, then tell windowController to draw the new image, and finally release filter before exiting the method.

In IAWindowController we need to implement -setImageToDraw:. Here's what this looks like:

- (void)setImageToDraw:(NSImage *)image
{
    [view setImage:image];
}

All we did here was forward the image along to the image view using setImage:.

Now we're ready to start building our filter class. Let's start this out by creating a new class from the file menu, and naming it IAGrayscaleFilter. Then add the method declaration to the header:

- (NSImage *)filterImage:(NSImage *)srcImage;

and change the import statement to read:

#import <AppKit/AppKit.h>

to import the AppKit headers instead of Foundation. This is necessary because we use AppKit classes in this class. Now let's begin implementing our image filter.

The Filter

The operational method in this whole business is filterImage:, which takes a source NSImage as an argument, and returns a new, gray-scale, NSImage. Accomplishing this requires the ability to access the raw data of the source image, run an algorithm on that data, and then store the new pixel values in the returned NSImage. This is over simplified since we don't actually obtain data from NSImage objects, but rather from instances of NSBitmapImageRep.

Recall from the first ImageApp column our discussion about the relationship between NSImage and NSImageRep. NSImage provides an abstraction layer that insulates clients of the class from the details of the image data format and implementation. To maintain the highest level of flexibility, however, Application Kit provides the class NSImageRep, which is a bridge between the insulating abstraction of NSImage and the data-dependent image representations defined in the NSImageRep subclasses.

Accessing the bitmap image representation of the source image is the first thing we need to do, so let's fill in that part of the puzzle, as shown below:

- (NSImage *)filterImage:(NSImage *)srcImage
{
    NSBitmapImageRep *srcImageRep = srcImageRep = [NSBitmapImageRep imageRepWithData:[srcImage TIFFRepresentation]];
    .
    .
    .
}

The first thing we do in this method is retrieve the bitmap image representation of the source image. NSBitmapImageRep provides the convenience constructor imageRepWithData:, which will take as a parameter an NSData object that is formatted as TIFF data. The TIFF data used to create the instance is obtained using NSImage's TIFFRepresentation method; this returns exactly the NSData object we need.

The next step is to set up the new NSImage that will be the destination of the filter operation. This involves not only creating an instance of NSImage to return, but also the bitmap image representation used to assemble the image data. Combining these two pieces is a simple procedure that will be done before returning the NSImage.

In the next piece of code, we'll do what we have just described. Creating an NSBitmapImageRep from scratch is a tedious procedure, so don't be shocked by the size of the code that you're about to see. The initialization method we will use is over 125 characters long; one of the longest method names in all of Cocoa! We supplement filterImage: in the following way:

- (NSImage *)filterImage:(NSImage *)srcImage
{
    NSBitmapImageRep *srcImageRep = [NSBitmapImageRep
                    imageRepWithData:[srcImage TIFFRepresentation]];

    int w = [srcImageRep pixelsWide];
    int h = [srcImageRep pixelsHigh];
    int x, y;

    NSImage *destImage = [[NSImage alloc] initWithSize:NSMakeSize(w,h)];
       
    NSBitmapImageRep *destImageRep = [[[NSBitmapImageRep alloc] 
                    initWithBitmapDataPlanes:NULL
                    pixelsWide:w 
                    pixelsHigh:h 
                    bitsPerSample:8 
                    samplesPerPixel:1
                    hasAlpha:NO
                    isPlanar:NO
                    colorSpaceName:NSCalibratedWhiteColorSpace
                    bytesPerRow:NULL 
                    bitsPerPixel:NULL] autorelease];

    .
    .
    .
    
    [destImage addRepresentation:destImageRep];
    return destImage;
}

Now, let's pick this apart and see what we did. First, we declared some variables that will prove useful in the method. We have the height and width of the source image as well as variables that we will use to specify a position in the image.

Next, we see that we created an NSImage object called destImage. Initializing this object involves specifying only the size of the image, which will be the same size as the source image. NSImage instances allocated and initialized in the manner above contain no image representations. Therefore, we'll add destImageRep to the list of representations in destImage, effectively manipulating destImageRep equivalent to manipulating destImage. Image representations are added to an image by invoking the method addRepresentation: in the NSImage object with the image representation we wish to add as the argument.

Now is the time where we allocate and initialize a new NSBitmapImageRep object. An image's data is simple, but the variety of data formats and organization requires a flexible initialization. Let's walk through it argument by argument.

The first argument, BitmapDataPlanes:, allows us to provide the object with memory space, which is set up to store planar bitmap data. Raw image data is nothing more than a sequence of byte-size elements that store the value of each sample of each pixel. A sample is a color component for a pixel, such as red, green, blue, alpha, cyan, and so on. When we create an NSBitmapImageRep we have the option of specifying one of two organizational schemes for the data: whether it is planar data, or interlaced data.

When an image's data is planar, there is a separate array in memory, or plane, for each color sample. That is, in an RGB image, there is a separate array of all of the red, green, and blue samples. We can create these arrays ahead of time and pass an array of pointers to this first argument. This array has as many elements as there are color components and each element is a pointer to the head of each of the data arrays. It is type unsigned char **.

Rather than specifying these data arrays, we can pass NULL, which means that the data will be interlaced. In this scheme there is one array that contains all the data for the image. The data is arranged such that the color samples for the same pixel are adjacent in memory. For example, the first three elements of the array would be the values of the red, green, and blue samples for the first pixel. The next three bytes of the array would be the same for the second pixel, and so on. We will be working with interlaced data.

The next two arguments, pixelsHigh: and pixelsWide: let us specify the size of the image. Next, in bitsPerSample:, we specify how large each sample is in terms of bits per sample. This discussion has assumed that each sample is one byte, or 8-bits large, but it is possible for each sample to be 12- or 16-bits. If you had some highly specialized application you are not precluded from choosing some esoteric value for the bitsPerSample:. Next, we specify the samplesPerPixel:. For an RGB image we would pass 3 for this parameter, if we are creating a gray-scale image, as we are here, we pass 1.

Related Reading

Learning Cocoa with Objective-C
By James Duncan Davidson, Apple Computer, Inc.

The combination of these last four parameters allows the class to determine how much memory to allocate to store all of the raw data in the image representation. We can determine the total number of pixels in the image by multiplying the height and the width. We can also determine the memory space needed for each pixel by multiplying the bitsPerSample by the samplesPerPixel.

Next, we indicate whether or not one of the samples specified in samplesPerPixel: is an alpha channel. To keep things simple, we will not concern ourselves with an alpha channel, so we pass NO. Then we specify whether or not the image data is planar. As we know, it will not be, so we pass NO here as well.

Almost there. Next we want to specify the colorspace for the image. A colorspace tells the graphics system how to interpret the image data. One familiar colorspace is RGB colorspace. In this colorspace a color is represented by three color components: red, green, blue. This is our source image's colorspace (we are assuming, for simplicities' sake). Another colorspace is white colorspace, where a color is a shade of gray. In this colorspace each pixel has only one sample. RGB colorspaces can also represent gray--as equal admixtures of the three component colors.

Because we are building a filter that takes a color image and converts it to gray-scale we will set the colorspace for our image representation to white colorspace, designated in Cocoa by the constant NSCalibratedWhiteColorSpace.

Finally, the last two methods to which we pass NULL allow us another way to specify how much memory to allocate for the image data. The argument bytesPerRow: is equivalent to the value of:

pixelsWide * bitsPerSample * samplesPerPixel / 8

The other argument, bitsPerPixel:, is another way of saying:

bitsPerSample * samplesPerPixel

This is how we make a new NSBitmapImageRep, fully configured for our use. As always, study the documentation for this class as it provides a host of details not covered here. Now that we have that out of the way, let's move on to talk about how we actually convert a color image into a gray-scale image.

The Way

Repeatedly, I've said that the filter will take a color image and convert it to gray-scale. This operation is a simple one, and in its most basic, it is no more complicated than finding a pixel, determining the values of the red, green, and blue samples, averaging those values, and then setting the average as the value of the white sample for the same in the gray-scale image representation.

This tiny bit of math must be done for every pixel in the image. To code this behavior we will be getting a heavy dose of C pointers and arrays. Since pointers and arrays are so closely related in C, I will use this discussion as an opportunity to show you how they are related. So, the first thing we need is a couple of loops that will scan through the entire image data buffer to access each pixel. This is where our x and y variables come in, as pixel numbers.

for ( y = 0; y < h; y++ ) {
    for ( x = 0; x < w; x++ ) {
        // Do the magic
    }
}

The action that goes on inside the loop reads the values of the red, green, and blue components of the current pixel, averages them together, and then sets the corresponding destination pixel equal to that averaged value. To accomplish this we need pointers to the beginning of the source and destination image data buffers. These pointers are conveniently obtained with bitmapData messages to srcImageRep and destImageRep. bitmapData method returns a character pointer, which is type unsigned char *. This is convenient as unsigned char is the same size as our samples, 8-bits. Putting these last couple of pieces together with our existing code gives us the following:

- (NSImage *)filterImage:(NSImage *)srcImage
{
    NSBitmapImageRep *srcImageRep = [NSBitmapImageRep
                    imageRepWithData:[srcImage TIFFRepresentation]];
    NSImage *destImage = [[NSImage alloc] initWithSize:NSMakeSize(w,h)];

    int w = [srcImageRep pixelsWide];
    int h = [srcImageRep pixelsHigh];
    int x, y;
       
    NSBitmapImageRep *destImageRep = [[[NSBitmapImageRep alloc] 
                    initWithBitmapDataPlanes:NULL
                    pixelsWide:w 
                    pixelsHigh:h 
                    bitsPerSample:8 
                    samplesPerPixel:1
                    hasAlpha:NO
                    isPlanar:NO
                    colorSpaceName:NSCalibratedWhiteColorSpace
                    bytesPerRow:NULL 
                    bitsPerPixel:NULL] autorelease];

    unsigned char *srcData = [srcImageRep bitmapData];
    unsigned char *destData = [destImageRep bitmapData];
    unsigned char *p1, *p2;

    for ( y = 0; y < h; y++ ) {
        for ( x = 0; x < w; x++ ) {
            
        }
    }
    
    [destImage addRepresentation:destImageRep];
    return destImage;
}

In addition to declaring the pointers srcData and destData that point to the first element, or the head of the data buffers, we define the pointers p1 and p2, which will be used as working pointers. Think of them as moving cursors set to the location in the array of the pixel we're currently working on in the loop.

To understand how we work with the data, we have to understand more about how the data is organized. In a 24-bit image, where each color sample is 8-bits, with no alpha component, the data is arranged sequentially, where the first byte of the data buffer corresponds to the blue component of the first pixel. The second byte in the data array is the green component of the first pixel, and the third byte is the red component of the first pixel. This sequence continues for each pixel, across the first row of pixels (constant y, increasing x), and then, like a carriage return on a typewriter, we get to the next vertical row (incrementing y, reset x and go down the row again). This accounts for why the x for-loop is nested within the y for-loop.

A Bit About C-Pointer Arithmetic

In C a pointer is a variable that points to a place in memory where a meaningful value is stored. In our case, these meaningful values are the color samples for all of the pixels in our image. In our code, srcData is a pointer that points to the blue sample of the pixel in row 1, column 1. If we want to know the value at this memory location, we would use the dereferencing operator. Thus, *srcData is the value of the pixel sample.

Now, what if we want to know the value of the memory location adjacent to the one pointed to by srcData? What about the value stored in the memory location 2 or 3 or 300 slots beyond srcData? Simple, we just add the number of slots to the pointer and we get a pointer to that memory location. So the memory slot adjacent to srcData is srcData+1, two slots away is srcData+2, and so on. Again, to access the value at these locations use the dereferencing operator. So, *(srcData+1) is the adjacent value, or the value of the green sample of the first pixel, and so on. Notice how we used parentheses. Writing *srcData + 1 gives us the value of the first pixel's blue sample, plus one.

Now, let's see how we use this in the code below:

unsigned char *srcData = [srcImageRep bitmapData];
unsigned char *destData = [destImageRep bitmapData];
unsigned char *p1, *p2;
int n = [srcImageRep bitsPerPixel] / 8;

for ( y = 0; y < h; y++ ) {
    for ( x = 0; x < w; x++ ) {
        p1 = srcData + n * (y * w + x);       
           p2 = destData + y * w + x;
        
        *p2 = (unsigned char)rint((*p1 + *(p1 + 1) + *(p1 + 2)) / 3);
    }
}

Let's take our time and pick this thing apart. The first line in the for-loop says that p1, a pointer, will point to the location in the source image data buffer that corresponds to the xth pixel in the yth row. We get to that pixel by figuring out how many pixels into the data buffer we are, which is the y-value times the width of the image plus the x value, and we multiply that by the number of bytes in each pixel, n. In our code, we expect n to be 3. The resulting number is then added to the address pointed to by srcData.

Next we repeat this calculation for p2 in the destination data buffer. However, since the destination data is in the NSCalibratedWhiteColorspace colorspace, each pixel occupies only one byte of memory (because that is what's needed to represent a gray-scale value). So the calculation is the same as above, except n equals 1.

In the third line we access each of the red, green, and blue bytes of the pixel located by p1. The blue byte is pointed to by p1. Because the green and red bytes occur sequentially after the blue byte, we can increment the address of p1 by 1 to get the green byte location, and by 2 to get the blue byte location. To retrieve the actual value stored in those address locations we have to dereference the pointer. The * operator is applied only to addresses, and will return the value stored at the location pointed to by the address. So, p1, p1+1, and p1+2 are pointers to the red, green, and blue components, and *p1, *(p1+1), and *(p1+2) are the values of those components.

We sum up those three values, average them, round them off as an integer, and then cast it back as a unsigned char (since we're limited to 8-bits of storage space per component), and set the value of the memory pointed to by p2, which is *p2, to the result.

So, there you have it: C pointer arithmetic in a nutshell.

The above could have also been written using C's array notation. Writing our pointer arithmetic in pointer syntax is simple. For example, previously, *(p1+n) referred to the value at the memory location n steps past p1. In array syntax that would have be written as p1[n]. With this syntactical change the previous code can be rewritten as the following:


unsigned char *srcData = [srcImageRep bitmapData];
unsigned char *destData = [destImageRep bitmapData];
unsigned char *p1, *p2;
int n = [srcImageRep bitsPerPixel] / 8;

for ( y = 0; y < height; y++ ) {
    for ( x = 0; x < width; x++ ) {
        p1 = srcData + n * (y * w + x);       
           p2 = destData + y * w + x;
        
        p2[0] = (unsigned char)rint((p1[0] + p1[1] + p1[2]) / 3);
    }
}

Another way we can write the for-loop is to replace the n*y*w with the number of bytes per row. This would simplify the address arithmetic to the following:

unsigned char *srcData = [srcImageRep bitmapData];
unsigned char *destData = [destImageRep bitmapData];
unsigned char *p1, *p2;

int n = [srcImageRep bitsPerPixel] / 8;
int srcBPR = [srcImageRep bytesPerRow];
int destBPR = [destImageRep bytesPerRow];

for ( y = 0; y < height; y++ ) {
    for ( x = 0; x < width; x++ ) {
        p1 = srcData + y * srcBPR + n*x;       
           p2 = destData + y * destBPR + x;
        
        p2[0] = (unsigned char)rint((p1[0] + p1[1] + p1[2]) / 3);
    }
}

So, this is just another way of locating the data. When we put all the pieces together our final -filterImage: method looks like the following:

- (NSImage *)filterImage:(NSImage *)srcImage
{
    NSBitmapImageRep *srcImageRep = [NSBitmapImageRep
                    imageRepWithData:[srcImage TIFFRepresentation]];
    NSImage *destImage = [[NSImage alloc] initWithSize:NSMakeSize(w,h)];

    int w = [srcImageRep pixelsWide];
    int h = [srcImageRep pixelsHigh];
    int x, y;
       
    NSBitmapImageRep *destImageRep = [[[NSBitmapImageRep alloc] 
                    initWithBitmapDataPlanes:NULL
                    pixelsWide:w 
                    pixelsHigh:h 
                    bitsPerSample:8 
                    samplesPerPixel:1
                    hasAlpha:NO
                    isPlanar:NO
                    colorSpaceName:NSCalibratedWhiteColorSpace
                    bytesPerRow:NULL 
                    bitsPerPixel:NULL] autorelease];

    unsigned char *srcData = [srcImageRep bitmapData];
    unsigned char *destData = [destImageRep bitmapData];
    unsigned char *p1, *p2;
    int n = [srcImageRep bitsPerPixel] / 8;

    for ( y = 0; y < height; y++ ) {
        for ( x = 0; x < width; x++ ) {
            p1 = srcData + n * (y * w + x);       
               p2 = destData + y * w + x;
            
            p2[0] = (unsigned char)rint((p1[0] + p1[1] + p1[2]) / 3);
        }
    }
    
    [destImage addRepresentation:destImageRep];
    return destImage;
}

With that we're ready to compile the code, and try it out.

Summary

As always, there are a number of enhancements you can make to this filter method. The implementation we created today is limited to 24-bit RGB images. One first enhancement would be to add support for images with an alpha channel. In the project you can download here you will find my implementation--it provides support for alpha.

You could also use this simple piece of code to make Altivec enhancements to the loop. I've haven't yet taken the time to dive into the Altivec libraries (I've only had a G4 for about six weeks now), but an image filter that uses the Altivec libraries is where I would start. It has all of the characteristics that make it a prime candidate for vectorization, and I imagine there'd be noticeable speed improvement. I leave that as an exercise for you folks.

So, we've seen how to make one image filter. There are hundreds of different operations you can perform on images that would fit well within the framework we've established here. That framework consists of this black-box method, -filterImage, where we pass an image in and get a new one back. The only thing that changes is what happens inside the for-loop. I'm not going to take the time to develop more filter operations, but in the next column I will show you how to define and implement an interface to a plug-in architecture for the application. This plug-in architecture will unload the burden of filter development from you to end users of the application. See you then.

Michael Beam is a software engineer in the energy industry specializing in seismic application development on Linux with C++ and Qt. He lives in Houston, Texas with his wife and son.


Read more Programming With Cocoa columns.

Return to the Mac DevCenter.

Copyright © 2009 O'Reilly Media, Inc.