MacDevCenter    
 Published on MacDevCenter (http://www.macdevcenter.com/)
 See this if you're having trouble printing code examples


Build an eDoc Reader for your iPod, Part 2

by Matthew Russell
12/17/2004

Build an eDoc Reader for your iPod, Part 2

This is the second part of a trilogy of articles that teaches you how to make reading electronic documents (such as books and PDF documents) on your iPod easy and enjoyable. This installment delves into the engine of our application and adds some user interface conveniences through NSUserDefaults. Next time, we finish by incorporating the ability to handle text extraction from PDF documents by exploiting the Cocoa-Java bridge.

Last Time

Let's briefly review the openFileDialog: method from last time and address memory as it applies to this project. The first thing you should notice about openFileDialog: is that it passes in a pointer to the object that brought about its activation. The condition [sender isEqualTo:sourceButton] compares pointer values for equality, so this is not a difficult task.

If Source File was pressed, we configure the NSOpenPanel to disable the selection of directories via [openPanel selectCanChooseDirectories:NO] because we want to guarantee that a file is chosen. We also limit the selection to files ending in "txt" or "pdf" because they're the formats we'll work with in this series.

If the Destination button was clicked, we guarantee that a directory is chosen, because we need a location to copy the source file's pieces to after it's parsed. The other messages we pass to the NSOpenPanel should be pretty clear just from reading the code, which is one of the many nice things about Objective-C syntax. You can always use Xcode's help menu if you need it.

There are a few interesting string methods in openFileDialog:. On Unix-based systems such as Mac OS X, the tilde (~) is a shortcut for the current user's home directory. Type echo ~ in Terminal to check it out for yourself. The Cocoa method stringByExpandingTildeInPath does just what it sounds like. This is nice because we often like to start navigation in the user's home directory as a convenience to them, and this is really simple way of making that happen. As another convenience, we start navigation for the destination directory in /Volumes, because the iPod is normally mounted there.

The line of code if (NSOKButton == [openPanel runModalForTypes:typesArray]) opens the file dialog window, and if the user has clicked on OK (as opposed to Cancel), we're guaranteed to have a valid selection because we configured our NSOpenPanels accordingly. Because NSOpenPanel can handle multiple selections, the result is contained in an NSArray. We retrieve the selection with [selection lastObject] because it is the only object in the array and use this value to set the corresponding NSTextField. Finally, we enable the Copy It button if both text fields have valid values.

A final note to take from last time is the reason why we passed the retain message to the NSArray in the init: method. In general, you should assume that objects initialized with convenience constructors like arrayWithObjects are not in the current autorelease pool. Thus, if you don't explicitly retain them, they're gone with the wind once you leave the current scope. Objects explicitly initialized with alloc or new, however, must be explicitly released with a release message, usually in dealloc. If you're craving more information on memory or have no idea what that just meant, check out the reference I pointed you to last time, "Introduction to Memory Management."

On to the Model

With that recapitulation, we can now proceed to build our engine. In your Xcode project, choose File -> New File, select Objective-C class, and click on Next. Name the class "TextChunker.m", ensure that "Also create TextChunker.h" is checked, and click on Finish to create and add the files to our PodReader target. Drag the TextChunker files into the Classes folder of your project. Once that's done, replace the TextChunker files that Xcode generated with these files: TextChunker.h and TextChunker.m. I recommend you do this through Finder, rather than through Xcode.

The header file, TextChunker.h, tells you all you need to know if you simply want to use the TextChunker class and are not interested in how it works. The method signature for chunkIt: specifies a source file, a destination directory, and a value like CHAPTER or SCENE that should be used to segment the chunks of text.

The file TextChunker.m is where all of the work happens. At the top of the method chunkIt:, there are three constants that designate the guidelines for chunking the text. We use the constant PR_CHUNK as a guideline to keep the chunk sizes smaller than 4K. Technically, 4K is 4096 bytes, but we're using some magic to link "pages" of our book together using the HTML anchor tag (this will be discussed in a moment), so we leave an ample margin to account for it.

The constant PR_INITIAL_CUT_POINT provides a starting location to segment a chunk if it is larger than PR_CHUNK, and PR_CUT_DELTA is used to incrementally decrease PR_INITIAL_CUT_POINT if cutting at PR_INITIAL_CUT_POINT still doesn't decrease the chunk size. These values will make more sense as we move on. Specifying these values one time at the top of the method allows for easy adjustment. For example, if you should want 2K chunks instead of 4K chunks, simply change PR_CHUNK and you're all done.

The next few lines of code and the first while loop create and load an NSMutableArray with chunks of text that are partitioned according to a separator value. Notice how simple it is to load an entire text file into a string value with [NSMutableString stringWithContentsOfFile:fileName]. The NSString method rangeOfString: returns an NSRange, so if you're unfamiliar with NSRange, look it up. You'll notice that it's a struct with two parts, a location and a length. We're interested in using the location to determine where a particular section of text in fileContents starts.

We start the substring matching from 1 instead of 0 because we want to skip the first occurrence of it, which is quite possibly the very beginning of the string at some point. For example, if we segment the following excerpt from Shakespeare's Macbeth on the word "ACT," our code spins in an infinite loop because it can never shorten the string containing the text. It repeatedly identifies the first word of the excerpt as the separator and never moves on.

ACT I. SCENE I.

A desert place. Thunder and lightning.

Enter three Witches.

  FIRST WITCH. When shall we three meet again?
    In thunder, lightning, or in rain?
  SECOND WITCH. When the hurlyburly's done,
    When the battle's lost and won.
  THIRD WITCH. That will be ere the set of sun.
  FIRST WITCH. Where the place?
  SECOND WITCH. Upon the heath.
  THIRD WITCH. There to meet with Macbeth.
  FIRST WITCH. I come, Graymalkin.
  ALL. Paddock calls. Anon!
    Fair is foul, and foul is fair.
    Hover through the fog and filthy air.                Exeunt.

Another detail of interest is that we use NSMutableArray and NSMutableString as opposed to NSArray and NSString. Objective-C is a language that distinguishes between mutable and non-mutable classes. In general, there's a performance overhead for handling mutable classes for reasons that deal with memory allocation and management. Non-mutable objects, on the other hand, do not require this overhead. Use immutable objects if possible.

Related Reading

Learning Cocoa with Objective-C
By James Duncan Davidson, Apple Computer, Inc.

An interesting implementation detail that you might like to know is that most mutable objects inherit from non-mutable parents. Looking up NSMutableArray in Xcode's help shows you that it inherits from NSArray. We use the mutable version of these objects, because the next block of code is very likely to alter both the string values in the array as well as the array itself.

A final note to take from this first block of code regards some of the many NSString methods: length, paragraphRangeForRange, lineRangeForRange, and stringByTrimmingCharactersInSet. They're all easily found in Xcode's help, which you should be a big fan of by this point, so check them out. There's no better way to learn what tools you have available to you than to dig through the API.

The next block of code resizes the chunks of text as needed. The high-level algorithm is like so:

Walk through the array:

Ideally, the logic is designed to produce neatly organized sections of text, but it attempts to maximize the reading time on each "page" of your iPod by filling up as much of the PR_CHUNK size as possible.

The final block of code writes each of the chunks to its own file. Files in the iPod's Notes directory will appear in alphanumeric order, so naming the files isn't as simple as just appending a number to the end of their file names. For example, if you parse Oscar Wilde's The Picture of Dorian Gray from Project Gutenberg, you'll end up with 122 different files. Naming the files by simply appending a number to the end of the original file name dgray results in these files appearing in this order on your iPod: dgray1, dgray10, dgray11, ... dgray19, dgray2, dgray20, ... and so on. Clearly, this doesn't make navigation convenient. Therefore,, we must pad the numeric value on the end of dgray with the correct number of leading zeros so that "pages" of text appear in sequential order.

The last interesting thing that happens is to append a hyperlink to the next "page" using a simple HTML anchor tag. The hyperlink appears in dark underlined letters. Once you're at the bottom of a "page," just press the button in the center of your scroll wheel to proceed to the next "page." Press the Menu button to go back. Question: could reading on the iPod get any easier?

Icing on the Cake

At this point, you could allocate a TextChunker to do your bidding when the Copy It button is clicked, and be done with it. But we'll quickly throw in a few more niceties. One such thing is to remember the locations of the last used file location and directory location. Users will typically store their electronic books in the same location, and the iPod's location doesn't change. Small conveniences like these are icing on the cake, but significantly enhance usability.

We'll set the values of the NSTextFields in our application by storing their previous values with the NSUserDefaults class: simply record these values each time the user changes them, and load in their previous values each time the application starts up. If you'd like additional information on NSUserDefaults, you can check out "Mac OS X's Preferences System (and More!)" here on Mac DevCenter, or Xcode's help. One thing you'll want to do in your project is to open the Info Window on PodReader, which is located under Targets, and input an appropriate identifier value and version, which you'll find in the Properties tab. The identifier is formed like a Java package structure, so you can enter something like "com.yourname.PodReader" and it will work fine. Once you've run the application, you can go to ~/Library/Preferences if you'd like to investigate the contents of this file, which is simply a special kind of XML file called a plist.


The Info Window for the PodReader target. Set the Identifier and Version fields here.

In order to set these values when the application starts, we'll need to implement the awakeFromNib protocol, which you can get more info on by looking up NSNibAwaking in Xcode's help. Essentially, all objects like our controls in the view are guaranteed to have been initialized by the time this method is called, so we can safely set their values in awakeFromNib.

Wow. With all of that behind us, replace your AppController files based on our discussion with these: AppController.h and AppController.m. Then build and run your project. Try out a book from Project Gutenberg . They have a rather long preamble of "small print," and I find that removing that portion results in better overall "chunking" into logical sections. For homework, you might try to use some of NSString's methods to automate that process.

Use the Notes folder of your iPod as the destination directory, but ensure that your iPod is configured for disk storage by using these instructions if you haven't already, or else you won't be able to get to the Notes folder. As a sanity check, you should be able to type cd /Volumes/your iPod's name/Notes into Terminal and be in your Notes folder without any problems. Once you've chunked a few books, come back and we'll review the interesting parts of the code. Be aware that your Notes folder can hold, at a maximum, 1,000 note files. I recommend creating subfolders for each document to better organize your readings.

   153
Shots of your PodReader in action. Simply click on the link to go to the next "page."


/* AppController.h */

#import <Cocoa/Cocoa.h>
#import "TextChunker.h"

@interface AppController : NSObject {
    IBOutlet NSTextField*		destDir;
    IBOutlet NSTextField*		sourceFile;
	IBOutlet NSButton*		sourceButton;
	IBOutlet NSButton*		destButton;
	IBOutlet NSButton*		copyButton;
	IBOutlet NSTextField*		separatorValue;
	IBOutlet NSProgressIndicator*	progressIndicator;
	
	TextChunker* chunker;
	NSArray* typesArray;
	NSUserDefaults* userDefaults;
	NSFileManager* fileManager;
}

- (id)init;
- (void)awakeFromNib;
- (IBAction)copyIt:(id)sender;
- (IBAction)openFileDialog:(id)sender;
- (void)dealloc;

@end

Here in the AppController header file, we #import the TextChunker header file and declare that we'll be using it, along with the NSUserDefaults class and the NSFileManager class. We also indicate that we'll implement the awakeFromNib protocol and the copyIt: action. In AppController.m, we declare these methods in the same order for consistency and ease of looking things up.


* AppController.m */
#import "AppController.h"

NSString* LAST_DEST_DIR = @"LastDestDir";
NSString* LAST_SOURCE_FILE = @"LastSourceFile";
NSString* LAST_SEPARATOR = @"LastSeparator";
@implementation AppController

- (id)init {
        [super init];
        typesArray = [[NSArray arrayWithObjects: @"txt", @"pdf", nil] retain];
        chunker = [[TextChunker alloc] init];           
        fileManager =  [[NSFileManager alloc] init];
        userDefaults = [NSUserDefaults standardUserDefaults];
        return self;
}

- (void)awakeFromNib {
        //the text fields are set up to be blank by default in the nib
        if ((nil != [userDefaults objectForKey:LAST_DEST_DIR]) &&
                 [fileManager fileExistsAtPath:
                       [userDefaults objectForKey:LAST_DEST_DIR]]) 
		 {
                 [destDir setStringValue:
                     [userDefaults stringForKey:LAST_DEST_DIR]];
        }
                
        if ((nil != [userDefaults objectForKey:LAST_SOURCE_FILE]) &&
                [fileManager fileExistsAtPath:
                    [userDefaults objectForKey:LAST_SOURCE_FILE]]) 
		 {
                [sourceFile setStringValue:
                    [userDefaults stringForKey:LAST_SOURCE_FILE]];
        }
        
        if (nil != [userDefaults objectForKey:LAST_SEPARATOR]) 
		 {
                [separatorValue setStringValue:
                    [userDefaults stringForKey:LAST_SEPARATOR]];
        }

        //the copy button is set to be disabled by default in the
        //nib and the text fields are set to be blank by default
        if  (! 
                 (([[sourceFile stringValue] isEqualToString:@""]) || 
                  ([[destDir stringValue] isEqualToString:@""]))
        ) 
		 {
                [copyButton setEnabled:YES];
        }       
        
        [progressIndicator setUsesThreadedAnimation:YES];
}

The init: method declares memory for objects just as we've discussed. We don't retain anything explicitly created using alloc, but we do have to remember to release them in dealloc:. In awakeFromNib:, we use the user defaults to determine if a value was previously stored for a key. If it was, we set the value in the corresponding NSTextField if the value is a path and still exists. Some reasons the path might not exist are if the iPod isn't mounted, or if you did some rearranging of your files in between times you used the application. If sufficient conditions are met, we enable the Copy It button, since a "legal" copy can now occur. Finally, we configure the progress indicator to use threaded animation. It needs to use threaded animation because the main thread will be busy doing work while we want it to spin, so another thread needs to handle the rendering. Of course, this is all encapsulated and you don't have to handle any of the details.


- (IBAction)copyIt:(id)sender {
        NSString* fileName = [[sourceFile stringValue] 
            stringByExpandingTildeInPath];
        
        [progressIndicator startAnimation:nil];
        if ([[fileName pathExtension] isEqualToString:@"pdf"]) 
        {
                NSLog(@"Trying to open a PDF File...next time");
        }
        else if ([[fileName pathExtension] isEqualToString:@"txt"]) 
		 {
                [chunker chunkIt:fileName
                           toDir:[[destDir stringValue] 
                               stringByExpandingTildeInPath] 
                   withSeparator:[separatorValue stringValue]];
        }
        else   {   //this should never happen if users are forced
                   //to select files with the open pane and all values
                   //in typesArray are handled in this event loop
                NSLog(@"Unrecognized file type");
        }
        [progressIndicator stopAnimation:nil];
        
        [userDefaults setObject:[separatorValue stringValue] 
                           forKey:LAST_SEPARATOR];
}

The method copyIt is fairly straightforward. It passes the source file, destination directory, and separator value specified in the view to chunker, which performs the grunt work. Around the work that takes place, we start and stop the animation of the NSProgressIndicator. If you are parsing relatively small files, you won't see it spin, but for very large files, you should feel like it's time to go and get a haircut.


/* A snippet from the revised version of openFileDialog: */

if ((nil != [userDefaults objectForKey:LAST_SOURCE_FILE]) &&
     [fileManager fileExistsAtPath:
           [userDefaults objectForKey:LAST_SOURCE_FILE]]) 
{
     [openPanel setDirectory:[[userDefaults stringForKey:LAST_SOURCE_FILE]
                                   stringByDeletingLastPathComponent]];
}
else {
        [openPanel setDirectory:[@"~" stringByExpandingTildeInPath]]; 
}
/* More Code Here*/
[userDefaults setObject:[destDir stringValue] 
                 forKey:LAST_DEST_DIR];
[userDefaults setObject:[sourceFile stringValue] 
                 forKey:LAST_SOURCE_FILE];
				 

The method openFileDialog: is almost the same. The differences reflect some sanity checks that are necessary, since we're trying to be helpful to the user and keep track of his or her last actions. You should realize by now that Objective-C code is incredibly readable, and you can figure out what's going on without any explanation. In short, if the last file the user copied still exists, we know that the directory it resides in exists. Therefore, the directory containing that file can be safely opened. Otherwise, we start navigation in the user's home directory as a reasonable alternative. As the method ends, it updates the user default values.


- (void)dealloc {
	[userDefaults synchronize];
	
	[chunker release];
	[typesArray release];
	[userDefaults release];
	[fileManager release];
	[super dealloc];
}

The method dealloc is a straight-shooter, as well. The only interesting thing here is the [userDefaults synchronize] line. Periodically the user defaults do synchronize, but we want to ensure this happens in case the application closes in between synchronizations.

Final Thoughts

Come back next time ready to cross the Cocoa-Java bridge and add in PDF support. We'll also discuss the many options available to you if you'd like to further enhance this application. Some enhancements that come to mind include reading files from FTP servers or Internet locations, specifying regular expressions as separator values for books that aren't broken into chapters with keywords, and unmounting the iPod once a copy is complete. If you're not already, you'll soon be a very well-read person.

Matthew Russell is a computer scientist from middle Tennessee; and serves Digital Reasoning Systems as the Director of Advanced Technology. Hacking and writing are two activities essential to his renaissance man regimen.


Return to the Mac DevCenter

Copyright © 2009 O'Reilly Media, Inc.