This is the second part of a trilogy of articles that teaches you how to make reading
electronic documents (such as books and PDF documents) on your iPod easy and
enjoyable. This installment delves into the engine of our application and adds
some user interface conveniences through NSUserDefaults. Next time, we finish by incorporating the ability to handle text extraction from PDF documents
by exploiting the Cocoa-Java bridge.
Let's briefly review the openFileDialog: method from last time and address
memory as it applies to this project. The first thing you should notice about openFileDialog: is
that it passes in a pointer to the object that brought about its activation.
The condition [sender isEqualTo:sourceButton] compares pointer
values for equality, so this is not a difficult task.
If Source File was pressed, we configure the NSOpenPanel to disable the selection
of directories via [openPanel selectCanChooseDirectories:NO] because
we want to guarantee that a file is chosen. We also limit the selection to
files ending in "txt" or "pdf" because they're the formats we'll work with
in this series.
If the Destination button was clicked, we guarantee that a directory is
chosen, because we need a location to copy the source file's pieces to after it's
parsed. The other messages we pass to the NSOpenPanel should be pretty clear
just from reading the code, which is one of the many nice things about Objective-C
syntax. You can always use Xcode's help menu if you need it.
There are a few interesting string methods in openFileDialog:. On Unix-based
systems such as Mac OS X, the tilde (~) is a shortcut for the current user's home
directory. Type echo ~ in Terminal to check it out for yourself. The Cocoa
method stringByExpandingTildeInPath does just what it sounds like. This is
nice because we often like to start navigation in the user's home directory
as a convenience to them, and this is really simple way of making that happen.
As another convenience, we start navigation for the destination directory in /Volumes,
because the iPod is normally mounted there.
The line of code if (NSOKButton == [openPanel runModalForTypes:typesArray]) opens
the file dialog window, and if the user has clicked on OK (as opposed to Cancel),
we're guaranteed to have a valid selection because we configured our NSOpenPanels
accordingly. Because NSOpenPanel can handle multiple selections, the result
is contained in an NSArray. We retrieve the selection with [selection
lastObject] because it is the only object in the array and use this
value to set the corresponding NSTextField. Finally, we enable the Copy It button
if both text fields have valid values.
A final note to take from last time is the reason why we passed the retain message
to the NSArray in the init: method. In general, you should assume that objects
initialized with convenience constructors like arrayWithObjects are not in
the current autorelease pool. Thus, if you don't explicitly retain them, they're
gone with the wind once you leave the current scope. Objects explicitly initialized
with alloc or new, however, must be explicitly released with a release message,
usually in dealloc. If you're craving more information on memory or have
no idea what that just meant, check out the reference I pointed you to last
time, "Introduction
to Memory Management."
With that recapitulation, we can now proceed to build our engine. In your
Xcode project, choose File -> New File, select Objective-C class, and
click on Next. Name the class "TextChunker.m", ensure that "Also create TextChunker.h" is
checked, and click on Finish to create and add the files to our PodReader
target. Drag the TextChunker files into the Classes folder of your project.
Once that's done, replace the TextChunker files that Xcode generated with these files: TextChunker.h and TextChunker.m.
I recommend you do this through Finder, rather than through Xcode.
The header file, TextChunker.h, tells you all you need to know if you simply
want to use the TextChunker class and are not interested in how it works. The
method signature for chunkIt: specifies a source file, a destination
directory, and a value like CHAPTER or SCENE that should be used to segment
the chunks of text.
The file TextChunker.m is where all of the work happens. At the top of the
method chunkIt:, there are three constants that designate the guidelines
for chunking the text. We use the constant PR_CHUNK as a guideline to keep
the chunk sizes smaller than 4K. Technically, 4K is 4096 bytes, but we're using
some magic to link "pages" of our book together using the HTML anchor tag (this will be discussed
in a moment), so we leave an ample margin to account for it.
The constant PR_INITIAL_CUT_POINT provides a starting location to segment
a chunk if it is larger than PR_CHUNK, and PR_CUT_DELTA is used to incrementally
decrease PR_INITIAL_CUT_POINT if cutting at PR_INITIAL_CUT_POINT still
doesn't decrease the chunk size. These values will make more sense as we move on.
Specifying these values one time at the top of the method allows for easy adjustment.
For example, if you should want 2K chunks instead of 4K chunks, simply change PR_CHUNK and
you're all done.
The next few lines of code and the first while loop create and load an
NSMutableArray with chunks of text that are partitioned according to a separator
value. Notice how simple it is to load an entire text file into a string value
with [NSMutableString stringWithContentsOfFile:fileName]. The
NSString method rangeOfString: returns an NSRange, so if you're unfamiliar
with NSRange, look it up. You'll notice that it's a struct with two parts,
a location and a length. We're interested in using the location to determine
where a particular section of text in fileContents starts.
We start the substring matching from 1 instead of 0 because we want to
skip the first occurrence of it, which is quite possibly the very beginning
of the string at some point. For example, if we segment the following excerpt
from Shakespeare's Macbeth on the word "ACT," our code spins in an infinite
loop because it can never shorten the string containing the text. It repeatedly
identifies the first word of the excerpt as the separator and never moves on.
ACT I. SCENE I.
A desert place. Thunder and lightning.
Enter three Witches.
FIRST WITCH. When shall we three meet again?
In thunder, lightning, or in rain?
SECOND WITCH. When the hurlyburly's done,
When the battle's lost and won.
THIRD WITCH. That will be ere the set of sun.
FIRST WITCH. Where the place?
SECOND WITCH. Upon the heath.
THIRD WITCH. There to meet with Macbeth.
FIRST WITCH. I come, Graymalkin.
ALL. Paddock calls. Anon!
Fair is foul, and foul is fair.
Hover through the fog and filthy air. Exeunt.
Another detail of interest is that we use NSMutableArray and NSMutableString
as opposed to NSArray and NSString. Objective-C is a language that distinguishes
between mutable and non-mutable classes. In general, there's a performance
overhead for handling mutable classes for reasons that deal with memory allocation
and management. Non-mutable objects, on the other hand, do not require this
overhead. Use immutable objects if possible.
|
Related Reading
Learning Cocoa with Objective-C |
An interesting implementation detail that you might like to know is that most
mutable objects inherit from non-mutable parents. Looking up NSMutableArray
in Xcode's help shows you that it inherits from NSArray. We use the mutable
version of these objects, because the next block of code is very likely to
alter both the string values in the array as well as the array itself.
A final note to take from this first block of code regards some of the many NSString
methods: length, paragraphRangeForRange, lineRangeForRange, and stringByTrimmingCharactersInSet. They're
all easily found in Xcode's help, which you should be a big fan of by this
point, so check them out. There's no better way to learn what tools you have
available to you than to dig through the API.
The next block of code resizes the chunks of text as needed. The high-level algorithm is like so:
Walk through the array:
PR_CHUNK, create two chunks:
PR_CHUNK as possible.Ideally, the logic is designed to produce neatly organized sections of text,
but it attempts to maximize the reading time on each "page" of your iPod by
filling up as much of the PR_CHUNK size as possible.
The final block of code writes each of the chunks to its own file. Files in the iPod's Notes directory will appear in alphanumeric order, so naming the files isn't as simple as just appending a number to the end of their file names. For example, if you parse Oscar Wilde's The Picture of Dorian Gray from Project Gutenberg, you'll end up with 122 different files. Naming the files by simply appending a number to the end of the original file name dgray results in these files appearing in this order on your iPod: dgray1, dgray10, dgray11, ... dgray19, dgray2, dgray20, ... and so on. Clearly, this doesn't make navigation convenient. Therefore,, we must pad the numeric value on the end of dgray with the correct number of leading zeros so that "pages" of text appear in sequential order.
The last interesting thing that happens is to append a hyperlink to the next "page" using a simple HTML anchor tag. The hyperlink appears in dark underlined letters. Once you're at the bottom of a "page," just press the button in the center of your scroll wheel to proceed to the next "page." Press the Menu button to go back. Question: could reading on the iPod get any easier?
At this point, you could allocate a TextChunker to do your bidding when the Copy
It button is clicked, and be done with it. But we'll quickly throw in a few
more niceties. One such thing is to remember the locations of the last used
file location and directory location. Users will typically store their electronic
books in the same location, and the iPod's location doesn't change. Small conveniences
like these are icing on the cake, but significantly enhance usability.
We'll set the values of the NSTextFields in our application by storing their
previous values with the NSUserDefaults class: simply record these values each
time the user changes them, and load in their previous values each time the
application starts up. If you'd like additional information on NSUserDefaults,
you can check out "Mac
OS X's Preferences System (and More!)" here on Mac DevCenter, or Xcode's
help. One thing you'll want to do in your project is to open the Info Window
on PodReader, which is located under Targets, and input an appropriate identifier
value and version, which you'll find in the Properties tab. The identifier
is formed like a Java package structure, so you can enter something like "com.yourname.PodReader" and
it will work fine. Once you've run the application, you can go to ~/Library/Preferences if
you'd like to investigate the contents of this file, which is simply a special
kind of XML file called a plist.
The Info Window for the PodReader target. Set the Identifier and Version
fields here.
|
In order to set these values when the application starts, we'll need to implement
the awakeFromNib protocol, which you can get more info on by looking up NSNibAwaking in
Xcode's help. Essentially, all objects like our controls in the view are guaranteed
to have been initialized by the time this method is called, so we can safely
set their values in awakeFromNib.
Wow. With all of that behind us, replace your AppController files based on
our discussion with these: AppController.h and AppController.m. Then build and run your project. Try out a book from Project
Gutenberg . They have a rather long preamble of "small print," and I find
that removing that portion results in better overall "chunking" into logical
sections. For homework, you might try to use some of NSString's methods to
automate that process.
Use the Notes folder of your iPod as the destination directory, but ensure
that your iPod is configured for disk storage by using these
instructions if you haven't already, or else you won't be able to get to
the Notes folder. As a sanity check, you should be able to type cd /Volumes/your
iPod's name/Notes into Terminal and be in your Notes folder without any
problems. Once you've chunked a few books, come back and we'll review the interesting
parts of the code. Be aware that your Notes folder can hold, at a maximum,
1,000 note files. I recommend creating subfolders for each document to better
organize your readings.
Shots of your PodReader in action. Simply click on the link to go to the
next "page."
/* AppController.h */
#import <Cocoa/Cocoa.h>
#import "TextChunker.h"
@interface AppController : NSObject {
IBOutlet NSTextField* destDir;
IBOutlet NSTextField* sourceFile;
IBOutlet NSButton* sourceButton;
IBOutlet NSButton* destButton;
IBOutlet NSButton* copyButton;
IBOutlet NSTextField* separatorValue;
IBOutlet NSProgressIndicator* progressIndicator;
TextChunker* chunker;
NSArray* typesArray;
NSUserDefaults* userDefaults;
NSFileManager* fileManager;
}
- (id)init;
- (void)awakeFromNib;
- (IBAction)copyIt:(id)sender;
- (IBAction)openFileDialog:(id)sender;
- (void)dealloc;
@end
Here in the AppController header file, we #import the TextChunker header file
and declare that we'll be using it, along with the NSUserDefaults class and
the NSFileManager class. We also indicate that we'll implement the awakeFromNib protocol
and the copyIt: action. In AppController.m, we declare these methods in the
same order for consistency and ease of looking things up.
* AppController.m */
#import "AppController.h"
NSString* LAST_DEST_DIR = @"LastDestDir";
NSString* LAST_SOURCE_FILE = @"LastSourceFile";
NSString* LAST_SEPARATOR = @"LastSeparator";
@implementation AppController
- (id)init {
[super init];
typesArray = [[NSArray arrayWithObjects: @"txt", @"pdf", nil] retain];
chunker = [[TextChunker alloc] init];
fileManager = [[NSFileManager alloc] init];
userDefaults = [NSUserDefaults standardUserDefaults];
return self;
}
- (void)awakeFromNib {
//the text fields are set up to be blank by default in the nib
if ((nil != [userDefaults objectForKey:LAST_DEST_DIR]) &&
[fileManager fileExistsAtPath:
[userDefaults objectForKey:LAST_DEST_DIR]])
{
[destDir setStringValue:
[userDefaults stringForKey:LAST_DEST_DIR]];
}
if ((nil != [userDefaults objectForKey:LAST_SOURCE_FILE]) &&
[fileManager fileExistsAtPath:
[userDefaults objectForKey:LAST_SOURCE_FILE]])
{
[sourceFile setStringValue:
[userDefaults stringForKey:LAST_SOURCE_FILE]];
}
if (nil != [userDefaults objectForKey:LAST_SEPARATOR])
{
[separatorValue setStringValue:
[userDefaults stringForKey:LAST_SEPARATOR]];
}
//the copy button is set to be disabled by default in the
//nib and the text fields are set to be blank by default
if (!
(([[sourceFile stringValue] isEqualToString:@""]) ||
([[destDir stringValue] isEqualToString:@""]))
)
{
[copyButton setEnabled:YES];
}
[progressIndicator setUsesThreadedAnimation:YES];
}
The init: method declares memory for objects just as we've discussed. We
don't retain anything explicitly created using alloc, but we do have to remember
to release them in dealloc:. In awakeFromNib:, we use the user defaults
to determine if a value was previously stored for a key. If it was, we set
the value in the corresponding NSTextField if the value is a path and still
exists. Some reasons the path might not exist are if the iPod isn't mounted, or
if you did some rearranging of your files in between times you used the application.
If sufficient conditions are met, we enable the Copy It button, since a "legal" copy
can now occur. Finally, we configure the progress indicator to use threaded
animation. It needs to use threaded animation because the main thread will
be busy doing work while we want it to spin, so another thread needs to handle
the rendering. Of course, this is all encapsulated and you don't have to handle
any of the details.
- (IBAction)copyIt:(id)sender {
NSString* fileName = [[sourceFile stringValue]
stringByExpandingTildeInPath];
[progressIndicator startAnimation:nil];
if ([[fileName pathExtension] isEqualToString:@"pdf"])
{
NSLog(@"Trying to open a PDF File...next time");
}
else if ([[fileName pathExtension] isEqualToString:@"txt"])
{
[chunker chunkIt:fileName
toDir:[[destDir stringValue]
stringByExpandingTildeInPath]
withSeparator:[separatorValue stringValue]];
}
else { //this should never happen if users are forced
//to select files with the open pane and all values
//in typesArray are handled in this event loop
NSLog(@"Unrecognized file type");
}
[progressIndicator stopAnimation:nil];
[userDefaults setObject:[separatorValue stringValue]
forKey:LAST_SEPARATOR];
}
The method copyIt is fairly straightforward. It passes the source file, destination
directory, and separator value specified in the view to chunker, which performs
the grunt work. Around the work that takes place, we start and stop the animation
of the NSProgressIndicator. If you are parsing relatively small files, you
won't see it spin, but for very large files, you should feel like it's time
to go and get a haircut.
/* A snippet from the revised version of openFileDialog: */
if ((nil != [userDefaults objectForKey:LAST_SOURCE_FILE]) &&
[fileManager fileExistsAtPath:
[userDefaults objectForKey:LAST_SOURCE_FILE]])
{
[openPanel setDirectory:[[userDefaults stringForKey:LAST_SOURCE_FILE]
stringByDeletingLastPathComponent]];
}
else {
[openPanel setDirectory:[@"~" stringByExpandingTildeInPath]];
}
/* More Code Here*/
[userDefaults setObject:[destDir stringValue]
forKey:LAST_DEST_DIR];
[userDefaults setObject:[sourceFile stringValue]
forKey:LAST_SOURCE_FILE];
The method openFileDialog: is almost the same. The differences reflect some
sanity checks that are necessary, since we're trying to be helpful to the user
and keep track of his or her last actions. You should realize by now that Objective-C
code is incredibly readable, and you can figure out what's going on without
any explanation. In short, if the last file the user copied still exists, we
know that the directory it resides in exists. Therefore, the directory containing
that file can be safely opened. Otherwise, we start navigation in the user's home
directory as a reasonable alternative. As the method ends, it updates the user
default values.
- (void)dealloc {
[userDefaults synchronize];
[chunker release];
[typesArray release];
[userDefaults release];
[fileManager release];
[super dealloc];
}
The method dealloc is a straight-shooter, as well. The only interesting
thing here is the [userDefaults synchronize] line. Periodically
the user defaults do synchronize, but we want to ensure this happens in case
the application closes in between synchronizations.
Come back next time ready to cross the Cocoa-Java bridge and add in PDF support. We'll also discuss the many options available to you if you'd like to further enhance this application. Some enhancements that come to mind include reading files from FTP servers or Internet locations, specifying regular expressions as separator values for books that aren't broken into chapters with keywords, and unmounting the iPod once a copy is complete. If you're not already, you'll soon be a very well-read person.
Matthew Russell is a computer scientist from middle Tennessee; and serves Digital Reasoning Systems as the Director of Advanced Technology. Hacking and writing are two activities essential to his renaissance man regimen.
Return to the Mac DevCenter
Copyright © 2009 O'Reilly Media, Inc.