MacDevCenter    
 Published on MacDevCenter (http://www.macdevcenter.com/)
 See this if you're having trouble printing code examples


Programming With Cocoa

Strings in Cocoa: Part I

06/29/2001

In the next two columns we will delve into the fundamentals of the classes NSString and NSMutableString; these two classes make up the majority of Cocoa's string-handling ability.

We'll start today by dealing with the various ways to create strings, and the basic manipulation methods that allow us to extract substrings, and search and compare strings too. However, before I get into that, I want to start with a mini-lesson about another concept of object-oriented programming: class clusters.

Another lesson in OOP

The NSString and NSMutableString classes are prime examples of the use of class clusters in Cocoa. A class cluster is a way of grouping closely related classes into a small number of abstract classes. In this scheme, the complicated mess of interfaces for the various classes are kept private, while the abstract classes' interface is kept public. Here, NSString and NSMutableString are the abstract classes with which you program. Cocoa handles the rest beneath the surface.

When you create an instance of NSString, you don't actually have an object that is exactly an NSString. Rather, you have an object that is an instance of a highly specialized type of string. You may have many of these specialized strings whose implementations vary based on specific features and behaviors.

For example, you might have a special string class that implements only C-type strings, and another string class that implements strings with UTF-8 encoding. The reason you create separate, specialized, string class implementations is in the interest of efficiency. It is much more resource-savvy to have low-overhead classes with very specific functionality than one aggregate class that can do everything.

Now you can imagine that the number of classes could quickly grow if you continue along this course, leaving you with a complicated set of classes to worry about and organize. So, what a class cluster accomplishes is to hide all those specialized classes from you, the developer, and simply provide a general, abstract class, thereby simplifying the interface without sacrificing functionality or leanness of implementation. You get the best of both worlds with class clusters.

In this case, NSString and its subclass NSMutableString are the abstract public classes of their particular class cluster. This is possible because presumably all strings would have the same behaviors, which are methods. So when you create a new string instance, there is some work going on behind the curtain to choose the class that would best suit your needs.

Comment on this articleLet's discuss strings, how they work in Cocoa, and some of the concepts that Mike Beam introduces in this column.
Post your comments

Also in Programming With Cocoa:

Understanding the NSTableView Class

Inside StYNCies, Part 2

Inside StYNCies

Build an eDoc Reader for Your iPod, Part 3

Build an eDoc Reader for your iPod, Part 2

Other examples of class clusters in the Foundation Framework are the classes NSArray, and NSNumber. When we get to NSArray, I will talk more about class clusters, in particular some subtleties that come into play when you subclass these abstract classes that are the faces of class clusters.

Now let's move on to NSString, and see what we can make of it.

The fundamentals of strings

NSString provides you with a plethora of methods for creating and initializing new strings. One of these methods that you will likely employ quite frequently is the creation of static, or constant, strings. These are NSString instances that are compiled into the executable, which means it is always with us, and can never be removed from memory. Objective-C provides us with a great syntax for creating these constant strings, which goes as follows:

NSString *aString = @"Hello, again.";

Strings of the @"..." construct are truly NSString objects. Thus, any time we have a method that takes an NSString argument, we can use @"..." strings as we would any object variable.

I have included with this column a small application that is simply a button and a text field, which serves as a string tester for your use. The way I use it is to put any of these string operations we talk about in this column and the next into the button's action method. I set up an example in the code, feel free to muck around with it all you want.


Learning CocoaLearning Cocoa
By Apple Computer, Inc.
Table of Contents
Index
Sample Chapter
Full Description
Read Online -- Safari

String objects as arrays

Formally, NSString is a class that supports the creation and management of immutable arrays of Unicode characters. Immutable means that once we make a string, it cannot be changed. The NSString subclass NSMutableString provides an interface for the creation and manipulation of mutable strings, those whose contents we can change after they have been created. Because of this, we will find that working with strings is much like working with arrays.

Think back to the way strings are handled in C; they too are nothing more than character arrays. This is evident in the syntax where C strings are declared in one of the following two manners (both of which are two equivalent ways of declaring C arrays):

char aCString[] = "This is a C string, as well as an array";
char *anotherCString = "And here's another one.";

So you see, fundamentally, strings in Cocoa are very closely related to strings in C. The difference is in the way we interact with them, which is a direct consequence of Cocoa being object oriented. NSString provides us with many higher level methods of interaction and behaviors, whereas in C we are stuck with the primitive elements of the language that allow us to access characters at some particular index, or to find the length of the array.

In fact, NSString has two primitive methods upon which the higher-level methods are based. They are -length and -characterAtIndex:, which we easily associate with our knowledge about C strings and arrays. (As an aside -- and I'll talk about this in more detail in a later column -- when you subclass an abstract class, which is the interface to a class cluster, the primitive methods of that class are the only ones you must override for the subclass to function normally.)

I digress; back to some more practical matters.

A cadre of methods

As I mentioned before, the easiest and most convenient way to create a string -- if you know ahead of time what the contents will be -- is the @"..." construct. In the more traditional fashion of Objective-C object creation, you can create a new string using a number of init... methods. Of course, you can create a blank string simply by doing:

NSString *aBlankString = [[NSString alloc] init];

But why waste time? A newly allocated string is easily given some meat with the initWithString: method, which essentially copies an existing string (the argument, which is often of the @"..." nature) into a newly allocated instance of NSString (the receiver). This works in the following manner:

NSString *aString = @"Hello again.";
NSString *anotherString = [[NSString alloc] initWithString:aString];

Equivalently, we could have shortened this:

NSString *anotherString = [[NSString alloc] initWithString:@"Hello, again."];

Remember from the third column that creating new objects requires two steps: allocating the necessary memory by calling the + alloc class method, and then initializing it to some new value.

We can also create new string instances from standard C strings (which, as we just learned, are arrays of byte-sized char strings) with the - initWithCString: method, shown here:

char aCString[] = "Hello again.";
NSString *aCocoaString = [[NSString alloc] initWithCString:aCString];

Remember, we could have also defined our C string using the pointer syntax of arrays:

char *aCString = "Hello again";

I stress this duality only because the bracket-array syntax is more familiar to most, yet the pointer-array syntax is the standard way of referring to C strings in the NSString class reference.

It is also possible to create formatted stings. The formatting I refer to here is the standard C formatting that is used in the C printf statement. Formatted strings allow you to interject variable values into a string by using special placeholders. These placeholders, indicated by a percent sign (%) and a character, specify the C data type (int, double, float, char) of the variable whose value is to be displayed in that place. The string format is then followed by a series of arguments that are the variables to put in the respective placeholder. For example, we might have the following formatted string that creates a string with two variables:

int n = 3;
int m = 4;
NSString *aFormattedString = [[NSString alloc] initWithFormat:@"Hello, again, %i times over. The next one will be %i.", n, m];

If we were to display this in a text field, it would look like this:

Hello again, 3 times over. The next one will be 4.

All the rules for creating formatted C strings in the printf statement apply here. For more information on the formatting options, check out the man page for printf by typing man printf in a terminal window.

Here we saw our first example of a method with optional arguments. We learned previously that non-optional arguments in a method are indicated by a ":". On the other hand, commas separate optional arguments.

I also want to tell you about creating strings from a text file. The NSString method that accomplishes this is initWithContentsOfFile:. The argument of this method is an NSString object indicating the absolute path of the file. (Later in this article, I'll tell you about some NSString methods that help you manipulate path strings.) So, if I wanted to create a string that contains the contents of a file in my home directory, we would do the following:

NSString *path = @"/Users/mike/textFile.txt";
NSString *contentsOfFile = [[NSString alloc] initWithContentsOfFile:path];

This method interprets the file as being encoded in the standard C encoding.

Finally, I need to say something about the class methods of NSString that allow us to create temporary strings (when I speak of temporary strings, I mean strings that are removed from memory shortly after they are created. We'll talk about this more when we get to memory management. In short, if you want the string object to stick around indefinitely, use the alloc/init... two-step object creation procedure. Otherwise, if you just temporarily need to create a string to copy somewhere else, one of other the class methods are what you need).

It is often useful or necessary to create a string that will be used immediately and then discarded (for example, a string that will immediately be displayed in a text field. This will all make more sense when we talk about memory management). In this case you can use any of the +string... class methods for NSString. The majority of Cocoa classes follow the convention that temporary objects of that class are created using the class methods that begin with the type of object being made (i.e. + stringWith...). Most of the init... methods have a matching class method to create temporary strings. So, if we wanted to create a temporary string from a C string, we would use the + stringWithCString: method:

[textField setStringValue:[NSString stringWithCString:"Hello again, and again"]];

Writing strings to a file

We saw above how we could make a new string from a plain text file, and we'll see now that the reverse is also possible; we can take a string and write it to a file using the - writeToFile:atomically: method. The first argument of this method is a string that is the path to the file we wish to write our string to (we'll see later how we can work with paths as strings).

The second argument, atomically:, is a a boolean value, so it is either "yes" or "no". If the atomically flag argument is set to "yes", the method will not overwrite an existing file located at path. Additionally, the file will not be created until the method knows that file creation will be successful. We can implement this method in the following way:

NSString *path = @"/Users/mike/textFile.txt";
NSString *contenstOfFile = @"This string will end up on disk in the next line of code.";
[contentsOfFile writeToFile:path atomically:YES];

What we did was first create a string that is the path to the file we wish to write to, and then use that when we send the writeToFile:atomically: message to the string we wish to put on disk.

Comparing strings

I imagine it won't be long in your programming before you want to compare some strings for equality. In C you learned that you can do so by using the function strcmp(string1, string2):

Editor's Note -- The example below was updated and corrected on 7/05/01. Thanks to our readers for helping us present as accurate information as possible.
char string1[] = "Yo";
char string2[] = "Yo";
if ( strcmp(string1, string2) == 0) {
    // do the following code
}

And the conditional would evaluate to "true", executing the code within the braces of the if statement. In Cocoa, the situation is similar. Remember, whenever we declare a string

NSString *aString;

the variable aString does not actually contain the string object -- it is a pointer to some string object in memory. Another name we had for this type of variable was an object identifier, because it identifies an object in memory rather than hold the object itself. This technical detail has some very important implications, as a pointer is an address to a location in memory. Consider the following situation where we try to compare two string objects using ==:

NSString *string1 = @"A String";
NSString *string2 = @"A String";

Here we have statically created two NSString objects, and they are two separate objects, despite having been endowed with the same value. Now, if we used the C equality operator on them,

BOOL result = string1 == string2;

the equality statement would evaluate to "no" (Objective-C's "false"). That's right, "no". Yes, they look equal, but this line of code did not compare the strings -- it compared the values of their memory addresses. Since they are not the same object, they exist in unique memory locations, and consequently string1 is a different address than string2. This explains the falsity of the statement.

Now, if we had done the following,

NSString *string1 = @"A String.";
NSString *string2 = string1;
BOOL result = string1 == string2;

the equality operator would indeed return "yes", because both string1 and string2 point to the same object in memory -- they have the same address. The line NSString *string2 = string1; accomplished this by taking the address of the object that string1 points to and assigned it to string2 as well. So the addresses are now equal, as was revealed in the equality.

Now, let me make the point clear that if the data type of a variable is int, double, char, or float, the equality operator will work as expected, because these variable data types are not pointers, they actually contain the value of the data. Only object variables (and all pointer variables) fall prey to this phenomenon.

So, I hope you're convinced now that the ways of old just don't work with object-oriented programming. I want to now show you how we do comparisons and make equality judgments.

Whenever you want to check the equality of objects, you must invoke special comparison methods in the respective classes. In NSString, the most straightforward of these is the method - isEqualToString:, whose argument is an NSString object, and returns a boolean value indicating whether the receiver string is equivalent to the argument string. Now, we can truly test the equivalency of our strings:

NSString *string1 = @"A String.";
NSString *string2 = @"A String.";
BOOL result = [string1 isEqualToString:string2];

The statement really will evaluate to "yes", because the values of the objects that string1 and string2 point to are equal. A more general method - compare: allows you to determine if a string is equal to the receiver of this method, or whether the string would come before or after the receiver string (as in the ordering of a dictionary, lexical ordering). The return type of compare: is a custom Cocoa data type called NSComparisonResult, which has three possible values: NSOrderedAscending, NSOrderedSame, NSOrderedDescending (these are just constants defined in the Foundation Framework equal to the integers -1, 0, and 1 respectively. So, we could use compare in the following fashion:

NSString *string1 = @"aardvark";
NSString *string2 = @"tarsier";

BOOL result = [string1 compare:string2] == NSOrderedAscending;

Beacuase string2 comes after string1 in the alphabet, the message to string1 will return the value NSOrderedAscending, which we compare with NSOrderedAscending using the equality operator, and get "yes" as the value of result. This is equivalent to saying string2 is greater than string1.

By substituting NSOrderedSame or NSOrderedDescending in place of NSOrderedAscending, we can check to see whether the receiver (string1) is the same as the argument (string2), or whether string2 appears sooner in the alphabet than string1.

In this system, uppercase letters are "less" than lowercase letters. So the following would evaluate to "yes":

NSString *string1 = @"Aardvark";
NSString *string2 = @"aardvark";

BOOL result = [string1 compare:string2] == NSOrderedAscending;

This is because "Aardvark" occurs before "aardvark" lexically (think in terms of the order of words in a dictionary). If you want to compare strings without regard to case sensitivity, then the -caseInsensitiveCompare: is the method for you. This method in use would look like (using the same strings as the previous example):

BOOL result = [string1 caseInsensitiveCompare:string2] == NSOrderedSame;

And result would be given the value "yes" because the statement evaluates as true.

These are some of the basic methods available for string comparison for you to use. If your needs are more demanding than what is we covered here, take a closer look at the class documentation, which details many more string comparison methods that give you more flexibility and options.

Finding strings within strings

NSString provides some methods that allow us to search strings for substrings. All of the string search methods return a special data type defined in the Foundation Framework known as NSRange. NSRange is a just C struct with two components, a starting index, and a length.

The way ranges work is like this: If we had a string with 100 characters (elements), then the range {49, 50} specifies a substring whose first element is the 49th character of the parent string, and includes the following 50 elements -- that is, the last half of the parent string (remember, strings are arrays in their most fundamental form, and counting always starts from 0).

In the next few examples, we will be using the following string:

NSString *theString = @"Okay, enough about ranges.";

Suppose we want to find where in the parent string the substring "about" can be found. The method we invoke is -rangeOfString:, and here is a snippet of code we could use to illustrate the way it works:

NSString *theString = @"Okay, enough about ranges";
NSString *substring = @"about";
NSRange range = [theString rangeOfString:substring];
int location = range.location;
int length = range.length;
NSString *displayString = [[NSString alloc] initWithFormat:@"Location: %i, length: %i",
location, length];
[textField setStringValue:displayString];

Note, that NSRange is not a class, it is a C structure, so we do not type the NSRange variable range as we do classes using the pointer-star (*); it is simply NSRange. In the previous example, the range returned by the search method is where in the parent string, theString, we can find the substring; thus firstElement is 13, and length is just the length of the substring, 5. If the substring cannot be found in the parent string, then a range with length zero is returned, indicating failure.

Additionally, note how we access the elements of a C structure. NSRange is defined as the following structure:

typedef struct _NSRange {
    unsigned int location;
    unsigned int length;
} NSRange;

Recall from C that components of a struct variable are accessed using the variableName.component construct. So, in our example above, we access the location and length components of range in the same way: range.location, and range.length.

Extracting substrings from strings

Three methods that allow us to extract substrings from a parent string are:

The first method, -substringToIndex:, returns a new string which is composed of the characters from the beginning of the receiver string up to, but not including, the character at the specified index. This might be used in the following way:

NSString *aString = @"Running out of ideas for strings.";
NSString *substring = [aString substringToIndex:7];

The result of this operation would be that substring now points to the string object @"Running". The method -substringFromIndex: works in the same way, except now the substring starts at the specified index of the receiver (including the character at the index), and includes all the characters to the end of the receiver. So if we wanted to get the substring "strings" out of aString, we would do the following:

NSString *substring = [aString substringFromIndex:25];

Finally, we have the method which lets us arbitrarily extract a substring from anywhere within the parent string-substringWithRange:. The argument to this method is -- as conveniently indicated by the method name (I love that about Objective-C) -- an NSRange. So, we could get the string "ideas" out of the parent string, aString this way:

NSString *substring = [aString substringWithRange:NSMakeRange(15, 5)];

Here the range starts with the 15th character, "i", and extends to include the next four characters, giving us a length of 5, "ideas".

Farewell

We've seen in this column just the fundamentals of working with string objects in Cocoa. Hopefully there is enough here to keep you busy, in addition to equipping you with the confidence to go and explore the more advanced methods of NSString. In the next column I will continue our discussion of strings by talking about how we work with paths, and I will also cover mutable strings and the NSMutableString class. Happy programming to you all! See you next time!

Michael Beam is a software engineer in the energy industry specializing in seismic application development on Linux with C++ and Qt. He lives in Houston, Texas with his wife and son.


Read more Programming With Cocoa columns.

Return to the Mac DevCenter.

Copyright © 2009 O'Reilly Media, Inc.