macdevcenter.com
oreilly.comSafari Books Online.Conferences.

advertisement

AddThis Social Bookmark Button

MacFUSE: New Frontiers in File Systems

by Scott Knaster
03/06/2007

If you're a typical Mac programmer or user, chances are your main concern with files is reading from and writing to them, opening and saving them. You probably haven't thought much about adding support for other file systems or actually implementing your own file system from scratch—why on earth would you want to do that?

Well, no matter what kind of Mac user you are, there's a new development in the somewhat arcane world of file systems that's bound to be interesting to you: MacFUSE. Simply put, MacFUSE takes something that was incredibly hard—adding new file systems to Mac OS X—and makes it much, much easier. In this article, I'll explain why file system support is generally hard, how MacFUSE makes it easier, and why you should care.

So what's all this fuss about file systems?

A traditional file system provides a way of organizing information on your disks so that programs can have access to them. The standard Mac OS X file system is called HFS+, although OS X supports several other file systems as well. There are a few basic reasons why you might care about which file systems are available on your Mac:

  • Compatibility. You might want to be able to read and write information stored on disks that use a different file system, such as the NTFS file system used by Microsoft Windows. Apple provides an implementation of NTFS in OS X, but you can use it only to read from disks, not write to them.
  • Cool features. Various file systems offer all sorts of different features, including many tricks that HFS+ can't do. For example, there's a lot of buzz about a relatively new file system called ZFS, which provides high reliability and scalability, among other features.
  • Innovation. Files and folders are very familiar and time-worn abstractions that users find easy to work with. There are many things you might want to represent as file systems, with their simple, familiar user interface, if only it were easier to write new file systems (as it is now with MacFUSE). For example, think about accessing your online photo albums and the pictures within them as if they were in a plain old local volume that you could manipulate using the Finder or any application of your choice.

Note that MacFUSE is not just for programmers. If you want to find out a little more about what MacFUSE can do, but you're not interested in creating your own file system right now, skip to the "MacFUSE belongs to you" section near the end of this article.

Now that you have some idea why you might want to use other file systems, we can talk about how.

Writing Your Own File System

Let's say you're ready to add support for a new file system to Mac OS X. To create a conventional file system in Mac OS X, or most other modern operating systems for that matter, you have to write a kernel extension (kext) that implements the behavior of the file system. Writing a file system is an exceptionally tricky task. Because your code must handle so many aspects of implementing the file system, you're responsible for all sorts of low-level details (and there are just too many details), such as organizing directory information, determining just what to do when a file is opened or closed, performing I/O, maintaining all kinds of reference counts, deciding when and how to compact files and reclaim space, and so on. In addition, because your code is running in the kernel, there are other fun restrictions and difficulties you'll run into when writing and debugging your code.

Of course, bugs in file systems can have devastating consequences. You're sharing memory with the kernel. If you do something wrong, you will almost certainly cause a kernel panic, which means user inconvenience and loss of productivity, and in many cases even worse: permanent data loss.

Have I scared you off yet?

There Must Be a Better Way

Don't go too far: MacFUSE is the answer. MacFUSE is an implementation of FUSE on Mac OS X. FUSE is an open source project that was designed to make it easier to add support for new file systems to Linux. FUSE runs in kernel space, like any kernel extension. But instead of implementing a specific file system, FUSE implements a generic file system. It does all the low-level gunk that any file system must do. However, it allows code that's specific to a given file system to live outside the kernel, in a regular user program. A library then provides an API that's essentially a much simpler, higher-level abstraction suitable for creating your own custom file system. Since your file system gets to run in user space, you're not troubled by kernel space hassles. (In fact, FUSE stands for Filesystem in USErspace, a pretty good summary of what it does.)

MacFUSE was written for Mac OS X by Amit Singh, who wrote it as a 20% time project at Google. The kernel part of MacFUSE is specific to Mac OS X, since kernel details differ enough across operating systems that you can't just port things as you often can with regular software. The FUSE user library is a reasonably straight port from Linux to Mac OS X. MacFUSE brings all the goodness of FUSE to the Mac. Amit has gotten down and dirty in kernel programming, writing all the low-level stuff that FUSE hides, just so we won't have to. Thanks, Amit.

MacFUSE makes writing a file system much, much easier. In practical terms, this means that you don't have to handle the lowest and goriest details of a file system. Instead, you just write code for a few relatively high-level calls (such as open(), read(), write(), rename(), and so on). From the kernel's standpoint, MacFUSE is the file system. From MacFUSE's standpoint, your user-space program is the file system. When the kernel wants to do file I/O to a mounted volume, it calls MacFUSE, which is in charge of the volume. MacFUSE receives the call and repackages it as a higher-level call to one of your file system's operations, then calls your code. If you are a programmer, you will appreciate how this simplifies things drastically. If you're not a programmer, you still get to take advantage of file systems that others have written using the platform provided by MacFUSE, and even existing FUSE file systems from the Linux world.

Let's take a look at how to create your own file system, something only a select few Mac OS X programmers in the world could have done before MacFUSE arrived on the scene.

Say Hello to a File System

We'll create an easy-to-understand example, the HelloWorld file system, or HelloWorldFS, to illustrate how to write a FUSE-compatible file system. A HelloWorldFS volume has the following properties:

  • It's read-only.
  • It contains exactly one file, named hello.txt.
  • The file hello.txt contains the text "Hello World!"

To implement HelloWorldFS, we'll need to write code for just a few operations that we might be called upon to perform:

  • open(), which gets us ready to do I/O to or from a file
  • readdir(), which enumerates the contents of a directory (that is, the files and subdirectories within the directory)
  • read(), the call that actually gets bytes from the file
  • getattr(), which returns information about the given file

The FUSE API actually defines more than 30 operations, but most of them are optional. FUSE even provides sane defaults when appropriate. In fact, you can create a functional file system using only a few calls. In the case of HelloWorldFS, we can create a working (albeit not terribly useful) example with very little code.

Globals

Let's start with a few globals defined for convenience at the top of our file system source file:

static const char  *file_path      = "/hello.txt";
static const char   file_content[] = "Hello World!\n";
static const size_t file_size      = sizeof(file_content)/sizeof(char) - 1;

(Note: we're using C for our example, but you don't have to. FUSE bindings are available for many languages, including Objective-C, Java, Python, and Perl.)

These variables define the name, hello.txt, of our lone file; the contents of that file, the string Hello World!; and the size of the file, which is the number of characters in the string.

Now we'll move on to the file operations. Each of the four operations (open(), readdir(), read(), and getattr()) is implemented by a function. We'll take a look at each of those functions now.

open

static int
hello_open(const char *path, struct fuse_file_info *fi)

The open call takes two parameters. First is path, a pointer to a char string that contains the path of the given file. Every FUSE operation takes this as the first parameter. The second parameter is a pointer to a fuse_file_info structure, which houses a variety of information about the file or directory in question.

Now we'll explore the code of open:

{
    if (strcmp(path, file_path) != 0) { /* We only recognize one file. */
        return -ENOENT;
    }

We first check to see which file is to be opened. In the case of HelloWorldFS, recall that we have only one file, hello.txt. So we do some error checking by making sure that that's the file named here. If it's not, we return the error ENOENT, which amounts to saying "No such file or directory." (Note that by convention, FUSE file systems return negated error numbers, so we actually return -ENOENT here. Error numbers are defined in the standard errno.h header file.)

If the filename is good, we continue:

    if ((fi->flags & O_ACCMODE) != O_RDONLY) { /* Only reading allowed. */
        return -EACCES;
    }

    return 0;
}

Here we check to make sure that we've been asked to open the file in read-only mode. If that's not the case, again we return with an error (this time it's EACCES, which means "Permission denied").

If the filename is right, and the open is asking for read-only mode, we happily return 0 to indicate no error.

readdir

static int
hello_readdir(const char *path, void *buf, fuse_fill_dir_t filler,
              off_t offset, struct fuse_file_info *fi)

When the user wants to see a directory listing, our file system will have its readdir function called. This happens if the user types ls on the command line or displays the volume's contents in the Finder.

We need to check the path that comes with the readdir call to make sure it's kosher:

{
    if (strcmp(path, "/") != 0) { /* We only recognize the root directory. */
        return -ENOENT;
    }

The first parameter, path, is the name of the directory that we're supposed to list. For HelloWorldFS, we only allow one possible directory: the file system's root directory, represented by /. If the path is anything but a single slash, we return an error. Otherwise, we continue:

    filler(buf, ".", NULL, 0);           /* Current directory (.)  */
    filler(buf, "..", NULL, 0);          /* Parent directory (..)  */
    filler(buf, file_path + 1, NULL, 0); /* The only file we have. */

    return 0;
}

The directory we'll return is fixed: there's the single period ("dot") and two periods ("dot-dot"), representing the current and parent directories, respectively, and finally the hello.txt file itself. We use a convenience function called filler(), which the FUSE library defines, to pack the directory entries into the provided buffer.

Pages: 1, 2

Next Pagearrow