MacDevCenter    
 Published on MacDevCenter (http://www.macdevcenter.com/)
 See this if you're having trouble printing code examples


MacFUSE: New Frontiers in File Systems

by Scott Knaster
03/06/2007

If you're a typical Mac programmer or user, chances are your main concern with files is reading from and writing to them, opening and saving them. You probably haven't thought much about adding support for other file systems or actually implementing your own file system from scratch—why on earth would you want to do that?

Well, no matter what kind of Mac user you are, there's a new development in the somewhat arcane world of file systems that's bound to be interesting to you: MacFUSE. Simply put, MacFUSE takes something that was incredibly hard—adding new file systems to Mac OS X—and makes it much, much easier. In this article, I'll explain why file system support is generally hard, how MacFUSE makes it easier, and why you should care.

So what's all this fuss about file systems?

A traditional file system provides a way of organizing information on your disks so that programs can have access to them. The standard Mac OS X file system is called HFS+, although OS X supports several other file systems as well. There are a few basic reasons why you might care about which file systems are available on your Mac:

Note that MacFUSE is not just for programmers. If you want to find out a little more about what MacFUSE can do, but you're not interested in creating your own file system right now, skip to the "MacFUSE belongs to you" section near the end of this article.

Now that you have some idea why you might want to use other file systems, we can talk about how.

Writing Your Own File System

Let's say you're ready to add support for a new file system to Mac OS X. To create a conventional file system in Mac OS X, or most other modern operating systems for that matter, you have to write a kernel extension (kext) that implements the behavior of the file system. Writing a file system is an exceptionally tricky task. Because your code must handle so many aspects of implementing the file system, you're responsible for all sorts of low-level details (and there are just too many details), such as organizing directory information, determining just what to do when a file is opened or closed, performing I/O, maintaining all kinds of reference counts, deciding when and how to compact files and reclaim space, and so on. In addition, because your code is running in the kernel, there are other fun restrictions and difficulties you'll run into when writing and debugging your code.

Of course, bugs in file systems can have devastating consequences. You're sharing memory with the kernel. If you do something wrong, you will almost certainly cause a kernel panic, which means user inconvenience and loss of productivity, and in many cases even worse: permanent data loss.

Have I scared you off yet?

There Must Be a Better Way

Don't go too far: MacFUSE is the answer. MacFUSE is an implementation of FUSE on Mac OS X. FUSE is an open source project that was designed to make it easier to add support for new file systems to Linux. FUSE runs in kernel space, like any kernel extension. But instead of implementing a specific file system, FUSE implements a generic file system. It does all the low-level gunk that any file system must do. However, it allows code that's specific to a given file system to live outside the kernel, in a regular user program. A library then provides an API that's essentially a much simpler, higher-level abstraction suitable for creating your own custom file system. Since your file system gets to run in user space, you're not troubled by kernel space hassles. (In fact, FUSE stands for Filesystem in USErspace, a pretty good summary of what it does.)

MacFUSE was written for Mac OS X by Amit Singh, who wrote it as a 20% time project at Google. The kernel part of MacFUSE is specific to Mac OS X, since kernel details differ enough across operating systems that you can't just port things as you often can with regular software. The FUSE user library is a reasonably straight port from Linux to Mac OS X. MacFUSE brings all the goodness of FUSE to the Mac. Amit has gotten down and dirty in kernel programming, writing all the low-level stuff that FUSE hides, just so we won't have to. Thanks, Amit.

MacFUSE makes writing a file system much, much easier. In practical terms, this means that you don't have to handle the lowest and goriest details of a file system. Instead, you just write code for a few relatively high-level calls (such as open(), read(), write(), rename(), and so on). From the kernel's standpoint, MacFUSE is the file system. From MacFUSE's standpoint, your user-space program is the file system. When the kernel wants to do file I/O to a mounted volume, it calls MacFUSE, which is in charge of the volume. MacFUSE receives the call and repackages it as a higher-level call to one of your file system's operations, then calls your code. If you are a programmer, you will appreciate how this simplifies things drastically. If you're not a programmer, you still get to take advantage of file systems that others have written using the platform provided by MacFUSE, and even existing FUSE file systems from the Linux world.

Let's take a look at how to create your own file system, something only a select few Mac OS X programmers in the world could have done before MacFUSE arrived on the scene.

Say Hello to a File System

We'll create an easy-to-understand example, the HelloWorld file system, or HelloWorldFS, to illustrate how to write a FUSE-compatible file system. A HelloWorldFS volume has the following properties:

To implement HelloWorldFS, we'll need to write code for just a few operations that we might be called upon to perform:

The FUSE API actually defines more than 30 operations, but most of them are optional. FUSE even provides sane defaults when appropriate. In fact, you can create a functional file system using only a few calls. In the case of HelloWorldFS, we can create a working (albeit not terribly useful) example with very little code.

Globals

Let's start with a few globals defined for convenience at the top of our file system source file:

static const char  *file_path      = "/hello.txt";
static const char   file_content[] = "Hello World!\n";
static const size_t file_size      = sizeof(file_content)/sizeof(char) - 1;

(Note: we're using C for our example, but you don't have to. FUSE bindings are available for many languages, including Objective-C, Java, Python, and Perl.)

These variables define the name, hello.txt, of our lone file; the contents of that file, the string Hello World!; and the size of the file, which is the number of characters in the string.

Now we'll move on to the file operations. Each of the four operations (open(), readdir(), read(), and getattr()) is implemented by a function. We'll take a look at each of those functions now.

open

static int
hello_open(const char *path, struct fuse_file_info *fi)

The open call takes two parameters. First is path, a pointer to a char string that contains the path of the given file. Every FUSE operation takes this as the first parameter. The second parameter is a pointer to a fuse_file_info structure, which houses a variety of information about the file or directory in question.

Now we'll explore the code of open:

{
    if (strcmp(path, file_path) != 0) { /* We only recognize one file. */
        return -ENOENT;
    }

We first check to see which file is to be opened. In the case of HelloWorldFS, recall that we have only one file, hello.txt. So we do some error checking by making sure that that's the file named here. If it's not, we return the error ENOENT, which amounts to saying "No such file or directory." (Note that by convention, FUSE file systems return negated error numbers, so we actually return -ENOENT here. Error numbers are defined in the standard errno.h header file.)

If the filename is good, we continue:

    if ((fi->flags & O_ACCMODE) != O_RDONLY) { /* Only reading allowed. */
        return -EACCES;
    }

    return 0;
}

Here we check to make sure that we've been asked to open the file in read-only mode. If that's not the case, again we return with an error (this time it's EACCES, which means "Permission denied").

If the filename is right, and the open is asking for read-only mode, we happily return 0 to indicate no error.

readdir

static int
hello_readdir(const char *path, void *buf, fuse_fill_dir_t filler,
              off_t offset, struct fuse_file_info *fi)

When the user wants to see a directory listing, our file system will have its readdir function called. This happens if the user types ls on the command line or displays the volume's contents in the Finder.

We need to check the path that comes with the readdir call to make sure it's kosher:

{
    if (strcmp(path, "/") != 0) { /* We only recognize the root directory. */
        return -ENOENT;
    }

The first parameter, path, is the name of the directory that we're supposed to list. For HelloWorldFS, we only allow one possible directory: the file system's root directory, represented by /. If the path is anything but a single slash, we return an error. Otherwise, we continue:

    filler(buf, ".", NULL, 0);           /* Current directory (.)  */
    filler(buf, "..", NULL, 0);          /* Parent directory (..)  */
    filler(buf, file_path + 1, NULL, 0); /* The only file we have. */

    return 0;
}

The directory we'll return is fixed: there's the single period ("dot") and two periods ("dot-dot"), representing the current and parent directories, respectively, and finally the hello.txt file itself. We use a convenience function called filler(), which the FUSE library defines, to pack the directory entries into the provided buffer.

read

Next, we'll implement the read function so that users can read from our file system.

static int
hello_read(const char *path, char *buf, size_t size, off_t offset,
           struct fuse_file_info *fi)

When we get a read call, we need to do what the parameters tell us to do. The path parameter gives the name of the file, buf is a buffer to put bytes into, size gives the number of bytes to return, and offset lets the read begin at a specific byte offset within the file.

Here's the first part of the read code:

{
    if (strcmp(path, file_path) != 0) {
        return -ENOENT;
    }

We first ensure that we're reading from our one and only file, else we return an error.

Next, we check the number of bytes to be read:

    if (offset >= file_size) { /* Trying to read past the end of file. */
        return 0;
    }

Here we test whether we're trying to read at an offset that's greater than the number of bytes in the file. If so, there are no bytes to return, but also no error, so we return 0 to indicate that no bytes were read.

The next test makes sure we're not going to read more bytes than are available:

    if (offset + size > file_size) { /* Trim the read to the file size. */
        size = file_size - offset;
    }

This code checks to see whether reading the given number of bytes from the desired offset will take us past the end of the file. If so, we modify the number of bytes to be read so that we read to the end of file and no further.

Now we're ready to read the file:

    memcpy(buf, file_content + offset, size); /* Provide the content. */

Finally, we read the requested bytes. We use memcpy() to move bytes into the buffer from the file, starting at the given offset location in the file.

    return size;
}

And we're done, returning the number of bytes read.

getattr

The last operation we need to implement is getattr, which simply returns the attributes of a file or directory:

static int
hello_getattr(const char *path, struct stat *stbuf)

We start by zeroing out the buffer we've been given that will hold the attributes structure:

memset(stbuf, 0, sizeof(struct stat));

This prevents uninitialized memory from ruining our day.

Next, we fill in the attributes structure, pretty much by brute force. There are only two possible entities we can be asked about: the root directory and the L file. We'll deal with the root directory first:

if (strcmp(path, "/") == 0) { /* The root directory of our file system. */
        stbuf->st_mode = S_IFDIR | 0755;
        stbuf->st_nlink = 3;

After learning that the first parameter is the root directory (it's just a slash), we set certain fields of the incoming stat structure. Specifically, we set the st_mode field to S_IFDIR | 0755, which specifies that this is a directory with its Unix permissions being 0755 (or "rwxr-xr-x"). We set the st_nlink field to 3. For a directory, the st_nlink field should be set to the total number of entries within the directory. Here, we say 3 because besides our hello.txt file, which accounts for one entry, we technically also have the "dot" and "dot-dot" entries, for a total of 3. If you think that's strange, well, that's just Unix convention.

The only other legal case is hello.txt. Let's handle that one now.

   } else if (strcmp(path, file_path) == 0) { /* The only file we have. */
        stbuf->st_mode = S_IFREG | 0444;
        stbuf->st_nlink = 1;
        stbuf->st_size = file_size;

First, we make sure the path parameter is hello.txt. If it is, we return S_IFREG | 0444 for the st_mode field (regular file, permissions being 0444 or "r--r--r--"), 1 for the st_nlink field, and our file_size constant as the size. Note that in the case of a regular file, the st_nlink field should contain the number of hard links to the file. Here, we don't even implement hard links in our file system, so we have to say 1 as the only possible sane value.

Because those are the only files we have, we need to return an error for any other filenames we might somehow be asked about:

} else { /* We reject everything else. */
        return -ENOENT;
    }

Otherwise, all is well, so we return 0, and we're done:

    return 0;
}

And the Rest

Here's the final bit of code for our HelloWorld file system. First is a structure that points to our file system's operations (note that we only enumerate the operations we implement):

static struct fuse_operations hello_filesystem_operations = {
    .getattr = hello_getattr, /* To provide size, permissions, etc. */
    .open    = hello_open,    /* To enforce read-only access.       */
    .read    = hello_read,    /* To provide file content.           */
    .readdir = hello_readdir, /* To provide directory listing.      */
};

We kick things off with our main function, which takes the operations structure as a parameter:

int
main(int argc, char **argv)
{
    return fuse_main(argc, argv, &hello_filesystem_operations, NULL);
}

That's it. We've implemented an actual file system in very little code. Although our file system is just a demo that doesn't do anything useful, it's real and it works, and its simplicity has been known to make kernel extension programmers weep with joy.

Trying It Out

OK, enough looking at source: it's time to actually use this stuff. The first step is to get and install MacFUSE. This part is very easy: there's even a true Mac installer available on the MacFUSE project site's download section. You should download the latest version of the "MacFUSE Core Installer Package" from the aforementioned page, and run the installer to get MacFUSE up and running.

Adding our HelloWorldFS requires a little more work, because we have to build it. First, grab the hellofs.c source by going to http://code.google.com/p/macfuse/wiki/HELLOWORLDFS and copying the source, then pasting it into a text file named hellofs.c. Then, assuming you have Apple's Developer Tools (Xcode) installed (version 2.4.x), compile it like this from the Terminal command line:

$ gcc -o hellofs hellofs.c -lfuse

That's it! Now we're ready to create a volume and mount it with HelloWorldFS:

$ mkdir /tmp/hello
$ ./hellofs /tmp/hello -oping_diskarb,volname=Hello

Note that we're specifying a couple of options when we mount the volume. The ping_diskarb option causes the volume to appear in the Finder, and the volname option gives the name to use for the volume.

After executing these commands, you should see the volume Hello in the Finder. If you look at the volume, it contains one file: hello.txt. You can double-click hello.txt to open it in TextEdit and see that it contains the text "Hello World!". You can even edit the text, but if you try to save the document, TextEdit complains, because HelloWorldFS is a read-only file system. (Of course, you can save it to a regular volume.)

When you're done having fun with HelloWorldFS, you can unmount the volume in any of the usual ways from the Finder, or like this from Terminal:

$ umount /tmp/hello

MacFUSE Belongs to You

Now that MacFUSE has been released (although it's not quite final quality yet), you can start using it on your own Mac. You can begin with the MacFUSE Quicker Start Guide and the Downloads page. There are over a hundred FUSE file systems out there for Linux, and as long as an existing FUSE file system doesn't use Linux-specific functionality, it should work with MacFUSE. Several popular file systems have in fact been tested with MacFUSE. Two of the most popular are ntfs-3g, a read/write version of NTFS, and sshfs, which lets you access a remote computer's storage as a file system, assuming you have an SSH connection to that computer.

Because MacFUSE makes file system coding much easier, you can think very creatively about what kinds of data might be represented as useful file systems. For example, Greg Miller created SpotlightFS, a file system that represents Spotlight queries as folders in a file system, with the query results appearing as files in folders. For example, you can create a folder named "Brian Wilson," and the folder contents will be all files that Spotlight finds when you send it the same "Brian Wilson" query. In fact, you can even use complex Spotlight queries, such as this:

   kMDItemKind == "*PDF*" && kMDItemAuthors == "Scott"w && kMDItemNumberOfPages > 1

And because SpotlightFS is a real file system, the folders and files on a SpotlightFS volume are real as well, and can be manipulated with file-handling applications (such as the Finder) and by file I/O code that you write. SpotlightFS is just an example of what you can do with MacFUSE. You can use your own creativity to think up other clever and useful file systems that can be brought to life with FUSE. Now go out and create something great!

More Information

To keep up with the pulse of MacFUSE, or to help out on the project, see the project home page. For fun and inspiration, be sure to check out this video that shows several MacFUSE tech demos in action.

Late Breaking News

The MacFUSE project now includes an Objective-C library that supplies a template MacFUSE file system. To use it, all you have to do is implement the particular methods you're interested in, such as open, read, write, and so on. This library handles details like volume icons and resource forks as well. You can find the Objective-C library and an example here: http://macfuse.googlecode.com/svn/trunk/filesystems-objc/.

Scott Knaster is a technical writer on the Mac team at Google.


Return to the Mac DevCenter.

Copyright © 2009 O'Reilly Media, Inc.