MacDevCenter    
 Published on MacDevCenter (http://www.macdevcenter.com/)
 See this if you're having trouble printing code examples


Unify and Synchronize Your iTunes Libraries

by Matthew Russell
08/22/2006

Editor's note: Last June, David Miller published an article that explained how to synchronize playlists on iTunes. Today, Matthew Russell extends this idea by investigating and presenting three different ways to synchronize the actual contents of your iTunes music libraries when they are scattered across multiple machines. Then, in the second half of the article, Matthew lays the foundation for a custom Python script that you can extend across multiple platforms and in various other ways.

Although iTunes makes it easy to share media between multiple machines on a home network, you might want to keep more than one physical copy of your music library synchronized. Perhaps the most common example involves a family's home network consisting of a desktop and a laptop -- each containing subsets of what is really a single music library. When the laptop is at home, all of the music can be shared fairly transparently between the two machines; when the laptop is not at home, however, whatever media is held on the laptop is out of reach until it comes home. Add in another machine and/or a network drive used for backups and you quickly run into synchronization headaches. Let's investigate some of the ways that you can unify your fragmented iTunes libraries.

Two Problems, Three Solutions

What seems like one big problem can really be formulated as two smaller, interrelated problems. The first problem is the primary one: unifying a distributed music library to a single location so that the unified library represents the complete library. The second problem is much less of a burden: unifying each of the remaining libraries by distributing fragments of the now-complete library.

For example, suppose you have three iTunes libraries where each contains a set of songs: lib1 contains {A,B,C}, lib2 contains {A,C,D}, and lib3 contains {A,D,E}. Here's one methodical way we could solve the problem:

Now, all three libraries have been unified and are in sync. Generalizing this approach to more libraries is straightforward.

Although following these simple steps certainly solves the first part of the problem, they do little to address the monotony of the task itself. Furthermore, maintaining this zen is still a difficult chore if all of your machines readily purchase music from iTunes Music Store, download podcasts, import CDs, etc. Fortunately, at least three good solutions are readily available: use a master/slave approach to managing your library, periodically run syncOtunes, or script your own customized solution. Let's look at each of these approaches in detail.

The Master/Slave Approach

This approach involves first manually unifying and synchronizing your libraries and then designating one machine (the master) as the sole authority to import new media into the library (the slaves.) Keeping track of the updates on the single master machine should be fairly simple to do, and armed with that knowledge, passing down periodic updates to slaves should be a trivial undertaking. Although this approach isn't necessarily the most flexible, its primary advantage is that it involves very little overhead once you've gotten your libraries in order.

For the first manual sync, you have a few options. Comparing the contents of your ~/Music/iTunes/iTunes Music Library.xml files or the results of a recursive directory listing (ls -R) are two good options, although the next two techniques we'll be taking a look at are also fine alternatives. If you don't yet have another machine, but might be getting one, now would be the ideal time to consider the advantages that the simplicity of the master/slave approach might offer to your situation. It involves little more than good housekeeping.

Syncing with syncOtunes

syncOtunes is an application designed explicitly to solve the task at hand. It's a native Cocoa application and it's donationware. syncOtunes provides you with a fairly intuitive interface and abstracts away the grueling work of manually computing set differences and building the directories of files that you'll need to import into iTunes. Great, right? Well, maybe. If you can live with its quirks and aren't concerned that it's an independent, closed source project for which support may or may not exist at any given time, then it might be just what you need. In any event, it's definitely worth giving a shot before putting in the effort that rolling your own solution takes (see Figure 1).

Thumbnail, click for full-size image.
Figure 1. syncOtunes readily compares iTunes music libraries, computes the differences between them, and prepares folders of music that you can use to remedy your synchronization problems. (Click for full-size image.)

One additional factor to consider with syncOtunes is the number of libraries that you're interested in maintaining. syncOtunes is explicitly designed to work with two libraries, but you could iteratively apply it to multiple libraries easily enough. Since syncOtunes is specifically designed for iTunes and depends on the iTunes Music Library.xml file that iTunes builds, however, it's probably not your best approach if you're slowly trying to migrate away from iTunes or trying to keep an iTunes library in sync with a non-iTunes library. But that's where some simple scripting and ingenuity comes in handy.

Rolling Your Own

If you need a more customized solution than manually keeping up with files that the master/slave approach offers and if syncOtunes just won't suffice, then rolling your own solution with some custom scripting is sure to get the job done. If you haven't done much scripting before, don't worry. This is a good place to cut your teeth and get some experience. We'll work up a Python script with some core functionality to bear the brunt of what syncOtunes accomplishes behind its interface, and you can further adapt this script to meet your particular needs.

As you probably already know, your ~/Music/iTunes/iTunes Music Library.xml file is in place to give other applications access to the contents of your music library. One quirk about this file, however, is that iTunes will not acknowledge any changes you make to it. Rather, whenever you make a change to your music library, iTunes updates this file itself. This circumstance explains why an application like syncOtunes requires you to manually drop the folders of music that it produces onto the iTunes interface; iTunes must explicitly bless the music for you. Apparently, the only way to gain full control of your music library outside the grip of iTunes would be to reverse engineer the file format of the ~/Music/iTunes/iTunes Library binary file and modify it--but we all know that every end-user license agreement known to humanity prohibits doing such things.

Moving along, let's work with what Apple has made available to us and automate as much of the monotony we can. Our first task is to compute the differences between two iTunes libraries using the XML files available to us. If your circumstances call for it, you may want to genuinely parse the file with PyObjC (see here) or an XML utility, but we'll get by using a simple regex to build a list. If you need a quick introduction to regexes in Python, take a look at this document.

A Simple Script to Get You Going

The following Python script computes the differences between two iTunes Music Library.xml files. Copy it into a file named "diffMusicLibs.py" or download it here; invoke it from a terminal by typing python diffMusicLibs.py <file2> <file1>.

import re
import sys
from sets import Set

#removes the part of the string up until the actual artist name.
#returns string of form 'artist/album/filename'
def parseOutSong(s):
    lst = s.split('/')
    return '%s/%s/%s' % (lst[-3],lst[-2],lst[-1])

def main():
    #the pattern to match in 'iTunes Music Library.xml'. 
    pattern='\<key\>Location\<\/key\>\<string\>file:\/\/localhost\/Users\/(.*)\/Music\/iTunes\/iTunes%20Music\/(.*)\<\/string\>'

    #store the contents of each iTunes library given on the command line to a variable
    lib0 = open(sys.argv[1], 'r').read()
    lib1 = open(sys.argv[2], 'r').read()

    #populate lists that contain the songs in each library
    lib_files0 = []
    m = re.search(pattern, lib0)
    while m != None:
        lib_files0.append(parseOutSong(m.group(2)))
        lib0 = lib0[m.span()[1]:]
        m = re.search(pattern, lib0)

    lib_files1 = []
    m = re.search(pattern, lib1)
    while m != None:
        lib_files1.append(parseOutSong(m.group(2)))
        lib1 = lib1[m.span()[1]:]
        m = re.search(pattern, lib1)

    #turn the lists into sets
    set0 = Set(lib_files0)
    set1 = Set(lib_files1)

    #compute the differences between the sets
    diff_01 = set0.difference(set1)
    diff_10 = set1.difference(set0)

    #display results
    header = 'Differences between %s and %s\n' % (sys.argv[1], sys.argv[2])
    print header
    print '-'*len(header)
    for i in diff_01:
        print '  %s' % (i)

    header = '\nDifferences between %s and %s\n' % (sys.argv[2], sys.argv[1])
    print header
    print '-'*len(header)
    for i in diff_10:
        print '  %s' % (i)


    #when you're ready, you might also choose to  build the directories that you'll need
    #to transfer in this script. one way is to wrap a normal copy command inside a 
    #system call like so:
    #import os
    #os.system('cp source destination')
    
if __name__ == '__main__':
    main()

Hopefully, the inline commenting makes this script easy to follow. It simply searches out lines that contain the files in your library, creates sets out of them, and then computes the differences between the sets. Computing the differences between libraries is really more than half the battle, because the rest of the work involved is rather boring: packaging up copies of the songs that need to be imported into another library, performing the copies, and then importing the songs.

Manual Clean Up

Our script relies on the same hierarchical structure that iTunes uses for storing tracks (Artist/Album/Tracks), so the results it produces are only as good as how iTunes categorized your music. Clearly, if tracks are named differently or categorized differently between iTunes installations (for whatever reason) you'll want to do some preprocessing and work those details out before the script above can be of much help (see Figure 2).

For example, the first time that I ran it, I noticed that several songs were tagged as being out of sync because they were labeled as belonging to a compilation in one library, while in another library they were labeled according to their original album. In other cases, there were tracks that were categorized into an "Unknown" folder in one library but not in another. Fortunately, the iTunes interface provides a pretty friendly way to work out these sorts of kinks.

cleanup
Figure 2. Use iTunes and scripts to do any initial clean up for tracks that may have been categorized incorrectly.

One way to systematically work through these issues is to run the Python script, use its output to correct a couple of albums, and then repeat the process until you're finished. Storing the output of the script as two separate files and viewing it through vimdiff may turn out to be useful as you work though this process (see Figure 3). Although not guaranteed to buy you anything, the way it highlights differences between lines is likely to save you some eye strain and increase your efficiency as you work through any manual clean up. At least you only have to do it once.

vimdiff
Figure 3. Tools like vimdiff may help you to expedite some of the initial clean up.

Depending on your circumstances, there are likely a variety of scripts out there that can be of further help with any preprocessing that you need to do. Doug's AppleScripts for iTunes is always a good place to start. In particular, the File Renamer script there can save you a lot of time if you need to standardize any track names for a particular album. (Here is a slightly modified version of the "File Renamer" script that fixed a minor annoyance where iTunes was not updating track names in its display.) Another option is to consider using any ID3 data in tracks (if present) to help you with the initial standardization.

Clearly, there's a lot of further customization you might choose to implement once you're ready to actually start the synchronization. For example, maybe you don't want to duplicate copies of particular albums or television shows on all of your machines for some reason. That's not a problem; simply filter through the results and handle these types of situations with a dose of conditional logic. A little bit goes a long way.

Packaging, Copying, etc.

Assuming you have manually worked out any strange anomalies between your two libraries with the preprocessing tips above, you're now ready to wrap things up. In particular, the final obstacles include: 1) getting all of your iTunes Music Library.xml files to a single location for comparison with the Python script and 2) distributing the "missing" files back out to wherever they need to go. Try using scp to accomplish these details, although another simple option would be to mount network shares containing the music you need and doing a simple copy.

You can read the all about scp by typing man scp in a terminal, but for our purposes, a command of the form scp user@hostname:'~/Music/iTunes/iTunes\ Music\ Library' /tmp/hostname_tunes should do the job for obtaining local copies of files. When you're ready to distribute songs back out, something like scp -r directoryOfSongs user@hostname:/tmp/ should work fine, although you might also choose to tar them up and perform the copy as a single file. Once you've performed the copy and gotten all of the goods into place, all that is left to do is to open iTunes and drag the folder of "missing" songs onto its main interface.

Wrapping Up

The three methods we identified for keeping your iTunes music libraries synchronized varied in involvedness, and hopefully one of them will work well with your particular situation. The master/slave approach offers a simple methodology for keeping things under wraps when there's not much complexity involved and is especially applicable for folks who are the sole users of a couple of machines. For slightly more complex situations, syncOtunes can get the job done by calculating differences between libraries and performing copies of the songs for you. Finally, we laid the foundation with some custom scripting for those of you who prefer to work up a more tailored approach. Extending the functionality of this base script to do more complex tasks should not be too difficult.

Matthew Russell is a computer scientist from middle Tennessee; and serves Digital Reasoning Systems as the Director of Advanced Technology. Hacking and writing are two activities essential to his renaissance man regimen.


Return to the Mac DevCenter.

Copyright © 2009 O'Reilly Media, Inc.