oreilly.comSafari Books Online.Conferences.


AddThis Social Bookmark Button

Cleaning iTunes

by brian d foy

Over the holidays, I was away from any sort of network, so I started to clean up things on my laptop. During this process, I decided to take a look at the 20GB of MP3s residing on my hard disk to see how badly the Gracenote CD Database (CDDB) had classified them. From there, I could decide how much effort I'd have to expend to clean things up.

What is the CDDB, and How Does It Work?

The CDDB works well for "popular" music -- that stuff that gets put into "Rock," "Alternative," and so on. It works less well for compilations, like "Soundtracks," since the phrase "Various Artists" ends up in the artist portion of the ID. Apparently, it works poorly for show tunes -- people seem to really care about the performers, so they get really specific with the artist portion of the ID, to the extent that every song on the CD has a different artist string even though it is the same cast. People almost always butcher western art music ("Classical" to most people) so that any information may show up anywhere -- track name in artist, performer in CD name, different languages in descriptions for different tracks, and then the ones that are really bad that I just type in myself.

Part of the reason for this is the history of MP3 ID3 tags. The first version, ID3v1, had a limited number of fields that had short character limits. A later version, ID3v2, had more fields and did not have the length restrictions of ID3v1. For instance, ID3v1 does not have a "Composer" field, and ID3v2 does. Complete details, as well as non-technical overviews, on both versions and future directions are on the ID3 Web site. iTunes can use either version, and can convert ID3v1 tags to ID3v2 tags.

The CDDB was originally a community project where the public submitted the CD data and the CDDB made it available to everyone else. They incorporated as CDDB, Inc. in 1995, and eventually renamed themselves Gracenote as they diversified their business. Their submission guidelines hint at the many different forms in which people have submitted data. Some of the things I want to clean up are bad data, and some are matters of Gracenote policy I disagree with, like inserting "(Disk N)" at the end of album names for multi-CD sets. In my iTunes Music Library, I get to have things my way.

Tidy Up Those Descriptions in iTunes

Related Reading

Mac OS X Hacks
100 Industrial-Strength Tips & Tricks
By Rael Dornfest, Kevin Hemenway

Some of the CD descriptions are easy to fix, and I can use some scripts from Doug's AppleScripts. The "Remove n Characters From Front" script removes the same number of characters from the names of all of the tracks in the current selection, so I can make a name like "01 First Track" into "First Track" (and the "Number Tracks of Playlist" will go the other way). If the artist and track names are reversed, the "Swap Track & Artist" script works nicely.

I started to work on a little Perl script to do this stuff for me, but since I use a Mac, I get to use iTunes. As I tried a few things in iTunes, I discovered that I could get the job done faster and easier with very little programming. Most of the changes relate to special cases anyway, so automating the task is not that helpful. Sometimes jumping into programming is the wrong way to go at first.

If I want to change the information for one track, I use the "Get Info" command from the "File" menu, which displays a dialog with three tabs. Under the "Tags" tab I update the appropriate information.

Screen shot.
Get Info dialog "Tags" tab

If I change the tag in iTunes, it updates the iTunes database and playlists, updates the ID tag in the actual MP3 file, and moves the file to a new artist directory if my music is in the iTunes Music Library. This is all very cool. I wanted to reorder the directory structure by artist anyway, and now I do extra work to do that once I fix things. I also use the MP3-Info Contextual Menu Item plug-in to ensure that the changes I make in iTunes show up in the MP3 file, and sometimes to perform simple edits.

Screen shot.
MP3-Info Contextual Menu items.

I can also select several tracks from a playlist and affect them in parallel. I make a contiguous selection by holding down the Shift key, selecting the first item, then selecting the last item, or a non-contiguous item by holding down the Apple key and selecting only the items I want. With multiple items selected, I choose the "Get Info" menu item again and get a similar dialog, but this time, the dialog does not have tabs and only shows the information that all of the tracks have in common; for instance, the same artist or album name. Any changes I make affect all of the tracks. Again, very cool if the change is simple.

Screen shot.
Get Info dialog for multiple items

And Those Changes That Aren't Simple ...

Often the change is not simple. While trying to fix the ID tags for Glenn Gould's English Suites, I needed to take part of the old title string and combine it with part of the old artist string. The title had the name of the major work, while the artist had the name of the movement in that piece (instead of, say, "Glenn Gould" or "Johann Sebastian Bach"). Each track had a unique artist string, and some tracks had the names of the major work, while others were left blank (ending up with "Track 02" and so on) meaning that they were the same as the previous track.

Screen shot.
English Suites with CDDB default values.

This looks like a tough programming problem, but it really is not that bad if I break it into little problems. Too often I see programmers (oh, that includes me too) try to solve several problems at once, which makes the solutions very complex, and complex solutions make for complicated maintenance. The Unix paradigm uses small tools in pipelines to accomplish tasks, and often I start making small tools to get the job done.

The first problem is getting the track names in order. If I can make the "Track N" strings match the major work name, I have solved one problem, and the next problem, combining that name with the movement name, is much easier.

Pages: 1, 2

Next Pagearrow