The Sounds of Science

Last Friday, hsarik pointed out an interesting web site: Echo Nest.  They provide a web service that allows you to analyze and remix music.  The API also can provide information (meta-data) about music, artists, songs etc.  and has Python bindings.  If you’ve seen the “More Cowbell” website where you can upload an mp3 and have more cowbell (and more Christopher Walken) added to it, well that site uses Echo Nest and if you download the python bindings for their API, you can see the script that adds the sounds.  Personally, I’m fond of “Ob-la-di, Ob-la-da” with 80% cowbell and 20% Christopher Walken.

I started playing with the API and as a first cut thought it would be neat to use the “get_similar” function.  So for each artist, you can get the top N similar artists.  Now where can I get a list of artists I like?  Well, I could type ’em in, but that sucks.  So I wrote a small program which:

  1. Opens the database on my iPod (or a directory of mp3 files)
  2. Finds each artist by either reading the iPod db or looking at the id3 tags in all of the files
  3. For each artist, add a node to a graph where the area of the node is proportional to the number of songs that artist has on the iPod (or in the music folder)
  4. For each artist, finds the top 50 similar artists
  5. For all of the similar artists that are in my collection of artists, add a graph edge between the two nodes
  6. Plot the graph

What can I say, I’ve been working on a fair amount of graph-theory at work recently.  So after processing my iPod, I came up with the following graph of my current music (click to embiggen):

Okay, that’s pretty cool.  Almost completely illegible, but cool.  FWIW, the graph has 15 connected components, unfortunately, 13 of them are “singles” (not connected to anything), with one pair (Louis Armstrong paired with Louis Armstrong and Duke Ellington).  Fortunately, the graphing tool I use (igraph), has built in tools for doing community analysis (using the leading eigenvector method), i.e., we can automatically find tightly coupled subgraphs.  A few examples from the 25 or so communities:

which arguably correspond to “Indie,”  “Classic Rock,”  “Jam Bands,”  “Guitar Gods,” and “Alternative.”  If I processed my complete music database, I suspect we would wind up with several other communities, e.g., Blues.  But since Robert Johnson is the only blues I’ve got on there right now… he’s in a class by himself.

I suppose it goes w/o saying, that my musical tastes aren’t everyone’s and that if you don’t like my musical tastes, you can keep it to yourself or go DIAF 🙂

So, what’s next?  I was talking with M from my office and we’ve come up with another interesting project for the Echo Nest API.  This one a) uses the audio analysis functions, and b) if we do it right might cause someone to send us a cease and desist.  So, win all the way around.

Comments are closed.