- Michael Jackson, Original Solo Recordings, 1972-1997 (view medium size, view large size)
Last week I was at Music Hackday, messing around with music data at The Guardian offices in Kings Cross, London. The outcome is this Michael Jackson doughnut, which spans the whole of his solo recording career, showing which of his tracks have been the most popular & loved by Last.fm listeners, as well as which releases have been the most influential and loved. The graphic was programmatically produced and could be produced for other artists quite easily.
Some notes on how to read this information graphic: The size of each slice is proportional to the number of plays of that particular track or release compared to the total number of plays of Michael Jackson's material on Last.fm. The darkness of the slice indicates how loved that track or album is by the listener base. Tracks are organised by album, by date, starting with his first solo record, Ben, from 1972.
Some notes on the metrics: The graphic takes into account 6 months worth of Last.fm listening data, that's 3,606,823 individual plays of his tracks by 1,432,458 listeners worldwide. The loving data spans a shorter time period, a little over two months; this period contains data which falls both before and after his death. The love data for releases is normalised by track count, so the darkness of the blue hue expresses the average number of loves per track on said release.
In the case of (countless) albums featuring re-releases of original works, such as Number Ones, I have aggregated play counts and merged them into a single play count for the original recording (e.g. Billie Jean). This is why you don't see releases like HiStory (Book 1) or The Essential Michael Jackson in the graphic. The playcounts on original tracks which are featured on those records, however, do contribute to the tracks shown in the graphic, as I'm trying to get a feel for which of his original recordings are the most influential overall.
Some notes on the tech: I spent much of the 4-8 hours it took to produce the software coming up with ways of cleaning the data. The gap in the doughnut represents the proportion of plays which relate to live versions, endless remixes (official & unofficial) and collaborations (duets etc). I chose to keep these in the dataset in order to represent proportionally how much attention goes towards that kind of material. I also stripped a lot of seemingly badly tagged, or dubiously attributed tracks using a bunch of filters aided by my observation of the dataset. I used simple Levenshtein distances to merge tracknames together and a few simple transforms to help the merging process. The merging operation was crucial because some tracks have been officially re-released over 30 times and the metadata can vary a little each time. The graphic was produced using PHP & Actionscript with JSON as the transport mechanism between the two.
Popularity and love data is from Last.fm, release dates are from Musicbrainz and Discogs was used for authoritative discographic data.
Lastly, just a note on the motivation - MJ was primarily a childhood experience for me (I was 6 when Bad was released), and the hours I've spent with the dataset have given me the time to reflect on what a grip his older material has had on my generation's collective identity. As the notion of a mainstream, backed as it is by broadcast media, gives way to thousands of niches in disparate networks, it's hard not to see MJ as a relic of an age of mass cultural experience that is slowly receding into the distance. R.I.P. MJ.
