I recently packaged up a version of the Webjay.org database for a
researcher interested in collaborative filtering of music. I'm
posting here to offer a copy to any interested researcher or
programmer. [...]
What could you do with this information? You can track the spread of
songs from one playlist to another, since successive additions will be
ordered by date; I imagine there are characteristic patterns. You
might develop methods to infer how appealing a song is, because a
more-appealing song should not only be more widespread, it should
spread faster. You could study what proportion of songs are hits. You
could study which users have the most influence in creating hits, and
ask what characteristics influencers have. You could take advantage of
the fact that the data is commonly available to compare algorithms from
different researchers, in a way similar to the NIST face image dataset.
The data is licensed under the Creative Commons Attribution-ShareAlike
License. More data may be available on request. It may also be
possible to gain direct access to the un-anonymized live data for the
purpose of developing real world services.