Making Connections

14 February 2003

Clay Shirky's article on weblogs and power laws continues to generate a lot of conversation. Steve Johnson agrees that there is a power law distribution and then asks:

"So the question that I'm wrestling with is this: let's say we decided that the existing power-law distribution isn't quite fair enough, or that there's some other justification for encouraging a more egalitarian spread (equality of results, and not just opportunity.) If we decided that this was our goal, how would we go about doing it? What architectural changes would fight against the power law trend, without doing it in a command-and-control kind of way?"

One of the possible answers is David Sifry's interesting newcomers list on Technorati. He describes the thinking behind the new feature in his weblog:

"What I wanted to do was to break that power law, and give more exposure to the lesser known, but still interesting bloggers, especially on days when they stand out and do something interesting..... Basically, the idea is that for a relatively obscure blogger who has, say, 40 people currently linking to his blog, getting 4 or 5 new blogs linking to him can have the same effect as a a-list blogger getting 40 or 50 new links..... This is interesting research for me, but the most satisfying thing about it is that I've found a way to identify interesting new writers and add them to my blogroll - people who I would have never had found out about otherwise."

I'm following this with interest as it parallels discussions within a couple of my current projects. We're seeing what are clearly power law distributions in the content that people are reading. I was surprised when I first noticed how skewed these distributions were. I had expected some articles to be more popular than others, of course, but I didn't expect the extent of the bias towards the top few articles. The prominence of power law distributions across various domains was enlightening - what we're seeing with content bias is not unusual.

The question is: what, if anything, should we do about it?

The comparison with linking between weblogs and web site is relevant but not an exact model. Firstly, the projects I'm considering are internal with no external linking. The articles are accessed through a content management system which shows all new content on the home page and also classifies it for access via hierarchical navigation. When newer content arrives, the older content drops off the home page but remains available through navigation or search. Unlike the links between weblogs and web sites, there is no preferential linking to certain content, although there may be some discussion within the community that points others to the content. If so, this is via email or conversation and outside the channels that I can directly monitor.

Secondly, unlike a weblog or web site, we don't necessarily have any reason to get excited about the popularity of one article rather than another. As a blogger, it's in my interest to have more people read my weblog, hence a skewed distribution that reduces the chances of people reading my weblog is a concern. Something like the Technorati interested newcomers list is a clear benefit if it increases my chances of being read (assuming for the moment that I value a broader audience, which is probably a fair assumption for a lot of weblogs and certainly for most web sites, at least within the commercial sector).

The benefits of highlighting unread articles are not so clear within the context of my content management projects. There may well be a case for bringing under-utilised content to people's attention - and with some articles that is almost certainly the case - but it's not necessarily a general rule across all users and all content. Some content may quite validly be of minority interest and promoting it may be counter-productive. Yes, readers can ignore links that don't sound interesting, but we have an editorial role that explicitly privileges the information we point people to. If they follow too many links that are of minority interest, they will begin to lose confidence in the editorial stance in general.

In fact, there's a good argument for taking the opposite approach, promoting popular articles by home page positioning, newsletter links, commissioning and linking related content and so forth. This is very much the mass media approach: this is what the audience wants, so let's give them more of it.

The above notwithstanding, the projects have a knowledge sharing role as well as a publication role. In this context, there is greater benefit in highlighting under-utilised content. It's the serendipity effect - broadening your knowledge beyond what you know and what you think you need to know.

This is where I'm interested in the Technorati approach or something similar. I had considered a list of hot topics based on number of accesses weighted to more recent accesses - whatever is being read at the moment will move up the list, even if it is not popular in the overall standings.

I hadn't considered David Sifry's approach which weights those items which have recently increased their popularity. If I've understood what he's doing, it can be summarised as weighting based on the number of new readers in proportion to the number of existing readers, with a threshold to prevent disproportionate bias towards items with one or a few readers.

I'm not sure which measure, if either, will be more useful for our audience. I'm also not sure whether introducing another new way to access content will be useful - there is a point at which too many navigation routes simply confuses everyone. But being aware of the wider perspective raised by the power law discussion - and of options that might counter-act the a-list effect - will certainly give us more to think about and other models to consider.

My final questions concern whether or not introducing counter-measures to the a-list bias will actually work. Will it change the distribution of access to articles? Will it change it in a useful way (ie beneficial to our readers)? Will it change which articles receive the bulk of the audience focus but not change the actual distribution? If so, is that useful (ie will better articles get to the top, for some value of 'better')? How does it interact with the community benefits of most people having read the same content and having a common basis for discussion?
1:20:16 PM comment []

Home