Blogging Alone

Wednesday, January 01, 2003

Genesis of PageRank.
I've been playing with Grokker Preview Release 2 this evening. It's a big improvement over PR1 in many ways although I still wouldn't recommend that anyone other than a search tool nut buy it at this point. However it did lead me to an interesting paper by Sergey Brin about Google. This was written in, I guess, 97/98 i.e. well before Google became the monster it is today.

However it does have the best description of the page rank algorithm and how it is calculated that I have seen so far. I'm guessing it's a good deal more sophisticated these days but this might be of interest for others like myself who wonder about the inner workings.

To quote from that paper:

2.1.1 Description of PageRank Calculation

Academic citation literature has been applied to the web, largely by counting citations or backlinks to a given page. This gives some approximation of a page's importance or quality. PageRank extends this idea by not counting links from all pages equally, and by normalizing by the number of links on a page. PageRank is defined as follows:

We assume page A has pages T1...Tn which point to it (i.e., are citations). The parameter d is a damping factor which can be set between 0 and 1. We usually set d to 0.85. There are more details about d in the next section. Also C(A) is defined as the number of links going out of page A. The PageRank of a page A is given as follows:
PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn))
Note that the PageRanks form a probability distribution over web pages, so the sum of all web pages' PageRanks will be one.

PageRank or PR(A) can be calculated using a simple iterative algorithm, and corresponds to the principal eigenvector of the normalized link matrix of the web. Also, a PageRank for 26 million web pages can be computed in a few hours on a medium size workstation. There are many other details which are beyond the scope of this paper.
[Curiouser and curiouser!]
11:35:54 AM comment []

Top 10 hits for SOCIAL ASPECT of P2P on..

1.	Blogging Alone
2.	Blogging Alone
3.	Blogging Alone
4.	Blogging Alone
5.	Blogging Alone
6.	Blogging Alone
7.	Blogging Alone
8.	Blogging Alone
9.	ACNet P2P Newsletter - Editorial Page
10.	ACNet P2P Newsletter - Editorial Page