Updated: 2/1/2003; 12:26:34 PM.
Blogging Alone
Stephen Dulaney's Radio Weblog
        

Wednesday, January 01, 2003

Genesis of PageRank.

I've been playing with Grokker Preview Release 2 this evening.  It's a big improvement over PR1 in many ways although I still wouldn't recommend that anyone other than a search tool nut buy it at this point.  However it did lead me to an interesting paper by Sergey Brin about Google.  This was written in, I guess, 97/98 i.e. well before Google became the monster it is today.

However it does have the best description of the page rank algorithm and how it is calculated that I have seen so far.  I'm guessing it's a good deal more sophisticated these days but this might be of interest for others like myself who wonder about the inner workings.

To quote from that paper:

2.1.1 Description of PageRank Calculation

Academic citation literature has been applied to the web, largely by counting citations or backlinks to a given page. This gives some approximation of a page's importance or quality. PageRank extends this idea by not counting links from all pages equally, and by normalizing by the number of links on a page. PageRank is defined as follows:

We assume page A has pages T1...Tn which point to it (i.e., are citations). The parameter d is a damping factor which can be set between 0 and 1. We usually set d to 0.85. There are more details about d in the next section. Also C(A) is defined as the number of links going out of page A. The PageRank of a page A is given as follows:

PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn))

Note that the PageRanks form a probability distribution over web pages, so the sum of all web pages' PageRanks will be one.

PageRank or PR(A) can be calculated using a simple iterative algorithm, and corresponds to the principal eigenvector of the normalized link matrix of the web. Also, a PageRank for 26 million web pages can be computed in a few hours on a medium size workstation. There are many other details which are beyond the scope of this paper.

[Curiouser and curiouser!]
11:35:54 AM    comment []

© Copyright 2003 Stephen Dulaney.
 
Top 10 hits for SOCIAL ASPECT of P2P on..
Google
1.Blogging Alone
2.Blogging Alone
3.Blogging Alone
4.Blogging Alone
5.Blogging Alone
6.Blogging Alone
7.Blogging Alone
8.Blogging Alone
9.ACNet P2P Newsletter - Editorial Page
10.ACNet P2P Newsletter - Editorial Page

Help link 2/1/2003; 12:26:23 PM.


istori/logBlogging AlonePython Community Server: Development
a klog apart
Scripting News
Instapundit.com
Ron Lusk's Radio Weblog
Second p0st
Seb's Open Research
Marcus' Tablet PC Radio Weblog
Ross Mayfield's Weblog
evhead
The Shifted Librarian
Jon's Radio
thomas n. burg | randgänge
Universal Rule
Jon Schull's Weblog
null
Ross Mayfield: Social Networks
Ray Ozzie's Weblog
John Robb's Radio Weblog
RatcliffeBlog: Business, Technology & Investing
Peter Drayton's Radio Weblog
A Man with a Ph.D. - Richard Gayle's WeblogMarc's Voice
Boing Boing Blog
Steve Gillmor's Radio Weblogkottke.orgStephen RapleyFast TakesSam Ruby   Hugh's ramblingsJeroen Bekkers' Groove WeblogJohn BurkhardtJeremy Allaire's Radio Robb Beal's Radio Weblog
January 2003
Sun Mon Tue Wed Thu Fri Sat
      1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31  
Dec   Feb































Click here to visit the Radio UserLand website.

Subscribe to "Blogging Alone" in Radio UserLand.

Click to see the XML version of this web page.

Click here to send an email to the editor of this weblog.