Book Reviews


[Day Permalink] Thursday, January 2, 2003

[Item Permalink] What to do with life? -- Comment()
The Scobleizer Weblog answers Aaron Swartz, who is asking what he should do with his life: "Now, to really understand the question, you need to understand that Aaron is young. He's still of the age that most people would be attending to high school. I was very fortunate to meet Aaron a couple times this past year. To say he's bright would be an understatement. [...] So, in that spirit, here's my answer: Aaron, soak up as much knowledge as you can in the next 12 years. Don't see that you need to give anything back until you're 30. I think you'll do that naturally anyway. Instead, become a master at something that society will need in 2015. Software might be it. I doubt we'll need more 802.11 experts in 2015, though. [...] I personally wouldn't worry too much about it. Life happens. The people I respect are those who build things that other people value. People like an actor on a broadway play, or the guy who designs the signs in Times Square (or, more likely, writes the software that runs them) or a guy who builds a cool scooter for us all to use. [...] If I were in your position, I'd learn to master something really hard. Something most people aren't willing (or are unable) to take on. You have 12 years to prepare. Go to it!"


[Item Permalink] Cold fit only for people with dogs -- Comment()
Today is a cold day. I wrote earlier about Finns at different temperatures. The cold weather is not a light matter any more in Finland. A week of almost -30°C / -22°F temperature is making outdoor life difficult. People with dogs venture outside, but other people tend to remain inside. I like to be out in the open in winter time, but with small children it is no light matter to go outside in this weather.

A friend with a dog commented about the cold weather: When was the last time you saw a dog with a frostbite? His dog refuses even now to go to her little cabin. The dog prefers the snow-covered garden, where it is cooler...


[Item Permalink] Google PageRank algorithm -- Comment()
As noted by many, the PageRank algorithm is based on solving a large eigenvalue problem. The dimension of the square matrix was 26 million in the description of the method (see below) written in 1997/1998. Today the dimension is of the order of 109, so the problem is computationally demanding. It must be one of the biggest eigenvalue computations done routinely. Of course, the matrix is sparse, and you don't need to be absolutely exact. It would be interesting to know what kind of solution methods are used, and how the solution is parallelized.

Curiouser and curiouser! (via Blogging Alone) writes: "... it did lead me to an interesting paper by Sergey Brin about Google.  This was written in, I guess, 97/98 i.e. well before Google became the monster it is today. [...] it does have the best description of the page rank algorithm and how it is calculated that I have seen so far.  I'm guessing it's a good deal more sophisticated these days but this might be of interest for others like myself who wonder about the inner workings. [...] To quote from that paper:

2.1.1 Description of PageRank Calculation

Academic citation literature has been applied to the web, largely by counting citations or backlinks to a given page. This gives some approximation of a page's importance or quality. PageRank extends this idea by not counting links from all pages equally, and by normalizing by the number of links on a page. PageRank is defined as follows:

We assume page A has pages T1...Tn which point to it (i.e., are citations). The parameter d is a damping factor which can be set between 0 and 1. We usually set d to 0.85. There are more details about d in the next section. Also C(A) is defined as the number of links going out of page A. The PageRank of a page A is given as follows:

PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn))

Note that the PageRanks form a probability distribution over web pages, so the sum of all web pages' PageRanks will be one.

PageRank or PR(A) can be calculated using a simple iterative algorithm, and corresponds to the principal eigenvector of the normalized link matrix of the web. Also, a PageRank for 26 million web pages can be computed in a few hours on a medium size workstation. There are many other details which are beyond the scope of this paper."