Monday, February 17, 2003

Is Bloogle going to be a working version of the Semantic Web?

My brain almost hurts thinking this thru...

From a talk by Larry Page (co-founder of Google):
It wasn't that we intended to build a search engine. We built a ranking system to deal with annotations. We wanted to annotate the web - build a system so that after you'd viewed a page you could click and see what smart comments other people had about it. But how do you decide who gets to annotate Yahoo? We needed to figure out how to choose which annotations people should look at, which meant that we needed to figure out which other sites contained comments we should classify as authoritative.
LinkDiscuss(via Stochastic Aleatory Ontological Expostulations) [Boing Boing]


5:16:35 PM    
The cost of document storage, real vs virtual

Hard to argue with the value of virtual document storage... but does it take into consideration the cost of converting a real document into a virtual one?

An amazing technology breakthrough - Dan Gillmor reports on a major milestone: storage costs for electronic files has dropped to $1 per Gigabyte. Dan's article reflects on the significance of low cost storage, and uses a lot of great examples.

For lawyers, who deal with reams of paper, here is another metric: 1 GB of data is equivalent to about 6 bankers boxes of scanned paper.

Here in our law firm it costs us $13.52 to send a box of records to storage, and then it's .23 per month for each box. So if we sent 6 boxes to storage, it would cost us $79.50 for the first month, and then $1.38 per month for the next 11 months. So the total yearly cost of storing 6 boxes would be $94.68.

Since it only costs $1 to store that data in electronic form, as compared to about $100 to store it in paper form, that means that the cost of storing paper is 100 times the cost of storing electronic documents.

Oh, and the cost of retrieving paper is not cheap. For our firm it would be $21.96 to retrieve a box from storage and return it. Or $13.39 to retrieve it permanently. Or $6.76 to have the storage company destroy the box. Obviously, there is no cost for retrieval or destruction of electronic files.

[Ernie the Attorney]


10:50:02 AM    
Erik Noble notices that folks are moving off of Radio

Erik Noble notices that folks are moving off of Radio UserLand. I'm watching that too. I'll try to get an official answer to the future of Radio. Clearly UserLand doesn't have many resources, though. It's hard to innovate when you don't know how you're gonna pay your rent. [The Scobleizer Weblog]
I'm interested in moving off Radio myself.

Here's what I like about Radio Userland:

  • async upload to teh server, meaning I can blog when I want to, and the server is updated when it can be
  • Here's what I don't like about Radio Userland:

  • the server logic is really on my machine, so no site searching
  • the server app runs as a regular app, which means it take up doc space, has a window
  • the server app is buggy, and can be a CPU and network hog at times
  • I'd like to move to Roller Weblogger, but it only has support for the Blogger API. I like being able to create weblog posts in NetNewsWire with titles and URLs, and the Blogger API doesn't support those. Now I'm just waiting til I can get my Java deve environment back online so I can look into adding the Metaweblogger API into Roller.


    9:50:41 AM    
    The Shifted Librarian comments on Google & Blogger

    She's often very verbose in her postings, and I can't often read every one of them. But each one is often well crafted. The point by Scobleizer is interesting, if true. I've been moaning about the lack of search capabilities for a while now, but I never thought that it should be server-side.

    Wow. You leave your news aggregator for a couple of days and Google goes out and buys Pyra, the company that created Blogger. Pretty big news, but I have to admit that my first thought is that this is a mistake. As much as I love Blogger, I don't think Google needed to do this.

    Of course, the advantage to blogging this story late is that I can read and comment on others' opinions.

    Anil Dash: "More to the point, Google's consistent marketing message so far has been, 'We do search, and we don't want to be a portal'. ...the reality is that it puts Google into a far different role than they've had so far."
    -- I agree with Anil, and I'm worried that this is a sign that Google is branching out into an area that isn't integral to that mission.

    Nick Denton: "Expect to see, first of all, that Blogger-powered sites show up in Google search results minutes after the posts are published, rather than days."
    -- Actually, this wouldn't impress me at all. If Google can't evolve to do this for sites outside of its domain, then it will lose its edge. We're getting to the point where we already expect this.

    Scobleizer: "So, Google has a HUGE vested interest in making sure that the weblog communities survive. Let's say that Pyra went out of business. Google would loose much of its competitive advantage (and Microsoft probably would be able to move in and improve its search offerings and maybe even offer its own weblog tool -- anyone remember that Microsoft already offers free Websites over at http://groups.msn.com ?)"
    -- I disagree that this is Google's motivation. Plenty of great companies went bankrupt during the dot-com era, and Google can't go around saving them as part of a business plan. They either have something specific in mind, or it's an experiment (one that could fail, a definite possibility since I wouldn't consider Froogle or Google Answers to be successes so far and they don't take up the company's resources that Blogger will).

    No Time to Think: "The concept of the 'next big thing' has been building and taking shape. Its the theory of the 'Semantic Web'meets the power of 'Google' meets the value of 'Reputation'. Call it the 'Global Clique' (although one will exist for each subject) - everyone knows everyone (either directly or indirectly), someone knows everything and lots of people know where to find it or who to ask, there is no specific or consistent relationship between the participants (they're loosely coupled), and the thoughtleaders and the influencers - both in general and on specific subjects - are clear. It just needed a push. Today it got a huge one."

    That last one sounds better to me, and I hope that's where Google is headed. However, the great thing about Google - and the benefit that made it integral to our everyday lives - is that it searches "everything" in a distributed fashion and uses the pagerank algorithm to rank the results, all in less than one second. Adding blog link trails to Google News is a great idea (Jim McGee has a good summary), but they should have been able to do this without purchasing Pyra. Their advantage has always been the ability to index and rank content in the outside world, not on their servers. Even if they plan to add this type of functionality into the Google Search Appliance that they sell for big bucks, I would have thought it would be more impressive if they did it in a distributed fashion, without purchasing a blogging company.

    My hope is that they'll build a better search engine for individual blog entries. For example, right now I'm trying to find a site that I blogged about last year. It had an RSS sidebar integrated into the main page. When you click on one of the sites listed in the blogroll, that site's headlines opened seamlessly in the sidebar. Earlier tonight, I was trying to remember if it was Gateway or Dell that is making integrated 802.11b standard on its laptops. I had a heck of a time finding it in Google or Daypop (if you're interested, I finally found it in Dell's press releases). What we really need is a more granular search engine for finding content that is unique (the thoughts of bloggers) but not unique (general concepts that are blogged by one, thirty, or a hundred people).

    I'll be interested to see how this plays out, but I still think Google is missing the boat by not working closely with librarians. If they truly want to become THE place to go for information, they should continue their work on the semantic web, but there will always be information that they can't provide. The way to fill in those holes is to create a librarian-based pagerank and integrate 24/7 library virtual reference projects into their offerings. A good librarian will thrash Google Answers any day of the week.

    Just imagine searching Google for something, and not finding what you need, being unsure of a site's authenticity, or tiring of paging through thousands of results. What if you could type your postal code into a box on the search results page and be connected to your local library's virtual reference service? Suddenly you have an expert at your fingertips, as well as access to subscription-only databases. Tell me that wouldn't rock!

    [The Shifted Librarian]


    9:42:01 AM    
    What is the Animatrix?

    For one, it is beautiful!


    9:40:27 AM    


    Email Subscription
    Enter your email address below to subscribe to deeje.com!


    powered by Bloglet