Wednesday, April 17, 2002

Community Documentation, Scripts and Tools. It's time. As a famous blogger wrote not too long ago, RTFM Won't Work: Documentation As Narrative.

(If you're doing stuff and you're not in there yet, you probably will be. Pester me and you shall be rewarded).

[Russ Lipton Documents Radio]
12:43:04 PM  #  comment []
Scalable directories. The idea of combining search, directories, and OPML is powerful (note, the file you are reading was generated as an OPML file and translated into HTML). It is particularly useful as the amount of information to be organized scales beyond a couple thousand files or documents. The same techniques that I am advocating for use on the Web work equally as well within an Intranet. Radio is a great way to publish and generate OPML-driven directories. If this takes off, it will be a major upgrade to the Web and corporate knowledge management. Here is some more detail:
  • End-user generated directories. This is a bottom up approach to the creation of topical directories that can scale to millions or billions of pages. Directories can cross link through transclusion (a method of interlinking directory OPML files. This means that end-users can seamlessly browse large directories without an interruption of experience. This method scales. It's democratic. The best directories float to the top. Wow!
    • Decentralized development of directories of information. This starts with an end-user building a directory of resources (links to web pages, documents and pictures) in an outliner which create an OPML. They publish the result to the Web (Radio makes this easy) or an Intranet.
    • Search services index these directories. A search engine like Google crawls the Web or the Intranet, finds the published OPML file, and indexes its results. End-users can then search on a keyword and get directories that are relevant to that term.
    • The most relevant directories are listed at the top. The directories are ranked by Google's quality measurement system (PageRank). Note: There isn't any uber directory (the reasons that Yahoo and DMOZ don't scale) that attempts to account for all info. It is completely keyword based.
  • Webpage relational metadata directories (this is data about information like a webpage). Every webpage Google crawls has as a Google relational metadata structure. Unfortunately, almost all of this richness is lost in the current interface. This can be assembled as an OPML file. Here is how it would work.
    • Each webpage indexed has an auto-generated OPML assigned to it. This would be part of each return generated by a search engine.
    • The pages OPML file would contain relational metadata. Here is a list of what it could contain:
      • Sites (URLs) that link to it.
      • Sites (URLs) that it links to
        • Internal to the site
        • External to the site
      • Keywords where it scores within the top ten returns
      • Documents it links to
      • Pictures it links to
      • Sites that are similar
      • A cached version of the page
        • Times when it was updated (with links to a cached copy of each previous update)
      • Publisher information (author, organization, etc.)
    • A link to a webpage included in the relational content would be a link to its OPML file. This way, I could move from rich description to rich description.
[John Robb's Radio Weblog]
12:24:37 PM  #  comment []