Thursday, January 30, 2003

The Invisible Web is subtitled "Uncovering Information Sources Search Engines Can't See." (Information Today, Inc., 2001, ISBN 0-910965-51-X.) The authors are Chris Sherman, President of the Searchwise consulting firm and Associate Editor of SearchEngineWatch.com and Gary Price, a library and information research consultant.

The book's focus is on what the authors call the "invisible web" defined as "the parts of the Web that search engines can't see" (p. 2). "The first challenge for the Web searcher is to understand that the Invisible Web exists in the first place.... In a nutshell, the Invisible Web consists of material that general-purpose search engines either cannot or, perhaps more importantly, will not include in their collections of Web pages (called indexes or indices). The Invisible Web contains vast amounts of authoritative and current information that's accessible to you, using your Web browser or add-on utility software--but you have to know where to find it ahead of time, since you simply cannot locate it using a search engine like HotBot or Lycos" (p. xxii).

The authors provide a useful brief history of search practices and tools on the Internet and the Web and a meaninful analysis of the differences between browsing and searching and between inverted index structures (the product of search engines) and hierarchical graph structures (the product of directories). Many graph structures are not detectable by search engine crawlers.

Another useful analysis that the authors provide is the identification of search engine "myths" such as the "All Search Engines Are Alike Myth," the "Search Engines Are Current Myth," the "Search Engines Overlap in Coverage Myth," the "If Your Found It Once You'll Find It Again Myth," and the "Search Indexes Are Comprehensive Myth." All of these myths are explicated with examples and case studies.

The second half of the book is devoted to chapters on subject areas and topics within subject areas that describe directory sites; these sites may not be detectable with search engines but are accessible using ordinary web browsers (if you know to go to them). The authors also list sites that aid researchers to stay abreast of resources on the invisible web such as the Scout Report (http://scout.cs.wisc.edu/scout/report/current), the Librarians' Index to the Internet (http://www.lii.org), Research Buzz (http://www.researchbuzz.com), Free Pint (http://www.freepint.co.uk), and the Internet Resources Newsletter (http://www.hw.ac.uk/libWWW/irn/irn.html). I use the Scout Report and the Scout Archives regularly to search for discipline-specific instructional resources, but the other sites were new to me.

The authors also maintain their own site related to the book (http://www.invisible-web.net) that updates the subject areas and topic areas from the book. However, I found the other update sites that the authors recommended (such as Research Buzz, Free Pint and the IRN) to be much more developed than the book-related site (the "drill-downs" at the Invisible Web site are not very deep). I plan to add all of the recommended sites to my regular search forays when looking for instructional resources; I'll also use them to keep abreast of search engine and directory developments. If the authors were writing their book today they would need to point to weblogs as sites for rapid update information; I've found in my own work that I often get valuable tips about locating Web resources and advances in search technology from sites such as The Shifted Librarian, Ariadne, D-Lib Magazine, and the LibTech Weblog.
