Updated: 9/21/2003; 1:40:10 PM.
Commentary on software, management, web services, and security

Thursday, January 30, 2003

Thoughts about Search Engines and Communication. When we do a search on the Internet we are seeking an answer to a question, looking for information. We want to make a query and get an accurate response, free of irrelevant and distracting information. The art of using search engines effectively is the art of framing a question. To frame a question for a search engine we must keep in mind that we are asking a machine for an answer, a machine that does not know the associative meanings that we take for granted when speaking with people. Look at the gif diagram (http://www2.eou.edu/~jhart/netcomdiagram.gif); it represents a simple view of Internet Communication, with just two people and two machines. P to P is person to person communication; M to M is machine to machine communication; P to M is person to machine communication; M to P is machine to person communication. Clearly the ways a machine communicates with another machine will be different than the ways a person communicates with another person. Similarly, a person to machine communication is different from a machine to person communication. The trick in all this to recognize that what works with one relational direction of communication will not necessarily work with another direction. Obviously, if machines communicate with people the way that they communicate with one another, then people will not get the message. If people communicate with machines the way that they communicate with each other, then machines will not get the message. Many of the challenges of the modern age of Internet communication are summarized in the directional lines of this one simple, communications rectangle (except the complexities are multiplied many times over when we add millions of people and millions of machines). As I understand it, the great effort now underway to construct a semantic web is an effort to enable M to M communications to utilize associative meanings; such semantic enhancements could yield emergent communications that are not now available between machines; those semantic communications could then facilitate M to P and P to M communications beyond the levels that are now available. (See http://www.w3.org/2001/sw/;"The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation." -- Tim Berners-Lee, James Hendler, Ora Lassila, The Semantic Web, Scientific American, May 2001.) One of the best ways to reveal the limits of current search engines is to start with the end result, i.e., begin at the end of the search rather the the usual start of the search. Begin with the answer rather than the question. E.g., if you know that you want to find the online instructional resource site MERLOT, you can see how effectively a search engine takes you to your sought-after target. In a typical search you can't know fully in advance where or what your target is, so you can't evaluate how effective or ineffective the search engine is in helping you find what you are looking for. A simple Google search for "MERLOT" works very well, bringing up contacts with the sought-after target in the first five listings before diverging into sites for Merlot wines. However a KartOO search does not succeed as well; it does bring up the desired taste.merlot.org repository address but most of the other listings are for wine sites. If the spelling of the search term is changed from "MERLOT" to "Merlot" something a person unfamiliar with the repository listing might do, then the KartOO search engine is even less successful at identifying what is sought. If a more advanced search technique is applied, using "MERLOT Educational Resources" as the search phrase, then the results are much more targeted, with all wine listings removed. But notice that I had already adjusted my search language to talk to a machine from the first search. If I do a Google search with the kind of query that I'd use with a person ("Find the MERLOT instructional resources site") then Google finds nothing. KartOO does do much better because it has been set up to accept plain language queries ("Find the MERLOT instructional resources site"?), but it does take 7 pages of listings to finally turn up the taste.merlot.org address; instead it brings up many sites that refer to MERLOT. The AskJeeves search engine also provides plain language search capabilities but found very little when give the "Find the MERLOT instructional resources site" instruction and did not locate the desired MERLOT address. Also notice that even in the plain language search example, I kept my language very simple so that I was still doing "computer speak" rather than "people speak." If I were speaking with a librarian or instructional support person I'd use a much more complicate set of instructions such as "Please help me find the MERLOT instructional resources site and all other repositories of online instructional resounces. I want to restrict the search to higher education resources and resources in English. I'd like to exclude all discipline-specific sites." Just for fun I tried this complete set of instructions on Google and did get a number of relevant sites, even though Google indicated that it "limits queries to 10 words." When I tried the same complex set of instructions on KartOO nothing was located. It's impressive to me that the Google algorithms were able to reductively cope with the the complex set of instructions and yield items of use--even though Google was unable to fulfill the complete set of instructions. It's also impressive that the advanced search tools in Google and KartOO and other search engines permit a searcher to construct a complex sequence of searches once the person learns the search protocol language of the software. I'm not sure what to conclude from this brief analysis, except to say that the field of knowledge management and the study of human/machine communication is very much a work in progress. What is encouraging is that the fields exist at all and that progress is underway. However we are a long way from the kinds of conversational communications with machines that are so popularly depicted in science fiction books and movies. [EduResources--Higher Education Resources Online]
9:01:39 PM    comment []

Graph-based Data Models and Languages. (thanks Murray) [Semantic Web Blog, featuring RDF]
8:33:11 PM    comment []

© Copyright 2003 Erick Herring.
January 2003
Sun Mon Tue Wed Thu Fri Sat
      1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31  
Dec   Feb

Click here to visit the Radio UserLand website.

Subscribe to "Lasipalatsi" in Radio UserLand.

Click to see the XML version of this web page.

Click here to send an email to the editor of this weblog.