Programming Projects

You asked for it! (Ok, maybe you didn't)

Thursday, January 2, 2003

The Interest Engine, some background

This is just some text I've been playing with. Lemme know what you think:

The internet as it stands is a vast collection of interrelated documents and media. There are thousands of engines available for free use which attempt to aggregate and classify this information, both through automation and with human intervention, usually with a high degree of accuracy. The semantic nets that are created by this process effectively categorize at many levels of granularity, these bodies of text and other forms of media. By peering over the shoulder of a user and watching her habits of information retrieval, we can collect effectively reverse-engineer the topic and subject areas in which a particular user is interested. This allows us to extrapolate and potentially even anticipate the [base "]type of resource[per thou] a user will seek, and be proactive about suggesting it.

Also, in this latest emerging generation of internet information storage, analysis and retrieval, there exists a form of information classification which, while not particularly new has increased in prevalence with the advance in computing power, storage space, and network bandwidth and that is a form of [base "]similarity by association[per thou]: The notion that if one user possesses 100 documents and another user possesses 85 of the same documents exist, chances are very high that the other 15 documents are of interest to the second user. Technologically this information is trivial to mine and aggregate once a system to do this is in place. Systems like this are deployed in advanced search engines and most notably in commercial sites such as Amazon. As with search-engine mining, these engines and sites can both be created where appropriate, and mined when they exist already to produce a series of further interests and possibilities.

Together, technologies like these can be harnessed in a desktop system to remarkably enhance the user experience both on the internet and on something as simple as a local LAN.

Such a system requires an extremely open and flexible framework to accommodate a highly modular design for constant updates.

Additionally, it[base ']s likely (though as yet untested) that in addition to these isolated methods of creating profiles and pictures that a massive amount of cross-pollination would unlock a level of correlation and correspondence currently unrevealed. This is the eventual goal of the Interest Engine: To cross these and other boundaries and bring the user something much much closer to the full currently unrealized power of the Internet.

6:37:28 PM

comment []