Craig Cline's Blog

Monday, April 18, 2005

O’Reilly’s Emerging Technologies 2005 Conference: Nerdstock!

O’Reilly’s annual Emerging Technologies Conference (or Etech, as the technorati like to call it) has become the gathering place for geeks, hackers, venture capitalists, analysts, and kooks from all over the world. It has a proven reputation for looking at the future in a unique way – both from 50K level (which you find at most such conferences) – but also from the trenches. That many of the speakers and attendees are the people who are actually pushing the edge of the proverbial envelope in practice gives the conference an authority that is lacking from most others.

Besides, where else can you find Feral Robot Dogs hacked to sniff out toxic waste gases at Superfund sites side by side with Amazon’s Jeff Bezos personally demoing Opensearch, a syndicated search tool just announced by Amazon’s A9 search subsidiary (www.a9.com)? Did I mention that this is my favorite conference?

Well, yes, Etech is a Nerdstock. Where else but Etech will you hear Bezos crack the following “joke”: “I asked Danny Hillis (founder of Thinking Machines and now with Applied Minds) ‘What is a global consciousness?’ Danny replied ‘That’s easy – that’s what decided de-caf coffee pots should have an orange handle.’”

Each Etech has a theme, and this year’s was “Remix: your hardware, your software, your media, your world.” “Remix” is a natural evolution of the theme of last year’s event and O’Reilly’s Web 2.0, which both started with the premise that the web has changed everything, and that wonderful things are beginning to emerge from developers and entrepreneurs who are combining web services, social software (such as Wikis) and web site APIs to create “next generation” applications. As Joshua Schachter, creator of www.del.icio.us (a web site that allows people to post their bookmarks and share them with others) said, “It used to be that when you wanted something, you went and made it. Then we turned into a bunch of consumers.”

Etech 2005 spun this process of creating new products and services by “remixing” existing web services, applications, content, music, APIs – and even DNA - as the stuff that the future of computing will be made of. Etech tackled this theme head on, fearlessly addressing not only the technological, but also the political – and legal – ramifications of remixing other people’s intellectual property. Indeed, a pall hung over the conference in the guise of the anti-piracy bills that Orin Hatch and others are trying to get passed that could, according to the EFF and others, make even the iPod illegal.

One of the best aspects of this event is also the thing that makes it so difficult to cover – it has to be one of the most heavily documented conferences in the world, with most of that coverage happening in real time at the event. There’s the conference site, with links to transcripts, MP3s, blogs, wikis and the like (http://www.oreillynet.com/et2005 ). There’s the conference Wiki (http://wiki.oreillynet.com/etech05/index.cgi ) where anyone attending the event could contribute their two cents on sessions, conversations, and even lost items. And of course there is the blog coverage, which in the past year has blossomed into a de rigueur part of covering any conference in real time. In short, if you want to find out what happened at Etech – including, I’m certain, who dated whom – just Google “Etech 2005” (http://www.google.com/search?sourceid=navclient&;q=Etech+2005 ) and you will find over 45 pages of links to both mainstream and blog coverage.

O’Reilly’s Radar

Tim O’Reilly and Conference Chair Rael Dornfest kick off every Etech with a free-wheeling tour of the latest technology and trends that, not coincidentally, comprises the framework for the rest of the conference. Quoting from George Bernard Shaw – “The reasonable man adapts himself to the world. The unreasonable man adapts the world to himself. Therefore progress depends on the unreasonable man.” – Dornfest rattled off a litany of remixes in the lab, companies, home, and “life” and pointed out how “hacks become frameworks become foundations” for future operating systems or applications.

O’Reilly discussed how “design patterns apply to web applications,” borrowing the pattern language pioneered by Christopher Alexander [A Pattern Language: Towns, Buildings, Construction (Center for Environmental Structure Series)]. Quoting Alexander, “Each pattern is a three part rule that expresses a relation between a certain context, a problem, and a solution,” O’Reilly applied it to the world of hacking and application development. For example, people who want to re-use images from the web in other contexts therefore must provide high-resolution images for online materials that you expect others to use. The context in the above example is anticipating every possible future use for images posted on line, the problem is how to provide multiple resolutions for each image on line, and the solution is a service that provides the multiple resolutions depending on what your application requires. (Publishers will recognize that this is what OPI has provided for years in the print world – shouldn’t it work online too?)

O’Reilly went on to point out that many of the most popular web sites are in “perpetual beta,” with open APIs that enable third parties to enhance the main application or adapt parts of the service for their own web applications. Sites that are “always a work in progress” include Google, Flickr, Safari, del.icio.us, and Y!Q (Yahoo Search). At these sites “users add value to shared data. The key to competitive advantage in networked applications is the extent to which your users can augment your data with their own.”

O’Reilly argued that the PC is no longer the only access point for networked applications – increasingly, people around the world are increasingly using their mobile smartphones like the Treo 650 or Blackberry 7280 or Nokia 7610 as a portal to web services such as Flickr or Amazon rather than desktop computers. Design for participation, O’Reilly went on, “A successful open source project consists of ‘small pieces loosely joined (to quote David Weinberger). Therefore your software or service should be architected in such a way as to be modular, to be used easily as a component of larger systems.”

“You no longer have to build all components on the web yourself,” O’Reilly concluded, and pointed to isbn.nu as a site that adds value by aggregating search results of dozens of other sites. www.isbn.nu offers a quick way to compare the prices of any in-print and many out-of-print books at 14 online bookstores.

Last year’s hot topic, social networking, was declared to be “badly broken” in its initial implementations as destination sites like Frienster. However, O’Reilly noted, as a by-product of applications such as email, instant messaging, photo sharing and even book buying, it makes sense. Amazon use of their knowledge of user preferences to provide recommendations to other users is perhaps the best known use of social networking within a commercial site. But there is no reason anyone couldn’t engineer social networking into their applications, and in fact doing so could give you a competitive edge.

To underscore its commitment to “Small pieces loosely joined – and remixed,” O’Reilly announced a new magazine and web site devoted to “hacks and How-tos for your gear” called Make: The first issue contains stories on backyard monorails, XM Radio hacks, iPod tricks, aerial photography with kits, feral robot dogs that sniff out toxic waste, and how to make a magnetic stripe card reader. The magazine launch was accompanied by a Make: Fair, where a dozen of the projects highlighted in the first issue were demonstrated at a wine and cheese reception.

Flickr

Each Etech has its Favorite Company, and this year it was Flickr. The popular photo sharing web site is built entirely on open source, and, as O’Reilly pointed out, in perpetual beta. Indeed, go to the site (www.flickr.com) and “beta” is part of the site’s logo. It has gained immense popularity with the technorati because its open API makes it easy to extend Flickr’s functionality, or to add call-outs to your own Flickr image portfolio directly from within your blog or your web site. Being in perpetual beta, with lots of developers, has its downside. As Flickr’s Stewart Butterfield related in his talk, “Web Services as a Strategy for Startups,” Flickr lost control over a lot of things that were happening, and had to contend with other people’s bugs being inserted into the code stream, not just their own. Sill, the community that has grown up around Flickr has made it a web platform. “You can build a variety of different applications using the Flickr API to provide photo aggregation in a variety of different ways at a variety of different web sites.”

Search

Search has been hot these last few years, not the least because finding things on the web is one of its major raison d’etres. Amazon, Yahoo and Google each had announcements concerning their search products. Jeff Bezos announced a new feature for A9.com that lets other search engines syndicate their search services to the site. Called OpenSearch, A9 added the feature so users can eventually select among thousands of vertical search options and manage them through the search columns that appear on the right-side of A9.com's interface. A9 developed an extension to RSS 2.0 to provide this functionality, and is making it available to other developers as an open source API. Bezos commented that “We wanted to do for search what RSS has done for content.” But as an RSS extension, OpenSearch promises to provide another set of tools for the enterprising developer.

Yahoo announced that the contextual search technology behind Y!Q is available as a Web Service. Fundamentally, this means developers can provide some text as context in addition to their explicit query. This can be very useful in resolving ambiguous queries. For example, if you use Yahoo’s Web Search service to provide web search on your scuba diving web site you might provide a few words or sentences about scuba diving in addition to the user's query. If a user searches for "equipment" with this context as background, they'll find more relevant results than a plain web search for "equipment". Y!Q also works well on content heavy sites such as news or blogs. Article titles or lead paragraphs often work well as context for focusing queries beyond the explicit keywords.

Yahoo also announced a Tech Buzz Game for developers. The Tech Buzz Game is a fantasy prediction market for high-tech products, concepts, and trends. The players goal is to predict how popular various technologies will be in the future. Popularity or buzz is measured by Yahoo! Search frequency over time. Predictions are made by buying virtual stock in the products or technologies a developer believes will succeed, and selling stock in the technologies they think will flop. In other words, “you put your play money where your mouth is." This “Game” has a serious purpose underlying it: it’s the theory of Yahoo Researchers the “price” of each “stock” represents the aggreated opinionon which alternative future will dominate searches. By inference, the results will reflect when people believe interest will shift from one product to the next. O’Reilly worked with yahoo to help formulate the wuestions asked in the “game” in such a way that answering them could be predictive of a given technology is going. The resultant “trends” therefore will be very real data that can be used by pundits and analysts to determine when adoption of, say, OS X Tiger, will switch from Panther.

Google demonstrated several new features. Google Suggests uses predictive text to autocomplete searches. The company uses dhtml to enable you to display the possible search iterations with the number of hits each generates – no small programming feat. Google Personalization gives the user a slider which enables her to skew the results based on the degree to which she wants to map the results to the information contained in her own profile. So if you are searching on Baltimore Orioles and your profile indicates you are a rabid baseball fan the results generated will be skewed toward the baseball team rather than the bird. The last feature, Google Sets, analyses the results from a number of different users conducting similar searches to create a search set that can be used to refine future search results.

Wikipedia

The Wikipedia is an immense project, based on the wiki concept of a web site where people can share their ideas and writings freely, a labor of love, and a demonstration of what the open soruce software movement is all about. It’s a freely licensed encyclopedia written by thousands of volunteers in many languages, with nearly 1.5 million entries across 200 languages. It has 499,388 entries in english, 209,000 in German, and 106,000 in Japanese. It has 350,000+ categories with a hieracrhcial structure, and its entries are peer reviewed. It has been around since 2001 and can be found at www.wikipedia.org .

So, you ask, how can an encyclopedia written by thousands of amateurs around the world with no central editing authority possibly be accurate or useful? Well, not only can anyone write an entry, anyone can edit an entry already written. The Darwinian nature of this process ensures that eventually only the most accurate information is posted. Still skeptical? Wikipedia is more popular (based on page hits) than USA Today, Paypal, and will soon surpass the New York Times.

Wikipedia has spawned similar proejcts in other domains. For example, Wikicities.com extends the wikipedia social model to communities. Over 170 wikicity communities have been formed in the past 3 months. Organization h the community social model of wikipedia is “kinda hard to explain,” according to Jimmy Wales, Wikipedia founder, “ the free-form nature of wiki software lets the community determine how it wants to interact. The softwre doesn’t enforce the rules of social cooperation – that would be too rigid. An override is built-in.”

Wales believes that social innovation will spread to other areas of the web: “Software which enables collaboration is the future of the net.”

Clay Shirky: Voodoo Categorization

In recent years it has become fashionable among the XML set to talk about ontology, metadata, and categorization. Clay Shirky is one of the deeper thinkers at Etech, and hails from NYU’s Interactive Telecommunications Program, He delivered a talk at Etech that effectively says, “Regarding categorization - not so fast.”

“A lot of what we think we know about categorization is wrong, and by trying to adopt what we think we know to the web we are threatening to break things,” Shirky announced at the start of his talk. He gave an example: when the Library of Congress cataloged their collection, they optimized for the number of books they had on the shelf in each category, rather than defining a flexible system with room for future growth and new variations. “They confused the container for the things contained.,” he noted, pointing to how they had a section on the “Soviet Union.” When that entity broke up they changed the categorization for books on Russia and all the newly independent states to “Former Soviet Union” -- because they didn’t have to staff to re-shelve all the books.

Shirky argued that this “kinda works” in the confined world of a library, but it totally breaks down when you try to apply it to the world of the web. “The constraint of having to re-shelve books doesn’t exist in the online world,” he noted. Early web sites like Yahoo attempted to replicate the ontology of the Library of Congress for on-line information, creating categories and then pigeon holing information and links into each. Instead, he argues, on the web it makes much more sense to start with links and build categories around them. “Great minds don’t think alike,” he observed.

Categorization needed to be dealt with in the context of time, realizing that over time the meaning of words and phrases morph and change. “The difference between the temporal and a category is a smear,” he noted, “At any given time a given user did or did not center around a set of tags with a high degre of certainty.”

Shirky argued for an “organic” categorization methodology based on market logic – individual motivation and group value. These approach should start with urls, not categories, which create overlap, not synchronicity, based on probablities rather than binary “this or that” category choices. In organic categorization, both the user and time are core attributes – you need to know when a given user created a given category to understand its ture meaning and context. One-off categories – used once by a given user at a given time, should be ignored by the categorization system, rather than discouraged or deflected. “The semantics are in the users, not the system,” Shirky concluded.

Shirky admitted that discrete formal categorization projects make sense – for example, in a given organization installing a digital asset management system – but that these will become “islands of categorization that won’t connect up with each other.”

Larry Lessig

Larry Lessig, from Stanford Law School and famous around the world for his books and arguments in support of free speech and against the erosion of our “fair use” rights, began his talk with a parable from H.G. Wells. “A mountain climber stumbled into a village in the mountains where everyone was blind, and fell in love with the chief ten’s daughter. The villagers welcomed the relationship between the climber and the girl, but were concerned about his disabilities. They consulted the village doctor, who said a solution was simple, namely to remove the irritant body, namely his eyes.”

Lessig argues that in life and culture there is seldom anything that is truly new and not created by borrowing from many different sources, “cause this is always and forever how cultures have been made, knowledge, politics and corporations are created by remixing.”

Until digital networks came along, all remixing was free. “It needs to be free if we are to avoid infantilizing culture – ‘ordinary ways’ of remixing intellectual property need to be free,” Lessig noted.

In the past, text was able to be remixed. “It took hundreds of years to defend the right to write freely. For our culture, writing is allowed.” Literacy is the process of teaching how to remix other people’s text. What happens when the “ordinary ways” with which people express and re-express their culture, changes? Do their right to remix change as well.

With the advent of digital creativity, the tools have changed. We have bottom-up democracy, blog democracy, peer-to-peer sharing of intellectual property. “When the tools change, do the freedoms change as well?” Lessig asked. “Is the remix for our kids using their ‘ordinary ways’ using digital tools like Garageband, bittorrent, and the iPod to be as free for them as it was for us with typewriters, word processors, and text?”

Will “writing, remixing” be allowed for them? “The answer is No.”

Now you need permission. New forms of expression are illegal. “We can’t teach them how to use these now ‘ordinary ways’ because that would be teaching piracy,” Lessig noted. Existing law conflicts with these new technologies – “We have to reform either the law or the technology, and unfortunately, Congress and the Courts appear to prefer changing the technology rather than the law.”

“In other words, they want us to ‘remove the irritant body,’ namely these machines. We are being told we must conform to 18^th century laws of property, thereby eliminating 150 years of progress we’ve made in establishing the right of fair use of intellectual property,” Lessig exclaimed.

Instead, we have to get them to reform the law to make remix free again. We need to be free to remix, to tinker, but not free to take and distribute other people’s Intellectual property without permission. “If you create a world where people need to assert freedom, then freedom doesn’t exist unless it is asserted,” Lessig concluded.

This is but a small sampling of the great talks, and innovative technologies presented at this year’s Etech conference. I encourage you to follow the links and check out the blogs to explore the conversations in more depth. I’m already looking forward to next year’s event.

-----------------------------------------------------------------------------------------------

Lessig presented a six point plan for creating an environment where the law could be reformed rather than the technology:

Find a way to connect to the other side. We need to call piracy “piracy.” Piracy, not remixing, is wrong. We need to defend the right to remix digital content as we have with text

We need to teach the other side how extraordinarily powerful these technologies are for our kids.

We should demand changing the laws, such as the DMCA etc. to allow digital re-mixing.

We need to change how intellectual property is treated. We need to update fair use to allow remixing with the new technology.

Finally we need to punish them for trying to take away fair use. Form a PAC (an iPac, if you will). We need to establish clear messages about what freedom is.

Support Creative Commons. Creative Commons offers a flexible range of options and protections for authors and artists using an open source model for licensing. www.creativecommons.org

2:25:17 PM