TBL gave a good presentation on the Semantic Web. A conversation is following.
Who controls vocabulary? (esther)
TBL: there will be alot of issues. that makes it more likely to be adopted in the enterprise. but it also should affect personal data (TBL says he does his taxes in an RDF compatible format). public data is going to be data that is already public. lots of social questions on how we use/buy/trade public data. when it comes to licensing data, first it will be important that we provide 'pre-canned' meta-data for people. we need new tools, for example, tools to track when and what is published, and prompt use of meta-data. for example, adobe wraps file formats with RDF meta-data. what's going to really stop it? patents will try and get in its way --- people are trying to say they own these kinds of models.
ED: the ram-verse lawsuit?? this is a patent lawsuit. are you aware of this?
TBL: this is one of the few cases where if you take a standard toward your fasion in a sneaky fashion it will be your own undoing.
ED: we really need to solve this. this could set a bad precedent.
TBL: the danger...it is tricky...one of the things that's necessary is to create some rules and understanding that this is a royalty free spec. so now the W3C is creating a patent policy. how can we get people to put their cards out, formal disclosure, to participate in the standards process. we're trying to make it much more clear. for patents that will help, there's an expectation that you will declare them. and if you change your mind, there's a process to handle this. it's been very difficult to work though this.
ED: another area with lots of traction, early semantic web is in web services. lots of action is here. it's in small lumps, and concentrations of effective attempts at semantic web.
TBL: basic stuff will be common and royalty free. but vocabularies will be very industry specific. and you're concerned that these formats might not be royalty free.
ED: in a situation where some vendor owns an industry, there's concern they could dictate a standard and charge for it.
TBL: a classic thing will be to see compatibility data. for example data on an airplane. classic industrial components. when you launch a search it will automatically find the right documents based on these semantics.
ED: and there will be a system where you say a certain format is not trusted.
TBL: yes, we'll do this. trust is right at the top of they pyramid. you can build some fun things. you wil get trust from a google search, for example. semantic data is necessary to ensuring trust in negotiating contracts via computers. that sort of thing could change the way we do busines. some people resist putting their data and specs on the web because they don't want to disclose data.
ED: people don't like transparency but don't want invisibility.
QUESTION: I read an interesting piece, a guy suggested that the role of intermediaries will be reduced. Google will overtake eBay becuase they can overtake search and data on any website, blow away any intermediary. how does this notion of the semantic web fit into the question of where are there intermediaries?
TBL: all kinds of things people will do.. some will go away, some will appear. some lists are made by hand, for example. that wont go away. that editorial value will be sellable. human created meta-data will be valuable and share. (gives an example from sun microsystems). there will be mechanical things that will be difficult. imagine you can write transformation/filtering rules. suppose 100,000s of these rules, manipulating all this semantic data. put all those rules on your laptop. people will assemble questions into collections of rules. that sort of index takes time to build. when we get semantic web, tons of new algorhims will emerge. concept of an intermediary and an agent will persist.
ED: if everything becomes efficient, will everything become centralized. will diversity reign?
TBL: there are people on both sides. people worried of the web being owned by one content channel. the world needs little cults. and it needs some global understanding. we need everything in between. the world is a big fractal mass. people are their own fractal mass. people want small and big worlds, to participate in both. there will be all sorts of data systems, some big vocabularies and global formats, and some middle scale formats but still large in scope, and lots of small scale things. and the web will support that. we need some global standards. but if it becomes all global, that would be awful.
QUESTION: big supporter of the semantic web. but skeptical about RDF. remember the first apps created with it. my question is, who is going to create all the RDF? how many of us have created an RDF file? is the rareness in the RDF in the english vocabulary a sheer coincidence? is there an RDF light?
TBL: i dont' expect you to write RDF. it uses XML, a tree structure, to write a graph. an attempt was made to make the syntax to work like XML. the result is a spec is that is hard to read, and RDF is verbose. the RDF that i play with, it's clean...applications create RDF. i don't expect people to write it. lots of legacy format conversion to RDF. yes, i created a lightweight RDF once. you can use it if you like.
QUESTION (same as above): it's hard to ask an average end-user to put so much meta-data around their content up-front?
TBL: wait a minute. the reason RDF was created was meta-data. who actually creates meta-data? noone does it really. remember keywords in microsoft word. the semantic web doesn't rely on it. it's the actual data that matters. your addresses, your calendar. there will be interesting applications...<pause>...we'll move from apps in a box to apps that intermingle...people won't write RDF. semantic web ISNT about meta-data.
ED: the apps will create the RDF.
TBL: business model for semantic web is the biz model of the web. it's how apps interoperate, it's how apps talk. short answer: dramatically reduce cost of enterprise app integration.
(My side conversation with Adam Bosworth, BEA chief architect and ex-Microsoft, Adam helped shape many of the XML standards. We both agree that this RDF thing is a big joke and TBL is on another planet. Adam helped drive the creation of XML Schema and XML Namespaces, as well as Web Services standards that uses these, and these are the things that are actually driving the semantic web. Virtualy no one uses RDF, but nearly everyone is moving to these other standards).
QUESTION: you've had this great vision for the web. the reality is so different. internet explorer is the only thing, and now people are customizing to a proprietary web. are you concerned? (my note: Flash is also another quasi-standard that a lot of people are building to).
TBL: I'm an optimist. web is a constant tension between monopy. there have been various timed when people are worried about one or another company. when it comes to the semantic web --- it's a paradigm shift --- there will be ways people can try and make it proprietary, especially in other layers above it. people will try and they will fail. open source is always looking after it, and trying to keep things fair.
ED: I'm worried about people controlling my choices, and enabling people to switch platforms.
QUESTION: 20 years ago researches were working on similar technical concepts. what has changed in 20 years that enables this? i have three ideas: 1) existence of the WWW as a platform; 2) existence of W3C to support this, 3) willingness in corporate world to annotate and share data.
TBL: those are all important. it's the web philsophy and architecture (distributed) that enables this. you could've said the same about hypertext sytems 10 years ago. they were interesting but never took. semantic web is more economical because ontologies can be shared. seen as part of the web it's extremely useful.
QUESTION: i appreciate the value of tagging data. concern is the messiness and problems of scale. when you try and scale to aggregate a lot of messy data. can this scale?
TBL: ambiguity is not a semantic web thing. isn't it really difficult to get everyone to agree on the same meta-data? yes, it is. the solution has to be more flexible and malleable. we need extensibility. that's what enables it to scale.........