Andy Roberts' Radio Weblog

Friday, September 17, 2004

Language Oriented Programming - new buzzword
Some interviews over at codegeneration.net show that there's a new buzzword that's gaining momentum - language oriented programming.

Here's my take on it: Programming languages, up to now, have been largely thought of as general purpose. People use these languages to create libraries and macros, and then they have to write lots of doc on how to use these libraries.

What's really going on is that the libraries are themselves extensions of the underlying languages.

A bunch of people are realizing that if there were a formal way to define a language (using some language of course), then it would be possible for people to define their language extensions using this same language - rather than using a different technique like Java docs. In addition, if there were such a thing as a formal definition of the language, then one could auto-generate editors for creating documents in those languages.

There are actually a ton of people working on this. Here's my quick list:

1) Microsoft - Jack Greenfield and Kieth Short. They call them DSL's or domain specific languages. You can read about it at this link.

2) JetBrains - Sergey Dmitriev is building something called a "meta-programming system", which is a system that lets you define a new language and run a program that generates various things like editors.

3) Intentional Software - Charles Simonyi. Same kind of thing with a focus on transformation of documents from one language rep to another using a "reduction compiler".

4) Xactium - Andy Evans. Provides a tool that lets you define a language as an extension/combination of other languages.

5) IBM - really the Eclipse project on EMF - driven by Ed Merks and crew. EMF ECore lets you define a "language" of sorts, and then the Edit and Editor plugins will generate an editor and custom command stack.

So - that's a lot.

I've thought about the work on language oriented programming, and our parametric modeling technology in The Factory. The two are complementary. Parametric modeling assumes that you start with a well defined "domain" or target, and then you build a set of builders for that domain. These builders then enable users to compose "fabrication engines" - we call them models - that produce artifacts in these domain languages. if you change the set of input parameters to a model and regenerate - you get a different instance of the domain specific output object. This produces a kind of runtime automation that enables one to automatically morph the structure of a domain specific object without human intervention. And this enables the automatic customization of code for syndicating apps out to partner sites - which is what the Factory is used for.

An interesting execrise would be to ultimately provide a tool that would auto-generate collections of builders - based on reading one or more language definition objects.

2:04:53 PM    comment []

Thursday, September 09, 2004

Turning the Web Inside Out
It used to be that people would visit their favorite web sites using a browser, and they would hop from site to site to do everything from research to e-commerce to news gathering. Now, however, there’s a new usage pattern emerging: people aren’t visiting sites as much any more. Instead, they’re letting a new breed of services and applications bring the content of sites to them. It’s this inversion, or “turning inside out” of the web that I’m talking about.

This new usage pattern on the web is signaling a transition from the old model of: “I’ll come to you”, to “you come to me”. This is why I use the term “inside out” - it’s as if people are using the web in a way that nobody intended to a few years ago – like wearing your sweatshirt inside out. Didn’t Madonna do that already?

Visiting a bunch of web sites has become too labor intensive for many web users. So, an innovative new kind of site has started to crop up that does the “legwork” or aggregating of info from groups of disparate destination sites – like hotels. Take hotels.com as an example: here’s a site that eliminates the need to visit dozens of hotel destination sites and enter the same information in over and over.

“Middle man” sites like hotels.com have introduced a huge challenge to the original destination site providers of hotels, airlines, and product manufacturers because those providers now have to suddenly find a new way to “export” or syndicate portions of their own sites into the “middle man” sites, or risk being cut off at the head. This phenomenon is called “disintermediation”, and you see it when you go to a site like cheaptickets.com and buy an airfare based totally on price and other mundane factors, but not on the brand name of the airline, or the snaziness of the airlines’ site.

Middle man sites have started a trend of bringing services to the user, rather than forcing the user to hunt around for a set of comparable services, and this trend is just the beginning of a much bigger trend that’s literally turning the web inside out.

Now consider the Web log, or “blog”. Blogs are basically online diaries, updated frequently by individuals, and read by groups of friends and colleagues. Blogs, too, have taken on a life of their own in the last two years, and have become micro destination sites for individuals interested in particular topics. Now, more than ever, people are visiting blogs as a way of getting aggregated information on specific subjects they’re intimately interested in. What’s more, companies have been creating blogs of their key executives as a way to extend their corporate reach. There may even be more people reading Jonathan Schwartz’s (Sun) blog than people visiting Sun’s web site itself (who knows, if not now then maybe in a year from now)!

Given the explosive growth of blogs, imagine if blog owners started putting ads for services in their blogs. Blogs would then become a new level of intermediate web site that would totally eclipse, or disintermediate, the destination sites of today. The wild thing is that there are a staggering number of blogs – and that number is growing fast.

But another emerging technology has come along too - the aggregator.   Aggregators have started to perform the tasks of finding, filtering, sorting, and formatting the information from a whole bunch of blogs. Aggregators are like personal portals – ultra personal portals, that is. Think of everyone having an aggregator as being like everyone having a desktop with a browser on it. There are going to be potentially billions of aggregator apps out there – a concept that gets Microsoft’s attention.

In addition, there are aggregator services popping up on the web. Some examples are del.icio.us, feedster, and technorati. According to del.icio.us’ own description, “del.icio.us is a social bookmarks manager. It allows you to easily add sites you like to your personal collection of links, to categorize those sites with keywords, and to share your collection not only between your own browsers and machines, but also with others”.

So, what does this all mean? One way to look at it is that the web started out being a powerful “thing” (think noun) to make information available in a connected worldwide way. Now, however, the web is becoming a powerful “service” (think verb) that gets the right information to the right people at the right time. People don’t look for information on the web. Rather, they let the web find the right information for them. Just look at the explosive rise of the Google service as an example.

Now, let your imagination do a fast forward. Image a totally “inside-out web” where there are no longer “user visited”, “UI type” web sites any more – but instead, billions of aggregators, and billions of information feeds. In this world, every user has his or her own powerful aggregator that serves as a gateway, or portal, or presentation layer to the right information gathered from the web at the right time. Aggregators can also aggregate from other aggregators – creating a fabric of n-tier information transforms. The aggregator does everything to provide each user with his or her own personal, filtered, prioritized, and organized view of the world - a view that’s updated in real time. Think of it as the concept of portal to the absolute extreme. What’s different in this world is that the number of aggregators or users (i.e. “sinks”) far dominates the number of originating service providers (i.e. “sources”) that we know of today - by potentially orders of magnitude.

In this world, if you’re a service provider, you have the huge new challenge of finding a way to get your service plugged into and visible in the billions of aggregators/personal blog/portal sites that will be out there. But how do you do this? You need to think about ways of making your currently “stationary” web-based services become “portable”, and then sending them out into the world to get syndicated into millions of other aggregation layers.

Validation of this phenomenon is demonstrated by what Google is doing right now. Google’s search service is arguably the “mother of all” aggregators today – a 100,000 server “Hal” in the sky. Imagine the world where Google adds to this “virtual aggregator” a desktop aggregator portion that becomes the portal and presentation layer for whatever a user wants to see in the whole world – tied into the Google server farm in the sky. Move over Microsoft… As Google sees it, the value of the web is not rooted in the connectivity that the web provides, but rather, it’s rooted in the service of prioritizing potential connections that can be made on the web, based on supply and demand. Google is the ultimate matchmaker, willing to introduce user X to service Y based on the fact that one or both of these parties is willing to pay a fee to get the best match.

It’s no wonder that Google has moved into the email and blogger space, and has launched an initiative to bring on veteran “desktop software developers” like Adam Bosworth, to build out the “consumer portion” of this mother of all aggregators. That’s my opinion.

In summary, this phenomenon of “turning the web inside out” is forcing service providers to find new ways of syndicating and embedding their services in the form of micro-apps into the millions of other intermediate aggregation layers that are cropping up on the web.   Add to this the fact that mass syndication leads to mass customization, and it’s like fast food. If you provide a service to millions of people, you had better make it “Have it Your Way” to quote Burger King’s slogan, and that’s where Bowstreet comes into the picture.

Bowstreet provides the product (The Factory) that enables service providers to mass customize and syndicate their web-based services into thousands to, who knows, millions of partner sites. If the total inversion of the web comes true, then we may be talking about billions.

3:05:06 PM    comment []

Thursday, July 08, 2004

Pushing RSS to the Next Level
A lot of people are talking about RSS as being the next big thing on the web. Why? On the surface, the answer seems to be that RSS lets users build their own views of the web, using aggregators, rather than having to visit many sites manually, and doing the aggregation manually. I think there's more to it than this.

RSS aggregators are basically just filtering and sorting programs. Google is an example of a filtering and sorting program too. Most RSS aggregators do filtering and sorting based upon the content of the RSS feeds to which they substribe. Some go further, however - like feedster.com - and use another form of data, or metadata, which captures changes to RSS feeds. Feedster uses weblogs.com to find out when RSS feeds change, and this helps it index more efficiently, as well as provide another dimension of filtering based upon change.

I think that weblogs.com is a primitive form of what I've been calling the deltaweb - the web that describes ongoing changes to the web. This is what's really beyond RSS as we know it today.  The RSS of tomorrow will include a formalization of how we characterize changes to things.

I've noticed some cool things going on in the world of "change related" software. Bram - who wrote BitTorrent - is now working on Codeville which is all about channeling and integrating change information among P2P nodes.

10:33:14 AM    comment []

Monday, March 29, 2004

Searching for "Delta Web" on Feedster vs. Google.
Interesting - I did only two blog posts on the concept of the "Delta Web" last week, and then went to Feedster and Google today to do a search on the words "Delta Web". Just curious to see what woud happen. I was surprised. On Feedster, my blog entries came up first and second!!!. On Google, a whole bunch of stuff on Delta airlines popped up.

This tells me a couple of things: 1) Feedster is fast, and 2) Feedster is relevant. Of course, Feedster is limited to rss feeds.

Here's the Feedster link that shows the search results of "Delta Web".

12:05:49 PM    comment []

Friday, March 26, 2004

Delta Web - More thoughts
I was looking at Jeremy Allaire's proposal for RSS-Data last night, and it made me think of a few things about RSS Data, and the Delta Web idea. First, I'd like to think that if people want to put foreign payloads into RSS feeds, then XML namespaces should be enough. You don't need to force people to use a specific markup - just because there are tools out there that can read this markup. the fact is, at some point an aggregator has to understand the domain specific meaning of a block of data, so the data may as well be encoded in the domain specific format associated with its domain namespace. Plus, isn't the whole idea behind a namespace to enable a set of XML tags to have a single, universal meaning (rather than a meaning relative to a document) - by way of the uniqueness of the namespace? I thought so... So, if someone wants to add data to an RSS feed that's outside the scope of RSS, then just use a namespace and be done.

But this got me to thinking about the Delta Web idea. In the Delta Web, I proposed using two references to identify a change. One was the reference to the context of the change (i.e. an XML document - for example), and the other one was a reference to the changed entity (i.e. an XPath reference to the changed stuff in the doc - for example). This is all well and good, except that if these are references, and there's no "copy" of the changed content, then it might be hard to find out what the content is if the original content has since changed. That's a mouthful I know.

here's what I mean. If I add an <event> element to my <calendar> XML document, and I create a Delta Web doc that identifies both the contextual ref, and the relative ref - that's great for right now. This is because an interested party (like my aggregator) could go to the original doc and find the reference to what changed, and read it - if it wanted to. But, now let's say that two minutes after I add the <event>, I decide to rip it out and replace it with a different <event>. And I'm a good boy and I update my Delta Web doc with two new entries - a delete and an add - and they exist right after the first one. Note - these three entries will all have different dates - so they're unique in the delta web sense.

However, if my aggregator reads the delta web doc and goes to the web doc trying to see "what" changed, it won't be able to find the first "added" entity. This is because the second change obliterated it. Oops. Well, wait a minute. maybe the job of the Delta Web is not to capture the content of the change, but just the metadata about it. That's the way I'm leaning. Maybe it's an extension of the Delta Web schema to have tags that might capture copies of the data that changes at each point. Why load up the Delta Web with all that change data if it's not even the job of the delta web to preserve the content of the change.

Of course, it would be easy enough to have an optional element in the Delta Web schema that would reference the changed content - or actually be a locally embedded copy of it. that's easy. I would just want to make it be optional. Also, I would want to be able to have my delta web doc define what changed - rather than necessarily be a complete re-buildable representation of how it changed.

This kind of ties back into my thinking of the Delta Web as being the first derivative of the web with respect to some parameter(s) like time. The derivitive function is not supposed to be something that enables rebuilding of the original function. To do that in math you need to integrate the derivative function with respect to some boundary conditions. Maybe the concept of "boundary conditions" is analogous to the concept of the data snapshot copies that could be optionally included or referenced by a Delta Web entry. I don't know.

Next, I think we need a trial schema for the delta web.

2:07:42 PM    comment []

Thursday, March 25, 2004

The Delta Web
The web is a giant distributed state machine. Sometimes, however, I think of the web in kind of a weird way, as just another program running on my pc. Following this simplistic thought process, I can use the web by entering a string, hitting Enter, and receiving back a page of data. In other words – it’s just a simple “function” that performs a task similar to that of a clerk at the post office. Of course, this way of looking at the web makes it seem trivial, but at the same time, you have to admit, it’s kind of accurate – because that’s the way the web actually looks from your desktop.

So, if the web is just a big “function”, then what’s so interesting about it? Well, one thing jumps to mind. Each time you issue a parameterized request, you have the ability to cause the state of the web function to change internally. For example, 6 months ago, I issued a request to the web that caused it to change its state, indicating that I now had 4 tickets on Delta airlines. Of course, I was buying plane tickets via the web, and I had just submitted the final form that contained the “OK” information to commit the transaction. But looking at this from a super simplistic point of view, I entered a string of characters into the web “function” and hit Enter. Back came a pile of data, and of course, the change in state that indicated that I now owned 4 tickets.

The reason I’m making the web out to look like a simple “function” is because it helps me think about things we could do to make this function more useful.

Enter the idea of the “delta web”. Think of the delta web as being a kind of “function” as well. The delta web’s function is to quantify state changes in the web function. What does this mean? Well, let’s set the stage as follows, and bear with me. Let’s call the web function F. F takes a parameter called URL, which we’ll call u, and it returns data called d. So, when I issue a request to the web, via my browser, what I’m really doing is issuing a call to F(u), and F(u) returns d, so we have d = F(u). That might not sound so good, but what the heck.

Next, because we said that the web has this interesting characteristic where the state can change over time through regular use, it means that the time parameter “t” has to somehow figure into the function F as well. In other words, F is not just a function of u, which would make F stateless. Since F(u) can be called, and return data d, and possibly change its state internally, and then since F(u) can be called again in the future and return a different d – namely d’, because the state changed, we have to add t to the function F. This is how we do it. The web function becomes: d = F(u, t). This way, d = F(u, t) can produce one result, whereas d’ = F(u, t + i) can produce a different result, simply by varying time. The t + i part just means that t is a different time from t + i.

So, the delta web ends up being a function that represents the first derivative of the web function with respect to one of the variable input parameters: u or t. Written in pseudo math terms, the delta web is essentially dF/dt or dF/du, where F(u, t) is the function representing the web. For now, I’m interested in the rate of change of the web with respect to the parameter time – and hence dF/dt.

OK – where’s all this going? Imagine a function that could track change to the web as a function of time, and put this to use in a positive way. This would be useful. Especially if you could look forward in time and examine units of change in the web, relative to units of change in time. This would be like looking at the elevation course map for a bike race and realizing that you’re about to come up onto a monster hill in two miles. Using this map, you could anticipate the change in course pitch, and prepare by taking a deep breath and concentrating.

Now, I’m going to really change gears and put this whole conversation into a business context. Let’s say that an airline has a flight booking service on the web that let’s people buy tickets online. Let’s say that in two weeks, the airline plans to cut the price of its flights from Boston to LA from $1200 to $200. The web function in this case is “find a flight”, and it is characterized by a unique URL that feeds the web function. In this case, the URL contains information defining the origination city, destination city, and departure date. The return of the function is a set of flight numbers, prices, and availability.

The delta web function for this case would be characterized as “changes to find a flight”. The delta web function is driven by the same kind of URL parameter, as in the “find a flight” web function above, in addition to a time parameter that defines the point in time in which you’d like to measure the change to the web function. This means that if you gave the delta web function a URL, along with a time t – past, present, or future, then the delta web function would tell you about changes to the web function at any of those points in time. Using the delta web function, a business that utilizes the web function could adjust the way it uses the web function for points in the future. For example, a syndication partner that provides embedded access to the “find a flight” service from its own site, might decide to change the way it handles the return data from the URL, so that it either passes on the price drop to the partner’s end users, or pad the price back up in order to keep a cut of the price drop for itself.

The delta web function could also be used to track the more simplistic kinds of changes to the web function that RSS is used to track changes to weblogs. The delta web function would have general purpose syntax for defining context of change, characteristic of change, and point of time of change. For example, if you added an entry to your weblog today, and wanted to update the delta web at the same time, you’d create a unit of data that combined the context of the change (i.e. the URL to the blog), the characteristic of the change to the blog (i.e. the relative URL to the new entry, and the type of operation - namely “add”), and the point in time of the change. With this new piece of data, you’d update the delta web so that one could issue a query to the delta web in the form of URL and date, and get back the change data describing the change relative to that URL and point in time.

Since the web function F is a giant function that lets you specify many billions of sub-functions in the form of URL’s, the delta web could simply be implemented as just a set of sub-functions within the giant, already existing function F. In other words, people could create their own delta web sites that would be hosted by the web. You don’t need a separate web for dF/dt that takes URL and time t. Instead, you could have web sites that would allow input parameters in the form of URL and time t, and they’d produce delta web formatted markup documents.

The next thing to consider is the markup for delta web data. A simple approach would be to create an XML schema for delta web data. This could then ride inside payloads such as SOAP, RSS, whatever.

Moving beyond the issue of syntax for delta web data, what’s really cool is imagining new kinds of software that would consume and produce delta web data. Let’s take the case of web based calendars. Say that I decide to post a set of URLs on the web, whereby each one returns an XML-based event schedule for a specific month. Let’s also say that I provide information that tracks changes to my scheduled events, by providing a delta web document that identifies changes at different points in time.

Given this delta web data, one could write an aggregator that would track changes to calendars, and use this “delta” information to either update someone else’s calendar, or produce a new “higher level” delta web document. Here’s where it gets interesting. An aggregator of the delta web would be able to do things like provide its own summarization delta web documents for other delta web aggregators. An example would be a delta web aggregator that would look at all the delta web docs for a set of employee calendars, update a group level calendar, and build a coarser grained delta web document representing changes to the group’s calendar. If all the employees became busy at various times next week, the group level delta web aggregator could update “blackout periods” on the group level calendar, and update the delta web doc for the group calendar.

Another freaky thing you could imagine is delta web docs that track changes to other delta web docs. These higher order delta web docs would be representative of “second derivatives” of the web function. You could also imagine delta webs that measure perturbations – or changes to the web function – as a function of the web’s URL parameter – rather than time. A delta web could measure the change in the web’s response function when a URL’s parameters were changed.

One of the next steps is to define a simple markup for the delta web. As I see it, there are 4 key elements that define a change: 1) the pattern that defines the context of the change, 2) the pattern that defines the boundary of the change relative to the context, 3) the nature of the change (add or delete), and 4) the point in time of the change. Following is a brief discourse on each.

1)                          Pattern defining context of change – this is a URL. It’s a pattern, used by the web function to define a precise sub function. In simple web terms, it could be a URL to an XML document.

2)                          Pattern that defines the boundary of the change – this is a relative URL. In the simple XML document case, it could be an XPath reference to an element in the XML document. The XPath ref defines the beginning and ending of the change – namely the start and end element marker locations.

3)                          The nature of the change – this is simple – add and remove. In the case of add, the meaning is that we added the data matching the pattern defined in 2) above. In the case of remove, it means that we removed the data matching the pattern defined in 2) above.

4)                          Point of time of the change – this is the date when the change takes place.

More to come soon!!!

Ps – I was a physics major in college – which might provide a bizarre explanation for the really whacko way I conceptualize the web.

5:06:21 PM    comment []

Tuesday, March 23, 2004

RSS extension for Eventing
There's a new extension to RSS for eventing called ESF .  We could use this eventing mechanism to propagate information on changes to web services. Check Cesar's blog for his angle on it...

9:20:01 AM    comment []

Home

Technorati Profile