Jon's Radio

Jon's Radio : Jon Udell's Radio Blog

Updated: 8/6/2002; 12:30:33 AM.

Note: Jon's Radio has moved to InfoWorld

Thursday, April 18, 2002

Sam points out that awk was a complicated beast. Of course, that's true.

Ever typed "man awk" lately? In particular, look at the sheer number of command line options. [Sam Ruby]

Awk is not intrinsically simple, nor is the species of whitespace-delimited text that it processes. But neither is XML. In both cases, it's tempting to slide down the slippery slope of definining kitchen-sink formats. Quite often less is more, is all I'm saying.

The state-space explosion that Sean McGrath refers to is, of course, a risk in any kind of patterned data. I think he's right to suggest, though, that when XML explodes in this way, there's a tendency to presume, because of the XMLness of the data, a false simplicity.

By the way, does "XML dogfood" count as a googlewhack? Or are the quotes cheating?
9:50:35 PM

Eating the XML dogfood

Sean McGrath speaks to the dark side of XML tagging in this cogent article. He's right. When the people who are making the dogfood don't have to eat it, there's bound to be trouble.

Every time a developer writes an XPath expression, a SAX handler, or weaves a DOM NodeList, he or she is contributing to the XML tags' cost of ownership. Every time a developer backs off from cutting code because of the sheer complexity of the XML structure being manipulated, you are accumulating costs.

In XML land, not only are the equivalent of "global variables" created with wild abandon, but their creators often see fit to invoice based on the number they create for you. An unfortunate schism exists in XML software development between the team that develops the schema and the team processing the XML that conforms to the schema. Too often, these are not the same teams.

I remember an example of lateral thinking in a book by Edward De Bono. A company suspected of water pollution applied for permission to draw fresh water from the same stream it was pumping effluent into. Permission was granted on condition that the water in-take occur down- stream from the effluent discharge.

Stretching the analogy to its breaking point, those who wish to create schemas must work at a software development level with the XML they themselves have modeled. That will teach them the real cost of XML tagging. The real cost of XML tags (ITworld.com) [IBM DeveloperWorks: XML News]

Last night, I had dinner with Dale Dougherty, who has been around the track a few times when it comes to processing structured text. (Dale wrote the 1990 classic, sed and awk.) What bugs him lately? Overly complex XML schemas. Less is more, he has concluded.
8:23:48 PM

Jeffrey P Shell, Zope's lizard brain, and loosely-coupled messaging

I had a nice response to a recent column on Zope from Jeffrey P Shell, a longtime Zopista and former Digital Creations guy who has decamped to the skiiing life in Utah. Jeffrey wrote one of the first BYTE columns on Python, longer ago than the web seems to remember. Anyway, in my column I talked about scripting Zope from the outside, using the RESTian approach of reverse-engineering its HTML management forms and calling them as URLs. I knew that XML-RPC was another way to do this. Jeffrey pointed out a third, little-known approach that taps into Zope's "lizard brain."

Did you know that Zope has always had a simple RPC mechanism, predating XML-RPC, and even predating Zope itself? There was a little piece of Bobo, which is now ZPublisher, called 'bci' for 'Bobo Call Interface'. I'm almost ashamed that more wasn't done to promote BCI, or turn it into an actual RPC mechanism (it doesn't marshal return data), because XML-RPC, while simple, is just a little too simple (no concept of None/NIL? No concept of authentication except as part of the API?). And SOAP.. *sigh*.

Anyways, ZPublisher.Client (which can be used without any other Zope modules, so you could install a copy of it into a common place) is another wy to do that Perl script that you wrote, while maintaining a cleaner syntax than writing a long URL. It basically generates the same URL (with all of the correct Zope marshalling, although I don't know if it knows of the more recent marshalling options) and does the same job.
from ZPublisher.Client import Object myCatalog = Object (
http://host:port/repository/myCatalog)
myCatalog.manage_catalogFoundItems(
obj_metatype=['Image', 'File'],
obj_permission="Access contents information",
search_sub=1, btn_submit="Find and Catalog"

You can see the similarities to XML-RPC, which you might even be able to use in this situation, but there are some niceties about BCI. When constructing a ZPublisher.Client.Object or ZPublisher.Client.Function method, you can specify a username and password and you will be authenticated over Basic Auth. You can specify which HTTP method to use (GET/POST/PUT). You can upload files just by passing a Python file object (basically anything with a 'read()' method). You can also catch remote exceptions. While I recognize that XML-RPC has the concept of 'fault', for more intimate Zope scripting, sometimes more knowledge of the cause of the fault is required. This is the only real marshalled data that ZPublisher.Client (BCI) sends back.

Cool! By the way, Jeffrey is trying out a blog. I hope he sticks with it. He's a wonderfully thoughtful and articulate writer, and a really clever guy. l bet he'll have OmiOutliner hooked in before long.

Looking into blogspace with fresh eyes, Jeffrey wondered "Why write when there's so much else to do? Who's reading?"

These are great questions. I of course get paid for my writing, though not for what I write here. But what about most folks? Why write? Who will read? Lots of people have lots of different reasons. For me, it's mainly about optimizing information flow and managing attention. In a recent column I explored the idea of storytelling as a tool for project coordination. That's closely related to what Dave Winer means by "narrating work" (and is demonstrating in his outline). We do this narration all the time in interpersonal email. Something interesting happens when we instead write messages addressed to spaces.

Defining why it's interesting is hard to do. But I'm closing in on it. Today I realized the following analogy may hold: loosely-coupled message-driven architecture, the mantra of the web services movement, is precisely what blogspace is becoming for the realm of human communication. When we adopt this style of communication, we give up some of the benefits of tight coupling: message acknowledgement, tight feedback loops. But we gain (maybe) the ability to scale beyond what is possible when tightly-coupled messaging (email, discussion groups) is the only available mode. This doesn't mean there's no benefit to tightly-coupled interpersonal messaging. It only suggests that the loosely-coupled mode is also important.
3:39:10 PM