Saturday, February 08, 2003

Mark Baker and Simon St. Laurent had a little back and forth over civility in technical discussions, specifically Elliotte Rusty Harold's bashing of SOAP.  Simon's pretty dismissive of the argument for civility, pointing at a post by Erik Naggum to comp.lang.lisp bashing XML.  It turns out that comp.lang.lisp has fairly well rehashed the arguments about civility in other threads, so I won't.  It seems to me, though, that the world remembers the peacemakers.  We name streets after Martin Luther King, but not so many after Malcom X, in spite of Malcom X's softening of his views later in his life.  Being right and being correct are two entirely different things.

That said, once I parsed the real content from his message, I found Erik Naggum's views enlightening.  Some of the highlights:

SGML is a good idea when the markup overhead is less than 2%.  Even attributes [are] a good idea when the textual element contents is the "real meat" of the document and attributes only aid processing, so that the printed version of a fully marked-up document has the same characters as the document sans tags.  Explicit end-tags is a good idea when the distance between start- and end-tag is more than the 20-line terminal the document is typed on.  Minimization is a good idea in an already sparsely tagged document, both because tags are hard to keep track of and because clusters of tags are so intrusive.  Character entities is a good idea when your entire character set is EBCDIC or ASCII.  Validating the input prior to processing is a good idea when processing would take minutes, if not hours, and consume costly resources, only to abend...When the markup overhead exceeds 200%, when attributes values and element contents compete for the information, when the distance between 99% of the "tags" is /zero/, when the character set is Unicode, and when validation takes more time than processing, then SGML has gone from good kid, via bad teenager, to malfunctioning, evil adult as XML...

The reasons for /not/ going binary when SGML competed with ODA have been reversed: When information should survive changes in the software, it was an important decision to make the data format verbose enough that it was easy to implement a processor for it and that processors could liberally accept what other processors conservatively produced, but now that the data formats that employ XML are so easily changed that the software can no longer keep up with it, we need to slam on the breaks and tell the redefiners to curb their enthusiasm, get it right before they share their experiments with the world, and show some respect for their users.

The Web provided me with a much needed realization that information cannot be /fully/ separated from its presentation, and showed me something I knew without verbalizing explicitly, that the presentation form we choose communicates real information.  Encoding all of it via markup would require a very fine level of detail, not to mention /awareness/ of issues so widely dispersed in the population that only a handful of people per million grasp them.  Therefore, to be successful, there must be an upper limit to the complexity of the language defined with SGML...

9:39:26 AM  permalink Click here to send an email to the editor of this weblog. 


Stories
DateTitle
1/23/2003 Why XML?
8/13/2002 Resolution for IE and Windows problems
8/10/2002 Supporting VS.NET and NAnt
5/11/2002 When do you stop unit testing?
Contact
jabber: weakliem
YM: gweakliem
MSN: gweakliem@pcisys.net
email: Click here to send an email to the editor of this weblog.
Subscribe to "Gordon Weakliem's Weblog" in Radio UserLand.
Click to see the XML version of this web page.