Wednesday, January 22, 2003

I've been exploring using RELAX-NG and Schematron, either as a replacement for, or an adjunct to, W3C XML Schema.  Schematron is pretty interesting because it seems to fill the holes in XML Schema nicely.  For example, I work with documents that have constructs

For instance, one construct you'll see in our data is something like this.  This is taken from the response to a query on rental cars available at a particular location.  Our mainframe will return only so much data; if the client wants to see more choices, they ask for them, if there's more to be had.  What's being described here is whether the recipient can request additional data beyond from the current response:

<MoreAvailInd>Y</MoreAvailInd>
<MoreAvailQual>
<IDMSControl>1</IDMSControl>
<!-- more elements ... -->
</MoreAvaiQual>

The idea being that the MoreAvailQual element's presence is contingent on MoreAvailInd being "Y". You could define this constraint in XML Schema, AFAIK, you can't enforce a relationship where the presence of one element depends on the value of another, so the best you could do is to say that MoreAvailQual is optional. OTOH, you can annotate a schema with a schematron rule to add the assertion that if MoreAvailQual exists, MoreAvailInd must be "Y". Here's a partial declaration:

<xs:element name="MoreAvailInd"/>

<xs:element name="MoreAvailQual" minOccurs="0">

<xs:annotation>

<xs:appinfo>

<schema xmlns="http://www.ascc.net/xml/schematron">

<pattern name="More Indicator">

<rule context="MoreAvailQual">

<assert test="preceding::MoreAvailInd/text()='Y'"> MoreAvailQual may not be present unless MoreAvailInd is 'Y' </assert>

</rule> </pattern> </schema> </xs:appinfo> </xs:annotation> <!-- define child elements...--> </xs:element>

I can't think of a way to describe this constraint using pure WXS.  My employer has a lot of XML data that's described this way, so Schematron should be pretty useful in capturing those constraints.  On the other hand, Schematron seems pretty weak at aaadescribing simple things like <xs:sequence>, where the order of elements matters.  Not that you couldn't describe that with XPath, but it'd be pretty awkward.  Maybe Schematron has a more compact syntax for that, but I haven't found it yet.  So I doubt that Schematron would be my primary schema language, because the order of elements always matters in our documents.

I sent the text above out as an email to a few co-workers.  The one response that I got was that Schematron wasn't a standard, and that XML, and WXS in particular was one place where every vendor had agreed on a standard.  My response was that Schematron is in the process of becoming an ISO standard, and that the fact that WXS is a standard doesn't help the problem that it can't describe our documents completely.  Now I realize that this is a bit of a straw man argument; why not just modify the document structure?  The <MoreAvailInd> is superfluous, the presence or absence of the <MoreAvailQual> element is self-describing.  My response is that this document already exists and it'd be easier to find a way to describe it rather than break all the existing clients.  This got me to thinking though.  It's pretty well accepted that one's language limits one's expressiveness; you can't think it if you can't say it.  There are certain concepts that simply don't translate well between natural languages for this reason.  I wonder if what will happen is that we (as in XML developers in general) will start constructing documents that can be fully described by WXS, because that's what WXS enables us to express.  What will we lose if that happens?

1:46:22 PM  permalink Click here to send an email to the editor of this weblog. 


Stories
DateTitle
1/23/2003 Why XML?
8/13/2002 Resolution for IE and Windows problems
8/10/2002 Supporting VS.NET and NAnt
5/11/2002 When do you stop unit testing?
Contact
jabber: weakliem
YM: gweakliem
MSN: gweakliem@pcisys.net
email: Click here to send an email to the editor of this weblog.
Subscribe to "Gordon Weakliem's Weblog" in Radio UserLand.
Click to see the XML version of this web page.