mercredi 5 octobre 2005

ZPT and XML

A while back I wondered if ElementTree or lxml could be used with ZPT. Answer: yes. Here's how.

First, let's discuss why. Let's say you want do deal with XML coming from somewhere else. Or, let's say you actually like XML because it is fast and the data model is obvious. Mostly, let's say you like XPath.

Great, but how do you render HTML? Most ZPT'ers look at XSLT and cough up a hairball. (I disagree, but that's a side point.) Wouldn't it be nice if you could use ZPT to render XML into HTML?

To start, review Shane's excellent, brief explanation of using ZPT outside Zope, from the command line. Here's our starting point:

from zope.pagetemplate import pagetemplate
from lxml import etree
from StringIO import StringIO

content = """ <?xml version="1.0" encoding="UTF-8"?> <items> <item id="1"> <title>First Item</title> </item> <item id="2" paid="1"> <title>Second Item</title> </item> <item id="3" paid="1"> <title>Last Item</title> </item> </items> """

f = StringIO(content) doc = etree.parse(f) root = doc.getroot()

text = """ <html> <head> <title tal:content="options/title">Title goes here</title> </head> <body> <h1 tal:content="options/title">Title goes here</h1> <ul> <li tal:repeat="i options/root" tal:content="i/attrib/id">Item ID</li> </ul> </body> </html> """

def main(): pt = pagetemplate.PageTemplate() pt.pt_edit(text, 'text/html') print pt(title='Success!', root=root)

if __name__ == '__main__': main()

This simple module creates a little XML document ("doc") from an inline string. It also creates a ZPT. When run from the command line, it produces HTML output.

How does it work? We pass in an ElementTree node into the ZPT as a Python object. This is the magic. Because ElementTree gives a Pythonic interface to XML data, and because the ZPT semantics for traversal match up reasonably well to the ElementTree, things just work. We can iterate over the children (item elements) of the root element (items) and show their @id attribute.

What would it look like if we wanted to show the value of each title? Alas, this is where things break down just a bit:

<li tal:repeat="i options/root" 
    tal:content="python: i.find('title').text">Item ID</li>

Here we use the 'find' method of the ElementTree API to grab the title subelement and show its text value. You can see that, since we had to shift to a Python expression, it isn't as seamless as the previous version. In this case, even though we are using lxml, we are sticking to the API it shares with ElementTree, which doesn't have a full XPath implementation.

However, what if we wanted to leverage the power of XPath? Here's one way the above would look:

<li tal:repeat="i python: options['root'].xpath('item/title/text()')" 
    tal:content="i">Item ID</li>

In this version, our tal:repeat expression uses lxml's xpath method. We ask the root element (items) for all of its "item" children, and for those, get the "title" element's text content. The tal:content then becomes quite simple.

Finally, for the part that really kicks in, we get to use the power of XPath to dive around inside deep trees of structure and pick things out. For example, only show the items that are paid:

<li tal:repeat="i python: options['root'].xpath('item[@paid]/title/text()')" 
    tal:content="i">Item ID</li>

As an example of a place where this could really help, imagine the navtree portlet in a CMS. Keep a cached lxml DOM in memory and let XPath munge it at, errm, high-speed. [wink] (Of course, this kind of recursive formatting is what XSLT is also good at.)

There's lots more that XPath can do: subtotals, going up and down nodes, etc. In fact, lxml currently sports an XPath extension facility. With this, we could bind our own Python methods to XPath evaluations inside ZPT.

In summary, XML provides a series of benefits for data abstraction. In particular, XPath is a powerful, well-documented, well-implemented spec for walking around in data. By using this from ZPT, Zope developers might have an easier way to gain those benefits.


11:29:54 AM   comment []