![]() |
mardi 18 avril 2006 |
lxml gets last things I needA huge thanks to Stefan Behnel for his ongoing work on lxml, the advanced Python bindings for libxml2 and libxslt.
I've been waiting on two last features to let me do what I'm trying to achieve: HTML parsing and external functions. Both of these are available in the lxml Subversion repository. If you want the two together, you currently need to use the branch at Documentation for the HTML support is in api.txt and extension docs are in extensions.txt. First some caveats. You need libxml2 2.6.23. More specifically, don't use the version that ships with OS X. It won't work. Also, be aware that (at least for me) it installs as a Python egg. You probably won't notice (and it might not do that if you don't have setuptools installed), but don't be surprised if this happens. So what does this give you? I'm working on themes as a way to globally manage page look-and-feel. The HTMLParser means I can use an existing page as a theme, even if it is tag soup and not well-formed XML. Also, I can apply a theme to content that might not be well-formed. I can then generate well-formed output.
Next, the extension functions mean my XSLTs can do two things I need. First, I want to probe the theme document and insert its DOCTYPE and encoding in the XSLT for the dynamic pages. Since those aren't part of the XML Infoset, I can't get that information from the ElementTree representation. No problem: I write a little Python function that parses the theme file string, then make it callable in the XSLT. Something like
Also, I'd like to grab Zope data and make it available in the XSLT. Some of the data I can prepare in advance (like request data, the content of the object I'm rendering, etc.) and make it part of the input document. Other data I might want to pull in on a case-base case basis. So I could do For those of you that detest XSLT syntax, I plan to take a look at XSLTAL (blog). Also, note that this theme stuff can still be a part of ZPT processing. The output of a ZPT can be the input of a theme. What's left that I could possibly want in lxml? Well, perhaps I'd want a standard way to access DOCTYPE and encoding information. But that's not part of ElementTree, XML Infoset, or (probably) libxml2. So I can't complain too much. I might still want to play around with resolvers, as a way to let the same templates operate on static mockup data in non-Python tools (oXygen). But I'm far from settled on that.
Ahh, lxml. What a successful and useful project. |