GIGO: words unreadable aloud
Mishrogo Weedapeval
 

 

  Saturday 24 April 2004
Python Parsing

I've been thinking about writing a parser for a toy programming language syntax design that's been kicking around in my head for a few years. Inspired by a recent thread on comp.lang.python, I thought I'd have a look at what's out there for a parser implemented in Python. A google search for Python parsing yielded 134,000 hits today. A Gooja search in comp.lang.python only, SINCE FEB 12 ONLY, yielded 216 hits! Yow! (Tangent: another "Python Productivity Gain" thread from this past February.)

I finally re-located the original thread (tricky because the original poster put a misspelled subject line that did not include the word "parser" ("Reommended compiler toolkit")), and here's a summary of what I found in that thread and another recent one (from February 2004, subject line "Parsing library for Python"). For what it's worth, I came into this with a slight bias towards SPARK (because it was the only one I had already heard of), and a strong bias towards pure python parsers rather than ones that require C extensions.

The best links mentioned in the two threads were this one from the official Python pages: http://www.python.org/topics/parsing.html (recently moved to this URL from a "parser-sig" URL), and the link to Martin von Löwis' paper http://www.python.org/sigs/parser-sig/towards-standard.html from PyCon 10.

I have a slight different take on things, as well as a few other links to follow up on. Later. Meanwhile, here are the bits that are relevant to me. The text fragments here are pasted directly from the GooJa search, except I've replaced "I" with "(TP)" for "this poster".

  • SPARK http://pages.cpsc.ucalgary.ca/~aycock/spark/ (TP) personally use spark, because its declarative nature and the early parsing method - but thats just a matter of taste.
  • PLY ("Python Lex-Yacc") http://systems.cs.uchicago.edu/ply/ (TP) used PLY for a project a while ago. It felt comfortable. Another poster said that PLY was the one (AP) liked most.
  • Yappy seems promising, but (TP) couldn't get it to work. It doesn't even compile the main example in its documentation
  • YAPPS : http://theory.stanford.edu/~amitp/Yapps/ Yapps (Yet Another Python Parser System) Produces human-readable recursive descent parsers; can produce context-sensitive scanners; rules can pass arguments down to subrules, like attribute grammars. The YAPPS page definitely has the best favicon of any of these :-).
  • You might also try http://pyparsing.sourceforge.net (TP) use Paul McGuire's pyparsing. It's pretty powerful; it uses Python to build the grammar, instead of a BNF (or BNF-like) lexer.
  • Also take a look at 'Toy Parser Generator' http://christophe.delord.free.fr/en/tpg/ ... the specification language is very nice.

Not pure python

  • SimpleParse which converts BNF to the tag tables used by mxTextTools. http://simpleparse.sourceforge.net/
  • mxTexttools http://www.egenix.com/files/python/mxTextTools.html (TP) thinks mxTexttools is way complicated. (TP)'d like something that (TP) can give a BNF grammar to handle. [DL: mxTextTools is one of the most mature packages here. If I were going for speed, that's probably what I would start with.]
  • PyLR - PyLR seems promising but is for Python 1.5 PyLR has a C extension, and looks like it hasn't been updated since 1997.
  • dparser http://dparser.sourceforge.net/ supports BNF (EBNF?) notation which I(TP)HO is nicer than the PLY grammar specs. (TP) hasn't tried dparser yet, so let us know how it is (compared to PLY) if you decide to use it. It includes a Python interface to the C-based parser (via SWIG). OTOH, it does seem pretty powerful.

Not really just parsers, or lexers, or I just Haven't followed up on very much yet:

  • Plex This is just a lexer. You'll have to write your own parser. Pyrex uses it, along with a hand-written recursive-descent parser. Although it shouldn't be too hard to plug the lexer from this into one of the other packages to do parsing.
  • FlexModule uses flex to make a lexer, and also works with BisonModule (which uses bison) to make the parser.
  • TRAP http://www.first.gmd.de/smile/trap/ Rather more comprehensive TRAnslator generator. "TRAP is a generic prototyping system for translators and compilers, especially suited for medium-complexity special purpose languages."
  • kwParsing
  • FlexModule
  • BisonGen

12:36:21 PM   comment/     


Click here to visit the Radio UserLand website. Click to see the XML version of this web page. © Copyright 2007 Doug Landauer .
Last update: 07/2/6; 12:38:25 .
Click here to send an email to the editor of this weblog.

April 2004
Sun Mon Tue Wed Thu Fri Sat
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30  
Mar   May

Previous/Next