Python Parsing
I've been thinking about writing a parser for a toy programming
language syntax design that's been kicking around in my head for a
few years. Inspired by a recent thread on comp.lang.python, I
thought I'd have a look at what's out there for a parser implemented
in Python. A google search for Python parsing yielded 134,000
hits today. A Gooja search in comp.lang.python only, SINCE FEB 12
ONLY, yielded 216 hits! Yow! (Tangent: another "Python
Productivity Gain" thread from this past February.)
I finally re-located the original thread (tricky because the original
poster put a misspelled subject line that did not include the word
"parser" ("Reommended compiler toolkit")), and here's a summary of
what I found in that thread and another recent one (from February 2004,
subject line "Parsing library for Python").
For what it's worth, I came into this with a slight
bias towards SPARK (because it was the only one I had already heard
of), and a strong bias towards pure python parsers rather than ones
that require C extensions.
The best links mentioned in the two threads were this one
from the official Python pages: http://www.python.org/topics/parsing.html
(recently moved to this URL from a "parser-sig" URL),
and the link to Martin von Löwis' paper http://www.python.org/sigs/parser-sig/towards-standard.html
from PyCon 10.
I have a slight different take on things, as well as a
few other links to follow up on. Later. Meanwhile, here are
the bits that are relevant to me.
The text fragments here are pasted directly from the
GooJa search, except I've replaced "I" with "(TP)" for
"this poster".
- SPARK http://pages.cpsc.ucalgary.ca/~aycock/spark/
(TP) personally use spark, because its declarative
nature and the early parsing method - but thats just a matter of taste.
- PLY ("Python Lex-Yacc") http://systems.cs.uchicago.edu/ply/
(TP) used PLY for a project a while ago.
It felt comfortable.
Another poster said that PLY was the one (AP) liked most.
- Yappy seems promising, but (TP) couldn't get it to work. It doesn't even
compile the main example in its documentation
-
YAPPS : http://theory.stanford.edu/~amitp/Yapps/
Yapps (Yet Another Python Parser System)
Produces human-readable recursive descent parsers;
can produce context-sensitive scanners;
rules can pass arguments down to subrules, like
attribute grammars.
The YAPPS page definitely has the best favicon of any of these :-).
- You might also try http://pyparsing.sourceforge.net
(TP) use Paul McGuire's pyparsing.
It's pretty powerful; it uses Python to build the grammar,
instead of a BNF (or BNF-like) lexer.
-
Also take a look at 'Toy Parser Generator'
http://christophe.delord.free.fr/en/tpg/
... the specification language is very nice.
Not pure python
- SimpleParse which converts BNF to the tag tables used by
mxTextTools. http://simpleparse.sourceforge.net/
- mxTexttools
http://www.egenix.com/files/python/mxTextTools.html
(TP) thinks mxTexttools is way complicated.
(TP)'d like something that (TP) can give a BNF
grammar to handle.
[DL: mxTextTools is one of the most mature
packages here. If I were going for speed, that's probably
what I would start with.]
- PyLR - PyLR seems promising but is for Python 1.5
PyLR has a C extension, and looks like
it hasn't been updated since 1997.
- dparser http://dparser.sourceforge.net/ supports BNF (EBNF?) notation which I(TP)HO is nicer than the PLY grammar
specs. (TP) hasn't tried dparser yet, so let us know how it is (compared to PLY) if you decide to use it.
It includes a Python interface to the
C-based parser (via SWIG). OTOH, it does seem pretty powerful.
Not really just parsers, or lexers, or I just
Haven't followed up on very much yet:
- Plex This is just a lexer. You'll have to write your own parser.
Pyrex uses it, along with a hand-written recursive-descent parser.
Although it shouldn't be too hard to plug the lexer from this into
one of the other packages to do parsing.
- FlexModule uses flex to make a lexer, and also works with
BisonModule (which uses bison) to make the parser.
- TRAP
http://www.first.gmd.de/smile/trap/
Rather more comprehensive TRAnslator generator.
"TRAP is a generic prototyping system for translators and compilers,
especially suited for medium-complexity special purpose languages."
-
kwParsing
-
FlexModule
- BisonGen
12:36:21 PM