Parsing -- Python parser generator module
The Parsing module is a pure-Python module that implements an LR(1) parser generator, as well as CFSM and GLR parser drivers. From an algorithmic perspective, this is one of the most advanced parser generators in existence, for the following reasons:
- The Parsing module uses a scalable algorithm for LR(1) parser generation, rather than the more limited LALR(1) or SLR(1) algorithms that are more typically used. To my knowledge, Menhir is the only other LR(1) parser generator that implements the algorithms that David Pager published 30 years ago. All other available LR(1) parser generators require exponential time.
- The Parsing module implements the standard “characteristic finite state machine” (CFSM) parser driver that many other parser generators such as yacc, bison, and lemon use. In addition though, it implements a GLR parser driver that is very similar to that of Elkhound, which has some important memory usage advantages over implementations like that used by bison.
- The Parsing module provides more robust conflict resolution mechanisms than any other parser generator I am aware of. Back before LR parsing was developed, precedence parsing was the norm. It appears that precedence parsing was subsumed by LR parser generators with little thought to the impedance mismatch between precedence parsing and LR parsing. Rather than limiting the developer to a linear precedence ranking, the Parsing module allows the developer to specify a directed acyclic graph of precedences. The primary advantage is that it is possible to resolve individual conflicts without silently hiding other conflicts that crop up during later development.
- The Parsing module does not generate source code like most parser generators. Instead it caches the results of parser generation in a pickle, then on subsequent runs verifies that the pickle is still usable for parsing. This means that there is no separate parser generation step, which is a real advantage when using Python.
- Although this is not unique among parser generators, it is worth mentioning that the Parsing module implements extensive error checking and diagnostics. There is no substitute for the hard work that it takes to become proficient at parser development, but the development aids provided by the Parsing module are superb.
The best way to get started with the Parsing module is to download/install it via PyPI (or download an older local version) and download the example, read and experiment with the example, then read the docstring-based documentation that is embedded in the Parsing module source. If you want to use the GLR parser driver, you may find some of the regression tests to be useful examples.
Copyright © 2013 Jason Evans <firstname.lastname@example.org>.
Last updated Wednesday 2013/01/06