Parsing and Modifying Code

Parsing code is a well understood problem. Grammar rules match the input text and when matched produce values created from the text they match. These values are assembled into an "abstract syntax tree" (or AST) which stores the information you need in a structured representation.

But when it comes to taking the AST and turning it back into code, with most parsers you start from scratch and write new code that expresses the inverse of those rules. That process produces a new file which works the same but looks different since it omits comments, and whitespace - information not preserved in the AST.

Parselets grammars contain a mapping between the grammar and the model objects, so it is able to produce the AST directly from the code. Additionally, it listens for changes made to the AST and updates the code incrementally. Since parselets was written, the paper on Object Grammars was published which uses a similar approach for generating the AST directly from the grammar. Object grammars does not appear to be incremental in how it does the formatting.

Parselets additionally supports error recovery which is robust enough to be used in an IDE for partial file parsing and completion with reasonable error highlighting.

It also includes the ability to take diffs between one version of a file and the next and to patch the AST with only the changes (needed for fast editing of large files).

The features behind parselets power a good deal of the IntelliJ plugin. Any language written in parselets can use those features making it much easier to build IDEs for new languages. Because these features exist in StrataCode, they can be ported to other Java-based IDE frameworks such as Eclipse or NetBeans in the future.