Tree transformations via XPath and S-expressions

I’ve finally have the right tools to now implement transforms over an Antlr parse tree.

The first part of a transform is identifying what nodes in the tree that are going to be replaced. It turns out that the best tool to do that is an XPath engine, which I’ve rewritten in C# from Java over the last month.

The other part of a transform–and far simpler to implement–is a way to express a tree that is created and spliced into an existing parse tree. Here, the work I did in Piggy for S-expressions comes in handy. For example, the expression “( ruleAltList ( labeledAlt ( alternative { child } )))” identifies the Antlr parse tree to create, and splices in a “child” node into a created “alternative” node. Note, this is a slightly different usage for “S-expressions” that you may be used to, in which a node is an unnamed pair, but conveys the same purpose.

With XPath and S-expressions, I can now rewrite all the transforms that I hardcoded in C# for Antlrvsix. The code implementing both parts is here, but I will be forking this code and placing it under in the Antlrvsix source until I see a need to place this code in a separate Nuget package.

At this moment, I’m not exactly sure what language and control structures to add on top of XPath and S-expressions. For now, these two tools should suffice along with C# to glue the pieces together in order to modify an Antlr parse tree. The other consideration is having intermediate results of an XPath expression. For example, I may want to get all ruleAltList’s but continue down the tree for a particular child. I’ve fixed a bug in the Eclipse XPath library to allow an intermediate result to be used as context for another XPath expression. But I might consider extending XPath to bind the results of an intermediate result into a C# variable.

One other note–Why am I not looking at term rewriting systems? I have. The problem is that they are not practical for two reasons: (1) integrating it with Antlr parse trees would not be easy; (2) most do not express manipulations directly on a tree. XSLT is one example. Here, the language isn’t specifying tree rewrites, but the construction of an entirely new tree. I also looked at TXL. Here, the language isn’t about trees, but term rewrites in the target language. I would need to convert the Antlr grammar into TXL grammar syntax. In all these systems, I would need to fit in Antlr parse trees into the framework. Again, all I want is to manipulate trees.

What of Piggy? Unfortunately, Piggy is not a tree editing library. While it recognizes tree nodes, the problem is that it then executes user code that performs an output on a DFS traversal of the tree. Again, all I want is to manipulate an existing tree, then do something later with that tree.

–Ken

Posted in Tip