A command-line approach to transforms of Antlr grammars

This is just a note to myself regarding some ideas on a command-line tool for Antlr grammars. It should be clear to anyone: grammars, especially Antlr grammars, are first-class objects that can be manipulated and changed to improve readability or performance. Antlrvsix now incorporates almost two dozen transformations, ranging from unfold, fold, sort rules, remove useless rules, reformat rules, split grammars, merge grammars, etc.

The question is how best to structure these transformations for the extension, and more importantly, in a completely automated manner whereby I can read the grammar from a web page containing a spec of a particular language like Java, C#, or what have you.

The main problem in the tool is how to identify the parts of the grammar that I want to change with a transform. Here, it looks like there are two possibilities: (1) a line/column number range; or (2) a handy XPath expression(s) to identify a point(s) or range(s) in the grammar.

So, to make changes to the grammar, I could envision something like this:

cat Grammar.g4 | trash "//ruleSpec[/RULE_SPEC = 'e'] => unfold" | trash "=> split-grammar" 1> GrammarParser.g4 2> GrammarLexer.g4

Afterwards, I can write an online version of the Antlr transformation tool.

Also to note to myself this article by Figueira et al. is the only one that I found that describes the denotational semantics of XPath in an unambiguous manner. Can fold be implemented using XPath, where one entire sub-tree (implemented I suppose as a node-set) can be compared to another sub-tree?


Posted in Tip