Series on program transformation systems: ROSE (2000)

This is the next entry in the series on program transformation systems. This article describes ROSE, a compiler system for C, C++, Java, Python, PHP source code. It has been in development since 2000 and is actively being developed to this day. ROSE was developed at the Lawrence Livermore National Laboratory.

ROSE can be divided into several parts:

  1. an AST representation, which is defined by SAGE III. Node types in the AST are predefined SAGE III classes, such as SgGlobal, SgIfStmt, SgWhileStmt, SgVariableDeclaration, etc. (see
    http://www.extreme.indiana.edu/sage/Doc-sage2/Sage_Classes.html). The ASTs generated by a language-specific frontend is converted into this generalized, universal AST representation. Users can create attributes for a node, but generally, one works with these hardwired types.
  2. an AST traversal API. The traversal API provides several types of standard AST traversals: pre-order, post-order, and a “classic object-oriented visitor” which visits all nodes of an entire tree in no particular order.
  3. an API for querying AST nodes. The main routine for querying tree nodes is NodeQuery::querySubTree(). Each AST node has a type, which you must then cast to the specific type required in order to use accessor functions for various attributes and children of the node.
  4. an API to rewrite the AST, e.g., LowLevelRewrite::insert().

ROSE can be downloaded and run in a number of ways. The easiest method to run ROSE is by using a virtual machine image that LLNL provides. LLNL has a disclaimer that says folks within LLNL cannot use the virtual machine (“WARNING: LLNL users may not be able to download …”)! Note: unpack the .gz file, then untar. You can use Virtual Box to run the virtual machine by selecting Ubuntu-ROSE-Demo-V3.vmdk. The virtual machine disk is over 20GB, so have plenty of disk space for the files.

Observations and notes

  • ROSE is a low-level interface for performing program transformations on an AST. Each node in the AST is represented with a hard-wired C++ type, which means that it is inflexible.
  • The “query language” which the authors mention is a C++ API that appears to be only a simple “find nodes of a given type”. It isn’t clear how one would create a complex AST structure, which Piggy supports.
  • Each node must be typecasted to each SAGE III AST node in order to access attributes for that node. I don’t think one can use a string name to access an attribute on a generic SAGE III AST node. (Piggy does.)
  • There is no generalized tree query language as in Coccinelle or Piggy.
  • ROSE is written in 2+million lines of C++ code.

References

Quinlan, D., 2000. ROSE: Compiler support for object-oriented frameworks. Parallel Processing Letters10(02n03), pp.215-226.

Posted in Tip