Parsing with Augmented Transition Networks and computing lookahead

The next step in the development of my LSP server for Antlr involves support for code completion. There is an API written by Mike Lischke that computes the lookahead sets from the parser tables and input, but it turns out to have a bug in some situations. So, I’ll now describe the algorithms I wrote to compute the lookahead set for Antlr parsers. I’ll start first with a few definitions, then go over the three algorithms that do the computation. An implementation of this is available in C# in the Antlrvsix extension for code completion and another is provided by default in the Antlr Net Core templates I wrote for syntax error messages.

Continue reading

Posted in Tip

New — Antlr4BuildTasks.Templates

For those interested in creating an Antlr4 program using C#, I wrote a dotnet package and uploaded it to Nuget. There is similar functionality in the VS2019 extension AntlrVSIX, but I am starting to move towards a Language Server Protocol client/server implementation for Antlr. This package capitalizes on the work I did with Antlr4BuildTasks supporting MSBUILD builds using the Java Antlr tool v4.7.2.

Continue reading

Posted in Tip

Getting VS2019 working with the Eclipse Java LSP server

Continuing with my work in making Antlrvsix a Language Server Protocol server implementation, I created an extension for Visual Studio 2019 that uses the client API Microsoft.VisualStudio.LanguageServer.Client with the Eclipse Java Language Server Protocol implementation. This extension follows the steps outlined in an old article on creating LSP clients in Visual Studio. The extension source code is here. The code has been updated so that the only requirement is that you have the Java runtime downloaded and installed, and JAVA_HOME set to the top-level directory of the Java runtime. The code will prompt you for the path and warn you that it isn’t set properly.

Even as noted in the old MS documentation page, many of the client features are enabled, e.g., go to def, find all refs, reformat, hover, and typing completions. What is missing is building and debugging. But it is very usable.

Posted in Tip

A comparison of Antlr grammars for parsing Java

This is an article about my ongoing research regarding the state of Antlr grammars for parsing Java. Some of the tests are taking weeks of computing time, so the results are preliminary.

Antlr is a popular LL(*) parser generator for recognizers of C#, Java, and many other programming languages. For Java, there are three grammars available on the Antlr grammar website: Java, Java8, and Java9. If you are a developer who hasn’t followed the maintenance history, it is unclear which grammar one should choose. Some of the changes that have been made to one grammar have not been applied to the other grammars. The basis for all grammars, however, is The Java Language Specification. Unfortunately, the latest available now is version 13 making all of them out of date.

Continue reading

Posted in Tip

Notes on Language Server Protocol

Surprise! I’ve been implementing something like the Language Server Protocol (LSP) with AntlrVSIX. The LSP is an abstraction for an editor (ed) using JSON as the lingua franca between the editor and a GUI client. The editor server: offers the persistence of code files; organizes files in a “workspace”; provides an API for edits, tagging, go to def, find all refs, reformat, code completion, defs for tooltips, etc. LSP is a step in the right direction because it separates an editor server backend from a GUI frontend, which is what AntlrVSIX is about.

Continue reading

Posted in Tip

Adding “workspaces” to AntlrVSIX

After a lot of hemming and hawing, I am now adding the concept of “workspaces” to AntlrVSIX. By this I mean an equivalent of workspaces that is defined in Roslyn. In AntlrVSIX, a Document will be a source file; a Project will be a collection of Documents; a program will be a Workspace, a collection of Projects. Properties on a Project or Document are copied to the equivalent AntlrVSIX object as a property list.

Continue reading
Posted in Tip

Updates to AntlrVSIX

I’ve been busy extending AntlrVSIX to include new features and correct bugs. Some of the things added with the 40 or so changes since the beginning of September are:

  • Persistent option settings;
  • Tagging of Java symbols using a symbol table;
  • Improved reformatter for languages;
  • Improved symbol click-on highlighter;
  • Improved co-existence with other extensions.

Continue reading

Posted in Tip

Adding in a symbol table into AntlrVSIX

I’m now working on adding in a full symbol table implementation into AntlrVSIX. This will help make the extension much more powerful with tasks such as tagging and navigating to defs.

For better or worse, I’m starting with the implementation in Antlr Symtab. It looks like it’ll work out pretty well, but it is missing certain classes, and needs a little polishing. For example, VariableSymbol is the base class for FieldSymbol and ParameterSymbol, but there isn’t a class for variables defined and referenced in a method body. This makes it somewhat difficult to distinguish between a field and a local variable in a block.

Also, for better or worse, I’ve added Mono.Cecil into the symbol table code. I’m not sure whether I’ll need this, but it’s there just in case. Mono contains a reader of PE files, which might be useful. But, what I’m really thinking of is the equivalent over in Java and other languages.

Posted in Tip