Towards a Visual Studio IDE extension supporting any programming language

As I pointed out before, it now seems possible to write a VS extension that supports any programming language. I’ve been updating AntlrVSIX to input a description of the syntax of a programming language and tag a file of that language in the editor. I have this now working for Antlr and Java, and plan to try a dozen or so more languages.

At issue is identifying what are the commonalities of a programming language, such as defining and applied occurrences, and how the editor should use that. Unfortunately, I lost an old textbook on programming languages I used years ago.

In addition, some extensions tag the programming language that my extension is also trying to tag, which results in multiple competing and conflicting classifiers and formatters. I solved this problem by changing the IContentType associated with a buffer to one that I define for the extension.

It may not look like much, but here is how Java is formatted by Visual Studio by default, and how my extension formats it (first attempt to get something working).

Current VS 2019 “code++.java” formatting. Variable “token” labeled erroneously in comments.
Formatting from the Antlr extension. Squiggles indicate no defining occurrence in the file, but should be found via imported libraries.
Posted in Tip

Managed Extensibility Framework — a Static Type System

Now that I have my AntlrVSIX extension working for Visual Studio IDE, I have now set my sights on a “meta-language” editor. This picks up the idea of an add-in that supports Antlr, and brings it forward to an add-in that supports any language. For this to work, the grammar for the language would be specified in Antlr. I would also need to add in information on what parts of the parse tree constitute what classified types for tagging in the editor. A corpus would be needed for Codebuff to work on reformatting a file containing code in the language. All seems doable. Or is it?

This idea, however, isn’t quite possible with the current framework of Visual Studio Extensions, at least not dynamically. I was hoping to have the extension read this information and act on it at runtime.

As I pointed out in a Gitter message system–and never received a reply–the problem with this idea is that I use MEF (Managed Extensibility Framework), and virtually all samples that I can find are based in MEF. Unfortunately, MEF works on a statically defined type system, i.e., the types are created upfront and compiled into a .VSIX file. You must write C# attributes to describe the language it implements. E.g., this tagger provider and the file suffix associated with the language is hardwired at compile time:

    [Export(typeof(ITaggerProvider))]
    [ContentType(AntlrVSIX.Constants.ContentType)]
    [TagType(typeof(ClassificationTag))]
    internal sealed class AntlrClassifierProvider : ITaggerProvider
    {
        [Export]
        [Name(/* AntlrVSIX.Constants.LanguageName */ "Antlr")]
        [BaseDefinition("code")]
        internal static ContentTypeDefinition AntlrContentType = null;

        [Export]
        [FileExtension(/* AntlrVSIX.Constants.FileExtension */ “.g;*.g4”)] 
        [ContentType(AntlrVSIX.Constants.ContentType)]
        internal static FileExtensionToContentTypeDefinition AntlrFileType = null;
    …
 }

I can, of course, generate a per-language extension via a template, then build the extension. But I was hoping for something that would be more dynamic. Is there something more modern than MEF for implementing VS extensions, something that isn’t based on a static type system framework? Essentially, no. MEF is integrated very tightly with Visual Studio IDE.

I don’t understand why ITaggerProvider couldn’t simply require methods Name(string), BaseDefinition(string), and FileExtension(string), which could all be called during initialization in order to determine what the “contract” ITaggerProvider supports?

Update Aug 19 2019 — As it turns out, it is possible to create the language describing a type system dynamically without using MEF. Also, it seems one can create a tagger with a provider that doesn’t require the FileExtension to be specified. I’ve decided to try to enhance AntlrVSIX to test out whether I can isolate the description of the Antlr itself into a grammar and tables. If it works, I’ll write an extension that will support any language.

Posted in Tip

AntlrVSIX v2.0

Due to my work on Piggy and Campy, I’m extending my Antlr4 extension for Visual Studio in a number of ways. The plug-in hasn’t been updated for two years, and there are no extensions for Antlr that work with Visual Studio 2019, so it is due for an update. In fact, there isn’t a single extension for Antlr for Visual Studio 2019, and Antlr.org removed Visual Studio from the list of developer tools for Antlr.

Changes

Targeting Visual Studio 2019 and 2017

I’m not sure where I read it, but “good practice” says I should only support the current version and one version previous.

Improved Tagging

I’ve added new tags for channels and modes, and the whole classification and tagging routines now check if the source has changed before reparsing, which improves performance.

A menu for AntlrVSIX in Extensions

In Visual Studio 2019, the menus for extensions are where you should look–under “Extensions”. Previously, I have to admit that it was hard to figure out the UI for AntlrVSIX, hidden under various menus, or available through a right-click context menu. That’s basically bad UI design.

Although the right-click context menu for AntlrVSIX is still available, it mirrors exactly what you see in the main menu under “Extensions -> AntlrVSIX”.

Navigation to next and previous rules in a grammar file

I’m a big fan of the old-style Emacs and even older Vi! Back then, UI was keyboard-oriented. It was much more efficient instead of moving around this stupid mouse, pointing to something I can hardly see on the screen.

I’ve added a few new ways to navigate around a grammar file with some shortcuts. “Next rule” jumps to the next rule in a grammar; “Previous rule” jumps to the previous rule in the grammar; “Go to Visitor” navigates to a visitor method for a tree node corresponding to a non-terminal.

Generation of Visitors and Listeners

Go to visitor/listener jumps to C# code for grammar symbol. If the method doesn’t exist, then the Visitor or Listener classes and methods are generated.

Options for the extension

An options dialog box for extension-specific parameters is now included. It is needed for things like where to find the “corpus” of examples of formatted grammars that Codebuff will use to format your grammar.

Performance

While AntlrVSIX is the only extension for VS 2019, it’s rarely used compared to several other older–and in my opinion, less useful extensions. AntlrVSIX has had only one review, which says the performance sucks. (Thanks, dude for not saying *what* specifically are you trying to do that is slow. I’d never have you on my QA team–not helpful descriptions).

Performance has taken a more front seat with this release. Before parsing, I check whether the code buffer has changed.

Significant cleanup and bug fix of the source code

I spent a lot of the time cleaning up the source code. At the time when I first wrote the extension, I had no clue how to modify which feature in the UI because the documentation for Visual Studio extensions is absolutely terrible. I now have a better command of what is going on and fixed a lot of the code.

Migration from packages.config to PackageReference builds

It’s absolutely appalling that the examples for extensions Microsoft provides is out of date and are so poor. Many still use the “packages.config” file to list the NuGet dependencies of a C# project. I’ve updated AntrlVSIX to use the latest format, which required the tool to be migrated to Net 4.7.2.

Integrating builds with Antlr4BuildTasks

You can’t really have an extension for a language like Antlr if you don’t include some way of actually *building* the project that uses the language. After first writing AntlrVSIX, I then modified Antlr4cs, a wrapper for a C# version of the Antlr tool, to just be a wrapper the standard Antlr4 Java tool. So, it now generates the parsers/lexers in the IDE.

In addition, the Antlr4BuildTasks had a bug where one couldn’t change the build mode of the .g4 file to none. This is required for multi-file grammars that use the “import” statement in Antlr.

Fixing Intellisense

I fixed the Intellisense for the extension. The tooltip now works when hovering over a single-character grammar symbol. Command completion now offers a list of grammar symbols in the grammar.

*****

The work on the extension has been going on for two weeks and should be finished in the next week.

Posted in Tip

Porting extensions to Visual Studio 2019

I was recently was trying to use my Visual Studio extension for Antlr in Visual Studio 2019, when I found that it just isn’t working anymore. In fact, I couldn’t even install the extension because it wouldn’t even show up in the search for the plug-in. In fact, there weren’t any extensions available for “Antlr” for Visual Studio 2019! I guess I hadn’t ported the add-in to VS 2019, so I decided to work on that.

Continue reading
Posted in Tip

New directions for Piggy patterns

Up to now, I have been making an assumption for the Piggy transformation system that regular expressions and NFA recognizers can be used to match arbitrary ASTs. After all, a lot of research has been published for tree recognizers, including XPath, TreeRegex, and Tregex, which are NFA based. With all this, I lost sight of the problem of trying to match subtrees within an AST. Unfortunately, pattern matching of ASTs is not regular, and so cannot be recognized by an NFA. This blog entry goes over why NFAs cannot be used, and how Piggy patterns should work.

Continue reading
Posted in Tip

Rewriting the pattern matching engine — part 2

Patterns in Piggy are regular expressions of parenthesized expressions that represent trees. The conversion of the regular expressions to a finite state automaton is easy (via Thompson’s Construction), but the resulting automaton for the pattern may not work for an important reason: patterns that specify attributes and children of an AST node have an implicit “…” between each attribute or child pattern. While it’s possible to introduce additional states to recognize “…” between each attribute or child node in the pattern, the resulting automaton is ambiguous. For every input AST node, any attribute or child AST node can mismatch a “sub-pattern”, but can still match the complete pattern.

Continue reading
Posted in Tip

Speeding up this website

A few months ago, I started to migrate some of my websites to DigitalOcean because the cost of a virtual server is $5/mo. So, I moved CodingGorilla.com to the new host. (Note, a long story, but the name came from an old boss, who saw I have the patience of a saint and attention to minute details, the traits of any good programmer.)

Unfortunately, the website has been painfully slow because I was told that you should keep your database and web server on separate hardware. This may be fine for large corps which have their servers on a fast LAN, but this was the wrong advice for a blog. I moved the MySQL database to the web server, and now the website works ~100x faster. Adage for the day: Believe half of what you see and nothing of what you hear.

Incidentally, the tool which I used to find this problem is Query Monitor by John Blackbourn (plugin page, website). I flags the slow queries, places the runtime for each query in a table, which you can then copy and paste into Excel to compute the total time required for the queries.

Posted in Tip