Monday, January 8, 2018

Syntax highlighting Nix expressions in mcedit

The year 2017 has passed and 2018 has now started. For quite a few people, this is a good moment for reflection (as I have done in my previous blog post) and to think about new year's resolutions. New year's resolutions are typically about adopting good new habits and rejecting old bad ones.

Orthodox file managers



One of my unconventional habits is that I like orthodox file managers and that I extensively use them. Orthodox file managers have a number of interesting properties:

  • They typically display textual lists of files, as opposed to icons or thumbnails.
  • They typically have two panels for displaying files: one source and one destination panel.
  • They may also have third panel (typically placed underneath the source and destination panels) that serves as a command-line prompt.

The first orthodox file manager I ever used was DirectoryOpus on the Commodore Amiga. For nearly all operating systems and desktop environments that I touched ever since, I have been using some kind of a orthodox file manager, such as:


Over the years, I have received many questions from various kinds of people -- they typically ask me what is so appealing about using such a "weird program" and why I have never considered switching to a more "traditional way" of working, because "that would be more efficient".

Aside from the fact that it may probably be mostly inertia, my motivating factors are the following:

  • Lists of files allow me to see more relevant and interesting details. In many traditional file managers, much of the screen space is wasted by icons and the spacing between them. Furthermore, traditional file managers may typically hide properties of files that I also typically want to know about, such as a file's size or modification timestamp.
  • Some file operations involve a source and destination, such as copying or moving files. In an orthodox file manager, these operations can be executed much more intuitively IMO because there is always a source and destination panel present. When I am using a traditional file manager, I typically have to interrupt my workflow to open a second destination window, and use it to browse to my target location.
  • All the orthodox file managers I have mentioned, implement virtual file system support allowing me to browse compressed archives and remote network locations as if they were directories.

    Nowadays, VFS support is not exclusive to orthodox file managers anymore, but they existed in orthodox file managers much longer.

    Moreover, I consider the VFS properties of orthodox file managers to be much more powerful. For example, the Windows file explorer can browse Zip archives, but Total Commander also has first class support for many more kinds of archives, such as RAR, ACE, LhA, 7-zip and tarballs, and can be easily extended to support many other kinds of file systems by an add-on system.
  • They have very powerful search properties. For example, searching for a collection of files having certain kinds of text patterns can be done quite conveniently.

    As with VFS support, this feature is not exclusive to orthodox file managers, but I have noticed that their search functions are still considerably more powerful than most traditional file managers.

From all the orthodox file managers listed above, Midnight Commander is the one I have been using the longest -- it was one of the first programs I used when I started using Linux (in 1999) and I have been using it ever since.

Midnight Commander also includes a text editor named: mcedit that integrates nicely with the search function. Although I have experience with half a dozen editors (such as vim and various IDEs, such as Eclipse and Netbeans), I have been using mcedit, mostly for editing configuration files, shell scripts and simple programs.

Syntax highlighting in mcedit


Earlier in the introduction I mentioned: "new year's resolutions", which may probably suggest that I intend to quit using orthodox file managers and an unconventional editor, such as mcedit. Actually, this is not something I am planning to :-).

In addition to Midnight Commander and mcedit, I have also been using another unconventional program for quite some time, namely: the Nix package manager since late 2007.

What I noticed is that, despite being primitive, mcedit has reasonable syntax highlighting support for a variety of programming languages. Unfortunately, what I still miss is support for the Nix expression language -- the DSL that is used to specify package builds and system configurations.

For quite some time, editing Nix expressions was a primitive process for me. To improve my unconventional way of working a bit, I have decided to address this shortcoming in my Christmas break by creating a Nix syntax configuration file for mcedit.

Implementing a syntax configuration for the Nix expression language


mcedit provides syntax highlighting (the format is described in the manual page) for a number of programming languages. The syntax highlighting configurations seem to follow similar conventions, probably because of the fact that programming languages influence each other a lot.

As with many programming languages, the Nix expression language has its own influences as well, such as Haskell, C, bash, JavaScript (more specifically: the JSON subset) and Perl.

I have decided to adopt similar syntax highlighting conventions in the Nix expression syntax configuration. I started by examining Nix's lexer module (src/libexpr/lexer.l):

  • First, I took the keywords and operators, and configured the syntax highlighter to color them yellow. Yellow keywords is a convention that other syntax highlighting configurations also seem to follow.
  • Then I implemented support for single line and multi-line comments. The context directive turned out to be very helpful -- it makes it possible to color all characters between a start and stop token. Comments in mcedit are typically brown.
  • The next step were the numbers. Unfortunately, the syntax highlighter does not have full support for regular expressions. For example, you cannot specify character ranges, such as [0-9]+. Instead you must enumerate all characters one by one:

    keyword whole \[0123456789\]
    

    Floating point numbers were a bit trickier to support, but fortunately I could steal them from the JavaScript syntax highlighter, since the formatting Nix uses is exactly the same.
  • Strings were also relatively simple to implement (with the exception of anti-quotations) by using the context directive. I have configured the syntax highlighter to color them green, similar to other programming languages.
  • The Nix expression language also supports objects of the URL or path type. Since there is no other language that I am aware of that has a similar property, I have decided to color them white, with the exception of system paths -- system paths look very similar to the C preprocessor's #include path arguments, so I have decided to color them red, similar to the C syntax highlighter.

    To properly support paths, I implemented an approximation of the regular expression used in Nix's lexer. Without full regular expression support, it is extremely difficult to make a direct translation, but for all my use cases it seems to work fine.

After configuring the above properties, I noticed that there were still some bits missing. The next step was opening the parser configuration (src/libexpr/parser.y) and look for any missing characters.

I discovered that there were still separators that I needed to add (e.g. parenthesis, brackets, semi-colons etc.). I have configured the syntax highlighter to color them bright cyan, with the exception of semi-colons -- I colored them purple, similar to the C and JavaScript syntax highlighter.

I also added syntax highlighting for the builtin functions (e.g. derivation, map and toString) so that they appear in cyan. This convention is similar to bash' syntax highlighting.

The implementation process of the Nix syntax configuration was generally straight forward, except for one thing -- anti-quotations. Because we only have a primitive lexer and no parser, it is impossible to have a configuration that covers all possibilities. For example, anti-quotations in strings that embed strings cannot be properly supported. I ended up with an implementation that only works for simple cases (e.g. a reference to an identifier or a file).

Results


The syntax highlighter works quite well for the majority of expressions in the Nix packages collection. For example, the expression for the Disnix package looks as follows:


The top-level expression that contains the package compositions looks as follows:


Also, most Hydra release.nix configurations seem to work well, such as the one used for node2nix:


Availability


The Nix syntax configuration can be obtained from my GitHub page. It can be used by installing it in a user's personal configuration directory, or by deploying a patched version of Midnight Commander. More details can be found in the README.

No comments:

Post a Comment