Friday, November 3, 2017

Creating custom object transformations with NiJS and PNDP

In a number earlier blog posts, I have described two kinds of internal DSLs for Nix -- NiJS is a JavaScript-based internal DSL and PNDP is a PHP-based internal DSL.

These internal DSLs have a variety of application areas. Most of them are simply just experiments, but the most serious application area is code generation.

Using an internal DSL for generation has a number of advantages over string generation that is more commonly used. For example, when composing strings containing Nix expressions, we must make sure that any variable in the host language that we append to a generated expression is properly escaped to prevent code injection attacks.

Furthermore, we also have to take care of the indentation if we want to output Nix expression code that should be readable. Finally, string manipulation itself is not a very intuitive activity as it makes it very hard to read what the generated code would look like.

Translating host language objects to the Nix expression language


A very important feature of both internal DSLs is that they can literally translate some language constructs from the host language (JavaScript or PHP) to the Nix expression because they have (nearly) an identical meaning. For example, the following JavaScript code fragment:

var nijs = require('nijs');

var expr = {
  hello: "Hello",
  name: {
    firstName: "Sander",
    lastName: "van der Burg"
  },
  numbers: [ 1, 2, 3, 4, 5 ]
};

var output = nijs.jsToNix(expr, true);
console.log(output);

will output the following Nix expression:

{
  hello = "Hello",
  name = {
    firstName = "Sander";
    lastName = "van der Burg";
  };
  numbers = [
    1
    2
    3
    4
    5
  ];
}

In the above example, strings will be translated to strings (and quotes will be escaped if necessary), objects to attribute sets, and the array of numbers to a list of numbers. Furthermore, the generated code is also pretty printed so that attribute set and list members have 2 spaces of indentation.

Similarly, in PHP we can compose the following code fragment to get an identical Nix output:

use PNDP\NixGenerator;

$expr = array(
  "hello" => "Hello",
  "name" => array(
    "firstName" => "Sander",
    "lastName => "van der Burg"
  ),
  "numbers" => array(1, 2, 3, 4, 5)
);

$output = NixGenerator::phpToNix($expr, true);
echo($output);

The PHP generator uses a number of clever tricks to determine whether an array is associative or sequential -- the former gets translated into a Nix attribute set while the latter gets translated into a list.

There are objects in the Nix expression language for which no equivalent exists in the host language. For example, Nix also allows you to define objects of a 'URL' and 'file' type. Neither JavaScript nor PHP have a direct equivalent. Moreover, it may be desired to generate other kinds of language constructs, such as function declarations and function invocations.

To still generate these kinds of objects, you must compose an abstract syntax tree from objects that inherit from the NixObject prototype or class. For example, we can define a function invocation to fetchurl {} in Nixpkgs as follows in JavaScript:

var expr = new nijs.NixFunInvocation({
    funExpr: new nijs.NixExpression("fetchurl"),
    paramExpr: {
        url: new nijs.NixURL("mirror://gnu/hello/hello-2.10.tar.gz"),
        sha256: "0ssi1wpaf7plaswqqjwigppsg5fyh99vdlb9kzl7c9lng89ndq1i"
    }
});

and in PHP as follows:

use PNDP\AST\NixExpression;
use PNDP\AST\NixFunInvocation;
use PNDP\AST\NixURL;

$expr = new NixFunInvocation(new NixExpression("fetchurl"), array(
    "url" => new NixURL("mirror://gnu/hello/hello-2.10.tar.gz"),
    "sha256" => "0ssi1wpaf7plaswqqjwigppsg5fyh99vdlb9kzl7c9lng89ndq1i"
));

Both of the objects in the above code fragments translate to the following Nix expression:

fetchurl {
  url = mirror://gnu/hello/hello-2.10.tar.gz;
  sha256 = "0ssi1wpaf7plaswqqjwigppsg5fyh99vdlb9kzl7c9lng89ndq1i";
}

Transforming custom object structures into Nix expressions


The earlier described use cases are basically one-on-one translations from the host language (JavaScript or PHP) to the guest language (Nix). In some cases, literal translations do not make sense -- for example, it may be possible that we already have an application with an existing data model from which we want to derive deployments that should be carried out with Nix.

In the latest versions of NiJS and PNDP, it is also possible to specify how to transform custom object structures into a Nix expression. This can be done by inheriting from the NixASTNode class or prototype and overriding the toNixAST() method.

For example, we may have a system already providing a representation of a file that should be downloaded from an external source:

function HelloSourceModel() {
    this.src = "mirror://gnu/hello/hello-2.10.tar.gz";
    this.sha256 = "0ssi1wpaf7plaswqqjwigppsg5fyh99vdlb9kzl7c9lng89ndq1i";
}

The above module defines a constructor function composing an object that refers to the GNU Hello package provided by a GNU mirror site.

A direct translation of an object constructed by the above function to the Nix expression language does not provide anything meaningful -- it can, for example, not be used to let Nix fetch the package from the mirror site.

We can inherit from NixASTNode and implement our own custom toNixAST() function to provide a more meaningful Nix translation:

var nijs = require('nijs');
var inherit = require('nijs/lib/ast/util/inherit.js').inherit;

/* HelloSourceModel inherits from NixASTNode */
inherit(nijs.NixASTNode, HelloSourceModel);

/**
 * @see NixASTNode#toNixAST
 */
HelloSourceModel.prototype.toNixAST = function() {
    return this.args.fetchurl()({
        url: new nijs.NixURL(this.src),
        sha256: this.sha256
    });
};

The toNixAST() function shown above composes an abstract syntax tree (AST) for a function invocation to fetchurl {} in the Nix expression language with the url and sha256 properties a parameters.

An object that inherits from the NixASTNode prototype also indirectly inherits from NixObject. This means that we can directly attach such an object to any other AST object. The generator uses the underlying toNixAST() function to automatically convert it to its AST representation:

var helloSource = new HelloSourceModel();
var output = nijs.jsToNix(helloSource, true);
console.log(output);

In the above code fragment, we directly pass the construct HelloSourceModel object instance to the generator. The output will be the following Nix expression:

fetchurl {
  url = mirror://gnu/hello/hello-2.10.tar.gz;
  sha256 = "0ssi1wpaf7plaswqqjwigppsg5fyh99vdlb9kzl7c9lng89ndq1i";
}

In some cases, it may not be possible to inherit from NixASTNode, for example, when the object already inherits from another prototype or class that is beyond the user's control.

It is also possible to use the NixASTNode constructor function as an adapter. For example, we can take any object with a toNixAST() function:

var helloSourceWrapper = {
    toNixAST: function() {
        return new nijs.NixFunInvocation({
            funExpr: new nijs.NixExpression("fetchurl"),
            paramExpr: {
                url: new nijs.NixURL(this.src),
                sha256: this.sha256
            }
        });
    }
};

By wrapping the helloSourceWrapper object in the NixASTNode constructor, we can convert it to an object that is an instance of NixASTNode:

new nijs.NixASTNode(helloSourceWrapper)

In PHP, we can change any class into a NixASTNode by implementing the NixASTConvertable interface:

use PNDP\AST\NixASTConvertable;
use PNDP\AST\NixURL;

class HelloSourceModel implements NixASTConvertable
{
    /**
     * @see NixASTConvertable::toNixAST()
     */
    public function toNixAST()
    {
        return $this->args->fetchurl(array(
            "url" => new NixURL($this->src),
            "sha256" => $this->sha256
        ));
    }
}

By passing an object that implements the NixASTConvertable interface to the NixASTNode constructor, it can be converted:

new NixASTNode(new HelloSourceModel())

Motivating use case: the node2nix and composer2nix generators


My main motivation to use custom transformations is to improve the quality of the node2nix and composer2nix generators -- the former converts NPM package configurations to Nix expressions and the latter converts PHP composer package configurations to Nix expressions.

Although NiJS and PNDP provide a number of powerful properties to improve the code generation steps of these tools, e.g. I no longer have to think much about escaping strings or pretty printing, there are still many organizational coding issues left. For example, the code that parses the configurations, fetches the external sources, and generates the code are mixed. As a consequence, the code is very hard to read, update, maintain and to ensure its correctness.

The new transformation facilities allow me to separate concerns much better. For example, both generators now have a data model that reflects the NPM and composer problem domain. For example, I could compose the following (simplified) class diagram for node2nix's problem domain:


A crucial part of node2nix's generator is the package class shown on the top left on the diagram. A package requires zero or more packages as dependencies and may provide zero or more packages in the node_modules/ folder residing in the package's base directory.

For readers not familiar with NPM's dependency management: every package can install its dependencies privately in a node_modules/ folder residing in the same base directory. The CommonJS module ensures that every file is considered to be a unique module that should not interfere with other modules. Sharing is accomplished by putting a dependency in a node_modules/ folder of an enclosing parent package.

NPM 2.x always installs a package dependency privately unless a parent package exists that can provide a conforming version. NPM 3.x (and later) will also move a package into the node_modules/ folder hierarchy as high as possible to prevent too many layers of nested node_modules/ folders (this is particularly a problem on Windows). The class structure in the above diagram reflects this kind of dependency organisation.

In addition to a package dependency graph, we also need to obtain package metadata and compute their output hashes. NPM packages originate from various kinds of sources, such as the NPM registry, Git repositories, HTTP sites and local directories on the filesystem.

To optimize the process and support sharing of common sources among packages, we can use a source cache that memorizes all unique source referencess.

The Package::resolveDependencies() method sets the generation process in motion -- it will construct the dependency graph replicating NPM's dependency resolution algorithm as faithfully as possible, and resolves all the dependencies' (and transitive dependencies) metadata.

After resolving all dependencies and their metadata, we must generate the output Nix expressions. One Nix expression is copied (the build infrastructure) and two are generated -- a composition expression and a package or collection expression.

We can also compose a class diagram for the generation infrastructure:


In the above class diagram, every generated expression is represented a class inheriting from NixASTNode. We can also reuse some classes from the domain model as constituents for the generated expressions, by also inheriting from NixASTNode and overriding the toNixAST() method:

  • The source objects can be translated into sub expressions that invoke fetchurl {} and fetchgit {}.
  • The sources cache can be translated into an attribute set exposing all sources that are used as dependencies for packages.
  • A package instance can be converted into a function invocation to nodeenv.buildNodePackage {} that, in addition to configuring build properties, binds the required dependencies to the sources in the sources cache attribute set.

By decomposing the expression into objects and combining the objects' AST representations, we can nicely modularize the generation process.

For composer2nix, we can also compose a class diagram for its domain -- the generation process:


The above class diagram has many similarities, but also some major differences compared to node2nix. composer provides so-called lock files that pinpoint the exact versions of all dependencies and transitive dependencies. As a result, we do not need to replicate composer's dependency resolution algorithm.

Instead, the generation process is driven by the ComposerConfig class that encapsulates the properties of the composer.json and composer.lock files of a package. From a composer configuration, the generator constructs a package object that refers to the package we intend to deploy and populates a source cache with source objects that come from various sources, such as Git, Mercurial and Subversion repositories, Zip files, and directories residing on the local filesystem.

For the generation process, we can adopt a similar strategy that exposes the generated Nix expressions as classes and uses some classes of the domain model as constituents for the generation process:


Discussion


In this blog post, I have described a new feature for the NiJS and PNDP frameworks, making it possible to implement custom transformations. Some of its benefits are that it allows an existing object model to be reused and concerns in an application can be separated much more conveniently.

These facilities are not only useful for the improvement of the architecture of the node2nix and composer2nix generators -- at the company I work for (Conference Compass), we developed our own domain-specific configuration management tool.

Despite the fact that it uses several tools from the Nix project to carry out deployments, it uses a domain model that is not Nix-specific at all. Instead, it uses terminology and an organization that reflects company processes and systems.

For example, we use a backend-for-frontend organization that provides a backend for each mobile application that we ship. We call these backends configurators. Optionally, every configurator can import data from various external sources that we call channels. The tool's data model reflects this kind of organization, and generates Nix expressions that contain all relevant implementation details, if necessary.

Finally, the fact that I modified node2nix to have a much cleaner architecture has another reason beyond quality improvement. Currently, NPM version 5.x (that comes with Node.js 8.x) is still unsupported. To make it work with Nix, we require a slightly different generation process and a completely different builder environment. The new architecture allows me to reuse the common parts much more conveniently. More details about the new NPM support will follow (hopefully) soon.

Availability


I have released new versions of NiJS and PNDP that have the custom transformation facilities included.

Furthermore, I have decided to release new versions for node2nix and composer2nix that use the new generation facilities in their architecture. The improved architecture revealed a very uncommon but nasty bug with bundled dependencies in node2nix, that is now solved.