Tuesday, October 30, 2018

Auto patching prebuilt binary software packages for deployment with the Nix package manager

As explained in many previous blog posts, most of the quality properties of the Nix package manager (such as reliable deployment) stem from the fact that all packages are stored in a so-called Nix store, in which every package resides in its own isolated folder with a hash prefix that is derived from all build inputs (such as: /nix/store/gf00m2nz8079di7ihc6fj75v5jbh8p8v-zlib-1.2.11).

This unorthodox naming convention makes it possible to safely store multiple versions and variants of the same package next to each other.

Although isolating packages in the Nix store provides all kinds of benefits, it also has a big drawback -- common components, such as shared libraries, can no longer be found in their "usual locations", such as /lib.

For packages that are built from source with the Nix package manager this is typically not a problem:

  • The Nix expression language computes the Nix store paths for the required packages. By simply referring to the variable that contains the build result, you can obtain the Nix store path of the package, without having to remember them yourself.
  • Nix statically binds shared libraries to ELF binaries by modifying the binary's RPATH field. As a result, binaries no longer rely on the presence of their library dependencies in global locations (such as /lib), but use the libraries stored in isolation in the Nix store.
  • The GNU linker (the ld command) has been wrapped to transparently add the paths of all the library package to the RPATH field of the ELF binary, whenever a dynamic library is provided.

As a result, you can build most packages from source code by simply executing their standardized build procedures in a Nix builder environment, such as: ./configure --prefix=$out; make; make install.

When it is desired to deploy prebuilt binary packages with Nix then you may probably run into various kinds of challenges:

  • ELF executables require the presence of an ELF interpreter in /lib/ld-linux.so.2 (on x86) and /lib/ld-linux-x86-64.so.2 (on x86-64), which is impure and does not exist in NixOS.
  • ELF binaries produced by conventional means typically have no RPATH configured. As a result, they expect libraries to be present in global namespaces, such as /lib. Since these directories do not exist in NixOS an executable will typically fail to work.

To make prebuilt binaries work in NixOS, there are basically two solutions -- it is possible to compose so-called FHS user environments from a set of Nix packages in which shared components can be found in their "usual locations". The drawback is that it requires special privileges and additional work to compose such environments.

The preferred solution is to patch prebuilt ELF binaries with patchelf (e.g. appending the library dependencies to the RPATH of the executable) so that their dependencies are loaded from the Nix store. I wrote a guide that demonstrates how to do this for a number of relatively simple packages.

Although it is possible to patch prebuilt ELF binaries to make them run work from the Nix store, such a process is typically tedious and time consuming -- you must dissect a package, search for all relevant ELF binaries, figure out which libraries a binary requires, find the corresponding packages that provide them and then update the deployment instructions to patch the ELF binaries.

For small projects, a manual binary patching process is still somewhat manageable, but for a complex project such as the Android SDK, that provides a large collection of plugins containing a mix of many 32-bit and 64-bit executables, manual patching is quite labourious, in particular when it is desired to keep all plugins up to date -- plugin packages are updated quite frequently forcing the packager to re-examine all binaries over and over again.

To make the Android SDK patching process easier, I wrote a small tool that can mostly automate it. The tool can also be used for other kinds of binary packages.

Automatic searching for library locations


In order to make ELF binaries work, they must be patched in such a way that they use an ELF interpreter from the Nix store and their RPATH fields should contain all paths to the libraries that they require.

We can gather a list of required libraries for an executable, by running:

$ patchelf --print-needed ./zipmix
libm.so.6
libc.so.6

Instead of manually patching the executable with this provided information, we can also create a function that searches for the corresponding libraries in a list of search paths. The tool could take the first path that provides the required libraries.

For example, by setting the following colon-separated seach environment variable:

$ export libs=/nix/store/7y10kn6791h88vmykdrddb178pjid5bv-glibc-2.27/lib:/nix/store/xh42vn6irgl1cwhyzyq1a0jyd9aiwqnf-zlib-1.2.11/lib

The tool can automatically discover that the path: /nix/store/7y10kn6791h88vmykdrddb178pjid5bv-glibc-2.27/lib provides both libm.so.6 and libc.so.6.

We can also run into situations in which we cannot find any valid path to a required library -- in such cases, we can throw an error and notify the user.

It is also possible extend the searching approach to the ELF interpreter. The following command provides the path to the required ELF interpreter:

$ patchelf --print-interpreter ./zipmix
/lib64/ld-linux-x86-64.so.2

We can search in the list of library packages for the ELF interpreter as well so that we no longer have to explicitly specify it.

Dealing with multiple architectures


Another problem with the Android SDK is that plugin packages may provide both x86 and x86-64 binaries. You cannot link libraries compiled for x86 against an x86-64 executable and vice versa. This restriction could introduce a new kind of risk in the automatic patching process.

Fortunately, it is also possible to figure out for what kind of architecture a binary was compiled:

$ readelf -h ./zipmix
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           Advanced Micro Devices X86-64

The above command-line instruction shows that we have a 64-bit binary (Class: ELF64) compiled for the x86-64 architecture (Machine: Advanced Micro Devices X86-64)

I have also added a check that ensures that the tool will only add a library path to the RPATH if the architecture of the library is compatible with the binary. As a result, it is not possible to accidentally link a library with an incompatible architecture to a binary.

Patching collections of binaries


Another inconvenience is the fact that Android SDK plugins typically provide more than one binary that needs to be patched. We can also recursively search an entire directory for ELF binaries:

$ autopatchelf ./bin

The above command-line instruction recursively searches for binaries in the bin/ sub directory and automatically patches them.

Sometimes recursively patching executables in a directory hierarchy could have undesired side effects. For example, the Android SDK also provides emulators having their own set of ELF binaries that need to run in the emulator. Patching these binaries typically breaks the software running in the emulator. We can also disable recursion if this is desired:

$ autopatchelf --no-recurse ./bin

or revert to patching individual executables:

$ autopatchelf ./zipmix

The result


The result of having most aspects automated of a binary patching process results in a substantial reduction in code size for the Nix expressions that need to deploy prebuilt packages.

In my previous blog post, I have shown two example cases for which I manually derived the patchelf instructions that I need to run. By using the autopatchelf tool I can significantly decrease the size of the corresponding Nix expressions.

For example, the following expression deploys kzipmix:

{stdenv, fetchurl, autopatchelf, glibc}:

stdenv.mkDerivation {
  name = "kzipmix-20150319";
  src = fetchurl {
    url = http://static.jonof.id.au/dl/kenutils/kzipmix-20150319-linux.tar.gz;
    sha256 = "0fv3zxhmwc3p34larp2d6rwmf4cxxwi71nif4qm96firawzzsf94";
  };
  buildInputs = [ autopatchelf ];
  libs = stdenv.lib.makeLibraryPath [ glibc ];
  installPhase = ''
    ${if stdenv.system == "i686-linux" then "cd i686"
    else if stdenv.system == "x86_64-linux" then "cd x86_64"
    else throw "Unsupported system architecture: ${stdenv.system}"}
    mkdir -p $out/bin
    cp zipmix kzip $out/bin
    autopatchelf $out/bin
  '';
}

In the expression shown above, it suffices to simply move the executable to $out/bin and running autopatchelf.

I have also shown a more complicated example demonstrating how to patch the Quake 4 demo. I can significantly reduce the amount of code by substituting all the patchelf instructions by a single autopatchelf invocation:

{stdenv, fetchurl, glibc, SDL, xlibs}:

stdenv.mkDerivation {
  name = "quake4-demo-1.0";
  src = fetchurl {
    url = ftp://ftp.idsoftware.com/idstuff/quake4/demo/quake4-linux-1.0-demo.x86.run;
    sha256 = "0wxw2iw84x92qxjbl2kp5rn52p6k8kr67p4qrimlkl9dna69xrk9";
  };
  buildInputs = [ autopatchelf ];
  libs = stdenv.lib.makeLibraryPath [ glibc SDL xlibs.libX11 xlibs.libXext ];

  buildCommand = ''
    # Extract files from the installer
    cp $src quake4-linux-1.0-demo.x86.run
    bash ./quake4-linux-1.0-demo.x86.run --noexec --keep
    # Move extracted files into the Nix store
    mkdir -p $out/libexec
    mv quake4-linux-1.0-demo $out/libexec
    cd $out/libexec/quake4-linux-1.0-demo
    # Remove obsolete setup files
    rm -rf setup.data
    # Patch ELF binaries
    autopatchelf .
    # Remove libgcc_s.so.1 that conflicts with Mesa3D's libGL.so
    rm ./bin/Linux/x86/libgcc_s.so.1
    # Create wrappers for the executables and ensure that they are executable
    for i in q4ded quake4
    do
        mkdir -p $out/bin
        cat > $out/bin/$i <<EOF
    #! ${stdenv.shell} -e
    cd $out/libexec/quake4-linux-1.0-demo
    ./bin/Linux/x86/$i.x86 "\$@"
    EOF
        chmod +x $out/libexec/quake4-linux-1.0-demo/bin/Linux/x86/$i.x86
        chmod +x $out/bin/$i
    done
  '';
}

For the Android SDK, there is even a more substantial win in code size reductions. The following Nix expression is used to patch the Android build-tools plugin package:

{deployAndroidPackage, lib, package, os, autopatchelf, makeWrapper, pkgs, pkgs_i686}:

deployAndroidPackage {
  inherit package os;
  buildInputs = [ autopatchelf makeWrapper ];

  libs_x86_64 = lib.optionalString (os == "linux")
    (lib.makeLibraryPath [ pkgs.glibc pkgs.zlib pkgs.ncurses5 ]);
  libs_i386 = lib.optionalString (os == "linux")
    (lib.makeLibraryPath [ pkgs_i686.glibc pkgs_i686.zlib pkgs_i686.ncurses5 ]);

  patchInstructions = ''
    ${lib.optionalString (os == "linux") ''
      export libs_i386=$packageBaseDir/lib:$libs_i386
      export libs_x86_64=$packageBaseDir/lib64:$libs_x86_64
      autopatchelf $packageBaseDir/lib64 libs --no-recurse
      autopatchelf $packageBaseDir libs --no-recurse
    ''}

    wrapProgram $PWD/mainDexClasses \
      --prefix PATH : ${pkgs.jdk8}/bin
  '';
  noAuditTmpdir = true;
}

The above expression specifies the search libraries per architecture for x86 (i386) and x86_64 and automatically patches the binaries in the lib64/ sub folder and base directories. The autopatchelf tool ensures that no library of an incompatible architecture gets linked to a binary.

Discussion


The automated patching approach described in this blog post is not entirely a new idea -- in Nixpkgs, Aszlig Neusepoff created an autopatchelf hook that is integrated into the fixup phase of the stdenv.mkDerivation {} function. It shares a number of similar features -- it accepts a list of library packages (the runtimeDependencies environment variable) and automatically adds the provided runtime dependencies to the RPATH of all binaries in all the output folders.

There are also a number of differences -- my approach provides an autopatchelf command-line tool that can be invoked in any stage of a build process and provides full control over the patching process. It can also be used outside a Nix builder environment, which is useful for experimentation purposes. This increased level of flexibility is required for more complex prebuilt binary packages, such as the Android SDK and its plugins -- for some plugins, you cannot generalize the patching process and you typically require more control.

It also offers better support to cope with repositories providing binaries of multiple architectures -- while the Nixpkgs version has a check that prevents incompatible libraries from being linked, it does not allow you to have fine grained control over library paths to consider for each architecture.

There is also a use case that the autopatchelf command-line tool does not support -- the autopatchelf hook can also be used for source compiled projects whose executables may need to dynamically load dependencies via the dlopen() function call.

Dynamically loaded libraries are not known at link time (because they are not provided to the Nix-wrapped ld command), and as a result, they are not added to the RPATH of an executable. The Nixpkgs autopatchelf hook allows you to easily supplement the library paths of these dynamically loaded libraries after the build process completes.

Availability


The autopatchelf command-line tool can be found in the nix-patchtools repository. The goal of this repository to provide a collection of tools that help making the patching processes of complex prebuilt packages more convenient. In the future, I may identify more patterns and provide additional tooling to automate them.

autopatchelf is prominently used in my refactored version of the Android SDK to automatically patch all ELF binaries. I have the intention to integrate this new Android SDK implementation into Nixpkgs soon.

Saturday, September 22, 2018

Creating Nix build function abstractions for pluggable SDKs

Two months ago, I decomposed the stdenv.mkDerivation {} function abstraction in the Nix packages collection that is basically the de-facto way in the Nix expression language to build software packages from source.

I identified some of its major concerns and developed my own implementation that is composed of layers in which each layer gradually adds a responsibility until it has most of the features that the upstream version also has.

In addition to providing a better separation of concerns, I also identified a pattern that I repeatedly use to create these abstraction layers:

{stdenv, foo, bar}:
{name, buildInputs ? [], ...}@args:

let
  extraArgs = removeAttrs args [ "name" "buildInputs" ];
in
stdenv.someBuildFunction ({
  name = "mypackage-"+name;
  buildInputs = [ foo bar ] ++ buildInputs;
} // extraArgs)

Build function abstractions that follow this pattern (as outlined in the code fragment shown above) have the following properties:

  • The outer function header (first line) specifies all common build-time dependencies required to build a project. For example, if we want to build a function abstraction for Python projects, then python is such a common build-time dependency.
  • The inner function header specifies all relevant build parameters and accepts an arbitrary number of arguments. Some arguments have a specific purpose for the kind of software project that we want to build (e.g. name and buildInputs) while other arguments can be passed verbatim to the build function abstraction that we use as a basis.
  • In the body, we invoke a function abstraction (quite frequently stdenv.mkDerivation {}) that builds the project. We use the build parameters that have a specific meaning to configure specialized build properties and we pass all remaining build parameters that are not conflicting verbatim to the build function that we use a basis.

    A subset of these arguments have no specific meaning and are simply exposed as environment variables in the builder environment.

    Because some parameters are already being used for a specific purpose and others may be incompatible with the build function that we invoke in the body, we compose a variable named: extraArgs in which we remove the conflicting arguments.

Aside from having a function that is tailored towards the needs of building a specific software project (such as a Python project), using this pattern provides the following additional benefits:

  • A build procedure is extendable/tweakable -- we can adjust the build procedure by adding or changing the build phases, and tweak them by providing build hooks (that execute arbitrary command-line instructions before or after the execution of a phase). This is particularly useful to build additional abstractions around it for more specialized deployment procedures.
  • Because an arbitrary number of arguments can be propagated (that can be exposed as environment variables in the build environment), we have more configuration flexibility.

The original objective of using this pattern is to create an abstraction function for GNU Make/GNU Autotools projects. However, this pattern can also be useful to create custom abstractions for other kinds of software projects, such as Python, Perl, Node.js etc. projects, that also have (mostly) standardized build procedures.

After completing the blog post about layered build function abstractions, I have been improving the Nix packages/projects that I maintain. In the process, I also identified a new kind of packaging scenario that is not yet covered by the pattern shown above.

Deploying SDKs


In the Nix packages collection, most build-time dependencies are fully functional software packages. Notable exceptions are so-called SDKs, such as the Android SDK -- the Android SDK "package" is only a minimal set of utilities (such as a plugin manager, AVD manager and monitor).

In order to build Android projects from source code and manage Android app installations, you need to install a variety of plugins, such as build-tools, platform-tools, platform SDKs and emulators.

Installing all plugins is typically a much too costly operation -- it requires you to download many gigabytes of data. In most cases, you only want to install a very small subset of them.

I have developed a function abstraction that makes it possible to deploy the Android SDK with a desired set of plugins, such as:

with import <nixpkgs> {};

let
  androidComposition = androidenv.composeAndroidPackages {
    toolsVersion = "25.2.5";
    platformToolsVersion = "27.0.1";
    buildToolsVersions = [ "27.0.3" ];
    includeEmulator = true;
    emulatorVersion = "27.2.0";
  };
in
androidComposition.androidsdk

When building the above expression (default.nix) with the following command-line instruction:

$ nix-build
/nix/store/zvailnl4f1261cn87s9n29lhj9i7y7iy-androidsdk

We get an Android SDK installation, with tools plugin version 25.2.5, platform-tools version 27.0.1, one instance of the build-tools (version 27.0.1) and an emulator of version 27.0.2. The Nix package manager will download the required plugins automatically.

Writing build function abstractions for SDKs


If you want to create function abstractions for software projects that depend on an SDK, you not only have to execute a build procedure, but you must also compose the SDK in such a way that all plugins are installed that a project requires. If any of the mandatory plugins are missing, the build will most likely fail.

As a result, the function interface must also provide parameters that allow you to configure the plugins in addition to the build parameters.

A very straight forward approach is to write a function whose interface contains both the plugin and build parameters, and propagates each of the required parameters to the SDK composition function, but manually writing this mapping has a number of drawbacks -- it duplicates functionality of the SDK composition function, it is tedious to write, and makes it very difficult to keep it consistent in case the SDK's functionality changes.

As a solution, I have extended the previously shown pattern with support for SDK deployments:

{composeMySDK, stdenv}:
{foo, bar, ...}@args:

let
  mySDKFormalArgs = builtins.functionArgs composeMySDK;
  mySDKArgs = builtins.intersectAttrs mySDKFormalArgs args;
  mySDK = composeMySDK mySDKArgs;
  extraArgs = removeAttrs args ([ "foo" "bar" ]
    ++ builtins.attrNames mySDKFormalArgs);
in
stdenv.mkDerivation ({
  buildInputs = [ mySDK ];
  buildPhase = ''
    ${mySDK}/bin/build
  '';
} // extraArgs)

In the above code fragment, we have added the following steps:

  • First, we dynamically extract the formal arguments of the function that composes the SDK (mySDKFormalArgs).
  • Then, we compute the intersection of the formal arguments of the composition function and the actual arguments from the build function arguments set (args). The resulting attribute set (mySDKArgs) are the actual arguments we need to propagate to the SDK composition function.
  • The next step is to deploy the SDK with all its plugins by propagating the SDK arguments set as function parameters to the SDK composition function (mySDK).
  • Finally, we remove the arguments that we have passed to the SDK composition function from the extra arguments set (extraArgs), because these parameters have no specific meaning for the build procedure.

With this pattern, the build abstraction function evolves automatically with the SDK composition function without requiring me to make any additional changes.

To build an Android project from source code, I can write an expression such as:

{androidenv}:

androidenv.buildApp {
  # Build parameters
  name = "MyFirstApp";
  src = ../../src/myfirstapp
  antFlags = "-Dtarget=android-16";

  # SDK composition parameters
  platformVersions = [ 16 ];
  toolsVersion = "25.2.5";
  platformToolsVersion = "27.0.1";
  buildToolsVersions = [ "27.0.3" ];
}

The expression shown above has the following properties:

  • The above function invocation propagates three build parameters: name referring to the name of the Nix package, src referring to a filesystem location that contains the source code of an Android project, and antFlags that contains command-line arguments that are passed to the Apache Ant build tool.
  • It propagates four SDK composition parameters: platformVersions referring to the platform SDKs that must be installed, toolsVersion to the version of the tools package, platformToolsVersion to the platform-tools package and buildToolsVersion to the build-tool packages.

By evaluating the above function invocation, the Android SDK with the plugins will be composed, and the corresponding SDK will be passed as a build input to the builder environment.

In the build environment, Apache Ant gets invoked build that builds the project from source code. The android.buildApp implementation will dynamically propagate the SDK composition parameters to the androidenv.composeAndroidPackages function.

Availability


The extended build function abstraction pattern described in this blog post is among the structural improvements I have been implementing in the mobile app building infrastructure in Nixpkgs. Currently, it is used in standalone test versions of the Nix android build environment, iOS build environment and Titanium build environment.

The Titanium SDK build function abstraction (a JavaScript-based cross-platform development framework that can produce Android, iOS, and several other kinds of applications from the same codebase) automatically composes both Xcode wrappers and Android SDKs to make the builds work.

The test repositories can be found on my GitHub page and the changes live in the nextgen branches. At some point, they will be reintegrated into the upstream Nixpkgs repository.

Besides mobile app development SDKs, this pattern is generic enough to be applied to other kinds of projects as well.

Thursday, August 2, 2018

Automating Mendix application deployments with Nix

As explained in a previous blog post, Mendix is a low-code development platform -- the general idea behind low-code application development is that instead of writing (textual) code, you model an application, such as the data structures and the corresponding views. One of the benefits of Mendix is that it makes you more productive as a developer, for certain classes of applications.

Although low-code development is conceptually different from a development perspective compared to more "traditional" development approaches (that require you to write code), there is one particular aspect a Mendix application lifecycle has in common. Eventually, you will have to deploy your app to an environment that makes your application available to end users.

For users of the Mendix cloud portal, deploying an application is quite convenient: with just a few simple mouse clicks your application gets deployed to a test, acceptance or production environment.

However, managing on-premise application deployments or actually managing applications in the cloud is all but a simple job. There all all kinds of challenges you need to cope with, such as:

  • Making sure that all dependencies of an app are present, such as a database for storage.
  • Executing all relevant deployment activities to make an app available for use.
  • Upgrading is risky and difficult -- it may break the application and introduce downtime.

There are a variety of deployment solutions available to manage deployment processes. However, no solution is perfect -- every tool has its strengths and weaknesses and no tool is a perfect fit. As a result, we still have to develop custom solutions that automate missing parts in a deployment process and we have many kinds of additional complexities that we need to cope with.

Recently, I investigated whether it would be possible to deploy Mendix applications, with my favourite class of deployment utilities from the Nix project, and I gave an introduction to the Nix project to the R&D department at Mendix.

Using tools from the Nix project


For readers not familiar with Nix: the tools in the Nix project solve many configuration management problems in their own unique way. The basis of all the tools is the Nix package manager that borrows concepts from purely functional programming languages, such as Haskell.

To summarize Nix in just a few sentences: deploying a package with Nix is the same thing as invoking a pure function that constructs a package from source code and its build-time dependencies (that are provided as function parameters). To accomplish purity, Nix composes so-called "pure build environments", in which various restrictions are imposed on the build script to ensure that the outcome will be (almost) identical if a package is built with the same build inputs.

The purely functional deployment model has all kinds of benefits -- for example, it provides very strong dependency completeness and reproducibility guarantees, and all kinds of optimizations (e.g. a package that has been deployed before does not have to be built again, packages that have no dependency on each other can be built in parallel, builds can be downloaded from a remote location or delegated to another machine).

Another important property that all tools in the Nix project have in common is declarative deployment -- instead of describing the deployment activities that need to be carried out, you describe the structure of your system that want to deploy, e.g. the packages, a system configuration, or a network of machines/services. The deployment tools infer the activities that need to be carried out to get the system deployed.

Automating Mendix application deployments with Nix


As an experiment, I investigated how Mendix application deployments could fit in Nix's vision of declarative deployment -- the objective is to take a Mendix project created by the modeler (essentially the "source code" form of an application), write a declarative deployment specification for it, and use the tools from the Nix project to get a machine running with all required components to make the app run.

To bring a Mendix application in a running state, we require the following ingredients:

  • We must obtain the Mendix runtime that interprets the Mendix models. Packaging the Mendix runtime in Nix is fairly straight forward -- simply unzipping the distribution, and moving the package contents into the Nix store, and adding a wrapper script launches the runtime suffices.
  • We must produce a Mendix Deployment Archive (MDA file) that creates a Zip container with all artifacts required to run a Mendix app by the runtime. An MDA file can be produced from a Mendix project by invoking the MxBuild tool. Since MxBuild is required for this, I had to package it as well. Packaging mxbuild is a bit trickier, as it requires mono and Node.js.

Building an MDA file with Nix


The most interesting part is writing a new function abstraction for building MDA files with Nix -- in a Nix builder environment, (almost) any build tool can be used albeit with restrictions that are imposed on them to make builds more pure.

We can also create a function abstraction that invokes mxbuild in a Nix builder environment:

{stdenv, mxbuild, jdk, nodejs}:
{name, mendixVersion, looseVersionCheck ? false, buildInputs ? [], ...}@args:

let
  mxbuildPkg = mxbuild."${mendixVersion}";
  extraArgs = removeAttrs args [ "buildInputs" ];
in
stdenv.mkDerivation ({
  buildInputs = [ mxbuildPkg nodejs ] ++ buildInputs;
  installPhase = ''
    mkdir -p $out
    mxbuild --target=package \
      --output=$out/${name}.mda \
      --java-home ${jdk} \
      --java-exe-path ${jdk}/bin/java \
      ${stdenv.lib.optionalString looseVersionCheck "--loose-version-check"} \
      "$(echo *.mpr)"
    mkdir -p $out/nix-support
    echo "file binary-dist \"$(echo $out/*.mda)\"" > $out/nix-support/hydra-build-products
  '';
} // extraArgs)

The above expression is a function that composes another function that takes common Mendix parameters -- the application name, the version of MxBuild that we want, and whether we want to use a strict or loose version check (it is possible to compile a project developed for a different version of Mendix, if desired).

In the body, we create an output directory in the Nix store, we invoke mxbuild to compile to MDA app and put it in the Nix store, and we generate a configuration file that makes it possible to expose the MDA file as a build product, when Hydra: the Nix-based continuous integration service is being used.

With the build function shown in the code fragment above, we can write a Nix expression for a Mendix project:

{ pkgs ? import  { inherit system; }                                                                                                                             
, system ? builtins.currentSystem
}:

let
  mendixPkgs = import ./nixpkgs-mendix/top-level/all-packages.nix {
    inherit pkgs system;
  };
in
mendixPkgs.packageMendixApp {
  name = "conferenceschedule";
  src = /home/sander/SharedWindowsFolder/ConferenceSchedule-main;
  mendixVersion = "7.13.1";
}

The above expression (conferenceschedule.nix) can be used to build an MDA file for a project named: conferenceschedule, residing in the /home/sander/SharedWindowsFolder/ConferenceSchedule-main directory using Mendix version 7.13.1.

By running the following command-line instruction, we can use Nix to build our MDA:

$ nix-build conferenceschedule.nix 
/nix/store/nbaa7fnzi0xw9nkf27mixyr9awnbj16i-conferenceschedule
$ ls /nix/store/nbaa7fnzi0xw9nkf27mixyr9awnbj16i-conferenceschedule
conferenceschedule.mda  nix-support

In addition to building an MDA, Nix will also download the dependencies: the Mendix runtime and MxBuild, if they have not been installed yet.

Running a Mendix application


Producing an MDA file is an important ingredient in the deployment lifecycle of a Mendix application, but it is not entirely what we want -- what we really want is a running system. To get a running system, additional steps are required beyond producing an MDA:

  • We must unzip the MDA file into a directory with write permissions.
  • We must create writable state sub directories, e.g. data/tmp, data/files.
  • After starting the runtime, we must configure the admin interface, to send instructions to the runtime to initialize the database and start the app:
    $ export M2EE_ADMIN_PORT=9000
    $ export M2EE_ADMIN_PASS=secret
    
  • Finally, we must communicate over the admin interface to configure, initialize the database and start the app:
    curlCmd="curl -X POST http://localhost:$M2EE_ADMIN_PORT \
    -H 'Content-Type: application/json' \
    -H 'X-M2EE-Authentication: $(echo -n "$M2EE_ADMIN_PASS" | base64)' \
    -H 'Connection: close'"
    $curlCmd -d '{ "action": "update_appcontainer_configuration", "params": { "runtime_port": 8080 } }'
    $curlCmd -d '{ "action": "update_configuration", "params": { "DatabaseType": "HSQLDB", "DatabaseName": "myappdb", "DTAPMode": "D" } }'
    $curlCmd -d '{ "action": "execute_ddl_commands" }'
    $curlCmd -d '{ "action": "start" }'
    

These deployment steps cannot be executed by Nix, because Nix's purpose is to manage packages, but not the state of a running process. To automate these remaining parts, we generate scripts that execute the above listed steps.

NixOS integration


NixOS is a Linux distribution that extends Nix's deployment facilities to complete systems. Aside from using the Nix package manage to deploy all packages including the Linux kernel, NixOS' main objective is to deploy an entire system from a single declarative specification capturing the structure of an entire system.

NixOS uses systemd for managing system services. The systemd configuration files are generated by the Nix package manager. We can integrate our Mendix activation scripts with a generated systemd job to fully automate the deployment of a Mendix application.

{pkgs, ...}:

{
  ...

  systemd.services.mendixappcontainer =
    let
      runScripts = ...
      appContainerConfigJSON = ...
      configJSON = ...
    in {
      enable = true;
      description = "My Mendix App";
      wantedBy = [ "multi-user.target" ];
      environment = {
        M2EE_ADMIN_PASS = "secret";
        M2EE_ADMIN_PORT = "9000";
        MENDIX_STATE_DIR = "/home/mendix";
      };
      serviceConfig = {
        ExecStartPre = "${runScripts}/bin/undeploy-app";
        ExecStart = "${runScripts}/bin/start-appcontainer";
        ExecStartPost = "${runScripts}/bin/configure-appcontainer ${appContainerConfigJSON} ${configJSON}";
      };
    };

The partial NixOS configuration shown above defines a systemd job that runs three scripts (as shown in the last three lines):

  • The undeploy-app script removes all non-state artefacts from the working directory.
  • The start-appcontainer script starts the Mendix runtime.
  • The configure-appcontainer script configures the runtime, such as the embedded Jetty server and the database, and starts the application.

Writing a systemd job (as shown above) is a bit cumbersome. To make it more convenient to use, I captured all Mendix runtime functionality in a NixOS module, with an interface exposing all relevant configuration properties.

By importing the Mendix NixOS module into a NixOS configuration, we can conveniently define a machine configuration that runs our Mendix application:

{pkgs, ...}:

{
  require = [ ../nixpkgs-mendix/nixos/modules/mendixappcontainer.nix ];

  services = {
    openssh.enable = true;

     mendixAppContainer = {
       enable = true;
       adminPassword = "secret";
       databaseType = "HSQLDB";
       databaseName = "myappdb";
       DTAPMode = "D";
       app = import ../../conferenceschedule.nix {
         inherit pkgs;
         inherit (pkgs.stdenv) system;
      };
    };
  };

  networking.firewall.allowedTCPPorts = [ 8080 ];
}

In the above configuration, the mendixAppContainer captures all the properties of the Mendix application that we want to run:

  • The password for communicating over the admin interface.
  • The type of database we want to use (in this particular case an in memory HSQLDB instance) and the name of the database.
  • Whether we want to use the application in development (D), test (T), acceptance (A) or production (P) mode.
  • A reference to the MDA that we want to deploy (deployed by a Nix expression that invokes the Mendix build function abstraction shown earlier).

By writing a NixOS configuration file, storing it in /etc/nixos/configuration.nix and running the following command-line instruction:

$ nixos-rebuild switch

A complete system gets deployed with the Nix package manager that runs our Mendix application.

For production use, HSQLDB and directly exposing the embedded Jetty HTTP is not recommended -- instead a more sophisticated database, such as PostgreSQL should be used. For serving HTTP requests, it is recommended to use nginx as a reverse proxy and use it to serve static data and provide caching.

It is also possible to extend the above configuration with a PostgreSQL and nginx system service. The NixOS module system can be used to retrieve the properties from the Mendix app container to make the configuration process more convenient.

Conclusion


In this blog post, I have investigated how Mendix applications can be deployed by using tools from the Nix project. This resulted in the following deployment functionality:

  • A Nix function that can be used to compile an MDA file from a Mendix project.
  • Generated scripts that configure and launch the runtime and the application.
  • A NixOS module that can be used to deploy a running Mendix app as part of a NixOS machine configuration.

Future work


Currently, only single machine deployments are possible. It may also be desirable to connect a Mendix application to a database that is stored on a remote machine. Furthermore, we may also want to deploy multiple Mendix applications to multiple machines in a network. With Disnix, it is possible to automate such scenarios.

Availability


The Nix function abstractions and NixOS module can be obtained from the Mendix GitHub page and used under the terms and conditions of the Apache Software License version 2.0.

Acknowledgements


The work described in this blog post is the result of the so-called "crafting days", in which Mendix supports its employees to experiment completely freely two full days a month.

Furthermore, I have given a presentation about the functionality described in this blog post and an introduction to the Nix project:


and I have also written an introduction-oriented article about it on the Mendix blog.

Monday, July 30, 2018

Automating Mendix application deployments with Nix (introduction-oriented blog post)


Mendix is a low-code application development platform. Low-code application development offers all kinds of benefits over traditional development approaches involving code, such as a boost in productivity. For some applications, it is possible to develop up to ten times faster compared to traditional coding approaches and frameworks.

However, despite being different from a development perspective, there is one particular activity that all application development approaches have in common – at some point, you need to deploy your application to an environment (e.g. test, acceptance, or production) to make it available to end users.

For users of the Mendix cloud portal, deployment is automated in a convenient way – with just a few simple mouse clicks, you can make your application available to all potential users in the world.

However, managing on-premise deployments or the cloud infrastructure itself is all but a trivial job – for example, there are many complex dependencies that need to be deployed to run a Mendix application, upgrading may introduce unnecessary downtime and break a system, and the infrastructure needs to be scalable so that it can manage thousands of applications.

Fortunately, there are many automated deployment solutions that come to our aid, such as Kubernetes. Although many of them are useful, none of these solutions are perfect -- they all have their strengths and weaknesses. As a result, there are still complexities we need to solve ourselves and incidents that require fixing.

At Mendix R&D, everybody is encouraged to freely experiment two days a month (the so-called “crafting days”). One of my crafting day projects is to experiment with deployment tools from a different and unorthodox solution spectrum: The Nix project. The goal is to fully automate the deployment of a Mendix application from source – the input is a Mendix project created with the modeler and the end-result is a system running the application.

The Nix Project


The Nix project provides a family of tools that solve configuration management problems in a unique way. Some tools that are part of the Nix project are:

  • The Nix package manager
  • The NixOS Linux distribution
  • NixOps: A NixOS-based cloud deployment tool
  • Hydra: The Nix-based continuous integration service
  • Disnix: A Nix-based service deployment tool

The basis of all tools in the Nix project is the Nix package manager. Nix is quite different from almost any conventional package manager (such as RPM, APT, or Homebrew) because it borrows concepts from purely functional programming languages, such as Haskell.

The Nix Package Manager


The Nix package manager implements a purely functional deployment model. In Nix, deploying a package reliably is the same thing as invoking a pure function, without any side effects. To make this possible, Nix provides a purely functional domain-specific language called the Nix expression language.

{ stdenv, fetchurl, acl }:

stdenv.mkDerivation {
  name = "gnutar-1.30";
  src = fetchurl {
    url = http://ftp.gnu.org/tar/tar-1.30.tar.xz;
    sha256 = "1lyjyk8z8hdddsxw0ikchrsfg3i0x3fsh7l63a8jgaz1n7dr5gzi";
  };
  buildCommand = ''
    tar xfv $src
    cd tar-1.30
    ./configure --prefix=$out --with-acl=${acl}
    make
    make install
  '';
}

The above code fragment is an example of a Nix expression that describes how to build GNU tar from source code and its build-time dependencies:

  • The entire expression is a function definition. The first line corresponds to a function header in which every argument is a build-time dependency:
    • stdenv is an environment providing standard UNIX utilities, such as cat, ls and make.
    • fetchurl is a function that is used to download files from an external location.
    • acl is a library dependency of GNU tar that provides access control list support.
  • In the body of the function, we invoke the mkDerivation {} function that composes so-called “pure build environments” in which arbitrary build commands can be executed.
  • As function arguments to mkDerivation, we specify the name of the package (name), how the source can be obtained (src) and the shell commands (buildCommand) that need to be executed to build the package.

The above expression is a function definition describing how to build something from sources, but the expression does not specify which version or variants of the sources that are supposed to be used. Function definitions alone are not useful. Instead, functions must be invoked with all the required function arguments. In Nix, they need to correspond to the versions or variants of the build-time dependencies that we want to use.

Packages are composed in a second Nix expression that has the following structure:

rec {
  stdenv = import ...
 
  fetchurl = import ...
 
  acl = import ../pkgs/os-specific/linux/acl {
    inherit stdenv fetchurl …;
  };
 
  gnutar = import ../pkgs/tools/archivers/gnutar {
    inherit stdenv fetchurl acl;
  };
 
  ...
}

The above partial Nix expression is an attribute set (a language construct conceptually similar to objects in JSON) in which every key represents a package name and every value refers to a function invocation that builds the package from source code. The GNU tar expression (shown in the previous code fragment) is imported in this expression and invoked with function arguments referring to the keys in the same attribute set, such as stdenv, fetchurl, and acl.

In addition to GNU tar, all build-time dependencies are composed in the same Nix expression. These dependencies are also constructed by following the same convention – invoking a function that builds the package from source code and its build-time dependencies.

In a Nix build environment, you can execute (almost) any build tool. In the GNU tar example, we run a standard GNU Autotools build procedure, but it is also possible to run Apache Ant (for Java software), Python setup tools, Perl’s MakeMaker or CMake and many other tools.

The only catch is that Nix imposes restrictions on what the tools are permitted to do to provide better guarantees that builds are pure, such as:

  • Every package is stored in an isolated directory, not in global directories, such as /lib, /bin or C:\Windows\System32
  • Files are made read-only after build completion
  • Timestamps are reset to 1 second after the epoch
  • Search environment variables are cleared and configured explicitly, e.g. PATH
  • Private temp folders and designated output directories
  • Network access is restricted (except when an output hash is given)
  • Running builds as unprivileged users
  • Chroot environments, namespaces, bind-mounting dependency packages

The most important restriction is the first – in Nix, all packages are stored in a so-called Nix store, in which every package is prefixed by a cryptographic hash code derived from all build inputs, such as: /nix/store/fjh974kzdcab7yp0ibmwwymmgbi6cg59-gnutar-1.30. Because hash prefixes are unique, no package shares the same name and as a result, we can safely store multiple versions and variants of the same package alongside each other in the store.

The result of complementing build tools with these restrictions is that when you build a package with Nix with certain build-time dependencies and you perform the build with the same inputs on another machine, the result will be the exact same (nearly bit-identical) build.

Purity offers many kinds of benefits, such as:

  • Strong dependency completeness guarantees
  • Strong reproducibility guarantees
  • Build only the packages and dependencies that you need
  • Packages that don’t depend on each other can be safely built in parallel
  • Ability to download substitutes from a remote machine (e.g. build server) if the hash prefix is identical
  • Ability to delegate builds to remote machines and be sure that the result is identical if it were built locally

By taking the composition expression (shown earlier) and running nix-build, we can build GNU tar, including all of its build-time dependencies:

$ nix-build all-packages.nix -A gnutar
/nix/store/fjh974kzdcab7yp0ibmwwymmgbi6cg59-gnutar-1.30

The result of the Nix-build instruction is a Nix store path that contains a hash code that has been derived from all build inputs.

Building Mendix Deployment Archives (MDAs) with Nix


As explained earlier, in Nix build environments any kind of build tool can be used albeit with purity restrictions.

For Mendix applications, there is also an important artifact that needs to be produced in the deployment lifecycle – the Mendix Deployment Archive (MDA) that captures all relevant files that an application needs to run in production.

MDA files can be produced by running the MxBuild tool. We can also package MxBuild and the Mendix runtime as Nix packages and write our own Nix function abstraction that builds MDA files from Mendix projects:

{stdenv, mxbuild, jdk, nodejs}:
{name, mendixVersion, looseVersionCheck ? false, ...}@args:
 
let mxbuildPkg = mxbuild."${mendixVersion}";
in
stdenv.mkDerivation ({
  buildInputs = [ mxbuildPkg nodejs ];
  installPhase = ''
    mkdir -p $out
    mxbuild --target=package --output=$out/${name}.mda \
     --java-home ${jdk} --java-exe-path ${jdk}/bin/java \
     ${stdenv.lib.optionalString looseVersionCheck "--loose-version-check"} \
     "$(echo *.mpr)"
     '';
} // args)

The above function returns another function taking Mendix-specific parameters (e.g. the name of the project, Mendix version), invokes MxBuild, and stores the resulting MDA file in the Nix store.

By using the function abstraction and a Mendix project created by the modeler, we can build the Mendix project by writing the following Nix expression:

{packageMendixApp}:
 
packageMendixApp {
  name = "conferenceschedule";
  src = /home/sander/ConferenceSchedule-main;
  mendixVersion = "7.13.1";
}

The above expression specifies that we want to build a project named: conferenceschedule, we want to use the Mendix project that is stored in the directory: /home/sander/ConferenceSchedule-main and we want to use Mendix version 7.13.1.

Using NixOS: A Nix-Based Linux Distribution


One of the common objectives that all tools in the Nix project have in common is declarative deployment, meaning that you can express the structure of your system, and the tools infer all the activities that need to be carried out to deploy it.

As a Mendix developer, generating an MDA archive is not entirely what we want – what we really want is a system running a Mendix application. To accomplish this, additional deployment activities need to be carried out beyond producing an MDA file.

NixOS is a Linux distribution that extends Nix’s deployment features to complete systems. In addition to the fact that the Nix package manager is being used to deploy all packages (including the Linux kernel) and configuration files, it also deploys entire machine configurations from a single declarative specification:

{pkgs, ...}:
 
{
  boot.loader.grub.device = "/dev/sda";
  fileSystems."/".device = "/dev/sda1";
 
  services = {
    openssh.enable = true;
 
    xserver = {
      enable = true;
      displayManager.sddm.enable = true;
      desktopManager.plasma5.enable = true;
    };
  };
 
  environment.systemPackages = [
    pkgs.firefox
  ];
}

The above code fragment is an example of a NixOS configuration file that captures the following properties:

  • The GRUB bootloader should be installed on the Master Boot Record of the first harddrive (/dev/sda)
  • The first partition of the first harddrive (/dev/sda1) should be mounted as a root partition
  • We want to run OpenSSH and the X Window System as system services
  • We configure the X Window Server to use SDDM as a login manager and the KDE Plasma Desktop as desktop manager.
  • We want to install Mozilla Firefox as an end-user package.

By running a single command-line instruction, we can deploy an entire system configuration with the Nix package manager:

$ nixos-rebuild switch

The result is a running system implementing the configuration described above.

Creating a NixOS Module for Mendix App Containers


To automate the remaining Mendix deployment activities (that need to be carried out after composing an MDA file), we can create a systemd job (the service manager that NixOS uses) that unpacks the MDA file into a writable directory, creates additional state directories for storing temp files, and configure the runtime by communicating over the admin interface to start the embedded Jetty HTTP service, configure the database and start the app.

Composing a systemd job can be done by adding a systemd configuration setting to a NixOS configuration. The following partial Nix expression shows the overall structure of a systemd job for a Mendix app container:

{pkgs, ...}:
 
{
  systemd.services.mendixappcontainer =
   let 
     mendixPkgs = import ../nixpkgs-mendix/top-level/all-packages.nix { inherit pkgs; };
     appContainerConfigJSON = pkgs.writeTextFile { ... };
     configJSON = pkgs.writeTextFile {
       name = "config.json";
       text = builtins.toJSON {
         DatabaseType = "HSQLDB";
         DatabaseName = "myappdb";
         DTAPMode = "D";
      };
    };
    runScripts = mendixPkgs.runMendixApp {
      app = import ../conferenceschedule.nix { inherit (mendixPkgs) packageMendixApp; };
    };
  in {
    enable = true;
    description = "My Mendix App";
    wantedBy = [ "multi-user.target" ];
    environment = {
      M2EE_ADMIN_PASS = "secret";
      M2EE_ADMIN_PORT = "9000";
      MENDIX_STATE_DIR = "/home/mendix";
    };
    serviceConfig = {
      ExecStartPre = "${runScripts}/bin/undeploy-app";
      ExecStart = "${runScripts}/bin/start-appcontainer";
      ExecStartPost = "${runScripts}/bin/configure-appcontainer ${appContainerConfigJSON} ${configJSON}";
    };
  };
}

The above systemd job declaration does the following:

  • It generates JSON configuration files with app container and database settings
  • It composes an environment with environment variables configuring the admin interface
  • It launches scripts: one script before startup that cleans the old state, a start script that starts the app container and a script that runs after startup that configures the app container settings, such as the database

Writing a systemd job as a Nix expression is quite cumbersome and a bit impractical when it is desired to compose NixOS configurations that should run Mendix applications. Fortunately, we can hide all these implementation details behind a more convenient interface by wrapping all Mendix app container properties in a NixOS module.

By importing this NixOS module in a NixOS configuration, we can more concisely express the properties of a system running a Mendix app container:

{pkgs, ...}:
 
{
  require = [ ../nixpkgs-mendix/nixos/modules/mendixappcontainer.nix ];
 
  services = {
    openssh.enable = true;
 
    mendixAppContainer = {
      enable = true;
      adminPassword = "secret";
      databaseType = "HSQLDB";
      databaseName = "myappdb";
      DTAPMode = "D";
      app = import ../../conferenceschedule.nix {
        inherit pkgs;
        inherit (pkgs.stdenv) system;
      };
    };
  };
 
  networking.firewall.allowedTCPPorts = [ 8080 ];
}

The above code fragment is another NixOS configuration that imports the Mendix app container NixOS module. It defines a Mendix app container system service that connects to an in-memory HSQL database, runs the app in development mode, and deploys the MDA file that is the result of building one of our test projects, by invoking the Nix build abstraction function that builds MDAs.

By running a single command-line instruction, we can deploy a machine configuration running our Mendix application:

$ nixos-rebuild switch

After the deployment has succeeded, we should able to open a web browser and test our app.

In production scenarios, only deploying an app container is not enough to make an application reliably available to end users. We must also deploy a more robust database service, such as PostgreSQL, and use a reverse proxy, such as nginx, to more efficiently serve static files and cache common requests to improve the performance of the application.

It is also possible to extend the NixOS configuration with a PostgreSQL and nginx system service and use the NixOS module system to refer to the relevant properties of a Mendix app container.

Conclusion


This blog post covers tools from the Nix project implementing deployment concepts inspired by purely functional programming languages and declarative programming. These tools offer a number of unique advantages over more traditional deployment tools. Furthermore, we have demonstrated that Mendix application deployments could fit into such a deployment model.

Availability


The Nix build abstraction function for Mendix projects and the NixOS module for running app containers can be obtained from the nixpkgs-mendix repository on GitHub. The functionality should be considered experimental – it is not yet recommended for production usage.

The Nix package manager and NixOS Linux distribution can be obtained from the NixOS website.

This blog post originally appeared on: https://www.mendix.com/blog/automating-mendix-application-deployments-with-nix/)

Thursday, July 26, 2018

Layered build function abstractions for building Nix packages

I have shown quite a few Nix expression examples on my blog. When it is desired to write a Nix expression for a package, it is a common habit to invoke the stdenv.mkDerivation {} function, or functions that are abstractions built around it.

For example, if we want to build a package, such as the trivial GNU Hello package, we can write the following expression:

with import <nixpkgs> {};

stdenv.mkDerivation {
  name = "hello-2.10";

  src = fetchurl {
    url = mirror://gnu/hello/hello-2.10.tar.gz;
    sha256 = "0ssi1wpaf7plaswqqjwigppsg5fyh99vdlb9kzl7c9lng89ndq1i";
  };

  meta = {
    description = "A program that produces a familiar, friendly greeting";
    longDescription = ''
      GNU Hello is a program that prints "Hello, world!" when you run it.
      It is fully customizable.
    '';
    homepage = http://www.gnu.org/software/hello/manual/;
    license = "GPLv3+";
  };
}

and build it with the Nix package manager as follows:

$ nix-build
/nix/store/188avy0j39h7iiw3y7fazgh7wk43diz1-hello-2.10

The above code fragment does probably not look too complicated and is quite easy to repeat for build other kinds of GNU Autotools/GNU Make-based packages. However, stdenv.mkDerivation {} is a big/complex function abstraction that has many responsibilities.

Its most important responsibility is to compose a so-called pure build environments, in which various restrictions are imposed on the build scripts to provide better guarantees that builds are pure (meaning: that they always produce the same (nearly) bit-identical result if the dependencies are the same), such as:

  • Build scripts can only write to designated output directories and temp directories. They are restricted from writing to any other file system location.
  • All environment variables are cleared and some of them are set to default or dummy values, such as search path environment variables (e.g. PATH).
  • All build results are made immutable by removing the write permission bits and their timestamps are reset to one second after the epoch.
  • Running builds as unprivileged users.
  • Optionally, builds run in a chroot environment and use namespaces to restrict access to the host filesystem and the network as much as possible.

In addition to purity, the stdenv.mkDerivation {} function has many additional responsibilities. For example, it also implements a generic builder that is clever enough to build a GNU Autotools/GNU Make project without specifying any build instructions.

For example, the above Nix expression for GNU Hello does not specify any build instructions. The generic builder automatically unpacks the tarball, opens the resulting directory and invokes ./configure --prefix=$out; make; make install with the appropriate parameters.

Because stdenv.mkDerivation {} has many responsibilities and nearly all packages in Nixpkgs depend on it, its implementation is very complex (e.g. thousands of lines of code) and hard to change.

As a personal exercise, I have developed a function abstraction with similar functionality from scratch. My implementation can be decomposed into layers in which every abstraction layer gradually adds additional responsibilities.

Writing "raw" derivations


stdenv.mkDerivation is a function abstraction, not a feature of the Nix expression language. To compose "pure" build environments, stdenv.mkDerivation invokes a Nix expression language construct -- the derivation {} builtin.

(As a sidenote: derivation is strictly speaking not a builtin, but an abstraction built around the derivationStrict builtin, but this is something internal to the Nix package manager. It does not matter for the scope of this blog post).

Despite the fact that this low level function is not commonly used, it is also possible to directly invoke it and compose low-level "raw" derivations to build packages. For example, we can write the following Nix expression (default.nix):

derivation {
  name = "test";
  builder = ./test.sh;
  system = "x86_64-linux";
  person = "Sander";
}

The above expression invokes the derivation builtin function that composes a "pure" build environment:

  • The name attribute specifies the name of the package, that should appear in the resulting Nix store path.
  • The builder attribute specifies that the test.sh executable should be run inside the pure build environment.
  • The system attribute is used to tell Nix that this build should be carried out for x86-64 Linux systems. When Nix is unable to build the package for the requested system architecture, it can also delegate a build to a remote machine that is capable.
  • All attributes (including the attributes described earlier) are converted to environment variables (e.g. strings, numbers and URLs are converted to strings and the boolean value: 'true' is converted to '1') and can be used by the builder process for a variety of reasons.

We can implement the builder process (the test.sh build script) as follows:

#!/bin/sh -e

echo "Hello $person" > $out

The above script generates a greeting message for the provided person (exposed as an environment variable by Nix) and writes it to the Nix store (the output path is provided by the out environment variable).

We can evaluate the Nix expression (and generate the output file with the Hello greeting) by running:

$ nix-build
/nix/store/7j4y5d8rx1vah5v64bpqd5dskhwx5105-test
$ cat result
Hello Sander

The return value of the derivation {} function is a bit confusing. At first sight, it appears to be a string corresponding to the output path in the Nix store. However, some investigation with the nix repl tool reveals that it is much more than that:

$ nix repl
Welcome to Nix version 2.0.4. Type :? for help.

when importing the derivation:

nix-repl> test = import ./default.nix

and describing the result:

nix-repl> :t test
a set

we will see that the result is actually an attribute set, not a string. By requesting the attribute names, we will see the following attributes:

nix-repl> builtins.attrNames test
[ "all" "builder" "drvAttrs" "drvPath" "name" "out" "outPath" "outputName" "person" "system" "type" ]

It appears that the resulting attribute set has the same attributes as the parameters that we passed to derivation, augmented by the following additional attributes:

  • The type attribute that refers to the string: "derivation".
  • The drvAttrs attribute refers to an attribute set containing the original parameters passed to derivation {}.
  • drvPath and outPath refer to the Nix store paths of the store derivation file and output of the build. A side effect of requesting these members is that the expression gets evaluated or built.
  • The out attribute is a reference to the derivation producing the out result, all is a list of derivations of all outputs produced (Nix derivations can also produce multiple output paths in the Nix store).
  • In case there are multiple outputs, the outputName determines the name of the output path that is the default.

Providing basic dependencies


Although we can use the low-level derivation {} function to produce a very simple output file in the Nix store, it is not very useful on its own.

One important limitation is that we only have a (Bourne-compatible) shell (/bin/sh), but no other packages in the "pure" build environment. Nix prevents unspecified dependencies from being found to make builds more pure.

Since a pure build environment is almost entirely empty (with the exception of the shell), the amount of things we can do in an environment created by derivation {} is very limited -- most of the commands that build scripts run are provided by executables belonging to external packages, e.g. commands such as cat, ls (GNU Coreutils), grep (GNU Grep) or make (GNU Make) and should be added to the PATH search environment variable in the build environment.

We may also want to configure additional environment variables to make builds more pure -- for example, on Linux systems, we want to set the TZ (timezone) environment variable to UTC to prevent error messages, such as: "Local time zone must be set--see zic manual page".

To make the execution of more complex build scripts more convenient, we can create a setup script that we can include in a every build script that adds basic utilities to the PATH search environment variable, configures these additional environment variables, and sets the SHELL environment variable to the bash shell residing in the Nix store. We can create a package named: stdenv that provides a setup script to accomplish this:

{bash, basePackages, system}:

let
  shell = "${bash}/bin/sh";
in
derivation {
  name = "stdenv";
  inherit shell basePackages system;
  builder = shell;
  args = [ "-e" ./builder.sh ];
}

The builder script of the stdenv package can be implemented as follows:
set -e

# Setup PATH for base packages
for i in $basePackages
do
    basePackagesPath="$basePackagesPath${basePackagesPath:+:}$i/bin"
done

export PATH="$basePackagesPath"

# Create setup script
mkdir $out
cat > $out/setup <<EOF
export SHELL=$shell
export PATH="$basePackagesPath"
EOF

# Allow the user to install stdenv using nix-env and get the packages
# in stdenv.
mkdir $out/nix-support
echo "$basePackages" > $out/nix-support/propagated-user-env-packages

The above script adds all base packages (GNU Coreutils, Findutils, Diffutils, sed, grep, gawk and bash) to the PATH of builder and creates a script in $out/setup that exports the PATH environment variable and the location to the bash shell.

We can use the stdenv (providing this setup script) as a dependency for building a package, such as:

{stdenv}:

derivation {
  name = "hello";
  inherit stdenv;
  builder = ./builder.sh;
  system = "x86_64-linux";
}

In the corresponding builder script, we include the setup script in the first line and we, for example, invoke various external commands to generate a shell script that says: "Hello world!":

#!/bin/sh -e
source $stdenv/setup

mkdir -p $out/bin

cat > $out/bin/hello <<EOF
#!$SHELL -e

echo "Hello"
EOF

chmod +x $out/bin/hello

The above script works because the setup script adds GNU Coreutils (that includes cat, mkdir and chmod) to the PATH of the builder.

Writing more simple derivations


Using a setup script makes writing build scripts somewhat practical, but there are still a number inconveniences we have to cope with.

The first inconvenience is the system parameter -- in most cases, we want to build a package for the same architecture as the host system's architecture and preferably we want the same architecture for all other packages that we intend to deploy.

Another issue is the shell. /bin/sh is, in a sandbox-enabled Nix installations, a minimal Bourne-compatible shell provided by Busybox, or a reference to the host system's shell in non-sandboxed installations. The latter case could be considered an impurity, because we do not know what kind of shell (e.g. bash, dash, ash ?) or version of a shell we are using (e.g. 3.2.57, 4.3.30 ?). Ideally, we want to use a shell that is provided as a Nix package in the Nix store, because that version is pure.

(As a sidenote: in Nixpkgs, we use the bash shell to run build commands, but this is not a strict requirement. For example, GNU Guix (a package manager that uses several components of the Nix package manager) uses both Guile as a host and guest language. In theory, we could also launch a different kind of interpreter than bash).

The third issue is the meta parameter -- for every package, it is possible to specify meta-data, such as a description, license and homepage reference as an attribute set. Unfortunately, attribute sets cannot be converted to environment variables. To deal with this problem, the meta attribute needs to be removed before we invoke derivation {} and be readded to the return attribute set. (IMO I believe this ideally should be something the Nix package manager could solve by itself).

We can hide all these inconveniences by creating a simple abstraction function that I will call: stdenv.simpleDerivation that can be implemented as follows:

{stdenv, system, shell}:
{builder, ...}@args:

let
  extraArgs = removeAttrs args [ "builder" "meta" ];

  buildResult = derivation ({
    inherit system stdenv;
    builder = shell; # Make bash the default builder
    args = [ "-e" builder ]; # Pass builder executable as parameter to bash
    setupSimpleDerivation = ./setup.sh;
  } // extraArgs);
in
buildResult //
# Readd the meta attribute to the resulting attribute set
(if args ? meta then { inherit (args) meta; } else {})

The above Nix expression basically removes the meta argument, then invokes the derivation {} function, sets the system parameter, uses bash as builder and passes the builder executable as an argument to bash. After building the package, the meta attribute gets readded to the result.

With this abstraction, we can reduce the complexity of the previously shown Nix expression to something very simple:

{stdenv}:

stdenv.simpleDerivation {
  name = "hello";
  builder = ./builder.sh;
  meta = {
    description = "This is a simple testcase";
  };
}

The function abstraction is also sophisticated enough to build something more complex, such as GNU Hello. We can write the following Nix expression that passes all dependencies that it requires as function parameters:

{stdenv, fetchurl, gnumake, gnutar, gzip, gcc, binutils}:

stdenv.simpleDerivation {
  name = "hello-2.10";
  src = fetchurl {
    url = mirror://gnu/hello/hello-2.10.tar.gz;
    sha256 = "0ssi1wpaf7plaswqqjwigppsg5fyh99vdlb9kzl7c9lng89ndq1i";
  };
  inherit stdenv gnumake gnutar gzip gcc binutils;
  builder = ./builder.sh;
}

We can use the following builder script to build GNU Hello:

source $setupSimpleDerivation

export PATH=$PATH:$gnumake/bin:$gnutar/bin:$gzip/bin:$gcc/bin:$binutils/bin

tar xfv $src
cd hello-2.10
./configure --prefix=$out
make
make install

The above script imports a setup script configuring basic dependencies, then extends the PATH environment variable with additional dependencies, and then executes the commands to build GNU Hello -- unpacking the tarball, running the configure script, building the project, and installing the package.

The run command abstraction


We can still improve a bit upon the function abstraction shown previously -- one particular inconvenience that remains is that you have to write two files to get a package built -- a Nix expression that composes the build environment and a builder script that carries out the build steps.

Another repetitive task is configuring search path environment variables (e.g. PATH, PYTHONPATH, CLASSPATH etc.) to point to the appropriate directories in the Nix store. As may be noticed by looking at the code of the previous builder script, this process is tedious.

To address these inconveniences, I have created another abstraction function called: stdenv.runCommand that extends the previous abstraction function -- when no builder parameter has been provided, this function executes a generic builder that will evaluate the buildCommand environment variable containing a string with shell commands to execute. This feature allows us to rewrite the first example (that generates a shell script) to one file:

{stdenv}:

stdenv.runCommand {
  name = "hello";
  buildCommand = ''
    mkdir -p $out/bin
    cat > $out/bin/hello <<EOF
    #! ${stdenv.shell} -e

    echo "Test"
    EOF
    chmod +x $out/bin/hello
  '';
}

Another feature of the stdenv.runCommand abstraction is to provide a generic mechanism to configure build-time dependencies -- all build-time dependencies that a package needs can be provided as a list of buildInputs. The generic builder carries out all necessary build steps to make them available. For example, when a package provides a bin/ sub folder, then it will be automatically added to the PATH environment variable.

Every package can bundle a setup-hook.sh script that modifies the build environment so that it knows how dependencies for this package can be configured. For example, the following partial expression represents the Perl package that bundles a setup script:

{stdenv, ...}:

stdenv.mkDerivation {
  name = "perl";
  ...
  setupHook = ./setup-hook.sh
}

The setup hook can automatically configure the PERL5LIB search path environment variable for all packages that provide Perl modules:

addPerlLibPath()
{
    addToSearchPath PERL5LIB $1/lib/perl5/site_perl
}

envHooks+=(addPerlLibPath)

When we add perl as a build input to a package, then its setup hook configures the generic builder in such a way that the PERL5LIB environment variable is automatically configured when we provide a Perl module as a build input.

We can also more conveniently build GNU Hello, by using the buildInputs parameter:

{stdenv, fetchurl, gnumake, gnutar, gzip, gcc, binutils}:

stdenv.runCommand {
  name = "hello-2.10";
  src = fetchurl {
    url = mirror://gnu/hello/hello-2.10.tar.gz;
    sha256 = "0ssi1wpaf7plaswqqjwigppsg5fyh99vdlb9kzl7c9lng89ndq1i";
  };
  buildInputs = [ gnumake gnutar gzip gcc binutils ];
  buildCommand = ''
    tar xfv $srcb
    cd hello-2.10
    ./configure --prefix=$out
    make
    make install
  '';
}

Compared to the previous GNU Hello example, this Nix expression is much simpler and more intuitive to write.

The run phases abstraction


We can improve the ease of use for build processes even further. GNU Hello, and many other GNU packages and other system software used for Linux are GNU Autotools/GNU Make based and follow similar conventions including the build commands you need to carry out. Likewise, many other software projects use standardized build tools that follow conventions.

As a result, when you have to maintain a collection of packages, you probably end up writing the same kinds of build instructions over and over again.

To alleviate this problem, I have created another abstraction layer, named: stdenv.runPhases making it possible to define and execute phases in a specific order. Every phase has a pre and post hook (a script that executes before and after each phase) and can be disabled or reenabled with a do* or dont* flag.

With this abstraction function, we can divide builds into phases, such as:

{stdenv}:

stdenv.runPhases {
  name = "hello";
  phases = [ "build" "install" ];
  buildPhase = ''
    cat > hello <<EOF
    #! ${stdenv.shell} -e
    echo "Hello"
    EOF
    chmod +x hello
  '';
  installPhase = ''
    mkdir -p $out/bin
    mv hello $out/bin
  '';
}

The above Nix expression executes a build and install phase. In the build phase, we construct a script that echoes "Hello", and in the install phase we move the script into the Nix store and we make it executable.

In addition to environment variables, it is also possible to define the phases in a setup script as shell functions. For example, we can also use a builder script:

{stdenv}:

stdenv.runPhases {
  name = "hello2";
  builder = ./builder.sh;
}

and define the phases in the builder script:

source $setupRunPhases

phases="build install"

buildPhase()
{
    cat > hello <<EOF
#! $SHELL -e
echo "Hello"
EOF
    chmod +x hello
}

installPhase()
{
    mkdir -p $out/bin
    mv hello $out/bin
}

genericBuild

Another feature of this abstraction is that we can also define exitHook and failureHook parameters that will be executed if the builder succeeds or fails.

In the next sections, I will show abstractions built on top of stdenv.runPhases that can be used to hide implementation details of common build procedures.

The generic build abstraction


For many build procedures, we need to carry out the same build steps, such as: unpacking the source archives, applying patches, and stripping debug symbols from the resulting ELF executables.

I have created another build function abstraction named: stdenv.genericBuild that implements a number of common build phases:

  • The unpack phase generically unpacks the provided sources, makes it content writable and opens the source directory. The unpack command is determined by the unpack hook that each potential unpacker provides -- for example, the GNU tar package includes a setup hook that untars the file if it looks like a tarball or compressed tarball:

    _tryUntar()
    {
        case "$1" in
            *.tar|*.tar.gz|*.tar.bz2|*.tar.lzma|*.tar.xz)
                tar xfv "$1"
                ;;
            *)
                return 1
                ;;
        esac
    }
    
    unpackHooks+=(_tryUntar)
    
  • The patch phase applies any patch that is provided by the patches parameter uncompressing them when necessary. The uncompress file operation also works with setup hooks -- uncompressor packages (such as gzip and bzip2) provide a setup hook that uncompresses the file if it is of the right filetype.
  • The strip phase processes all sub directories containing ELF binaries (e.g. bin/ and lib/) and strips their debugging symbols. This reduces the size of the binaries and removes non-deterministic timestamps.
  • The patchShebangs phase processes all scripts with a shebang line and changes it to correspond to a path in the Nix store.
  • The compressManPages phase compresses all manual pages with gzip.

We can also add GNU patch as as base package for this abstraction function, since it is required to execute the patch phase. As a result, it does not need to be specified as a build dependency for each package.

This function abstraction alone is not very useful, but it captures all common aspects that most build tools use, such as GNU Make, CMake or SCons projects.

I can reduce the size of the previously shown GNU Hello example Nix expression to the following:

{stdenv, fetchurl, gnumake, gnutar, gzip, gcc, binutils}:

stdenv.genericBuild {
  name = "hello-2.10";
  src = fetchurl {
    url = mirror://gnu/hello/hello-2.10.tar.gz;
    sha256 = "0ssi1wpaf7plaswqqjwigppsg5fyh99vdlb9kzl7c9lng89ndq1i";
  };
  buildInputs = [ gnumake gnutar gzip gcc binutils ];
  buildCommandPhase = ''
    ./configure --prefix=$out
    make
    make install
  '';
}

In the above expression, I no longer have to specify how to unpack the download GNU Hello source tarball.

GNU Make/GNU Autotools abstraction


We can extend the previous function abstraction even further with phases that automate a complete GNU Make/GNU Autotools based workflow. This abstraction is what we can call stdenv.mkDerivation and is comparable in terms of features with the implementation in Nixpkgs.

We can adjust the phases to include a configure, build, check and install phase. The configure phase checks whether a configure script exists and executes it. The build, check and install phases will execute: make, make check and make install with appropriate parameters.

We can also add common packages that we need to build these projects as base packages so that they no longer have to be provided as a build input: GNU Tar, gzip, bzip2, xz, GNU Make, Binutils and GCC.

With these additional phases and base packages, we can reduce the GNU Hello example to the following expression:

{stdenv, fetchurl}:

stdenv.mkDerivation {
  name = "hello-2.10";
  src = fetchurl {
    url = mirror://gnu/hello/hello-2.10.tar.gz;
    sha256 = "0ssi1wpaf7plaswqqjwigppsg5fyh99vdlb9kzl7c9lng89ndq1i";
  };
}

The above Nix expression does not contain any installation instructions -- the generic builder is able to figure out all steps on its own.

Composing custom function abstractions


I have shown several build abstraction layers implementing most features that are in the Nixpkgs version of stdenv.mkDerivation. Aside from clarity, another objective of splitting this function in layers is to make the composition of custom build abstractions more convenient.

For example, we can implement the trivial builder named: writeText whose only responsibility is to write a text file into the Nix store, by extending stdenv.runCommand. This abstraction suffices because writeText does not require any build tools, such as GNU Make and GCC, and it also does not need any generic build procedure executing phases:

{stdenv}:

{ name # the name of the derivation
, text
, executable ? false # run chmod +x ?
, destination ? ""   # relative path appended to $out eg "/bin/foo"
, checkPhase ? ""    # syntax checks, e.g. for scripts
}:

stdenv.runCommand {
  inherit name text executable;
  passAsFile = [ "text" ];

  # Pointless to do this on a remote machine.
  preferLocalBuild = true;
  allowSubstitutes = false;

  buildCommand = ''
    target=$out${destination}
    mkdir -p "$(dirname "$target")"

    if [ -e "$textPath" ]
    then
        mv "$textPath" "$target"
    else
        echo -n "$text" > "$target"
    fi

    [ "$executable" = "1" ] && chmod +x "$target" || true
  '';
}

We can also make a builder for Perl packages, by extending: stdenv.mkDerivation -- Perl packages also use GNU Make as a build system. Its only difference is the configuration step -- it runs Perl's MakeMaker script to generate the Makefile. We can simply replace the configuration phase for GNU Autotools by an implementation that invokes MakeMaker.

When developing custom abstractions, I basically follow this pattern:

{stdenv, foo, bar}:
{name, buildInputs ? [], ...}@args:

let
  extraArgs = removeAttrs args [ "name" "buildInputs" ];
in
stdenv.someBuildFunction ({
  name = "mypackage-"+name;
  buildInputs = [ foo bar ] ++ buildInputs;
} // extraArgs)

  • A build function is a nested function in which the first line is a function header that captures the common build-time dependencies required to build a package. For example, when we want to build Perl packages, then perl is such a common dependency.
  • The second line is the inner function header that captures the parameters that should be passed to the build function. The notation allows an arbitrary number of parameters. The parameters in the { } block (name, buildInputs) are considered to have a specific use in the body of the function. The remainder of parameters are non-essential -- they are used as environment variables in the builder environment or they can be propagated to other functions.
  • We compose an extraArgs variable that contains all non-essential arguments that we can propagate to the build function. Basically, all function arguments that are used in the body need to be removed and function arguments that are attribute sets, because they cannot be converted to strings.
  • In the body of the function, we set up important aspects of the build environment, such as the mandatory build parameters, and we propagate the remaining function arguments to the builder abstraction function.

Following this pattern also ensures that the builder is flexible enough to be extended and modified. For example, by extending a function that is based on stdenv.runPhases the builder can be extended with custom phases and build hooks.

Discussion


In this blog post, I have derived my own reimplementation of Nixpkgs's stdenv.mkDerivation function that consists of the following layers each gradually adding functionality to the "raw" derivation {} builtin:

  1. "Raw" derivations
  2. The setup script ($stdenv/setup)
  3. Simple derivation (stdenv.simpleDerivation)
  4. The run command abstraction (stdenv.runCommand)
  5. The run phases abstraction (stdenv.runPhases)
  6. The generic build abstraction (stdenv.genericBuild)
  7. The GNU Make/GNU Autotools abstraction (stdenv.mkDerivation)

The features that the resulting stdenv.mkDerivation provides are very similar to the Nixpkgs version, but not entirely identical. Most notably, cross compiling support is completely absent.

From the experience, I have a number of improvement suggestions that we may want to implement in Nixpkgs version to improve the quality and clarity of the generic builder infrastructure:

  • We could also split the implementation of stdenv.mkDerivation and the corresponding setup.sh script into layered sub functions. Currently, the setup.sh script is huge (e.g. over 1200 LOC) and has many responsibilities (perhaps too many). By splitting the build abstraction functions and their corresponding setup scripts, we can separate concerns better and reduce the size of the script so that it becomes more readable and better maintainable.
  • In the Nixpkgs implementation, the phases that the generic builder executes are built for GNU Make/GNU Autotools specifically. Furthermore, the invocation of pre and post hooks and do and dont flags are all hand coded for every phase (there is no generic mechanism that deals with them). As a result, when you define a new custom phase, you need to reimplement the same aspects over and over again. In my implementation, you only have to define phases -- the generic builder automatically executes the coresponding pre and post hooks and evaluates the do and dont flags.
  • In the Nixpkgs implementation there is no uncompressHook -- as a result, the decompression of patch files is completely handcoded for every uncompressor, e.g. gzip, bzip2, xz etc. In my implementation, we can delegate this responsibility to any potential uncompressor package.
  • In my implementation, I turned some of the phases of the generic builder into command-line tools that can be invoked outside the build environment (e.g. patch-shebangs, compress-man). This makes it easier to experiment with these tools and to make adjustments.

The biggest benefit of having separated concerns is flexibility when composing custom abstractions -- for example, the writeText function in Nixpkgs is built on top of stdenv.mkDerivation that includes GNU Make and GCC as dependencies, but does not depend on it. As a result, when one of these packages get updated all generated text files need to be updated as well, while there is no real dependency on it. When using a more minimalistic function, such as stdenv.runCommand this problem will go away.

Availability


I have created a new GitHub repository called: nix-lowlevel-experiments. It contains the implementation of all function abstractions described in this blog post, including some test cases that demonstrate how these functions can be used.

In the future, I will probably experiment with other low level Nix concepts and add them to this repository as well.