Wednesday, November 28, 2012

On Nix and GNU Guix


Quite recently, the GNU project has announced Guix, a new package manager for the GNU system. Guix is described on their website as:

GNU Guix is a purely functional package manager, and associated free software distribution, for the GNU system. In addition to standard package management features, Guix supports transactional upgrades and roll-backs, unprivileged package management, per-user profiles, and garbage collection.

The announcement has apparently attracted quite a lot of attention and it has been in the news quite a lot, such as on Linux Weekly News, Phoronix and Reddit. As my frequent readers may probably notice, this description looks very much like the Nix package manager.

GNU Guix


In fact, Guix is not a new package package manager -- it's using several crucial components of the Nix package manager and gains the described unique deployment properties, such as transactional upgrades, from them.

What Guix basically provides is a new language front-end. The Nix package manager has its own domain-specific language (DSL), called the Nix expression language, to specify how packages can be built from source code and its required dependencies. I have shown many examples of Nix expressions in earlier blog posts. The Nix expression language is an external DSL, meaning that it has a custom syntax and parser to process the language.

Guix provides a different front-end using GNU Guile -- a Scheme programming language interpreter, which is blessed as the official extension language in the GNU project and embedded in a number free software programs, such as TeXmacs and Lilypond. Guix provides an internal DSL (or embedded DSL), meaning that it uses a general purpose host language (in this case Scheme) and its features to implement a DSL.

Furthermore, the Guix repository contains a small set of package specifications (comparable to Nixpkgs) that can be used to deploy a small subset of a system. The developers have the intention to allow a full GNU system to be deployed from these specifications at some point in the future.

A comparison of package specifications


So how does the way packages are specified in Nix and Guix differ to each other? Using the Nix package manager, a package such as GNU cpio, is specified as follows:
{stdenv, fetchurl}:

stdenv.mkDerivation {
  name = "cpio-2.11";

  src = fetchurl {
    url = mirror://gnu/cpio/cpio-2.11.tar.bz2;
    sha256 = "1gavgpzqwgkpagjxw72xgxz52y1ifgz0ckqh8g7cckz7jvyhp0mv";
  };

  patches = [ ./cpio-gets-undeclared.patch ];

  meta = {
    homepage = http://www.gnu.org/software/cpio/;
    longDescription = ''
      GNU cpio copies ...
    '';
    license = "GPLv3+";
  };
}

The above code fragment defines a function in the Nix expression language taking 2 arguments: stdenv is a component providing a collection of standard UNIX utilities and build tools, such as: cat, ls, gcc and make. fetchurl is used to download a file from an external source.

In the remainder of the expression, we do a function invocation to stdenv.mkDerivation, which is the Nix-way of describing a package build operation. As function arguments, we provide a package name, the source code (src which is bound to a function that fetches the tarball from a GNU mirror), a patch that fixes a certain issue and some build instructions. If the build instructions are omitted (which is the case in our example), the standard GNU Autotools build procedure is executed, i.e.: ./configure; make; make install.

The expression shown earlier merely defines a function specifying how to build a package, but does not provide the exact versions or variants of the dependencies that we should use to build it. Therefore we must also compose the package, by calling the function with the required function arguments. In Nix, we do this in a composition expression, containing an attribute set in which each attribute name is a package, while its value is a function invocation, importing package expressions and by providing their dependencies, which are defined in the same expression:
rec {
  stdenv = ...

  fetchurl = import ../pkgs/build-support/fetchurl {
    ...
  };

  cpio = import ../pkgs/tools/archivers/cpio {
    inherit stdenv fetchurl;
  };

  ...
}
In the above expression, the cpio package expression (shown earlier) is imported and called with its required function arguments that provide a particular stdenv and fetchurl instance. By running the following command-line instruction and by providing the above expression as a parameter, cpio can be built. Its result is stored in isolation in the Nix store:
$ nix-build all-packages.nix -A cpio
/nix/store/pl12qa4q1z...-cpio-2.11

In Guix, the GNU cpio package is specified as follows:

(define-module (distro packages cpio)
  #:use-module (distro)
  #:use-module (guix packages)
  #:use-module (guix download)
  #:use-module (guix build-system gnu))

(define-public cpio
  (package
    (name "cpio")
    (version "2.11")
    (source
     (origin
      (method url-fetch)
      (uri (string-append "mirror://gnu/cpio/cpio-"
                          version ".tar.bz2"))
      (sha256
       (base32
        "1gavgpzqwgkpagjxw72xgxz52y1ifgz0ckqh8g7cckz7jvyhp0mv"))))
    (build-system gnu-build-system)
    (arguments
     `(#:patches (list (assoc-ref %build-inputs
                                  "patch/gets"))))
    (inputs
     `(("patch/gets" ,(search-patch "cpio-gets-undeclared.patch"))))
    (home-page "https://www.gnu.org/software/cpio/")
    (synopsis
     "A program to create or extract from cpio archives")
    (description
     "GNU Cpio copies ...")
    (license "GPLv3+")))

As can be seen, the above code fragment defines a package in the Scheme programming language. The above code fragment defines a module (representing a single package), that depends on a collection of modules providing its build-time dependencies, such as all the other packages that are defined in the Guix repository, a module responsible for downloading files from external location and a module providing build instructions.

In the remainder of code fragment, a procedure is defined capturing the properties of the package (in this case: cpio). As can be observed, the information captured in this procedure is quite similar to the Nix expression, such as the package name, the external location from which the source code should be obtained, and the patch that fixes an issue. The build-system parameter says that the standard GNU autotools build procedure (./configure; make; make install) should be executed.

To my knowledge, the composition of the package is also done in the same specification (because it refers to modules defining package compositions, instead of a function, which arguments should be set elsewhere), as opposed to Nix, in which we typically split the build function and its composition.

GNU cpio can be built using Guix by running:
$ guix-build hello
/nix/store/pl12qa4q1z...-cpio-2.11

The above command connects to the nix-worker process (a daemon part of Nix, capable of arranging multi-user builds), generates a Nix derivation (a low-level build specification that the worker uses to perform builds) and finally builds the derivation, resulting in a Nix component containing cpio, which is (like the ordinary package manager) stored in isolation in the Nix store and achieving the same purely functional deployment properties.

Possible advantages of Guix over Nix


So you may probably wonder why Guix has been developed and what (potential) benefits it gives over Nix. The presentation given by Ludovic Court├Ęs at the GNU Hackers Meeting, lists the following advantages:

  • because it rocks!
  • because it's GNU!
  • it has a compiler, Unicode, gettext, libraries, etc.
  • it supports embedded DSLs via macros
  • can be used both for composition and build scripts

To be quite honest, I see some potential interesting advantages in these, but they are not entirely clear to me. The first two points are subjectively defined and should not be taken seriously I guess. I assume that it rocks, because it's cool to show that something can be done and I think it's GNU (probably) because it's using the GNU Guile language which has been used as an extension language for a number of GNU packages or due to the fact that Guix has been blessed as an official GNU project.

The third point lists some potential advantages, that are related to a number of potential interesting features of the host language (GNU Guile) that can be (re)used, which in an external DSL have to be developed from scratch, taking significantly more effort. This observation corresponds to one of the advantages described by others that internal DSLs have over external DSLs -- less time to invest in developing a language and host language features that can be reused.

The fourth point (supporting embedded DSLs) is also a bit unclear to me, why this is an advantage. Yes, I've seen macros that implement stuff, such as the standard GNU Autotools build procedure, but I'm not sure in what respect this is an advantage over the Nix expression language.

The fifth point refers to the fact that Scheme can be used for both writing package specifications and their build instructions, whereas in Nix, we typically use embedded strings containing shell code performing build steps. I'm not entirely sure what Guix does different (apart from using macros) and in what respect it offers benefits compared to Nix? Are strings statically checked? automatically escaped? I haven't seen the details or I may have missed something.

My thoughts


First of all, I'd like to point out that I don't disapprove Guix. First, Nix is free software and freedom 1 says:

The freedom to study how the program works, and change it so it does your computing as you wish (freedom 1). Access to the source code is a precondition for this.
So for me, it's perfectly fine that the GNU project scratches their itches. I also think Guix is interesting for the following reasons:

  • It's interesting to compare an internal vs external DSL approach in deployment. Although Ludovic has listed several potential benefits, I still don't see that these are proven yet by Guix. Some questions that came into my mind are:
    • Does compiling Guix specifications give any benefits, let's say in performance or in a different aspect?
    • Why do we need modules, such as gettext, for package specifications?
    • Can Guix expressions be better debugged than Nix expressions? (I know that the Nix expression language is a lazy purely functional language and that errors are very hard to debug).
    • I know that by using an internal DSL the language is extensible, but for what purposes is this useful and what can you do it what Nix currently cannot?

      On the other hand, I think the Nix expression language is also extensible in the sense that you can call any process from a derivation function (implementing an operation) and encapsulate derivation into a function with a nice interface. For example, I have used this for Disnix to integrate deployment planning algorithms. Maybe there are different kind of extensions or more efficient integration patterns possible with Guix? So far, I haven't seen anything concrete yet.
  • I also find the integration aspect with Nix interesting. I have seen many language/environment specific package managers, for e.g. Perl, Python, Eclipse and they all solve deployment issues in their own way. Furthermore, they do not offer all the features we care about, such as transactional upgrades and reproducible builds. By making it possible to integrate language specific package managers with Nix, we can remove this burden/annoyance.
  • If GNU packages can be easily packaged in a functional way, it will also make the lives of Nix packagers easier, as we don't have to implement any hacks/workarounds anymore and we can (semi-)automatically convert Nix and Guix expressions.
  • It's also good to have a critical look at the foundations Nix/Nixpkgs, such as the bootstrap. In Guix, this entire process is reimplemented that may yield useful techniques/lessons that we can apply in Nix as well.
  • To have Nix and the purely functional deployment model is the media always good, as we want to break conventional thoughts/ideas.

Apart from these potential benefits and the fact that Guix is far from finished, I currently see no reason to recommend Guix over Nix, or to see any reason using it myself. To me, it looks like a nice experiment, but I still have to be convinced that it adds value, apart from integration with language-specific package managers. The advantages of using an internal DSL approach still have to be proven.

On the other hand, I also think that external DSLs (which the Nix expression language is) have benefits over internal DSLs:

  • A more concise syntax, which is shorter and better comprehensible. However, I have to admit that I'm a bit biased on this, because I only have little hands-on experience with the Scheme programming language and quite some experience with the Nix expression language.
  • Better static consistency checking. External DSLs can produce more understandable error messages, whereas host language "abusement" may create all kinds of complex structures significantly making error reports much more complicated and difficult to grasp.

    Furthermore in an embedded DSL, you can also use the host language to do some unintended operations. For example, we could use Scheme to imperatively modify a variable, that is used by another package, affecting reproducibility (although the build of a derivation itself is pure, thanks to Nix). A packager has to manually take care that no side-effects are specified, while the Nix expression language prevents these side-effects to be programmed.

    Again, I have not done any comparisons with Nix and Guix, but I have worked several years in a research group (which besides me, includes Eelco Dolstra, the author of Nix) in which external DSLs are investigated, such as WebDSL, a domain-specific language for web applications.

    My former colleague Zef Hemel (along with a few others) wrote a series of blog posts covering issues with internal DSLs (one of them: 'When rails fails', covering some problems with internal DSLs in the Ruby programming language, has raised quite some (controversial) attention) and a paper titled: 'Static consistency checking of web applications with WebDSL' reporting on this matter.
  • External DSLs have often smaller dependencies. If an internal DSL (embedded in a host language) is used in a tool, it often needs the entire host language runtime, which for some programming languages is quite big, while we only need a small subset of it. One of the questions we have encountered frequently in the Nix community, is why we didn't use Haskell (a lazy purely functional language) to specify package configurations in Nix. One of the reasons is that the Haskell runtime is quite big and has many (optional) dependencies.

I have also observed a few other things in Guix:

  • The Nix expression language is a lazy purely functional language, whereas Guix uses eager evaluation, although the derivation files that are produced and built are still processed lazily by the Nix worker. In Nix, laziness offers various benefits, such as that only the desired packages and its required dependencies are built, while everything we don't need is not, improving efficiency. In Guix, other means have to be used to achieve this goal and I'm not sure if Guix has a better approach.
  • Guix's license is GPLv3 including the package descriptions, whereas Nix's license is LGPLv2.1 and the Nixpkgs is MIT licensed. Guix has a stronger copyleft than Nix. I don't want to have a debate about these licenses and the copyleft here, but for more information I'd like to redirect readers to an earlier blog post about free and open-source software and to form an opinion on this.

Concluding remarks


In this blog post I have covered GNU Guix, a package manager recently announced by the GNU project, which uses Nix under the roof to achieve its desired non-functional properties, such as transactional upgrades and reproducible builds. Guix differs from Nix, because it offers an internal DSL using Scheme (through GNU Guile), instead of the Nix expression language, an external DSL.

The main reason why I wrote this blog post is that GNU Guix has appeared on many news sites, which often copy details from each other. They do not always check facts, but just copy what others are saying. These news messages may sometimes suggest that GNU Guix is something entirely new and providing revolutionary new features to make deployment reliable and reproducible.

Moreover, they may suggest that these features are exclusive to GNU Guix, as they only mention Nix briefly (or sometimes not at all). These facts are not true -- these properties were already in Nix and Nix has been designed with these goals in mind. Currently, GNU Guix is merely a front-end to Nix and inherits these properties because of that. At some point in the future, this may give people the impression that "Nix is something like GNU Guix", which should be exactly the opposite. Furthermore, I'm a researcher and I have to look at stuff critically.

Finally, I'd like to stress out that there is no schism in the Nix project or that I have anything against the Guix effort, although I'm speaking on behalf of myself and not the entire Nix community. Furthermore, Nix is mentioned on the GNU Guix website and covered more in detail in the GNU Guix presentation and README.

Although I find it an interesting experiment, I don't see any benefits yet in using Guile over the Nix expression language. I have raised many questions in this blog post and its usefulness still has to be proven, in my opinion.

Wednesday, November 7, 2012

Building Android applications with the Nix package manager

Some time ago, I have used the Nix package manager to build and test software packages for AmigaOS, as a fun project. Furthermore, I have announced that I have switched jobs and that I was exploring the mobile device space. This blog post, is a report on the first step in which I show how to build and emulate Android Apps through the Nix package manager. The approach is comparable to what I have done with the AmigaOS emulator. I think it may be good to hear that I'm actively turning research into practice!

Packaging the Android SDK


The first step in automating a build process of Android Apps, is to package the Android SDK as a Nix package, which contains all required utilities for building, packaging and emulating. We must package it (as opposed to referring to an already installed instance), because all build-time dependencies must be handled through Nix in order to achieve reliability and reproducibility.

Unfortunately, the Android SDK is not very trivial to package:

  • The Android SDK from the website is not a source package. Google does not seem to provide any proper source releases, except for obtaining the sources from Git yourself. The downloadable distribution is a zip archive with Java JAR files and a hybrid of native i686 and x86_64 executables. Native executables do not work with Nix out of the box, as they try to lookup their run-time dependencies from global locations, which are not present on NixOS. Therefore, they must be patched using PatchELF.
  • The Android SDK is not self-contained. It requires developers to install a number of add-ons, such as platform tools, platform SDKs, system images, and support libraries, by running:

    $ android update

  • In the normal workflow, these additions are downloaded by the android utility and stored in the same base directory as the SDK, which is an imperative action. This conflicts with the Nix deployment model, as components are made immutable after they have been built. Moreover, these additions must be installed non-interactively.

Android SDK base package


I have packaged the Android SDK base package in Nix (which is obtained from the Android SDK page) by unzipping the zip distribution and by moving the resulting directory into the Nix store. Then I have patched a number of executables, scripts and libraries to allow them to work from the Nix store.

As explained earlier, we cannot run ELF executables out of the box on NixOS, as Nix has no global directories, such as /usr/lib, in which executables often look for their dependencies. Moreover, the dynamic linker also resides in a different location.

First, we have to patch executables to provide the correct path to the dynamic linker (which is an impurity and does not reside in /lib). For example, by running ldd on the ELF executables, we can see that all of them require libstdc++ (32-bit):

$ ldd ./emulator-x86 
linux-gate.so.1 =>  (0xf76e4000)
libdl.so.2 => /nix/store/7dvylm5crlc0sfafcc0n46mb5ch67q0j-glibc-2.13/lib/libdl.so.2 (0xf76de000)
libpthread.so.0 => /nix/store/7dvylm5crlc0sfafcc0n46mb5ch67q0j-glibc-2.13/lib/libpthread.so.0 (0xf76c4000)
librt.so.1 => /nix/store/7dvylm5crlc0sfafcc0n46mb5ch67q0j-glibc-2.13/lib/librt.so.1 (0xf76af000)
libstdc++.so.6 => not found
libm.so.6 => /nix/store/7dvylm5crlc0sfafcc0n46mb5ch67q0j-glibc-2.13/lib/libm.so.6 (0xf7689000)
libutil.so.1 => /nix/store/7dvylm5crlc0sfafcc0n46mb5ch67q0j-glibc-2.13/lib/libutil.so.1 (0xf7684000)
libgcc_s.so.1 => not found
libc.so.6 => /nix/store/7dvylm5crlc0sfafcc0n46mb5ch67q0j-glibc-2.13/lib/libc.so.6 (0xf7521000)
/nix/store/7dvylm5crlc0sfafcc0n46mb5ch67q0j-glibc-2.13/lib/ld-linux.so.2 (0xf76e5000)

In order to allow these executables to find a particular library, we have to add its full path (provided by evaluating a derivation) to the RPATH header of the ELF executable. The following build commands will patch most of the utilities:
cd tools

for i in dmtracedump emulator emulator-arm emulator-x86 hprof-conv \
  mksdcard sqlite3
do
  patchelf --set-interpreter ${stdenv.gcc.libc}/lib/ld-linux.so.2 $i
  patchelf --set-rpath ${stdenv.gcc.gcc}/lib $i
done
Two other tools apparently do zip compression/decompression and require zlib in addition to libstdc++:
for i in etc1tool zipalign
do
  patchelf --set-interpreter ${stdenv.gcc.libc}/lib/ld-linux.so.2 $i
  patchelf --set-rpath ${stdenv.gcc.gcc}/lib:${zlib}/lib $i
done

A shared library used by the monitor (lib/monitor-x86/libcairo-swt.so) requires many more libraries, which are mostly related to the GTK+ framework.

In addition to ELF binaries, we also have a number of shell scripts that start Java programs. They have a shebang refering to the bash shell residing at /bin/bash, which does not exist on NixOS. By running the shebangfix tool this line gets replaced to refer to the right Nix store path of bash:

for i in ddms draw9patch monkeyrunner monitor lint traceview
do
    shebangfix $i
done

After performing these patching steps, there are still a bunch of utilities not properly functioning, such as the emulator, showing:
SDL init failure, reason is: No available video device
I have used strace to check what's going on:

$ strace -f ./emulator
...
rt_sigaction(SIGINT, {SIG_DFL, [INT], SA_RESTART}, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGQUIT, {SIG_DFL, [QUIT], SA_RESTART}, {SIG_DFL, [], 0}, 8) = 0
futex(0xfffffffff7705064, FUTEX_WAKE_PRIVATE, 2147483647) = 0
open("./lib/tls/i686/sse2/libX11.so.6", O_RDONLY) = -1 ENOENT (No such file or directory)
open("./lib/tls/i686/libX11.so.6", O_RDONLY) = -1 ENOENT (No such file or directory)
open("./lib/tls/sse2/libX11.so.6", O_RDONLY) = -1 ENOENT (No such file or directory)
open("./lib/tls/libX11.so.6", O_RDONLY) = -1 ENOENT (No such file or directory)
open("./lib/i686/sse2/libX11.so.6", O_RDONLY) = -1 ENOENT (No such file or directory)
open("./lib/i686/libX11.so.6", O_RDONLY) = -1 ENOENT (No such file or directory)
open("./lib/sse2/libX11.so.6", O_RDONLY) = -1 ENOENT (No such file or directory)
open("./lib/libX11.so.6", O_RDONLY)     = -1 ENOENT (No such file or directory)
...
open("/nix/store/cy8rl8h4yp2j3h8987vkklg328q3wmjz-gcc-4.6.3/lib/libXext.so.6", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/nix/store/7dvylm5crlc0sfafcc0n46mb5ch67q0j-glibc-2.13/lib/libXext.so.6", O_RDONLY) = -1 ENOENT (No such file or directory)
...
open("/nix/store/cy8rl8h4yp2j3h8987vkklg328q3wmjz-gcc-4.6.3/lib/libXrandr.so.2", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/nix/store/7dvylm5crlc0sfafcc0n46mb5ch67q0j-glibc-2.13/lib/libXrandr.so.2", O_RDONLY) = -1 ENOENT (No such file or directory)
write(2, "SDL init failure, reason is: No "..., 55SDL init failure, reason is: No available video device
) = 55
unlink("/home/sander/.android/avd/foo.avd/hardware-qemu.ini.lock") = 0
exit_group(1) 

Apparently these utilities also open a number of libraries dynamically, such as the ones belonging to the X Window System, which are not in the executable's RPATH. I have fixed this by wrapping the paths to these additional libraries in a shell script that sets the LD_LIBRARY_PATH environment variable and then executes the real executable, so that these can be found:

for i in emulator emulator-arm emulator-x86
do
    wrapProgram `pwd`/$i \
      --prefix LD_LIBRARY_PATH : `pwd`/lib:${libX11}/lib:\
${libxcb}/lib:${libXau}/lib:${libXdmcp}/lib:\
${libXext}/lib
done

Supporting plugins and optional packages


As explained earlier, the Android SDK is not self contained and provides many additions and optional packages, depending on what classes of devices a developer wants to support and what features he wants to provide. Apparently, there is no web page to easily download these additions from. Moreover, we do not need all of them. Downloading all possible additions require developers to download many gigabytes of data.

However, running:

$ android list sdk

reveals some interesting information:

Fetching https://dl-ssl.google.com/android/repository/addons_list-2.xml
  Validate XML
  Parse XML
  Fetched Add-ons List successfully
  Refresh Sources
  Fetching URL: https://dl-ssl.google.com/android/repository/repository-7.xml
  Validate XML: https://dl-ssl.google.com/android/repository/repository-7.xml
  Parse XML:    https://dl-ssl.google.com/android/repository/repository-7.xml
  Fetching URL: https://dl-ssl.google.com/android/repository/addon.xml
  Validate XML: https://dl-ssl.google.com/android/repository/addon.xml
  Parse XML:    https://dl-ssl.google.com/android/repository/addon.xml

The output shows that the Android SDK fetches a collection of XML files from URLs providing package information. I have used these XML files to package all the additions I care about in separate Nix expressions.

Platform tools


One important addition that's not in the base package are the platform tools, which contains the Android debugger and a number of related utilities. The platform-tools' zip distribution is defined in the repository-7.xml file.

Packaging the platform tools in very straight forward. It must be unzipped and a number of native ELF executables need to be patched, such as adb. Fortunately, none of them uses dynamically loaded libraries. There is one shell script: dx that requires a shebang fix.

Finally, the platform tools must be accessible from the platform-tools directory from the Android SDK basedir. We can easily solve this by creating a symlink from the Android SDK base package to the platform tools package.

Platform SDKs and system images


Apart from the basic tools and platform tools, we have to be able to actually develop Android Apps. Android Apps are developed for a wide range of devices and operating system versions, ranging from the classic Android 1.5 OS to the recent Android 4.2 OS.

In order to be able to build an Android app for a particular device (or a class of devices), we require the appropriate Android SDK version for that particular Android version. Besides building, we also want to use the emulator for testing. The emulator requires the right system image for a particular Android OS version.

It would be very costly to have all Android versions supported by default, which requires developers to download a lot of data, while they often only need a small subset of it. Therefore, we want to package every SDK and system image separately.

Fortunately, the repository-7.xml XML file contains all the information that we need to do that. For example, each platform SDK is defined in an XML element, such as:

<sdk:sdk-repository ...>
  <sdk:platform>
    <sdk:version>2.2</sdk:version>
    <sdk:api-level>8</sdk:api-level>
    <sdk:codename/>
    <sdk:revision>03</sdk:revision>
    <sdk:min-tools-rev>
      <sdk:major>8</sdk:major>
    </sdk:min-tools-rev>
    <sdk:description>Android SDK Platform 2.2_r3</sdk:description>
    <sdk:desc-url>http://developer.android.com/sdk/</sdk:desc-url>
    <sdk:archives>
      <sdk:archive arch="any" os="any">
        <sdk:size>74652366</sdk:size>
        <sdk:checksum type="sha1">231262c63eefdff8...</sdk:checksum>
        <sdk:url>android-2.2_r03-linux.zip</sdk:url>
      </sdk:archive>
    </sdk:archives>
    <sdk:layoutlib>
      <sdk:api>4</sdk:api>
    </sdk:layoutlib>
  </sdk:platform>

  ...
</sdk:sdk-repository>
The given XML elements can be transformed into a Nix expression, using XSL in a straight forward manner:

let buildPlatform = ...
in
{
  ...  

  platform_8 = buildPlatform {
    name = "android-platform-2.2";
    src = fetchurl {
      url =
        https://dl-ssl.google.com/android/repository/android-2.2_r03-linux.zip;
      sha1 = "231262c63eefdff8fd0386e9ccfefeb27a8f9202";
    };
    meta = {
      description = "Android SDK Platform 2.2_r3";
      url = http://developer.android.com/sdk/;
    };
  };

  ...
}

The resulting Nix expression is an attribute set, in which every attribute refers to a package containing a platform SDK.

The buildPlatform function simply unzips the zip file and moves the contents into the Nix store. The <sdk:api-level> is an important element -- it's a unique version number that the Android SDK uses to make a distinction between various Android operating systems and is also used to make the attribute names in the above attribute set unique. As we will see later, we can use the API level number and this naming convention to relate optional components to a particular Android OS version.

To make a specific platform SDK available to developers, we must symlink it into the platforms/android-<api-level> directory of the Android base package.

For the system images, a similar approach is used that generates an attribute set in which each attribute refers to a system image package. Here, also a <sdk:api-level> element is defined, that we can use to relate the system image to a particular Android OS version. A system image can be made available by creating a symlink in the system-images/android-<api-level>.

Other additions


In addition to the platform SDKs and system images, there are many more optional additions, which are defined in the addon.xml file. For example, to allow Android Apps to use APIs, such as Google Maps, we need to make these package available as well. The Google API packages are defined in a similar manner in the XML file as the platform SDKs, with an api-level identifier and must be symlinked in into the addons/addon-google_apis-<api-level> directory of the Android SDK package.

There is also the support library that exposes certain newer functionality to older Android OSes and some utility APIs. The support library can be made available by symlinking it into support/ of the Android SDK base package.

Building Android applications


So far, I have described how we can package the Android SDK and its (optional) additions in Nix. How can we use this to automatically build Android Apps through the Nix package manager?

The first important aspect is that the Android command-line utility must be used to create an Android project, as opposed to using the Eclipse IDE. In addition to a basic project layout, the command-line utility produces an Apache Ant build file, that can be used to automatically build the project from the command line. An example of this is:

android create project --target android-8 --name MyFirstApp \
--path /home/sander/MyFirstApp --activity MainActivity \
--package com.example.myfirstapp

The above command-line instruction creates a new project targetting the Android API-level 8 (which corresponds to the Android 2.2 platform SDK, as shown earlier), with the name MyFirstApp, having a MainActivity and stores the code in the com.example.myfirstapp Java package.

By running the following command line instruction, an Android application can be built, which produces an APK archive (a zip archive containing all the files belonging to an App) signed with the debugger key:
$ ant debug
To create releases for production use, we also need to sign an APK with a custom key. A key can be created by running keytool, part of the Java SDK:
$ keytool --genkeypair --alias sander
If I add the following lines to the ant.properties file in the project directory, we can automatically sign the APK with a custom key:
key.store=/home/sander/.keystore
key.alias=sander
key.store.password=foobar
key.alias.password=foobar
By running the following command-line instruction:
$ ant release
A signed APK for release is produced.

I have encapsulated all the previous aspects into a Nix function, named: androidenv.buildApp, which can be used to conveniently build Android apps from source code and a number of specified options. The following code fragment shows an example invocation, building the trivial Android example application, that I have implemented to test this:
{androidenv}:

androidenv.buildApp {
  name = "MyFirstApp";
  src = ../../src/myfirstapp;
  platformVersions = [ "8" ];
  useGoogleAPIs = true;

  release = true;
  keyStore = /home/sander/keystore;
  keyAlias = "sander";
  keyStorePassword = "foobar";
  keyAliasPassword = "foobar";
}
The expression above looks similar to an ordinary expression -- it defines a function that requires androidenv containing all Android related properties. In the remainder of the function, we make a function call to androidenv.buildApp, which can be used to build an App. As function arguments, we provide a name that ends up in the Nix store, a reference to the source code (which resides on the local filesystem), the API-level which we want to target (as we have seen earlier, API-level 8 corresponds to Android OS 2.2) and whether we want to use the Google APIs.

In this example, we have also enabled key signing. If the release parameter is omitted (it is false by default), then the remaining arguments are not required and the resulting APK is signed with the debug key. In our example, we provide the location, alias and keystore passwords that we have created with keytool, so that signing can be done automatically.

As with ordinary expressions, we also have to compose an Android package:
rec {
  androidenv = import ./androidenv { ...  };

  myfirstapp = import ./myfirstapp {
    inherit androidenv;
  };
  ...
}
The above fragment contains a reference to the Android build infrastructure and invokes the earlier build expression with the given androidenv argument. The App can be built by calling (pkgs.nix corresponds to the above code fragement):
$ nix-build pkgs.nix -A myfirstapp
/nix/store/11fz1yxx33k9f9ail53cc1n65r1hhzlg-MyFirstApp
$ ls result/
MyFirstApp-release.apk
By running the above command-line instruction the complete build process is performed. The Android SDK is downloaded and installed, all the required platform SDKs and system images are installed, the App itself is built and signed, and a Nix component is produced containing the signed APK that is ready to be released.

Our build function composes the Android SDK with its optional features using the function parameters, so that only the additions that we need are downloaded and installed, ensuring reliability, reproducibility and efficiency. It would be a waste of time and disk space to download all possible additions, of course. :-)

Emulating Android apps


Besides building Android apps, it is also desirable to test them using the Android emulator. To run the emulator, we must first create an AVD (Android Virtual Device). On the command-line this can be done by:
$ android create avd -n device -t android-8
The above instruction generates an AVD named device targeting the Android API-level 8 (Android 2.2 OS). If we want to use the Google APIs, then we have to pick a different target, which is named: "Google Inc.:Google APIs:8", if which the integer represents the API-level.

Then we have to start the emulator representing the generated AVD:
$ emulator -avd device -no-boot-anim -port 5554
The above command-line instruction starts the emulator running our AVD, without displaying a boot animation. The debugger interface uses TCP port 5554. (As a sidenote, TCP ports are an impurity inside Nix expressions and it seems that the emulator cannot use Unix domain sockets. In order to cope with this, I wrote a procedure that scans for a free TCP port in the even number range between 5554-5584, by grepping the output of: adb devices).

When we start the emulator, we have to wait until its booted so that we can install our generated APK. I have discovered that the Android debugger can wait until a device has reached it's device state, so that the Android debugger is ready to talk to the emulator. This can be done by the following command-line instruction (the -s parameter provides the serial for our recently spawned emulator instance, which is composed of the string 'emulator' and the assigned port number shown earlier):
$ adb -s emulator-5554 wait-for-device
Although the device state has been reached, the device is not guaranteed to be booted. By running getprop command-line tool remotely on the device, we can query various device properties. When the device has been fully booted, the dev.bootcomplete should be 1, e.g.:
$ adb -s emulator-5554 shell getprop dev.bootcomplete
1
Then we should be able to install our APK through the debugger, and should we be able to pick it from the application menu on the device:
$ adb -s emulator-5554 install result/MyFirstApp-release.apk
Finally, we must launch the application, which is done by launching the start activity of an App. We can do this automatically, by remotely calling am that creates an intent to launch the start activity (MainActivity in the example that we have used) from our App package (com.example.my.first.app):
$ adb -s emulator-5554 shell am start -a android.intent.action.MAIN \
  -n com.example.my.first.app/.MainActivity
Because we always have to compose a SDK having all our desired additions and due to the fact that we have to execute a lot of steps, I have decided to conveniently automate this procedure. I have developed a function called: androidenv.emulateApp encapsulating these. The following Nix expression shows how it can be invoked:
{androidenv, myfirstapp}:

androidenv.emulateApp {
  name = "MyFirstApp";
  app = myfirstapp;
  platformVersion = "16";
  useGoogleAPIs = false;
  package = "com.example.my.first.app";
  activity = "MainActivity";
}
The above expression is a function that takes two parameters: androidenv is the Android build infrastructure, myfirstapp refers to the build function of the example application, I have shown earlier.

In the remainder of the expression, we invoke the androidenv.emulateApp function that generates a script that automatically instantiates and launches the emulator and finally deploys our APK in it automatically. Here, we also have to specify which app to use, what API-level we want to target (in this example we target API-level 16, which corresponds to Android 4.1) and whether we want to use the Google APIs). The API-level used for emulation may differ from the level we used for building (i.e. it makes sense to test older Apps on newer devices). Finally, we specify the package name and the name of the main activity, so that we can automatically start the App.

By evaluating this expression and executing the resulting script, an emulator is launched with our example app deployed in it, and it's started automatically:
$ nix-build pkgs.nix -A emulate_myfirstapp
./result/bin/run-test-emulator

The above screenshot shows that it works :-)

Deploying Android apps on real devices


I guess the remaining question is how to deploy Android apps on real devices. After building the App through Nix, the following command-line instruction suffices for me, if I attach my phone to the USB port and I enable debugging on my phone:
$ adb -d install result/MyFirstApp-release.apk
This is the result:


It probably does not look that exciting, but it works!

Conclusion


In this (lengthy, I'm sorry :P) blog post, I have packaged the Android SDK in Nix and a large collection of its additions. Furthermore, I have implemented two Nix functions, that may come in handy for Android App development:

  • androidenv.buildApp builds an Android App for a particular class of Android devices.
  • androidenv.emulateApp generates a script that launches a particular emulator instance and automatically starts an App in it.

These functions take care of almost the entire deployment process of Android Apps hiding most of its complexity, including all its dependencies and (optional) additions. Due to the unique advantages of Nix, we can safely use multiple variants of SDKs and their libraries next to each other, all dependencies are always guaranteed to be included (if they are specified), we can use laziness and function composition to ensure that only the required dependencies are used (which improves efficiency), and we can easily parallelise builds thanks to the purely functional nature of Nix. Furthermore, this function can also be used in conjunction with Hydra -- the Nix-based continuous build and integration server to continuously assess the state of Android App code.

The only nasty detail is that the emulator and debugger use TCP ports to communicate with each other, which is an impurity. I have implemented some sort of a work around, but it's not very elegant and has various drawbacks. As far as I know, there is no way to use Unix domain sockets.

I'd like to thank my new employer: Conference Compass, for giving me the space to develop this and taking interest in the deployment technology I was involved in as a researcher. (hmm 'was'? I'm still involved, and this still is research in some way :-) )

Availability


The Android build infrastructure is part of Nixpkgs, available under the MIT license. The androidenv component can be used by including the Nixpkgs top-level expression. The trivial example case and its composition expression (containing the myfirstapp and emulate_myfirstapp attributes) can be obtained from my Nix Android tests GitHub page.

Friday, November 2, 2012

An alternative explanation of the Nix package manager

As I have explained in my previous blog post, I started working at a new company as a software architect. One of the things I'm working on is (obviously) deployment.

Recently, I had to give my new employer some background information about my past experience and the Nix package manager. I have frequently presented Nix and related subjects to various kind of audiences. However, I nearly always use a standard explanation recipe, that is somewhat similar to what I have described in an old blog post about the Nix package manager.

This time, I have decided to give an alternative explanation, which I will describe in this blog post.

The Nix package manager


In short: Nix is a package manager, which is a collection of software tools to automate the process of installing, upgrading, configuring, and removing software packages. Nix is different compared to conventional package managers, because it borrows concepts from purely functional programming languages to make deployment reliable, reproducible and efficient. The Nix project has been initiated by Eelco Dolstra as part of his PhD research.

Purely functional programming languages


So, what are purely functional programming languages?

Many programming languages that are in use nowadays support functions. Functions in mathematics are an early inspiration source for "building bricks" in higher level languages.

For example, if we would have sticked ourselves to machine language or assembly, implementing and calling functions is not very obvious -- developers have to follow a function calling convention, that sets the function argument values in the right memory locations, pushes/pops memory addresses onto/from the stack, jumps from one memory location to another etc. For example, the following Intel assembly code fragment, shows how a function invocation can be implemented, using a certain calling convention:

.486
.MODEL FLAT
.CODE
PUBLIC _myFunc
_myFunc PROC
  ; Subroutine Prologue
  push ebp     ; Save the old base pointer value.
  mov ebp, esp ; Set the new base pointer value.
  sub esp, 4   ; Make room for one 4-byte local variable.
  push edi     ; Save the values of registers that the function
  push esi     ; will modify. This function uses EDI and ESI.
  ; (no need to save EBX, EBP, or ESP)

  ; Subroutine Body
  mov eax, [ebp+8]   ; Move value of parameter 1 into EAX
  mov esi, [ebp+12]  ; Move value of parameter 2 into ESI
  mov edi, [ebp+16]  ; Move value of parameter 3 into EDI

  mov [ebp-4], edi   ; Move EDI into the local variable
  add [ebp-4], esi   ; Add ESI into the local variable
  add eax, [ebp-4]   ; Add the contents of the local variable
                     ; into EAX (final result)

  ; Subroutine Epilogue 
  pop esi      ; Recover register values
  pop  edi
  mov esp, ebp ; Deallocate local variables
  pop ebp ; Restore the caller's base pointer value
  ret
_myFunc ENDP
END
I don't expect anyone to understand this code fragment, but it's pretty obvious that something as simple as invoking a function, is very complicated on machine level, as opposed to the following code fragment, written in the C programming language that defines a sum function that adds two integers to each other:

int sum(int a, int b)
{
    return a + b;
}

int main()
{
    return sum(1, 2);
}

Although many programming languages support functions, these are not the same as functions in mathematics. Consider the following mathematical theorem, known as Leibniz' Principle:
x = y => f(x) = f(y)
The above theorem states that if two function arguments are identical, the result of their function applications are identical as well, something which looks very obvious and makes sense. However, in most programming languages, this is not always the case, such as the C programming language. Consider the following C code example:
int v = 0;

int add(int a)
{
    v = v + a;
    return v;
}

int main()
{
    int p = add(1); /* 1 */
    int q = add(1); /* 2 */
}
The add function is called twice in this program with the same function argument. However, each function invocation yields a different result. The reason that this happens is because we have a global variable v that is accessed and overwritten in each function invocation to add.

There are many causes why function invocations with the same parameters yield different results, such as functions returning time-stamps, generating random numbers, performing I/O, accessing global variables. These causes are called side-effects in the functional programming community.

Because C (and many other commonly used programming languages) allow these side-effects to be programmed, they lack referential transparency, meaning that functions cannot be replaced by its value without changing the behaviour of a program.

Purely functional languages are an exception. In purely functional programming languages the result of a function invocation exclusively depends on the definition of the function itself and their parameters, as they do not allow side-effects to be programmed and they exclude destructive updates, such as having a global variable that can be updated. Purely functional programming languages have no variables, but identifiers that are bound to immutable objects. As a consequence, functions in these language support referential transparency.

Another characteristic of functional programming languages is that they often use lazy evaluation, which means that an expression only gets evaluated when it's really needed. A well known purely functional programming language is Haskell.

Because of these properties, purely functional programming languages, such as Haskell, have a number of interesting benefits -- thanks to the fact that functions always yield the same result based on their definition and their arguments, we can cache evaluation results, so that they only have to be executed once, improving efficiency. Laziness offers the advantage that a particular function only has to be evaluated when it's needed, improving efficiency even more. Because of referential transparency and the closures of the functions are known, we can also easily divide function execution over multiple cores/CPUs improving execution speed.

Although purely functional languages have some benefits, they also have some drawbacks. For example, the laziness and caching properties conflict with time predictability, which makes it very hard to develop real-time systems. Moreover, programs written in purely functional languages are much harder to debug.

Purely functional package management


Now I'm trying to draw an analogy to package management: What if we treat the deployment process of a package as a function in programming languages (consisting of steps, such as configuring, building from source, installing and/or upgrading)? In conventional package managers like RPM, such "functions" look very much like functions in imperative programming languages, such as C.

For example, most packages that are in use today, are rarely self contained -- they usually depend on other components that are needed during build-time, such as a compiler, and at run-time, such as libraries. Dependencies can be considered function arguments to a function that deploys a specific package.

However, conventional package managers, have a number of drawbacks. For example, while executing a function (i.e. deploying a package), we are often capable of destructively modifying other packages, by overwriting or removing files belonging to another package. Although it may look obvious now that these properties have drawbacks, it's very common that such operations happen. Especially upgrades are destructive and may result in problems, such as the "DLL hell", because after an upgrade, a program may utilise a library that is incompatible or does not work properly.

Furthermore, files belonging to packages are often stored in global locations, such as /usr/lib on Linux, or C:\WINDOWS\System32 on Windows, allowing packages to find their dependencies even if they are not declared (you could see these implicit dependencies as global variables in a programming language). These factors limit reproducibility. For example, if we would run the same function on another machine, the deployment may fail, because the undeclared dependency is not present.

Because the contents of packages deployed by conventional package managers are often installed in global locations, it's hard to allow multiple versions or variants to safely co-exist, unless the packager has manually checked that there is no file that shares the same name with another package.

The Nix package manager is designed to overcome these drawbacks, by borrowing concepts from purely functional programming languages. In Nix, we describe the build recipes of packages in a domain-specific language called the Nix expression language and every build is modeled after a function that describes the build and a function invocation that builds the package with its required dependencies.

The following example shows a Nix expression that describes how to build the GNU Hello package:
{stdenv, fetchurl}:

stdenv.mkDerivation {
  name = "hello-2.6";
  
  src = fetchurl {
    url = ftp://ftp.gnu.org/gnu/hello/hello-2.6.tar.gz;
    sha256 = "1h6fjkkwr7kxv0rl5l61ya0b49imzfaspy7jk9jas1fil31sjykl";
  };

  meta = {
    homepage = http://www.gnu.org/software/hello/manual/;
    license = "GPLv3+";
  };
}
The expression shown above defines a function that takes two arguments: the stdenv component is the standard environment providing common UNIX utilities, (such as cat and ls), GNU Make and GCC, fetchurl is a function that downloads a file from an external source.

In the remainder of the function definition, we invoke the stdenv.mkDerivation function that is used to build a package from source, its dependencies (which are passed as function arguments) and a specified build recipe. In our example, we have omitted the build recipe. If no build procedure is specified, the standard Autotools build procedure: ./configure; make; make install is executed.

The earlier code fragment only defines a function, but in order to build a package we need to compose it, by calling it with the right function arguments, which is done in the following expression:
rec {
  stdenv = ...;

  fetchurl = import ../build-support/fetchurl {
    inherit stdenv curl;
  };

  hello = import ../applications/misc/hello {
    inherit stdenv fetchurl;
  }

  ...
}
Here, the hello attribute is bound to the function defined in the earlier expression and invoked with the right function arguments. The dependencies of GNU Hello are also defined and composed in the above expression.

The derivation functions used to build packages as well as the Nix expression language itself are purely functional -- the build result of a package exclusively depends on the function definition and its given arguments, files belonging to another package are never overwritten or removed, dependencies can only be found if they are specified, and we can easily divide derivations over multiple CPUs, cores or machines to improve scalability.

As may become evident now, applying purely functional concepts in package management makes deployment of packages reliable, reproducible, scalable and efficient.

The remaining open question is how Nix achieves these purely functional properties:

  • To ensure that packages cannot be overridden or removed by build functions of other packages, we store the result of each build in a separate location on the filesystem, in directories that are made immutable (by removing write permission bits), so that they cannot be changed. To create, such a unique location, we store component in a store called the Nix store and we use hashing (derived from all inputs to the function, such as their dependencies, sources and build scripts) to generate a unique directory name, e.g. /nix/store/xq2bfcqdsbrmfr8h5ibv7n1qb8xs5s79-openssl-1.0.0c. If we, for example, change any of its build parameters, such as the used compiler, the hash will differ and thus its safe to allow multiple variants to coexist as they never share the same name.
  • By using the Nix store concept, we get another important property "for free" -- because every package variant is stored in a unique location, as opposed to global locations, such as /usr/lib, we have stricter guarantees that dependency specifications are complete, as they cannot be implicitly found. In order to allow a dependency to be found we have to explicitly specify it, for example, by adding it to the PATH environment variable.
  • To reduce the chances on side effects even more, we run build scripts in isolated environments with unprivileged user rights and we can optionally use a chroot() environment to limit access to the host filesystem even more.
  • We have also patched several common UNIX utilities, such as gcc and ld to ignore global locations, such as /usr/lib.

Nix applications


Apart from package management, Nix has been used for a number of other applications:
  • NixOS is a GNU/Linux distribution completely built around the Nix package manager. Apart from the fact that every component (including the Linux kernel and configuration files) are managed by Nix, it is also used to deploy entire system configurations from declarative specifications.
  • Disnix is an extension to Nix developed by me to automatically deploy service-oriented systems composed of distributable components, in networks of machines.
  • Hydra is a continuous build and integration server, built around Nix and using the Nix expression language to define build and test jobs.

Why am I writing this blog post?


As I have explained Nix on my blog before, you may probably wonder why I'm doing it again? It's not that since I have switched jobs, that I wanted to reboot my blog :-)

At my new working spot, my colleagues are not system administrators or packagers, but people with programming language experience, including functional languages, such as Lisp. Therefore, I have decided to give them an alternative explanation, instead of the traditional one.

In the past, I sometimes had some trouble properly explaining the Nix concepts to certain kind of audiences. In many ways I think this alternative explanation is better than the traditional one, albeit it's also a bit longer, but oh well... :-)

In the traditional explanation, I start explaining conventional package managers (and their drawbacks) and showing the Nix store concept and the hash-codes in the path name. Usually when people see this, they initially have a weird feeling. Some even think that we're crazy.

Then we show more stuff, such as how to write Nix expressions specifying how to build packages, increasing the confusion a bit more. Then we have to put some effort in 'justifying' this means. Quite often, after some confusion people see the point and understand some of the benefits. Some of them, however, have already lost interest by then.

Furthermore, we often avoid the term 'purely functional', because it causes extra confusion. In this alternative explanation, I do not omit the term 'purely functional'. The described means (such as the Nix store paths with hash codes) make sense in this explanation, as they are a direct consequence of mapping the goal of purely functional languages to the deployment of packages.

Downsides of this alternative explanation


Although I find this explanation better, there are also some drawbacks. In this explanation, programming languages are the main focus point. Usually our intended audience consists of system administrators, system engineers, packagers and (of course) researchers. Most of them are often not programmers, and have no affinity with programming language research concepts. Especially uncommon languages, such as purely functional ones (including the term: purely functional).

I have especially suffered from this while trying to publish papers in the systems community. They often slam our papers, because "their event is not about programming languages" and we never receive any valuable feedback from them either, even though I have always used the traditional explanation instead of this one.

Conclusion


In this blog post, I have given an alternative explanation of the Nix package manager, using programming languages as a basis concept. In many ways, I find this explanation better than the older/traditional explanation, although it also has its drawbacks. I think properly explaining Nix without any confusion is still an open challenge.

References