Friday, December 30, 2011

One year later

Today, it's my blog's first anniversary, so I thought this is a good opportunity to do some reflection about it.

Motivation


So why have I started this blog in the first place? The main reason is that as a software engineer and researcher, I don't necessarily only write code or write academic papers:

  • Sometimes I have to solve engineering problems, instead of doing science (whatever that means). In my opinion, this is also interesting and useful to report about. For example, while most of my research is about Nix, NixOS, Disnix, sometimes I have to apply my work in specific scenarios/environments which is challenging enough to get it to work. A notable example is the deployment .NET software. Without proper application of the stuff I'm investigating, the usefulness of my research isn't so great either.
  • Sometimes I have to express ideas and opinions. For example, what others say about the research we're doing or about general software engineering issues.
  • Sometimes I need to do knowledge transfer about issues I have to keep explaining over and over again, like the blog post about software deployment in general.
  • Sometimes I have some fun projects also well.

The main purpose of my blog is to fill this gap with academic paper publishing. Apparently, it seems that some of my blog posts have raised quite some attention, which I'm very happy about. Moreover, some blog posts (like the .NET related stuff) also gave me some early feedback which helped me solving a problem which I was struggling with for a long time.

Writing for a blog


For some reason, I find writing blog posts more convenient than writing academic papers. In order to write an academic paper I have to take a lot of stuff into account next to the scientific contribution I want to make. For example:

  • Because I'm doing research in software deployment and because this is a neglected research topic, I have to adapt my paper to fit in the scope of a conference I have to submit to, because they are typically about something else. This is not always easy.
  • I have to identify the audience of the particular conference and learn about the subjects and concepts they talk about.
  • I have to read a lot of related work papers to determine the exact scientific contribution and to describe what the differences are.
  • I have to integrate my idea into the concepts of the given conference.
  • I have to explain the same concepts over and over again, because they are not generally understood. What is software deployment? Why is software deployment difficult and important? What is Nix? Why is Nix conceptually different compared to conventional package managers? etc. etc.
  • In each paragraph, I have to convince the reader over and over again.
  • It should fit within the page limit
  • I have to keep myself aware of the fact that my paper must be scientific paper and not an engineering paper.

Writing for my blog is actually much easier for me, because I don't have to spent so much time on these other issues next to the contribution I want to make. Furthermore, because I can link to earlier topics and I know my audience a bit, I don't have to explain the same things over and over again.

Although these observations are quite negative, this does not mean that I want to claim that I shouldn't write any academic papers and that academic papers are useless, but nonetheless it's still a funny observation I have.

Blog posts


Another funny observation is the top 10 of most popular blog posts. The top 10 (at this time) is as follows:

  1. Software deployment complexity. I'm happy that my general blog post about software deployment complexity has raised so much attention, because within the software engineering research community it's typically an ignored subject. In just one week, it surpassed the number of hits of all my other blog posts.
  2. Second computer. For some reason my blog post about my good ol' Commodore Amiga is very popular. For a long time, this was actually my most popular blog post and it's not even research related! It seems that 19 years after Commodore's demise, the Amiga is far from dead.
  3. First blog post. This is the first blog post in which I briefly introduce myself. Apparently, people want to know who I am :-)
  4. NixOS: A purely functional Linux distribution. This is a general blog post explaining the idea and features of NixOS, the Linux distribution built on top of the Nix package manager, which we use as a foundation for our current research project.
  5. First computer. Apart from my good ol' Commodore Amiga, my first computer the: Commodore 128 also still lives on!
  6. The Nix package manager. This blog post covers the Nix package manager, on which NixOS is built. Apparently people are interested in the main Nix concepts as well.
  7. Concepts of programming languages. This blog post is about the 'Concepts of programming languages' course taught at the TU Delft, for which I was a teaching assistent. This course covers various programming languages, which are conceptually different from each other. I actually have no idea why this blog post is so popular.
  8. Using NixOS for declarative deployment and testing. This blog post covers two very nice applications of NixOS in which you can automatically deploy a distributed network of NixOS machines and use the same specification to generate a network of virtual machines for testing. I have presented this topic at FOSDEM, Europe's biggest free and open source software event, and my presentation was well received there :-).
  9. On Nix, NixOS and the Filesystem Hierarchy Standard (FHS). This blog post responds to one of the common criticisms I have received about Nix and NixOS from other distributions. In this blog post I explain why we are different and that it's for a very good reason.
  10. Self-adaptive deployment with Disnix. This blog post is about one of my biggest research contributions, which I have presented at the SEAMS symposium (co-located with ICSE) in Hawaii. In this blog post I have built a framework on top of Disnix and the underlying purely functional deployment model of Nix to make systems self-adaptable by redeployment in case of an event.

My biggest surprise is the fact that the Amiga stuff is so popular. Actually, I have some fun projects that I'm working on, so if I can find any time for it, there may be more to report about.

Conclusion


It seems for me that it pays off to have blog, so hopefully next year there will be much more interesting things to report about. I have one more thing to say and that is:


HAPPY NEW YEAR!!!!!!! (I took the fireworks picture above during the ICSE in Hawaii, just in case you wanted to know ;) )

Wednesday, December 21, 2011

Techniques and lessons for improvement of deployment processes

So far, all my software deployment related blog posts were mostly about techniques implemented in Nix, NixOS and Disnix and some general deployment information.
In my career as a Masters student and PhD student, I have written quite a number of Nix expressions for many packages and I have also "Nixified" several large code bases of commercial companies to make it possible to use the distinct features of Nix to make software deployment processes reproducible and reliable.

However, for many systems it turns out that implementing Nix support (for instance, to allow it to be deployed by our Nix-based deployment tools) is more difficult, tedious and laborious than expected. In this blog post, I'm going to describe a number of interesting techniques and lessons that could improve the deployment process of a system, based on my experience with Nix and Disnix. Second, by taking these lessons into account, it also becomes easier to adopt Nix as underlying deployment system.

  1. Deal with the deployment complexity from the beginning in your development process. Although this lesson isn't technical and may sound very obvious to you, I can ensure you that if you have a codebase which grows rapidly, without properly thinking about deployment, it becomes a big burden to implement the techniques described in this blog post later in the development process. In some cases, you may even have to re-engineer specific parts of your system, which can be very expensive to do, especially if your codebase consists of millions of lines of code.

  2. Make your deployment process scriptable. This lesson is also an obvious one, especially if you have adopted an agile software development process, in which you want to create value as soon as possible (without a deployed system there is no value).

    Although it's obvious, I have seen many cases where developers still build and package components of a software system by manually performing build and package tasks in the IDE and by doing all the installation steps by hand (with some deployment documentation). Such deployment processes typically take a lot of time and are subject to errors (because they are performed manually and humans make mistakes).

    Quite often these developers, who perform these manual/ad-hoc deployment processes, tell me that it's much too complicated to learn how a build system works (such as GNU Make or Apache Ant) and to maintain these build specifications.

    The only thing I can recommend to them is that it really is worth the effort, because it significantly reduces the time to make a new release of the product and also allows you test your software sooner and more frequently.

  3. Decompose your system and their build processes. Although I have seen many software projects using some kind of automation, I have also encountered a lot of build tools which treat the the complete system as a single monolithic blob, which must be built and deployed as a whole.

    In my opinion it's better to decompose your system in parts which can be built separately, for the following reasons:

    • It may significantly reduce build times, because only components that have been changed have to be rebuilt. There is no need to reexamine the complete codebase for each change.
    • It increases flexibility, because it becomes easier to replace specific parts of a system with other variants.
    • It allows a better means of sharing components across systems and better integration in different environments. As I have stressed out in an earlier blog post, software nowadays is rarely self-contained and run in many types of environments.
    • It allows you to perform builds faster, because they can be performed in parallel.

    In order to decompose a system properly, you have to think early about a good architecture of your system and keep reflecting about it. As a general rule, the build process of a sub component should only depend on the source code, build script and the build results of the dependencies. It should not be necessary to have cross references between source files among components.

    Typically, a bad architecture for a system also implies a very complex and inefficient build process.

    I have encountered some extreme cases, in which the source code system is one big directory of files, containing hundreds of dependencies including third-party libraries in binary form, infrastructure components (such as the database and web servers) containing many tweaks and custom configuration files.

    Although such code bases can be automatically deployed, it offers almost no flexibility, because it is a big burden to replace a specific part (such as a third party library) and still ensure that the system works as expected. Also it requires the system to be deployed in a clean environment because it offers no integration. Typically these kind of systems aren't updated that frequently either.

  4. Make build-time and run-time properties of components configurable. Another obvious lession, but I quite frequently encounter build processes which make implicit assumptions about the locations where specific dependencies can be found.

    For example, in some Linux packages, I have seen a number of Makefiles which have hardcoded references to the /usr/bin directory to find specific build tools, which I have to replace myself. Although most Linux systems have these build-time dependencies stored in this folder, there are some exceptions such as NixOS and GoboLinux.

    In my opinion it's better to make all locations and settings configurable, either by allowing it to be specified as a build parameter, environment variable (e.g. PATH) or stored in a configuration file, which can be easily edited.

  5. Isolate your build artifacts. I have noticed that some development environments store the output of a build in separate folders (for example Eclipse Java projects) whereas others store all the build artifacts of all projects in a single folder (such as Visual Studio C# projects). Similarly, with end user installations in production environments, libraries are also stored in global directories, such as /usr/lib on Linux environments, or C:\Windows\System32 on Windows.

    Based on my experience with Nix, I think it's better to store the build output of every component in isolation, by storing them in separate folders. Although this makes it harder to find components, it also offers some huge benefits, such as the ability to store multiple versions next to each other. Also upgrades can be made more reliable, because you can install another version next to an existing one. Furthermore, builds can also be made more pure and reproducible, because it's required to provide the locations of these components explicitly. In global folders such as /usr/lib a dependency may still be implicitly found even if it's not specified, which cannot happen if all the dependencies are stored in separate locations. More information about isolation can be found in my blog post about NixOS and the FHS.

  6. Think about a component naming convention. Apart from storing components in separate directories which offers some benefits, it's also important to think about how to name them. Typically, the most common naming scheme that's being used consists of a name and version number. This naming scheme can be insufficient in some cases, for example:

    • It does not take the architecture of the component into account. In a name-version naming scheme a 32-bit variant and a 64-bit variant are considered the same, while they may not be compatible with another component.
    • It does not take the optional features into account. For example, another component may require certain optional features enabled in order to work. It may be possible that a program incorrectly uses a component which does not support a required optional feature.

    I recommend you to also take relevant build-time properties into account when naming a component. In Nix you get such a naming convention for free, although the naming convention is also very strict. Nix hashes all the build-time dependencies, such as the source code, libraries, compilers and build scripts and makes this hash code part of the component name, for example: /nix/store/r8vvq9kq18pz08v24918my6r9vs7s0n3-firefox-8.0.1. Because of this naming scheme, different variants of a component, which are (for example) compiled with another version of a compiler of for a different system architecture do not share the same name.

    In some cases, it may not be necessary to adopt such a strict naming scheme, because the underlying component technology used is robust enough to offer enough flexibility. In one of my case studies covering Visual Studio projects, we only distinguish between variants of component by taking the name, version, interface and architectures into account. because they have no optional features and we assume the .NET framework is always compatible. We organize the filesystem in such a way that 32-bit and 64-bit variants are stored in separate directories. Although a naming scheme like this is less powerful than Nix's, it already offers huge benefits compared to traditional deployment approaches.

    Another example of a more strictly named component store is the .NET Global Assembly Cache (GAC), which makes a distinction between components based on the name, version, culture and a cryptographic key. However, the GAC is only suitable for .NET library assemblies and not for other types of components and cannot take other build properties into account.

  7. Think about component composition. A system must be able to find its dependencies at build-time and run-time. If the components are stored in a separate folders, you need to take extra effort in order to compose them. Some methods of addressing dependencies are:

    • Modify the environment. Most platforms use environment variables to address components, such as PATH for looking up executables, CLASSPATH for looking up Java class files, PERL5LIB for addressing Perl libraries. In order to perform a build these environment variables must be adapted to contain all the dependencies. In order to run the executable, the executable needs to be wrapped in a process, which sets these environment variables. Essentially, this method binds dependencies statically to another component.
    • Composing symlink trees. You can also provide a means of looking up components from a single location by composing the contents of the required dependencies in a symlink tree and by referring to its contents. In conjunction with a chroot environment, and a bind mount of the component directory, you can make builds pure. Because of this dynamic binding approach, you can replace dependencies, without a rebuild/reconfiguration, but you cannot easily run one executable that uses a particular variant of a library, while another runs another variant. This method of addressing components is used by GoboLinux.
    • Copy everything into a single folder. This is a solution which always work, however it is also a very inefficient as you cannot share components.

    Each composition approach has its own pros and cons. The Nix package manager supports all three approaches, however the first approach is mostly used for builds as its the most reliable one, because static binding always ensures that the dependencies are present and correct and that you can safely run multiple variants of compositions simultaneously. The second approach is used by Nix for creating Nix profiles, which end users can use to conveniently access their installed programs.

  8. Granularity of dependencies. This lesson is very important while deploying service-oriented systems in a distributed environment. If service components have dependencies on components with a large level of granularity, upgrades may become very expensive because some components are unnecessarily reconfigured/rebuilt.

    I have encountered a case in which a single configuration file was used to specify the locations of all service components of a system. When the location of a particular service component changes, every service component had to be reconfigured (even services that did not need access to that particular service component) which is very expensive.

    It's better to design these configuration files in such a way that they only contain properties that a service components need to know.


Conclusion


In this blog post I have listed some interesting lessons to improve the deployment process of complex systems based on my experience with Nix and related tools. These can be implemented in various deployment processes and tools. Furthermore, it becomes easier to adopt Nix related tooling by taking these lessons into account.

I also have to remark that it is not always obvious to implement all these lessons. For example, it is hard to integrate Visual Studio C# projects in a stricter deployment model supporting isolation, because the .NET runtime has no convenient way to address run-time dependencies in arbitrary locations. However, there is a solution for this particular problem, which I have described in an earlier blog post about .NET deployment with Nix.

UPDATE: I have given a presentation about this subject at Philips recently. As usual, the slides can be obtained from the talks page of my homepage.

Monday, December 12, 2011

An evaluation and comparison of GoboLinux


In this blog post, I'm going to evaluate deployment properties of GoboLinux and compare it with NixOS. Both Linux distributions are unconventional because they deviate from the Filesystem Hierarchy Standard (FHS). Apart from this, GoboLinux shares the same idea that the filesystem can be used to organize the management of packages and other system components, instead of relying on databases.

The purpose of this blog post is not to argue which distribution is better, but to see how this distribution achieves certain deployment properties, what some differences are compared to NixOS and what we can learn from it.

Filesystem organization


Unlike NixOS, which only deviates from the FHS where necessary, GoboLinux has a filesystem tree which completely deviates from the FHS, as shown below:


GoboLinux stores static parts of programs in isolation in the /Programs folder. The /System folder is used to compose the system, to store system configuration settings and variable data. /Files is used for storing data not belonging to any program. /Mount is used for accessing external devices such as DVD-ROM drives. The /Users folder contains home directories. /Depot is a directory without a predefined structure, which can be organized by the user.

GoboLinux also provides compatibility with the FHS. For example, the /usr and /bin directories are actually also present, but not visible. These directories are in fact symlinks trees referring to files in the /System directory (as you may see in the picture above). They are made invisible to end-users by a special kernel module called GoboHide.

Package organization


Like NixOS, GoboLinux stores each package in a seperate directories, which do not change after they have been built. However, GoboLinux uses a different naming convention compared to NixOS. In GoboLinux every package uses the /Program/<Name>/<Version> naming convention, such as: /Programs/Bzip2/1.0.4 or /Programs/LibOGG/1.1.3. Furthermore, each program directory contains a Current symlink which points to the version of the component, that is actually in use. Some program directories also have a Settings/ directory containing system configuration files for the given package.

The use of isolated directories offers a number of benefits like NixOS:

  • Because every version is stored in its own directory, we can safely store multiple versions next to each other, without having to worry that a file of another version gets overwritten.
  • By using a symlink, pointing to the current version, we can atomically switch between versions by flipping the symlink (also used by Nix to switch Nix profiles), which ensures that there is no situation in which a package contains both files of the old version and new version.

In contrast to NixOS, GoboLinux uses a nominal naming convention and dependency specifications, which are based on the name of the package and version number. With a nominal naming convention, it cannot make a distinction between several variants of components, for example:

  • It does not reflect which optional features are enabled in a package. For example, many libraries and programs have optional dependencies on other libraries. Some programs may require a particular variant of a library with a specific option enabled.
  • It does not take the build-time dependencies, such as build tools or library versions into account, such as the version of GCC or glibc. Older versions may have ABI incompatibilities with newer versions.
  • It does not take the architecture of the binaries into account. Using this naming scheme, it is harder to store 32-bit and 64-bit binaries safely next to each other.

In NixOS, however, every package name is an exact specification, because the component name contains an hash-code derived from all build-time dependencies to build the package, such as the compiler version, libraries and build scripts.

For example, in NixOS bzip2 may be stored under the following path: /nix/store/6iyhp953ay3c0f9mmvw2xwvrxzr0kap5-bzip2-1.0.5. If bzip2 is compiled with a different version of GCC or linked to a different version of glibc, it gets a different hash-code and thus another filename. Because no component shares the same name, it can be safely stored next to other variants.

Another advantage of the Nix naming convention is that unprivileged users can also build and install software without interfering with other users. If for example, a user injects a trojan horse in a particular package, the resulting component is stored in a different Nix store path and will not affect other users.

However, because of these exact dependency specifications, upgrades in NixOS may be more expensive. In order to link a particular program to a new version of a library, it must be rebuild, while in GoboLinux the library dependency can be replaced without rebuilding (although it may not guarantee that the upgraded version will work).

System composition


Of course, storing packages in separate directories does not immediately result in a working system. In GoboLinux, the system configuration and structure is composed in the /System directory.

The /System directory contains the following directories:

  • Kernel/, contains everything related to the Linux kernel, such as Boot/ storing kernel images, Modules/ storing kernel modules and Objects/ storing device files.
  • Links/, is a big symlink tree composing the contents of all packages that are currently in use. This symlink tree makes it possible to refer to executables in Executables/, libraries in Libraries/ and other files from a single location. The Environment/ directory is used to set essential environment variables for the installed programs.
  • Settings/ contains configuration files, which in other distributions are commonly found in /etc. Almost all files are symlinks to the Settings/ folder included in each program directory.
  • Variable/ contains all non-static (variable) data, such as cache and log files. In conventional distributions these files are stored in /var.

Like NixOS, GoboLinux uses symlink trees to compose a system and to make the contents of packages accessible to end users.

Recipes


Like NixOS, GoboLinux uses declarative specifications to build packages. GoboLinux calls these specifications recipes. Each recipe directory contains a file called Recipe which describes how to build a package from source code and a folder Resources/ defining various other package properties, such as a description and a specification of build-time and run-time dependencies.

compile_version=1.8.2
url="$httpSourceforge/lesstif/lesstif-0.95.2.tar.bz2"
file_size=2481073
file_md5=754187dbac09fcf5d18296437e72a32f
recipe_type=configure
make_variables=(
   "ACLOCALDIR=$target/Shared/aclocal"
   "aclocaldir=$target/Shared/aclocal"
)

An example of a recipe, describing how to build Lesstif, is shown above. Basically, for autotools based projects, you only need to specify the location where the source code tarball can be obtained, and some optional parameters. From this recipe, the tarball is download from the sourceforge web server and the standard autotools build procedure is performed (i.e. ./configure; make; make install) with the given parameters.

In the Resources/Dependencies run-time dependencies can be specified and in Resource/BuildDependencies build-time dependencies can be specified. The dependencies are specified by giving a name and an optional version number or version number range.

GoboLinux recipes and Nix expressions both offer abstractions to declaratively specify build actions. A big difference between those specifications, is the way dependencies are specified. In GoboLinux only nominal dependency specifications are used, which may not be complete enough, as explained earlier. In Nix expressions, you refer to build functions that build these dependencies from source-code and their build-time dependencies, which are stored in isolation in the Nix store using hash codes.

Furthermore, in Nix expressions run-time and build-time dependencies are not separately specified. In Nix, every run-time dependency is specified as build-time dependency. After a build has been performed, Nix conservatively scans for hash occurrences of build-time dependencies inside a realized component to identify them as run-time dependencies. Although this sounds risky, it works extremely well, because chances are very slim that an exact occurrence of a hash code represents something else.

Building packages


In GoboLinux, the Compile tool can be used to build a package from a recipe. Builds performed by this tool are not entirely pure, because it looks for dependencies in the /System/Links directory. Because this directory contains all the installed packages on the system, it may be possible that the build of a recipe may accidentally succeed when a dependency is not specified, because it can be implicitly found.

In order to make builds pure, GoboLinux provides the ChrootCompile extension, which performs builds in a chroot environment. ChrootCompile tool bind mounts the /Program directory in the chroot environment and creates a small /System only containing symlinks to the dependencies that are specified in the recipe. Because only the specified dependencies can be found in /System/Links directory, a build cannot accidentally succeed if a dependency has been omitted.

Both Nix and ChrootCompile have the ability to prevent undeclared dependencies to accidentally succeed a build, which improves reproducibility. In Nix, this goal is achieved differently. In Nix, the environment variables in which a build is performed are completely cleared (well not completely, but almost :-) ), and dependencies which are specified are added to the PATH and other environment variables, which allow build tools to find the dependencies.

Nix builds can be optionally performed in a chroot environment, but this is not mandatory. In NixOS, the traditional FHS directories, such as /usr don't exist and cannot make a build impure. Furthermore, common utilities such as GCC have been patched so that they ignore standard directories, such as /usr/include, removing many impurities.

Furthermore, Nix also binds dependency relationships statically to the executables (e.g. by modifying the RPATH header in an ELF binary), instead of allowing binaries to look in global directories like: /System/Links, which GoboLinux executables do. Although, GoboLinux builds are pure inside a chroot environment, their run-time behaviour may be different when a user decides to upgrade a version of its dependency.

Conclusion


In this blog post I did an evaluation of GoboLinux and I compared some features with NixOS. In my opinion, evaluating GoboLinux shows a number of interesting lessons:

  • There are many deployment related aspects (and perhaps other aspects as well) that can be solved and improved by just using the filesystem. I often see that people write additional utilities, abstraction layers and databases to perform similar things, while you can also use symlink trees and bind mounts to provide abstractions and compositions. Additionally, this also shows that the filesystem, which is an essential key component for UNIX-like systems, is still important and very powerful.
  • Storing packages in separate directories is a simple way to manage their contents, store multiple versions safely next to each other and to make upgrades more reliable.

GoboLinux and NixOS have a number of similar deployment properties, because of their unorthodox filesystem organization, but also some differences and limitations. The following table summarizes the differences covered in this blog post:

GoboLinux NixOS
FHS compatibility Yes (through hidden symlink trees) No (deviates on some FHS aspects)
Component naming Nominal (name-version) Exact (hash-name-version)
Component binding Dynamic (dependency can be replaced without rebuild) Static (rebuild required if a dependency changes)
Granularity of component isolation Only between versions Between all build-time dependencies (including version)
Unprivileged user installations No Yes
Build specifications Stand-alone recipes Nix expressions which need to refer to all dependency expressions
Build-time dependency addressing /System/Links symlink tree in chroot environment Setting environment variables + modified tools + optional chroot environment
Run-time dependencies Specified manually Extracted by scanning for hash occurrences
Run-time dependency resolving Dynamic, by looking in /System/Links Static (e.g. encoded in RPATH)

The general conclusion of this blog post is that both distributions achieve better deployment properties compared to conventional Linux distributions, by their unorthodox filesystem organisation. Because of the purely functional deployment model of the Nix package manager, NixOS is more powerful (and strict) than GoboLinux when it comes to reliability, although this comes at the expense of extra rebuild times and additional disk space.

The only bad thing I can say about GoboLinux is that the latest 014.01 release is outdated (2008) and it looks like 015 is in progress for almost three years... I'm not sure if there will be a new release soon, which is a pity.

And of course, apart from these deployment properties, there are many other differences I haven't covered here, but that's up to the reader to make a decision.

I'm planning to use these lessons for a future blog post, which elaborates more on techniques for making software deployment processes more reliable.

Tuesday, November 29, 2011

On Nix, NixOS and the Filesystem Hierarchy Standard (FHS)

Our work on NixOS has been topic of discussion within various free software/open source projects and websites. For example, we have been discussed on the Debian mailing list and on Linux Weekly News.

One of the criticisms I often receive is that we don't comply with the Filesystem Hierarchy Standard (FHS). In this blog post, I'd like to give a response on the FHS and why NixOS deviates from it.

What is the Filesystem Hierarchy Standard?


The purpose of the Filesystem Hierarchy Standard (FHS) is to define the main directories and their contents in Linux operating systems. More specifically:

  • It defines directory names and their purposes. For example, the bin directory is for storing executable binaries, sbin is for storing executable binaries only accessible by the super-user, lib is for storing libraries.
  • It defines several hierarchies within the filesystem, which have separate purposes:

    The primary / hierarchy contains essential components for the boot process and system recovery. The secondary hierarchy: /usr contains components not relevant for booting and recovery. Moreover, files in this directory should be shareable across multiple machines, e.g. through a network filesystem. The tertiary hierarchy: /usr/local is for the system administrator to install software locally.

    Hierarchies and special purpose directories are usually combined. For example, the /bin directory contains executable binaries relevant for booting and recovery which can be used by anyone. The /usr/bin directory contains executable binaries not relevant for booting or recovery, such as a web browser, which can be shared across multiple machines. The /usr/local/bin directory contains local executables installed by the system administrator.

    The FHS also defines the /opt directory for installing add-on application software packages. This directory can also be considered a separate hierarchy, although it is not defined as such in the standard. This directory also contains the same special purpose directories, such as bin and lib like the primary, secondary and tertiary hierarchies. Furthermore, also the /opt/<package name> convention may be used, which stores files specific to a particular application in a single folder.
  • It also makes a distinction between static and variable parts of a system. For example, the contents of the secondary hierarchy /usr are static and could be stored on a read-only mounted partition. However, many programs need to modify their state at runtime and store files on disks, such as cache and log files. These files are stored in variable storage directories, such as /var and /tmp.
  • It defines what the contents of some folders should look like. For example, a list of binaries which should reside in /bin, such as a Bourne compatible shell.
  • The standard has some practical problems. Some aspects of the filesystem are undefined and need clarification, such as a convention to store cross compiling libraries. Furthermore, the standard is quite old and newer Linux features such as the /sys directory are not defined in the standard. To cope with these issues, many distributions have additional clarifications and policies, such as the Debian FHS policy.

Linux Standard Base


The Filesystem Hierarchy Standard (FHS) is part of another standard: The Linux Standard Base (LSB), which incorporates and extends several other standards, such as POSIX and the Single UNIX specification.

The LSB standard has been developed, because there are many Linux based systems out there. What they all have in common is that they run the Linux kernel, and quite often a set of GNU utilities (that's why the free software foundation advocates GNU/Linux as the official name for these kind of systems).

Apart from some similarities, there are many ways in which these systems differ from each other, such as the package manager which is being used, the way the file system is organized, the software which is supported etc.

Because there are many Linux systems available, the Linux Standards Base is designed to increase compatibility among these Linux distributions, so that it becomes easier for software vendors to ship and run Linux software, even in binary form. Many of the common Linux distributions such as Debian, SuSE and Fedora try to implement this standard, although it has several issues and criticisms.

NixOS and the FHS



In NixOS, we have an informal policy to follow the FHS as closely as possible, but to deviate from it where necessary. An important aspect is that we can't follow the structure of the primary /, secondary /usr and tertiary hierarchies: /usr/local.

The main reason to refrain from using these hierarchies is that they don't provide isolation. For example the /usr/lib directory contains a wide collection of shared libraries belonging to many packages. To determine to which package a particular library belongs, you need to have a look in the database of the package manager. Because most libraries can be found in the same location, e.g. /usr/lib, it is very tempting to forget specifying a dependency, because they can still be implicitly found.

The purpose of the Nix package manager is to achieve purity, which means that the build result of a package should exclusively depends on the source code and input parameters. Purity ensures that the build of a package can be reproduced anywhere and that a package can be transferred to any machine we want, with the guarantee that the dependencies are present and correct.

Nix achieves purity by using the filesystem as a database and to store packages in isolation in a special directory called the Nix store. For example:

/nix/store/r8vvq9kq18pz08v249h8my6r9vs7s0n3-firefox-8.0.1

The path above refers to the Mozilla Firefox web browser. The first part of the directory name: r8vvq9kq18pz08v249h8my6r9vs7s0n3 is a hash-code derived from all build time dependencies. By using this naming convention, components can be stored safely in isolation from each other, because no component shares the same name. If Firefox is compiled with a different version of GCC or linked to a different version of the GTK+ library, the hash code will differ and thus not interfere with another variant.

Because of the naming scheme of the Nix store and the fact that we don't use global directories to store binaries and libraries, the build of a Nix package will typically fail if a required build time dependency is omitted. Furthermore, it also restricts undeclared dependencies which may allow a build to accidentally succeed (and therefore also prevents incomplete dependency specifications).

Another deviation of the FHS, is not following the list of required binaries and libraries in the /bin, /lib etc. directories. For example, the FHS requires a Bourne compatible shell in /bin/sh, some mkfs.* utilities in /sbin and for historic reasons, a sendmail executable in /usr/lib.

In NixOS, apart from the /bin/sh executable (which is a symlink to the bash shell in the Nix store), we don't store any binaries in these mandatory locations because it is not necessary and it also makes builds impure.

Discussion


Now that I have explained why we deviate from the FHS on some points, you probably may wonder, how we achieve certain properties defined in the FHS in NixOS:

  • How to determine files for the boot process and recovery? Because we don't use hierarchies, files relevant for the boot process can't be found in /bin and /lib. In NixOS, we basically generate a small base system as a Nix component serving this purpose, which is stored as a single component in the Nix store.
  • How to share components? Because we have no secondary hierarchy, you can't share components by storing the /usr directory on a network file system. In Nix, however, the entire Nix store is static and shareable. Furthermore, it's even more powerful, because the hash codes inside the component names prevent different variants of NixOS to use the incorrect versions of a component. For example, you can safely share the same Nix store across 32-bit and 64-bit machines, because this parameter is reflected in the hash.

    In fact, we already use sharing extensively in our build farm, to generate virtual machines to perform testcases. Apart from a Linux kernel image and an initial RAM disk, the components of the entire virtual network are shared through the host system's Nix store, which is mounted as a network file system.
  • How to find dependencies? Because we store all software packages in the Nix store, it is harder to address components in e.g. scripts and configuration files, because all these components are stored in separate directories, and aren't necessarily in the user's PATH.

    I have to admit that this is inconvenient in some cases, however in NixOS we usually don't manually edit configuration files and write scripts.
    NixOS is a distribution that is unconventional in this sense, because its goal is to make the entire deployment process model-driven. Instead of manually installing packages and adapting configuration files in e.g. /etc. In NixOS, we generate all the static parts of a system from Nix expressions, including all configuration files and scripts. The Nix expression language provides the paths to the components.

Improvements


Although we have to deviate from the FHS for a number of reasons, there are also a few improvements we could make in the organization of the filesystem:

  • The directory names and purposes within the hierarchies can be more consistently used within Nix packages. Essentially, you could say that every component in the Nix store is a separate hierarchy.

    For example, I think it's a good thing to use /nix/store/<package>/bin for storing binaries, and /nix/store/<package>/lib for libraries. Sometimes installation scripts of packages do not completely adhere to the FHS and need some fixing. For example, some packages install libraries in the libexec directory, which is not defined by the FHS.

    We could make an additional check in the generic builder of Nixpkgs, checking the structure of a package and to report inconsistencies.
  • For the variable parts of the system, we can adhere better to the FHS. For example, the current version of the FHS defines the /srv directory used for serving files to end-users, through a service (e.g. a web server). In NixOS, this directory is not used.
  • Because the FHS standard has some gaps, we could also define some additional clarification, like other Linux distributions do.

Conclusion


In this blog post I've explained the idea behind the Filesystem Hierarchy Standard (FHS) and I have explained why we deviate from it in NixOS. The main reason is that the organization of specific parts of the file system, conflict with important aspects of the Nix package manager, such as the ability to store components isolation and to guarantee correct and complete dependencies.

Furthermore, apart from the FHS, NixOS cannot completely implement other parts of the LSB either. For example, in the LSB a subset of the RPM package manager is defined as default package manager, which imperatively modifies the state of a system.

If we would implement this in NixOS, we can no longer ensure that the deployment of a system is pure and reproducible. Therefore, we sometimes have to break from the traditional organization and management of Linux systems. However, sometimes I get the impression that the FHS is considered a holy book and by breaking from it, we are considered heretics.

Related work


  • GoboLinux is also a Linux distribution not obeying the FHS, with a custom filesystem organization. They also have the vision that the filesystem can act as a database for organizing packages.
  • The Mancoosi project is also a research project investigating package management related problems, like we do. Their research differs from us, because they don't break away from the traditional filesystem organization and management of Linux systems. For example, a lot of their research deals with maintainer scripts of packages, which goal is to "glue" files from a package to files on the system, e.g. by imperatively modifying configuration files.

    By keeping the traditional management intact, this introduces a whole bunch of challenges to make deployment efficient, reliable and reproducible. For example, they have investigated modeling techniques for maintainer scripts and system configurations, to simulate whether an upgrade succeed and to determine inverse operations to perform rollbacks.

    In our research, we achieve more reliable deployment by breaking from the traditional model.

Monday, October 31, 2011

Deploying .NET services with Disnix

In two earlier blog posts, I have explained how the the Nix package manager can be used to deploy .NET software. I'm happy to report that I have extended these possibilities to Disnix.

With these new features it is possible to develop databases for Microsoft SQL server, implement WCF web services and ASP.NET web applications (which may be inter-connected to each other) and to automatically and reliably deploy them in a network of machines using Disnix.

Modifications to Disnix


The modifications I had to make to Disnix were relatively minor. Disnix can already be compiled on Cygwin, just as the Nix package manager. Since Disnix is built on top of Nix, it reuses the Nix functions I have developed in earlier blog posts to build Visual Studio projects.

The only missing piece in the deployment process, is the activation and deactivation of Microsoft SQL server databases and ASP.NET web applications on Internet Information Services (IIS) for which activation scripts must be developed. Luckily, Microsoft SQL server and IIS have command-line tools which can be scripted, to do this job.

To support these new types of services, I have developed the following activation scripts:

  • mssql-database. This activation script loads a schema on initial startup if the database does not exists. It uses the OSQL.EXE tool included with Microsoft SQL server, to automatically execute the SQL instructions from shell scripts to check whether the database exists and to create the tables if needed.
  • iis-webapplication. This activation script activates or deactivates a web application on Microsoft Internet Information Services. It uses the MSDeploy.exe tool to automatically activate a web application or deactivate a web application.

These activation scripts are automatically used by assigning a mssql-database or iis-webapplication type to a service in the Disnix services model.

Installing Disnix on Windows


Important is to know how to get Disnix working on Windows and how to enable support for .NET applications. Most of the installation steps on Windows/Cygwin are the same as UNIX systems. Details of the Disnix installation process can be found in the Disnix documentation.

However, there are a number of details that must be taken care of, which are not described in the manual (yet). Furthermore, the best way to get Disnix working is by compiling it from source so that all the required optional features are enabled.

In order to enable the mssql-database and iis-webapplication activation types, you must first manually install SQL server and IIS on your Windows system:



Moreover, the configure script of the disnix-activation-scripts package must be able to find OSQL.EXE and MSDeploy command-line tools, which must be in your PATH. Otherwise the activation types that we need are disabled and we cannot deploy .NET applications.

On my Windows 7 machine, OSQL can be found in: C:\Program Files\Microsoft SQL Server\100\Tools\Binn and MSDeploy in: C:\Program Files\IIS\Microsoft Web Deploy. I have included a screenshot above, which shows you what the output should of the configure script should look like. As you can see, the configure script was able to detect the locations of the command line tools, because I have adapted the PATH environment variable.

Running the Disnix daemon


We also need to run the Disnix daemon on every machine in the network, so that we can remotely deploy the services we want. Probably the best way to get Cygwin services running is by using the cygrunsrv command, which runs Cygwin programs as Windows services.

Since the core Disnix daemon is a D-Bus service, we need to run the D-Bus system daemon, which can be configured by typing:

$ cygrunsrv -I dbus -p /usr/sbin/dbus-daemon.exe -a \
    '--system --nofork'

The Disnix service can be configured by typing:

$ cygrunsrv -I disnix -p /usr/local/bin/disnix-service.exe -a \
  '--activation-modules-dir /usr/local/libexec/disnix/activation-scripts' \
  -e 'PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin' \
  -y dbus

Disnix also needs to be remotely connectible. Disnix can use various interfaces, but the recommended interface is SSH. In order to connect through SSH, you also need to configure a SSH server. This can be done by executing the following script on Cygwin:

$ ssh-host-config

And you probably need to configure some SSH keys as well, to prevent Disnix asking for passwords for each operation. Check the OpenSSH documentation for more information.


After configuring the services, you probably need to activate them for the fist time, which can be done by the Windows service manager (Control Panel -> System and Security -> Administrative Tools -> Services). You need to pick the Disnix service and select the start option. If you want to use the SSH server, you need to pick and start the 'CYGWIN sshd' service as well. A screenshot is included above.

Example case


Now that I have explained how Disnix can be installed and configured on Windows, we probably also like to see what it's capabilities are. As an example case, I have ported the StaffTracker, a motivating example in our WASDeTT paper, from Java to .NET technology, using C# as an implementation language, ADO.NET as database manager, WCF to implement web services, and ASP.NET to create the web application front-end.


It was a nice opportunity to learn some of these technologies. I have to admit that Visual Studio 2010 was a very convenient development environment and it didn't take much time for me to port the example case. Although I was impressed by this, I currently have no plans to write any software using .NET technology except for this example case. (Perhaps I will port it to Mono as an experiment some day).

Distributed deployment


In order to make our example deployable through Disnix, I had to write Disnix expressions for each service component and I had to write a services, infrastructure and distribution model. A Disnix expression for a WCF web service looks like this:

{dotnetenv}:
{zipcodes}:

dotnetenv.buildSolution {
  name = "ZipcodeService";
  src = ../../../../services/webservices/ZipcodeService;
  baseDir = "ZipcodeService";
  slnFile = "ZipcodeService.csproj";
  targets = "Package";
  preBuild = ''
    sed -e 's|.\SQLEXPRESS|${zipcodes.target.hostname}\SQLEXPRESS|' \
        -e 's|Initial Catalog=zipcodes|Initial catalog=${zipcodes.name}|' \
        -e 's|User ID=sa|User ID=${zipcodes.target.msSqlUsername}|' \
        -e 's|Password=admin123$|Password=${zipcodes.target.msSqlPassword}|' \
        Web.config
  '';
}

As you may notice, the expression above looks similar to an ordinary Nix expression building a Visual Studio project, except that it uses the inter-dependency parameter (zipcodes) to configure a database connection string defined in the Web.config file, so that the web service can connect to its database back-end. More information about setting a connection string through this configuration file, can be found here: http://weblogs.asp.net/owscott/archive/2005/08/26/Using-connection-strings-from-web.config-in-ASP.NET-v2.0.aspx.

By using the Disnix expressions in conjunction with the services, infrastructure and distribution models, the .NET example can be deployed in a network of machines with a single command line instruction:

$ disnix-env -s services.nix -i infrastructure.nix -d distribution.nix

Below, I have a included a screenshot showing a Disnix deployment scenario with our .NET Staff Tracker example. In this screenshot, you can see a console showing Disnix output, a web browser displaying the entry page of the web application front-end, the IIS manager showing various deployed WCF web services, and the ASP.NET web front-end and the SQL server management studio showing a number of deployed databases. The complete system is deployed using a single command-line instruction.


Limitations


In this blog post I have shown how service-oriented .NET applications can be automatically deployed with Disnix. There are several slight inconveniences however:

  • Disnix is used to manage the service components of a system. Disnix does not deploy infrastructure components, such as web server or database server. DisnixOS is a NixOS based extension that takes care of this. However, DisnixOS cannot be used on Windows, because SQL server and IIS are tightly integrated into the Windows operating system and registry. We cannot use the Nix store to safely isolate them. You need to either install these infrastructure components manually or use other deployment solutions.
  • As mentioned in our previous blog post about .NET deployment, the .NET framework needs to be installed manually in order to be able to build Visual Studio projects. On most Windows installations, however, the .NET framework is already included.
  • The .NET build functions and activation scripts are quite new and not very well tested. Moreover, they could also break, because we currently have no way to automatically test them like we do with Linux software.
  • Also not all desired deployment features may be supported. I'd like to have feedback on this :-)

References

Thursday, October 27, 2011

Software deployment complexity

In this blog post, I'd like to talk about the software deployment discipline in general. In my career as PhD student and while visiting academic conferences, I have noticed that software deployment is (and has never been) a very popular research subject within the software engineering community.

Furthermore, I have encountered many misconceptions about what software deployment is supposed to mean and even some people are surprised that people do research in this field. I have also received some vague complaints of certain reviewers saying that things that we do aren't novel and comments such as: "hmm, what can the software engineering community learn from this? I don't see the point..." and "this is not a research paper".

What is software deployment?


So what is actually software deployment? One of the first software deployment papers in academic research by Carzaniga et al [1], describes this discipline as follows:

Software deployment refers to all the activities that make a software system
available for use

Some of the activities that may be required to make a software system available for use are:

  • Building software components from source code
  • Packaging software
  • Transferring the software from the producer site to consumer site
  • Installation of the software system
  • Activation of the software system
  • Software upgrades


An important thing to point out is that the activities described above are all steps to make a software system available for use. I have noticed that many people mistakenly think that software deployment is just the installation of a system, which is not true.

Essentially, the point of software deployment is that a particular software system is developed with certain goals, features and behaviour in mind by the developers. Once this software system is to be used by end-users, it typically has to be made available for use in the consumer environment. Important is that the software system behaves exactly the way the developers have intended. It turns out that, for many reasons, this process has become very complicated nowadays and it is also very difficult to give any guarantees that a software system operates as intended. In some cases, systems may not work at all.

Relationship to software engineering


So what has software deployment to do with software engineering? According to [2] software engineering can be defined as:

Software Engineering (SE) is the application of a systematic, disciplined, quantifiable approach to the development, operation, and maintenance of software, and the study of these approaches; that is, the application of engineering to software.

Within the software engineering research community, we investigate techniques to improve and/or study software development processes. Typically, the deployment step is usually the last phase in a software development project, when the development process of a software system is completed and ready to be made available to end-users.

In old traditional waterfall-style software development projects, the deployment phase is not performed so frequently. Nowadays most software development projects are iterative in which features of the software are extended and improved, so for each cycle the system has to be redeployed. Especially in Agile software projects, which have short iterations (of about 2 weeks) it is crucial to be able to deploy a system easily.

Because of the way we develop software nowadays, the deployment process has become much more of a burden and that's why it is also important to have systematic, disciplined, quantifiable approaches for software deployment.

Apart from delivering systems to end-users, we also need to deploy a system to test it. In order to run a test suite, all necessary environmental dependencies must be present and correct. Without a reliable and reproducible deployment process, it becomes a burden and difficult to guarantee that tests succeed in all circumstances.

Why is software deployment complicated?



Back in the old days, software was developed for a specific machine (or hardware architecture), stored on a disk/tape and delivered to the customer. Then the customer loaded the program from the tape/disk into memory and was able to run the program. Apart from the operating system, all the required parts of the program were stored on the disk. Basically, my good old Commodore 64/128 worked like this. All software was made available on either cassette tapes or 5.25 inch floppy disks. Apart from the operating system and BASIC interpreter (which were stored in the ROM of the Commodore) everything that was required to run a program was available on the disk.

Some time later, component based software engineering (CBSE) was introduced and received wide acceptance. The advantages of CBSE were that software components can be obtained from third parties without having to develop those yourself and that components with the same or similar functionality can be shared and reused across multiple programs. CBSE greatly improved the quality of software and the productivity of developers. As a consequence, software products were no longer delivered as self-contained products, but became dependent on the components already residing on the target systems.


Although CBSE provides a number of advantages, it also introduced additional complexity and challenges. In order to be able to run a software program all dependencies must be present and correct and the program must be able to find them. There are all kinds of things that could go wrong while deploying a system. A dependency may be missing, or a program requires a newer version of a specific component. Also newer components may be incompatible with a program (sometimes this intentional, but also accidentally due to a bug on which a program may rely).


For example, in Microsoft Windows (but also on other platforms) this lead to a phenomenon called the DLL hell. Except for Windows DLLs, this phenomenon occurs in many different contexts as well, such as the JAR hell for Java programs. Even the good old AmigaOS, suffered from the same weakness although they were not that severe as they are now, because the versions of libraries didn't change that frequently.

In UNIX-like systems, such as Linux, you will notice that the degree of sharing of components through libraries is raised to almost a maximum. For these kind of systems, it is crucial to have deployment tooling to properly manage the packages installed on a system. In Linux distributions the package manager is a key aspect and also a distinct feature that sets a particular distribution apart from another. There are many package mangers around such as RPM, dpkg, portage, pacman, and Nix (which we use in our research as a basis for NixOS).


Apart from the challenges of deploying a system from scratch, many system are also upgraded because (in most cases) it's too costly and time consuming to deploy them over and over again, for each change. In most cases upgrading is a risky process, because files get modified and overwritten. An interruption or crash during an upgrade phase may have disastrous results. Also an upgrade may not always give the same results as a fresh installation of a system.

Importance of software deployment


So why is research in software deployment important?


  • First of all, (not surprisingly) software systems become bigger and increasingly more complex. Nowadays, some software systems are not only composed of many components, but these components are also distributed and deployed on various machines in a network working together to achieve a common goal. For example, service-oriented systems are composed this way. Deploying these kinds of systems manually is a very time consuming, complex, error prone and tedious process. The bigger the system gets, the more likely it becomes that an error occurs.
  • We have to be more flexible in reacting to events. For example, in a cloud infrastructure, if a machine breaks, we must be able to redeploy the system in such a way that services are still available to end-users, limiting the impact as much as possible.
  • We want to push changes to a system in production environment faster. Because systems become increasingly more complex, an automated deployment solution is essential. In Agile software development projects, a team wants to generate value as quickly as possible, for which it is essential to have a working system in a production environment as soon as possible. To achieve this goal, it is crucial that the deployment process can be performed without much trouble. A colleague of mine (Rini van Solingen), who is also a Scrum consultant, has covered this importance in a video blog interview.

Research


What are software deployment research subjects?

  • Mechanics. This field concerns the execution of the deployment activities. How can we make these steps reproducible, reliable, efficient? Most of the research that I do covers deployment mechanics.
  • Deployment planning. Where to place a component in a network of machines? How to compose components together?
  • Empirical research covering various aspects of deployment activities, such as: How to quantify build maintenance effort? How much maintenance is needed to keep deployment specifications (such as build specifications) up to date?

Where are software deployment papers published? Currently, there is no subfield conference about software deployment. In the past (a few years before I started my research), there were three editions of the Working Conference on Component Deployment, which is no longer held since 2005.

Most of the deployment papers are published in various conferences, such as the top general ones, subfield conferences about software maintenance, testing, cloud computing. The challenging part of this is that (depending on the subject) I have to adapt my story to the conference where I want my paper to be published. This requires me to explain the same problems over and over again and integrate these problems with the given problem domain, such as cloud computing or testing. This is not always trivial to do, nor will every reviewer understand what the point is.

Conclusion


In this blog post, I have explained what software deployment is about and why research in this field is important. Systems are becoming much bigger and more complicated and we want to respond to changes faster. In order to manage this complexity, we need research in providing automated deployment solutions.

References


Wednesday, September 14, 2011

Deploying .NET applications with the Nix package manager (part 2)

In my previous blog post, I have explained how the Nix package manager can be used to deploy .NET applications. One of the open issues was that run-time dependencies can't be resolved in a convenient way. I have explained three possible solutions, each having its pros and cons and none of them was ideal.

After writing that blog post, I have received a number of suggestions and reactions from people from the #nixos freenode channel. Moreover, during the dinner of the SEAMS symposium in Hawaii, I have heard similar suggestions. It seems that blogging about certain issues pays off after all!

The last two days I have decided to look at these suggestions and to do some experiments at Philips. I'm happy to report that I have a follow up story now, in which I have a new solution for resolving run-time dependencies of .NET executables. This solution is also the best option (in my opinion).

Implementing a wrapper for .NET applications


Apparently .NET has a reflection API. With this reflection API you can dynamically load classes and dynamically invoke methods. You can also load assemblies dynamically from any location whether they have a strong name or not.

The .NET runtime also seems to fire an AssemblyResolve event, in case a library assembly can't be found. Apparently you can create your own event handler, dealing with such an event and use it to load a missing assembly through the reflection API.

So by taking these features into account, it is possible to create a wrapper executable capable of resolving the run-time dependencies that we need. This is what the wrapper I have developed for Nix looks like (I actually had to write some C# code for this):

using System;
using System.Reflection;
using System.IO;

namespace HelloWorldWrapper
{
    class HelloWorldWrapper
    {
        private String[] AssemblySearchPaths = {
          @"C:\cygwin\nix\store\23ga...-ALibrary",
          @"C:\cygwin\nix\store\833p...-BLibrary"
        };

        private String ExePath =
          @"C:\cygwin\nix\store\27f2...-Executable\Executable.exe";

        private String MainClassName =
          "SomeExecutable.Executable";

        public HelloWorldWrapper(string[] args)
        {
            // Attach the resolve event handler to the AppDomain
            // so that missing library assemblies will be searched
            AppDomain currentDomain = AppDomain.CurrentDomain;
            currentDomain.AssemblyResolve +=
              new ResolveEventHandler(MyResolveEventHandler);

            // Dynamically load the executable assembly
            Assembly exeAssembly = Assembly.LoadFrom(ExePath);

            // Lookup the main class
            Type mainClass = exeAssembly.GetType(MainClassName);

            // Lookup the main method
            MethodInfo mainMethod = mainClass.GetMethod("Main");

            // Invoke the main method
            mainMethod.Invoke(this, new Object[] {args});
        }

        static void Main(string[] args)
        {
            new HelloWorldWrapper(args);
        }

        private Assembly MyResolveEventHandler(object sender,
          ResolveEventArgs args)
        {
            // This handler is called only when the common language
            // runtime tries to bind to the assembly and fails.

            Assembly MyAssembly;
            String assemblyPath = "";
            String requestedAssemblyName =
              args.Name.Substring(0, args.Name.IndexOf(","));

            // Search for the right path of the library assembly
            foreach (String curAssemblyPath in AssemblySearchPaths)
            {
                assemblyPath = curAssemblyPath + "/" +
                  requestedAssemblyName + ".dll";

                if (File.Exists(assemblyPath))
                    break;
            }

            // Load the assembly from the specified path. 
            MyAssembly = Assembly.LoadFrom(assemblyPath);

            // Return the loaded assembly.
            return MyAssembly;
        }

    }
}

The wrapper class defined above has a number of fields. The AssemblySearchPaths field defines a String array containing all the Nix store paths of the runtime dependencies. The exePath field defines a String referring to the path of the executable in the Nix store which we want to run. The MainClassName field defines the full name of the class containing the Main method we want to run.

In the Main method of this class, we create an instance of the wrapper. In the constructor, we attach our own custom resolve event handler to the app domain controller. Then we use the reflection API to dynamically load the actual executable and to invoke the main method in the specified main class.

When we load the executable assembly, the resolve event handler is triggered a number of times. Our custom MyResolveEvent handler, tries to load to given assembly in all the search paths defined in the AssemblySearchPaths string array, which should succeed if all runtime dependencies are present.

There is a small caveat, however, with dynamically invoking the Main method of another executable. By default, the Main method in C# programs is defined like this:

namespace SomeNamespace
{
    class SomeClass
    {
       static void Main(string[] args)
       {
       }
    }
}

Apparently, it has no access modifier, which means the internal access modifier is used by default. The internal access modifier restricts access to this method to all members of the same assembly. This means that we cannot invoke an external Main method from a different assembly like the wrapper. To counter this, we need to make the access modifier of the actual executable public (or make the wrapper class a friend, but nonetheless we need to make a small modification anyway).

Usage


I have implemented a convenience function in Nixpkgs: dotnetenv.buildWrapper that automatically builds a .NET executable and generates a wrapper for the given executable. The function can be invoked like this:

{dotnetenv, MyAssembly1, MyAssembly2}:

dotnetenv.buildWrapper {
  name = "My.Test.Assembly";
  src = /path/to/source/code;
  slnFile = "Assembly.sln";
  assemblyInputs = [
    dotnetenv.assembly20Path
    MyAssembly1
    MyAssembly2
  ];
  namespace = "TestNamespace";
  mainClassName = "MyTestApplication";
  mainClassFile = "MyTestApplication.cs";
}

As you may see, the structure of the dotnetenv.buildWrapper is similar to the dotnetenv.buildSolution function, except that it requires several additional parameters for the wrapper, such as the namespace, class name and file location of the class containing the Main method of the actual executable. The function automatically makes the given Main method in the given main class file public and it creates a wrapper class containing the right properties to run the actual executable, such as the location of the actual executable and the paths of the run-time dependencies.

By using this wrapper function, it is possible to run a .NET executable assembly from the Nix store without much trouble.

Conclusion


In this blog post, I have implemented a wrapper executable that deals with resolving run-time dependencies of a .NET application. The wrapper uses a resolve event handler which loads all the required library assemblies through the .NET reflection API. This wrapper can be automatically generated from a convenience function, which makes it possible to run .NET applications from the Nix store, without much trouble.

References


Tuesday, September 6, 2011

Deploying .NET applications with the Nix package manager

This probably sounds like a very strange topic to some (or perphaps, most) readers, but I have done some experiments in the past with deploying .NET applications by using the Nix package manager. The Nix package manager is mostly used on Unix-like systems (Linux, FreeBSD, etc.) and designed with Unix-principles in mind. Furthermore, a lot of people know me as a Microsoft-critic. So you probably wonder why I want to do this?

Motivation


Being able to use Nix for deploying .NET applications has the following benefits:

  • For installing or upgrading .NET applications you have the same deployment benefits that Nix has: Being able to store multiple versions/variants next to each other, dependency completeness, atomic upgrades and rollbacks and a garbage collector which safely removes components no longer in use.
  • You can use Hydra, our continuous build and integration server, for building and testing .NET applications in various environments including environmental dependencies
  • You can use Disnix, to manage the deployment of a service-oriented applications developed using .NET technology in a network machines. This also works for web applications. For example, you can deploy your ASP.NET / Microsoft SQL server database environment from a declarative specification. Because Disnix is built on top of Nix, it also provides features such as dependency completeness, (almost) atomic upgrades and a garbage collector in a distributed environment.
  • The Nix deployment technology and related tooling are designed as generic tools (i.e. not developed for a particular component technology). Being able to support .NET applications is a useful addition.
  • And finally, we have an industry partner in our research project, who's interested in this.

Global Assembly Cache (GAC)


When I talk about Nix (and especially about the principle of the Nix store) to .NET people, I often hear that the Global Assembly Cache (GAC) already solves the DLL-hell, so you have no worries. Although the GAC solves several common deployment issues, it has a number of drawbacks compared to the Nix store:

  • It only provides isolation for library assemblies. Other components such as executables, compilers, configuration files, or native libraries are not supported.
  • A library assembly must have a strong name, which gives a library an unique name. A strong name is composed of several attributes, such as a name, version number and culture. Furthermore, the library assembly is signed with a public/private key pair.
  • Creating a strong-named assembly is in many cases painful. A developer must take care that the combination of attributes is always unique. For example, for a new release the version number must be increased. Because developers have to take care of this, people typically don't use strong names for internal release cycles, because it's too much work.
  • Creating a strong-named assembly could go wrong. It may be possible that a developer forgets to update any of these strong name attributes, which makes it possible to create a different assembly with the same strong name. Then isolation in the GAC can't be provided.

In contrast to the GAC, you can store any type of component in the Nix store, such as executables, configuration files, compilers etc. Furthermore, the Nix store uses hash codes derived from all build-time dependencies of a component, which always provides unique component file names.

Building Visual Studio projects in Nix


So how can Visual Studio projects be supported in Nix to compile .NET applications? We have implemented a Nix function to support this:

{stdenv, dotnetfx}:
{ name, src, slnFile, targets ? "ReBuild"
, options ? "/p:Configuration=Debug;Platform=Win32"
, assemblyInputs ? []
}:
stdenv.mkDerivation {
  inherit name src;
  buildInputs = [ dotnetfx ];
  installPhase = ''
    for i in ${toString assemblyInputs}; do
      windowsPath=$(cygpath --windows $i)
      AssemblySearchPaths="$AssemblySearchPaths;$windowsPath"
    done
    export AssemblySearchPaths
    ensureDir $out
    outPath=$(cygpath --windows $out)\\
    MSBuild.exe ${slnFile} /nologo /t:${targets} \
      /p:OutputPath=$outPath ${options} ...
  '';
}

The Nix expression code fragment above shows you the definition of the dotnetenv.buildSolution function, which builds Visual Studio projects and stores the output in the Nix store.

The idea of this function is easy: The function takes several parameters, such as the name of the component, a build time options string, the filename of the Visual Studio solution file (SLN) and a list of libraries (Assembly Inputs). It uses the dotnetfx (.NET framework) as a buildtime dependency, which provides access to the MSBuild executable, used to build Visual Studio solution files.

In order to let MSBuild find its library dependencies, we set the AssemblySearchPaths environment variable to contain the paths to the Nix store components containing the library assemblies. After setting the environment variable, the MSBuild command is invoked to build the given solution file and to produce the output in a unique path in the Nix store. The cygpath command is used to convert UNIX path names to Windows path names (and vice versa).

{dotnetenv, MyAssembly1, MyAssembly2}:

dotnetenv.buildSolution {
  name = "My.Test.Assembly";
  src = /path/to/source/code;
  slnFile = "Assembly.sln";
  assemblyInputs = [
    dotnetenv.assembly20Path
    MyAssembly1
    MyAssembly2
  ];
}

The above Nix expression shows you how this function can be used to build a Visual Studio project. Like ordinary Nix expressions, this expression is also a function taking several input arguments, such as dotnetenv which provides the Visual Studio build function (shown in the previous code fragment) and the library assemblies which are required to build the project. In the body we call the buildSolution function with the right parameters, such as the Solution file and the library assemblies which this project requires. The dotnetenv.assembly20Path refers to the .NET 2.0 system assemblies directory.

rec {
  dotnetfx = ...
  stdenv = ...
  dotnetenv = import ../dotnetenv {
    inherit stdenv dotnetfx;
  };

  MyAssembly1 = import ../MyAssembly1 {
    inherit dotnetenv;
  };

  MyAssembly2 = import ../MyAssembly1 {
    inherit dotnetenv;
  };

  MyTestAssembly = import ../MyTestAssembly {
    inherit dotnetenv MyAssembly1 MyAssembly2;
  };
}

Like ordinary Nix expressions, we also have to compose Visual Studio components by calling the build function in the previous code fragment with the right parameters. This is done in the Nix expression shown above. The last attribute: MyTestAssembly imports the expression shown in the previous code fragement with the required function arguments. As you may see, also all dependencies of MyTestAssembly are defined in this file. By running the following command-line instruction (pkgs.nix is the filename of the code fragement above):

nix-env -f pkgs.nix -iA MyTestAssembly

The assembly in our example is build from source, including all library dependencies and the output is produced in:

/nix/store/ri0zzm2hmwg01w2wi0g4a3rnp0z24r8p-My.Test.Assembly

Running .NET applications from the Nix store


We have explained how can build .NET applications and how MSBuild is able to find the required build time dependencies. Except for building a .NET application with Nix, we also have to be able to run them from the Nix store. To make this possible, an executable assembly needs to find its runtime dependencies, which is more complicated than I thought.

The .NET runtime locates assemblies as follows:

  • First, it tries to determine the correct version of the assembly (only for strong named assemblies)
  • If the strong named assembly has been bound before in memory, that version will be used.
  • If the assembly is not already in memory, it checks the Global Assembly Cache (GAC).
  • And otherwise it probes the assembly, by looking in a config file or by using some probing heuristics.

Because Nix stores all components, including library assemblies, in unique folders in Nix store, this gives some challenges. If an executable is started from the Nix store, the required libraries can't be found, because probing heuristics look for libraries in the same basedir as the executable.

Currently, I have implemented three methods to resolve runtime dependencies (each approach has its pros and cons and none of them is ideal):

  • Copying DLLs into the same folder as the executable. This is the most expensive and inefficient method, because libraries are not shared on the hard drive. However, it does work with both private and strong named assemblies and it also works on older versions of Windows, such as Windows XP.
  • Creating a config file, which specifies where the libraries can be found. A disadvantage of this approach is that .NET does not allow private assemblies to be looked up in other locations beyond the basedir of the executable. Therefore assemblies need a strong name, which is not very practical because these have to be generated by hand.
  • The third option is creating NTFS symlinks in the same folder as the executable. This works also for private libraries. A disadvantage of this approach is that NTFS symlinks are only supported from Windows Vista and upwards. Furthermore, you need special user privileges to create them and their semantics are not exactly the same as UNIX symlinks.

Usage


If you want to experiment with the Visual Studio build functions in Nix, you need to install Nix on Cygwin and you need a checkout of Nixpkgs. Check the Nix documentation for more instructions on this.

The dotnetenv component can be found in the pkgs/buildsupport/ directory of Nixpkgs. You need to install the .NET framework yourself and you have to edit some of the attributes of dotnetenv so that the .NET framework utilities can be found.

Unfortunately, the .NET framework can't be deployed through Nix (yet), because of dependencies on the Windows register. I also haven't looked into the scripting possibilities of the installer. So the .NET framework deployment isn't entirely pure. (Perhaps someone is able to make some tweaks to do this, but I don't know if this is legal to do).

Conclusion


In this blog post, I have described how .NET applications can be deployed with Nix. However, there are some minor issues, such as the fact that runtime dependencies can't be resolved in a convenient way. Furthermore, the deployment isn't entirely pure as the .NET framework must be installed manually. I know that Mono is able to use the MONO_PATH environment variable to look for libraries in arbitrary locations. Unfortunately, it seems that the .NET framework does not have something like this.

I have been told that it's also possible to resolve run time dependencies programmatically. This way you can load any library assembly you want from any location. I'm curious if somebody has more information on this. Any feedback would be welcome, since I'm not a .NET expert.

References