Monday, June 17, 2013

Setting up a multi-user Nix installation on non-NixOS systems

I have written quite some Nix-related articles on this blog. Nix is typically advertised as the core component of the NixOS Linux distribution. However, it can also be used separately on conventional Linux distributions and other UNIX-like systems, such as FreeBSD and Mac OS X.

Using Nix on conventional systems makes it possible to use the interesting features of Nix and its applications, such as Hydra and Disnix, while still being able to use your favorite distribution.

Single-user Nix installations


I have noticed that on non-NixOS systems, the Nix package manager is often installed for only one single user, as performing single-user installations is relatively simple. For example, in my blog post describing how to build apps for iOS with Nix, I perform a Nix installation that can only be used by my personal user account.

For most of the simple use cases, single user installations are sufficient. However, they have a number of issues besides the fact that only one user of a machine (in addition to the super-user) is able to use it:

  • Although Nix creates build environments that remove several important side-effects, e.g. by clearing environment variables and storing all packages in isolation in the Nix store, builds can still refer to executables and other files in global directories by having hardcoded references, such as /usr/bin or /var, which may influence the result of the build.

    On NixOS these references are typically not an issue since these directories do not exist, but on conventional distributions these do exist, which may cause (purity) problems.
  • Many packages have hard-coded references to the default Bourne-compatible shell in: /bin/sh. Some of these packages assume that this shell is bash and use bash-specific features.

    However, some Linux distributions, such as Debian and Ubuntu, use dash as default /bin/sh causing some builds to break. Moreover, BSDs such as FreeBSD also use a simpler Bourne shell implementation by default.
  • Some build processes may unknowingly try to download stuff from the Internet causing impurities.
  • Package builds have the same privileges as the calling user, allowing external processes run by the user to interfere with the build process. As a consequence, impurities may sneak in when executing multiple package builds in parallel.

Multi-user Nix installations


As a remedy for the issues just described, Nix is also capable of executing each build with separate user privileges in a nearly clean chroot environment in which we bind mount the Nix store. However, in order to use these features, a multi-user Nix installation is required, as these operations require super-user privileges. In NixOS, a multi-user Nix installation comes for free.

In multi-user Nix installations, builds are not executed directly by each individual user, since they cannot be trusted. Instead we run a server, called the nix-daemon that builds packages on behalf of a user. This daemon also takes care of running processes as a unique unprivileged user and setting up chroot environments.

Although the Nix manual provides some pointers to set up a multi-user installation, it turned out to be a bit trickier than I thought. Moreover, I have noticed that a few practical bits were missing in the manual and the Nix distribution.

In this blog post, I have investigated these issues and implemented a few improvements that provide a solution for these missing parts. I have performed these steps on a Ubuntu 12.04 LTS machine.

Installing Nix from source


As a first step, I installed Nix from source by running the following commands:

$ ./configure --prefix=/usr --sysconfdir=/etc
$ make
$ sudo make install

Adding the Nix build group and users


Since every concurrent build must be run as a separate user, we have to define a common group of which these users should be a member:
$ sudo groupadd -g 20000 nixbld
Then we need to add user accounts for each build that gets executed simultaneously:

$ for i in `seq 1 10`
do
    sudo useradd -u `expr 20000 + $i` -g nixbld \
      -c "Nix build user $i" -d /var/empty -s /noshell
done

In the code fragment above, we assume that 10 users are sufficient, but if you want/need to utilise more processors/cores this number needs to be raised.

Finally, we have to specify the build users group in the Nix configuration:

$ sudo echo "build-users-group = nixbld" >> /etc/nix/nix.conf

Changing permissions of the Nix store


In ordinary installations the user is made owner of /nix/store. In multi-user installations, it must be owned by root and group owned by the nixbld user. The following shell commands grant the Nix store the right permissions:

$ sudo chgrp nixbld /nix/store
$ sudo chmod 1775 /nix/store

Creating per-user profile and garbage collection root folders


In single user installations, we only have one system-wide default profile (/nix/var/nix/profiles/default) owned by the user. In multi-user installations, each user should be capable of creating their own profiles and garbage collector roots. The following shell commands ensure that it can be done:

$ sudo mkdir -p -m 1777 /nix/var/nix/profiles/per-user
$ sudo mkdir -p -m 1777 /nix/var/nix/gcroots/per-user

Setting up the Nix daemon


We must also run the nix-daemon that executes builds on behalf of a user. To be able to start and stop it, I have created an init.d script for the Nix daemon. The interesting part of this script is the start operation:

DAEMON=/usr/bin/nix-daemon
NAME=nix-daemon

if test -f /etc/default/nix-daemon; then
    . /etc/default/nix-daemon
fi

...

case "$1" in

start)
    if test "$NIX_DISTRIBUTED_BUILDS" = "1"; then
        NIX_BUILD_HOOK=$(dirname $DAEMON)/../libexec/nix/build-remote.pl
                
        if test "$NIX_REMOTE_SYSTEMS" = "" ; then
            NIX_REMOTE_SYSTEMS=/etc/nix/remote-systems.conf
        fi
                
        # Set the current load facilities
        NIX_CURRENT_LOAD=/var/run/nix/current-load
                
        if test ! -d $NIX_CURRENT_LOAD; then
            mkdir -p $NIX_CURRENT_LOAD
        fi
    fi
                
    start-stop-daemon -b --start --quiet \
        --exec /usr/bin/env \
        NIX_REMOTE_SYSTEMS=$NIX_REMOTE_SYSTEMS \
        NIX_BUILD_HOOK=$NIX_BUILD_HOOK \
        NIX_CURRENT_LOAD=$NIX_CURRENT_LOAD \
        $DAEMON -- $DAEMON_OPTS
    echo "$NAME."
    ;;

...

esac

For the start operation, we have to spawn the nix-daemon in background mode. Moreover, to allow Nix to perform distributed builds, we must set a number of environment variables that provide the locations of the build hook script, the configuration file containing the properties of the external machines and a directory containing files keeping track of the load of the machines. Moreover, some directories may also have to be created if they don't exist.

To ensure that it's automatically launched on startup, we must add the following symlinks for the relevant runlevels:

$ sudo -i
# cd /etc/rc2.d
# ln -s ../init.d/nix-daemon S60nix-daemon
# cd ../rc3.d
# ln -s ../init.d/nix-daemon S60nix-daemon
# cd ../rc4.d
# ln -s ../init.d/nix-daemon S60nix-daemon
# cd ../rc5.d
# ln -s ../init.d/nix-daemon S60nix-daemon
# exit

I think the above init.d script can be trivially ported to other distributions.

Setting up user profiles


To allow users to install software through Nix and allow them to refer to their installed programs from a simple command-line invocation, we need to add some stuff to the user's shell profile, such as setting the PATH environment variable pointing to certain Nix profiles.

However, the nix.sh profile.d script in the Nix distribution only performs the necessary steps for single user installations. For example, it only adds to system-wide Nix profile and assumes that the user has all the rights to configure a channel.

I have ported all the relevant features from NixOS to create /etc/profile.d/nix-multiuser.sh, supporting all required features to set up a shell profile for multi-user installations:

First, we have to set up a user's profile directory in the per-user profile directory, if it doesn't exist:

export NIX_USER_PROFILE_DIR=/nix/var/nix/profiles/per-user/$USER

mkdir -m 0755 -p $NIX_USER_PROFILE_DIR
if test "$(stat --printf '%u' $NIX_USER_PROFILE_DIR)" != "$(id -u)"; then
    echo "WARNING: bad ownership on $NIX_USER_PROFILE_DIR" >&2
fi

In single user installations, a ~/.nix-profile symlink is created pointing to the system-wide default Nix profile. For multi-user installations, we must create a ~/.nix-profile symlink pointing to the per-user profile. For the root user, we can still use the system wide Nix profile providing software for all users of the system:

if ! test -L $HOME/.nix-profile; then
    echo "creating $HOME/.nix-profile" >&2
    if test "$USER" != root; then
        ln -s $NIX_USER_PROFILE_DIR/profile $HOME/.nix-profile
    else
        # Root installs in the system-wide profile by default.
        ln -s /nix/var/nix/profiles/default $HOME/.nix-profile
    fi
fi

In single user installations, we add the bin directory of the system-wide Nix profile to PATH. In multi-user installations, we have to do this both for the system-wide and the user profile:

export NIX_PROFILES="/nix/var/nix/profiles/default $HOME/.nix-profile"

for i in $NIX_PROFILES; do
    export PATH=$i/bin:$PATH
done

In single user installations, the user can subscribe itself to the Nixpkgs unstable channel providing pre-built substitutes for packages. In multi-user installations only the super-user can do this (as ordinary users cannot be trusted). Although root can only subscribe to a channel, ordinary users can still install from the subscribed channels:

if [ "$USER" = root -a ! -e $HOME/.nix-channels ]; then
    echo "http://nixos.org/channels/nixpkgs-unstable nixpkgs" \
      > $HOME/.nix-channels
fi

We have to create a garbage collector root folder for the user, if it does not exists:

NIX_USER_GCROOTS_DIR=/nix/var/nix/gcroots/per-user/$USER
mkdir -m 0755 -p $NIX_USER_GCROOTS_DIR
if test "$(stat --printf '%u' $NIX_USER_GCROOTS_DIR)" != "$(id -u)"; then
    echo "WARNING: bad ownership on $NIX_USER_GCROOTS_DIR" >&2
fi

We must also set the default Nix expression, so that we can conveniently install packages from Nix channels:

if [ ! -e $HOME/.nix-defexpr -o -L $HOME/.nix-defexpr ]; then
    echo "creating $HOME/.nix-defexpr" >&2
    rm -f $HOME/.nix-defexpr
    mkdir $HOME/.nix-defexpr
    if [ "$USER" != root ]; then
        ln -s /nix/var/nix/profiles/per-user/root/channels \
          $HOME/.nix-defexpr/channels_root
    fi
fi

Unprivileged users do not have the rights to build package directly, since they cannot be trusted. Instead the daemon must do that on behalf of the user. The following shell code fragment ensures that:

if test "$USER" != root; then
    export NIX_REMOTE=daemon
else
    export NIX_REMOTE=
fi

Using multi-user Nix


After having performed the previous steps, we can start the Nix daemon by running the init.d script as root user:

$ sudo /etc/init.d/nix-daemon start
Then if we login as root, we can update the Nix channels and install packages that are supposed to be available system-wide:

$ nix-channel --update
$ nix-env -i hello
$ hello
Hello, world!

We should also be able to log in as an unprivileged user and capable of installing software:

$ nix-env -i wget
$ wget # Only available to the user that installed it
$ hello # Also works because it's in the system-wide profile

Enabling parallel builds


With a multi-user installation, we should also be able to safely run multiple builds concurrently. The following change can be made to allow 4 builds to be run in parallel:

$ sudo echo "build-max-jobs = 4" >> /etc/nix/nix.conf

Enabling distributed builds


To enable distributed builds (for example to delegate a build to a system with a different architecture) we can run the following:

$ sudo echo "NIX_DISTRIBUTED_BUILDS=1" > /etc/defaults/nix-daemon

$ sudo cat > /etc/nix/remote-systems.conf << "EOF"
sander@macosx.local x86_64-darwin /root/.ssh/id_buildfarm 2
EOF

The above allows us to delegate builds for Mac OS X to a Mac OS X machine.

Enabling chroot builds


On Linux, we can also enable chroot builds allowing us to remove many undesired side-effects that single-user Nix installations have. Chroot environments require some directories of the host system to be bind mounted, such as /dev, /dev/pts, /proc and /sys.

Moreover, we need a default Bourne shell in /bin/sh that must be bash, as other more primitive Bourne compatible shells may give us trouble. Unfortunately, we cannot bind mount the host's system /bin folder, as it's filled with all kinds of executables causing impurities. Moreover, these executables have requirements on shared libraries residing in /lib, which we do not want to expose in the chroot environment.

I know two ways to have bash as /bin/sh in our chroot environment:

  • We can install bash through Nix and expose that in the chroot environment:

    $ nix-env -i bash
    $ sudo mkdir -p /nix-bin
    $ sudo ln -s $(readlink -f $(which bash)) /nix-bin/sh
    
    The approach should work, as bash's dependencies all reside in the Nix store which is available in the chroot environment.

  • We could also create a static bash that does not depend on anything. The following can be run to compile a static bash manually:

    $ ./configure --prefix=~/bash-static --enable-static-link \
        --without-bash-malloc --disable-nls
    $ make
    $ make install
    $ sudo mkdir -p /nix-bin
    $ sudo cp bash /nix-bin/sh
    

    Or by using the following Nix expression:

    with import <nixpkgs> {};
    
    stdenv.mkDerivation {
      name = "bash-static-4.2";
      src = fetchurl {
        url = mirror://gnu/bash/bash-4.2.tar.gz;
        sha256 = "1n5kbblp5ykbz5q8aq88lsif2z0gnvddg9babk33024wxiwi2ym2";
      };
      patches = [ ./bash-4.2-fixes-11.patch ];
      buildInputs = [ bison ];
      configureFlags = [
        "--enable-static-link"
        "--without-bash-malloc"
        "--disable-nls"
      ];
    }
    

Finally, we have to add a few properties to Nix's configuration to enable chroot builds:

$ sudo echo "build-use-chroot = true" >> /etc/nix/nix.conf
$ sudo echo "build-chroot-dirs = /dev /dev/pts /bin=/nix-bin $(nix-store -qR /nix-bin/sh | tr '\n' ' ')" \
    >> /etc/nix/nix.conf
The last line exposes /dev, /dev/pts and /nix-bin (mounted on /bin, containing only sh) of the host system to the chroot environment allowing us to build stuff purely. If we have installed bash through Nix, then we also have to add the Nix closure of bash, which we can obtain through the nix-store -qR command. For a static bash this invocation has to be omitted.

Conclusion


In this blog post, I have described everything I did to set up a multi-user Nix installation supporting distributed and chroot builds on a conventional Linux distribution (Ubuntu) which was a bit tricky.

I'm planning to push some of things I did upstream, so that others can benefit from it. This is a good thing, because I have the feeling that most non-NixOS Nix users will lose their interest if they have figure out the same stuff I just did.

Tuesday, June 11, 2013

Securing Hydra with a reverse proxy (Setting up a Hydra cluster part 3)

In two earlier blog posts, I have given a general description about Hydra and described how Hydra can be set up.

Another important aspect is to know that most of the facilities of Hydra are publicly accessible by anyone, except for the administration tasks, such as maintaining projects and jobsets for which it is required to have a user account.

A way to prevent Hydra from having public user access is to secure the reverse proxy that is in front of it. In this blog post, I will describe how to set up an HTTPS virtual domain with password authentication for this purpose. Moreover, the approach is not restricted to Hydra -- it can be used to secure any web application by means of a reverse proxy.

Generating a self-signed SSL certificate


To be able to set up HTTPS connections we need an SSL certificate. The easiest way to get one is to create a self-signed SSL certificate. The following command-line instruction suffices for me to generate a private key:
$ openssl genrsa -des3 -out hydra.example.com.key 1024
The above command asks you to provide a passphrase. Then we need to generate a certificate signing request (CSR) file to sign the certificate with our own identity:
$ openssl req -new -key hydra.example.com.key \
    -out hydra.example.com.csr
The above command asks you some details about your identity, such as the company name, state or province and country. After the CSR has been created, we can generate a certificate by running:
$ openssl x509 -req -days 365 -in hydra.example.com.csr \
    -signkey hydra.example.com.key -out hydra.example.com.orig.crt
To prevent the web server from asking me for the passphrase on every start, I ran the following to adapt the certificate:
$ openssl rsa -in hydra.example.com.orig.crt \
    -out hydra.example.com.crt
Now we have successfully generated a self-signed certificate allowing us to encrypt the remote connection to our Hydra instance.

Obviously, if you really care about security, it's better to obtain a cross-signed SSL certificate, since that provides you (some sort of) guarantee that a remote host can be trusted, whereas our self-signed SSL certificate forces users to create a security exception in their browsers.

Creating user accounts


A simple way to secure an Apache HTTP server with user authentication is by using the htpasswd facility. I did the following to create a user account for myself:

$ htpasswd -c /etc/nixos/htpasswd sander
The above command creates a user account named: sander, asks me for a password and stores the resulting htpasswd file in /etc/nixos. The above command can be repeated to add more user accounts.

Adapting the NixOS system configuration


As a final step, the reverse proxy must be reconfigured to accept HTTPS connections. The following fragment shows how the Apache HTTPD NixOS service can be configured to act as a reverse proxy having an unsecured HTTP virtual domain, and a secured HTTPS virtual domain requiring users to authenticate with a username and password:

services.httpd = {
  enable = true;
  adminAddr = "sander@example.com";
  hostName = "hydra.example.com";
  sslServerCert = "/etc/nixos/hydra.example.com.crt";
  sslServerKey = "/etc/nixos/hydra.example.com.key";
      
  virtualHosts = [
    { hostName = "localhost";
      extraConfig = ''
        <Proxy *>
          Order deny,allow
          Allow from all
        </Proxy>
            
        ProxyRequests     Off
        ProxyPreserveHost On
        ProxyPass         /    http://localhost:3000/ retry=5 disablereuse=on
        ProxyPassReverse  /    http://localhost:3000/
      '';
    }
        
    { hostName = "localhost";
      extraConfig = ''
        <Proxy *>
          Order deny,allow
          Allow from all
          AuthType basic
          AuthName "Hydra"
          AuthBasicProvider file
          AuthUserFile /etc/nixos/htpasswd
          Require user sander
        </Proxy>
            
        ProxyRequests     Off
        ProxyPreserveHost On
        ProxyPass         /    http://localhost:3000/ retry=5 disablereuse=on
        ProxyPassReverse  /    http://localhost:3000/
        RequestHeader set X-Forwarded-Proto https
        RequestHeader set X-Forwarded-Port 443
      '';
      enableSSL = true;
    }
  ];
};

There is one important aspect that I had to take into account in the configuration of the HTTPS virtual host. Without providing the RequestHeader properties, links in Hydra are not properly generated, as Catalyst (the MVC framework used by Hydra) does not know that links should use the https:// scheme instead of http://.

Restricting the unsecured HTTP virtual domain


It may also desired to prevent others from having access to the insecure HTTP virtual host. In a NixOS system configuration, you can set the firewall configuration to only accept HTTPS connections by adding the following lines:

networking.firewall.enabled = true;
networking.firewall.allowedTCPPorts = [ 443 ];

Conclusion


The tricks described in this blog post allowed me to secure Hydra with an HTTPS connection and password authentication, so that I can open a web browser and use: https://hydra.example.com to access the secured virtual host.

The major disadvantage of this approach is that you cannot use Nix's channel mechanism to automatically obtain substitutes from Hydra. Perhaps, in the future Nix can be adapted to also support connections with user authentication.

Sunday, June 9, 2013

Dr. Sander



As a follow up story on the previous two blog posts, I can tell you that I have defended my PhD thesis last Monday, which went successfully. Now I can officially call myself a Doctor.

It was quite a tense, busy and interesting day. I'm not really used to days in which you are the centre of attention for most of the time. In the morning, our foreign guests: Sam Malek and Roberto di Cosmo (who are members of my committee) arrived to tell a bit about their ongoing research. We had some sort of a small software deployment workshop with quite some interesting ideas and discussions. Having them visiting us, gave me some renewed excitement about ongoing research and some interesting future ideas, since software deployment (as I have explained) is typically a misunderstood and under appreciated research subject.

After our small workshop, I had to quickly pick up our suits, and then return to the university. In the early afternoon, I had to give a laymen's talk to my family and friends in which I tried to explain the background and the goal of my PhD thesis. Then the actual defence started lasting for exactly one hour (no more, no less) in which the committee members asked me questions about my thesis and the accompanied propositions.

At Delft University of Technology and any other Dutch university, there is a lot of ceremonial stuff involved with the PhD defence. Professors have to be dressed up in gowns. Me and my paranimphs (the two people sitting in front of me who help me and take over my defence if I pass out) had to be dressed up in suits. I have to address to committee members formally depending on their roles, such as: "hooggeleerde opponent" and the committee members had to formally address me with "waarde promovendus".

I received some interesting questions during my defence round. To have an idea what these questions were, Vadim Zaytsev has made 140 character transcriptions of each question and answer on Twitter.

Although I was experiencing some nervousness at the beginning of the day, as I didn't know what exactly was going to happen and what kind of questions I would receive, in the end it was fun and I liked it very much. After the defence I received my PhD diploma:


For people that want to know what my thesis is exactly about:

Saturday, June 1, 2013

My PhD thesis propositions and some discussion

Apart from the contents of my PhD thesis, theses at our university are usually accompanied by a list of propositions. According to our university's Doctorate regulations, they must be defendable and opposable, at least six of the propositions are not supposed to be directly related to the research subject and two of them may be slightly playful. Besides the contents of my thesis, committee members are also allowed to ask questions about the propositions.

A colleague of mine: Felienne Hermans, has covered her propositions on her blog to elaborate about them. I have decided to do the same thing, although I'm not planning to create separate blog posts for each individual proposition. Instead, I cover all of them in a single blog post.

Propositions


  1. Many of the non-functional requirements of a deployed service-oriented system can be realized by selecting an appropriate software deployment system.

    As I have explained earlier in a blog post about software deployment complexity, systems are rarely self-contained but composed of components. An important property of a component is that it can be deployed independently, significantly complicating a software deployment process.

    Components of service-oriented systems are called "services". It's a bit of an open debate to exactly tell what they are, since people from industry often think in terms of web services (things that use SOAP, WSDL, UDDI), while I have also seen the description "autonomous platform independent entities that can be loosely coupled" in the academic literature.

    Although web services are some sort of platform independent entities, they still have implementations behind their interfaces depending on certain technology and can be deployed to various machines in a network. We have seen that deployment on a single machine is hard and that deploying components into networks of machines is even more complicated, time consuming and error prone.

    Besides deployment activities, there are many important non-functional requirements a system has to meet. Many of them can be achieved by designing an architecture, e.g. components, connectors and the way they interact through architectural patterns/styles. Architectural patterns (e.g. layers, pipes and filters, blackboard etc.) implement certain quality attributes.

    For service-oriented systems, it's required to deploy components properly into a network of machines to be able to compose systems. In other words: we have to design and implement a proper deployment architecture. This has several technical challenges, such as the fact that we have to deploy components in such a way that they exactly match the deployment architecture, the deployment activities themselves, and the fact that we have to determine whether a system is capable of running a certain component (i.e. a service using Windows technology cannot run on a Linux machine or vice-versa).

    In addition to technical constraints, there are also many non-functional issues related to deployment that require attention, i.e. where to place components and how to combine them to achieve certain non-functional requirements? For example, privacy could be achieved by placing services providing access to privacy-sensitive data in a restricted zone and robustness by deploying multiple redundant instances of the same service.

    It can also be hard to manually find a deployment architecture that satisfies all non-functional requirements. In such cases, deployment planning algorithms are very helpful. In some cases it's even too hard or impossible to find an optimal solution.

    Because of all these cases, an automated deployment solution taking all relevant issues into account is very helpful in achieving many non-functional requirements of a deployed service-oriented system. This is what I have been trying to do in my PhD thesis.

  2. The intention of the Filesystem Hierarchy Standard (FHS) is to provide portability among Linux systems. However, due to ambiguity, over-specification, and legacy support, this standard limits adoption and innovation. This phenomenon applies to several other software standards as well.

    There are many standards in the software engineering domain. In fact, it's dominated by them. One standard that is particularly important in my research is the Filesystem Hierarchy Standard (FHS) defining the overall filesystem structure of Linux systems. I have written a blog post on the FHS some time ago.

    In short: the FHS defines the purposes of directories, it makes a distinction between static and variable parts of a system, it defines hierarchies (e.g. / is for boot/recovery, /usr is for user software and for /usr/local nobody really knows (ambiguity)). Moreover, it also defines the contents of certain directories, e.g. /bin should contain /bin/sh (over-specification).

    I have problems with the latter two aspects -- the hierarchies do not provide support for isolation, allowing side-effects to easily manifest themselves while deploying and enabling destructive upgrades. I also have a minor problem with strict requirements of the contents of directories, as they easily allow builds to trigger side-effects while assuming that certain tools are always present.

    For all these reasons, we deviate on some aspects of the FHS in NixOS. Some people consider this unacceptable, and therefore they will not be able to incorporate most of our techniques to improve the quality of deployment processes.

    Moreover, as the FHS itself has issues, we observe that although the filesystem structure is standardized, file system layouts in many Linux distributions are still slightly different and portability issues still arise.

    In other domains, I have also observed various issues with standards:

    • Operating systems: nearly every operating system is more or less forced to implement POSIX and/or the Single UNIX Specification, taking a lot of effort. Furthermore, by implementing these standards they UNIX is basically reimplemented. These standards have many strict requirements on how certain library calls should be implemented, although it also specifies undefined behaviour at the same time. Apart from the fact that it's difficult and time consuming to implement these standards, there is little room to implement an operating system that is conceptually different from UNIX as it conflicts with portability.
    • The Web (e.g. HTML, CSS, DOM etc.): First, a draft is written in a natural language (which is inherently ambiguous) and sometimes underspecified. Then vendors start implementing these drafts. As initially these standards are ambiguous and underspecified, implementations behave very differently. Slowly these implementations converge into something that is uniform by collaborating with other vendors and the W3C to improve the draft version of a standard. Some vendors intentionally or accidentally implement conformance bugs, which don't get fixed for quite some time.

      These buggy implementations may become the de-facto standard, which has happened in the past, e.g. with Internet Explorer 6, requiring web developers to implement quirks code. Since the release of Internet Explorer 6 in 2001, Microsoft had 95% market share and did not release a new version until 2006. This was seriously hindering innovation in web technology. It also took many years before other implementations with better web standards conformance and more features gained acceptance.

  3. So are standards bad? Not necessarily, but I think we have to critically evaluate them and not consider them as holy books. Moreover, standards need to be developed with some formality and elegance in mind. If junk gets standardized, it will remain junk and requires everybody to cope with junk for quite some time.

    One of the things that may help is using good formalisms. For example, a good one I can think of is BNF that was used in the ALGOL 60 specification.

  4. To move the software engineering community as a whole forward, industry and academia should collaborate. Unfortunately, their Key Performance Indicators (KPIs) drive them apart, resulting in a prisoner's dilemma.

    This proposition is related to an earlier rant blog post about software engineering fractions and collaborations. If I would use stereotypes, then the primary KPI of academia are the amount of (refereed) publications and the primary KPI of industry is how much they sell.

    It's obvious that, if both fractions would let themselves go a bit from their KPIs, i.e. academia does a bit more in engineering tools and transferring knowledge, while industry spends some of their effort in experimenting and paying attention to "secondary tasks", that both parties would benefit. However, in practice often the opposite happens (although there are exceptions of course).

    This is analogous to a prisoner's dilemma, which is a peculiar phenomenon. Visualize the following situation: two prisoners have jointly committed a crime and got busted. If both confess their crime then the amount of time they have to spend in prison are five years. If one prisoner confesses while the other does not, then the prisoner that confesses goes ten years into jail and the other remains free. If none of them confess, they both have to spent twenty years in prison.

    In this kind of situation the (obvious) win-win situation for both criminals is that they both confess. However, because they both give priority to their self-interests, none of them confesses as they assume that they remain free. But instead, the situation has the worst outcome: both have to spent twenty years in prison.

  5. In software engineering, the use of social media, such as blogging and twitter, are an effective and efficient way to strengthen collaboration between industry and academia.

    This proposition is related to the previous one. How can we create a win-win situation? Often I hear people saying: "Well collaboration is interesting, but it costs time and money, which we don't have right now and we have other stuff to do".

    I don't think the barrier has to be that high. Social media, such as blogging and twitter, can be used for free and allows one to easily share stories, thoughts, results and so on. Moreover, recipients can also share these with people they know.

    My blog for example, has attracted many more readers and has given me much more feedback then all my research papers combined. Moreover, I'm not limited by all kinds of contraints that program committee members impose on me.

    However these observations are not unique to me. Many years ago, a famous Dutch computer scientist named Edsger Dijkstra wrote many manuscripts that he sent to his peers directly. He wrote about subjects that he found relevant. His peers spread these manuscripts through their colleagues allowing him to reach a wide range of people, eventually reaching thousands of people.

  6. While the vision behind the definition of free software as described by the Free Software Foundation to promote freedom is compelling, the actual definition is ambiguous and inadequately promoted.

    The free software definition defines four freedoms. I can rephrase them in one sentence: "Software that can be used, studied, adapted and shared for any purpose". An essential precondition for this is the availability of the source code. I think this definition is clear and makes sense.

    However, there is a minor issue with the definition. The word 'free' is ambiguous in English. In the definition, it refers to free as in freedom not free in price (gratis). In Dutch or French it's not a problem. In these languages free software translates to 'vrije software' and 'libre software'.

    Moreover, although (almost) all free software is gratis, it's also allowed to sell free software for any price, which is often misunderstood.

    I have seen that the ambiguity of the word free is often used as an argument why the definition is not attracting a general audience.

    I think there is a bigger problem: the way free software is advertised. Most advertisements are not about what's good about free software, but about what's bad about proprietary software and how evil certain practices of certain companies are.

    Although I don't want to say that they are not right and we should tolerate such bad practices, I think it would also help to pay more attention to the good aspects of free software. The open source definition has much more care for this. For example, being able to improve quality of software. That's something I think that would attract people from the other side, whereas negative campaigning does not.

  7. Compared to the definition of free software provided by the Free Software Foundation, the definition of Open Source as provided by the Open Source Initiative, fails to improve on freedom. While it has been more effectively promoted, it lacks a vision and does not solve ambiguity.

    The open source definition lists ten pragmatic points with the intention of having software that is free (as in freedom), e.g. availability of source code, means to share modified versions, and so on. However, it does not explain why it's desired for others to respect these ten pragmatic points and what their rationale is (although there is an annotated definition that does).

    Because of these reasons, I have seen that sometimes software is incorrectly advertised as being open-source, while in reality they are not. For example, there is also software available with source code, for which it is not allowed to do commercial redistributions, such as the LCC compiler. That's not open source (nor free software). Another prominent example is Microsoft's Shared Source initiative, only allowing someone to look at code, but not to modify or redistribute it.

    A very useful aspect of open source is the way it's advertised. It pays a lot of attention in selling its good points. For example, that everyone is able to improve its quality, and allowed to collaborate etc. Companies (even those that sell proprietary software) acknowledge these positive aspects and are sometimes willing to work with open-source people on certain aspects or to "open-source" pieces of their proprietary products. Examples of this are the Eclipse platform and the Quake 3 arena video game.

  8. Just like making music is more than translating notes and rests into tones and pauses with specific durations, developing software systems is more than implementing functional requirements. In both domains, details, collaboration and listening to others are most important.

    I have observed that in both domains we make estimations. In software development, we try to estimate how much effort it takes to develop something and in music we try to estimate how much effort it takes to practice and master a composition.

    In software development, we often look at functional requirements (describing what a system should do) to estimate. I have seen that sometimes functional requirements may look ridiculously simple, such as displaying tabular data on a web page. Nearly every software developer would say: "that's easy".

    But even if functional requirements are simple, certain non-functional requirements (describing how and where) could make it very difficult. For example, properly implementing security facilities, a certain quality standard (such as ISO 9126) or to provide scalability. These kind of aspects may be much more complicated than the features of a system itself.

    Moreover, software is often developed in a team. Good communication and being able to divide work properly is important. In practice, you will almost always see that something goes wrong with that, because people have assumptions that all details are known by others or there is no clear architecture of a system so that work can be properly divided among team members.

    These are all kinds of reasons that may result in development times that are significantly longer than expected and failure to properly deliver what clients have asked.

    In my spare time I'm a musician and in music I have observed similar things. People make effort estimations by looking at the notes written on paper. Essentially, you could see those as functional requirements as they tell you what to play.

    However, besides playing notes and pausing, there are many more aspects that are important, such as tempo, dynamics (sudden and gradual loudness) and articulation. You could compare these aspects to non-functional requirements in software, as they tell somebody how to play (series of) notes.

    Moreover, making music can also be a group effort, such as a band or an orchestra, requiring people to properly interact with each other. If others make mistakes they may confuse you as well.

    I vividly remember a classical composition of 15 years ago. I just joined an orchestra and we were practising: "Land of the Long White Cloud" by Philip Sparke. In the middle of the composition, there is a snare drum passage consisting of only sixteenth notes. I already learned playing patterns like these in my first drum lesson, so I thought that it would be easy.

    However, I had to play these notes in a very fast tempo and very quietly, which are usually conflicting constraints for percussionists. Furthermore, I had to keep up the right tempo and don't let the other members distract me. Unfortunately, I couldn't cope with all these additional constraints, and that particular passage had to be performed by somebody else. I felt like an idiot and I was very disappointed. However, we did win the contest in which we had to perform that particular composition.

    As a side note: "The land of the long white cloud" is also known as Aotearoa or New Zealand.

  9. Multilingualism is a good quality for a software engineer as it raises awareness that in natural languages as well as in software languages and techniques, there are things that cannot be translated literally.

    It's well known that some words or sentences cannot be literally translated from one natural language to another. In such cases, we have to reformulate a sentence into something that has an equivalent meaning, which is not always trivial.

    For example, non-native English speakers, like Dutch people, tend to make (subtle) mistakes now and then, which sometimes have hilarious outcomes. Make that the cat wise is a famous website that elaborates on this, calling the Dutch variant of English: Dunglish.

    Although we are aware of the fact that we cannot always translate things literally in natural languages, I have observed that in the software domain the same phenomenon occurs. One particular programming language may be more useful for a certain goal, than another programming language. Eventually, code written in a programming language gets compiled into machine code or another programming language (having an equivalent reformulation in a different language) or interpreted by an interpreter.

    However, I have also observed that in the software engineering domain there is a lot of programming language conservatism. Most conventional programming languages used nowadays (Python, C++, Java, C# etc.) use structured and imperative programming concepts in combination with class-based OO techniques. Unconventional languages such as purely functional programming languages (Haskell) or declarative languages (Prolog, Erlang) only get little mainstream acceptance, although they have very powerful features. For example, programs implemented in a purely functional language easily scale across multiple cores/processors.

    Instead, many developers use conventional languages to achieve the same, imposing many additional problems that need to be solved and more chances on errors. Our research based on the purely functional deployment model also suffers from conservatism. Therefore, I think multilingualism is very powerful asset for an engineer, as he is not limited by a solution set that is too narrow.

  10. Stubbornness is both a positive as well as a negative trait of a researcher.

    I think that when you do research and if you discover something that is uncommon, others may reject it or tell you to do something that they consider more relevant. Some researchers choose to comply and give up stuff that they think is relevant. If every scientist would do that, then I think certain things would have never been discovered. I think it's a scientist's duty to properly defend important discoveries.

    In the middle ages it was even worse. For example, a famous scientist named Galileo Galilei revealed that not the Earth but the Sun was the centre of our solar system. He was sentenced to house arrest for the rest of his life by the catholic church.

    However, stubbornness also has a negative aspect. It often comes with ignorance and sometimes that's bad. For example, I have "ignored" some advice about properly studying related work and taking evaluation seriously, resulting in a paper that was badly rejected.

Conclusion


In this blog post, I have described some thoughts on my PhD propositions. The main reason of writing this down is to prepare myself for my defence. I know this blog post is lengthy, but that's good. This will probably prevent my committee members to read all the details, so that they cannot use everything I have just written against me :-) (I'm very curious to see if anyone has notice that I just said this :P).