Sander van der Burg's blog

Monday, October 31, 2011

Deploying .NET services with Disnix

In two earlier blog posts, I have explained how the the Nix package manager can be used to deploy .NET software. I'm happy to report that I have extended these possibilities to Disnix.

With these new features it is possible to develop databases for Microsoft SQL server, implement WCF web services and ASP.NET web applications (which may be inter-connected to each other) and to automatically and reliably deploy them in a network of machines using Disnix.

Modifications to Disnix

The modifications I had to make to Disnix were relatively minor. Disnix can already be compiled on Cygwin, just as the Nix package manager. Since Disnix is built on top of Nix, it reuses the Nix functions I have developed in earlier blog posts to build Visual Studio projects.

The only missing piece in the deployment process, is the activation and deactivation of Microsoft SQL server databases and ASP.NET web applications on Internet Information Services (IIS) for which activation scripts must be developed. Luckily, Microsoft SQL server and IIS have command-line tools which can be scripted, to do this job.

To support these new types of services, I have developed the following activation scripts:

mssql-database. This activation script loads a schema on initial startup if the database does not exists. It uses the OSQL.EXE tool included with Microsoft SQL server, to automatically execute the SQL instructions from shell scripts to check whether the database exists and to create the tables if needed.
iis-webapplication. This activation script activates or deactivates a web application on Microsoft Internet Information Services. It uses the MSDeploy.exe tool to automatically activate a web application or deactivate a web application.

These activation scripts are automatically used by assigning a mssql-database or iis-webapplication type to a service in the Disnix services model.

Installing Disnix on Windows

Important is to know how to get Disnix working on Windows and how to enable support for .NET applications. Most of the installation steps on Windows/Cygwin are the same as UNIX systems. Details of the Disnix installation process can be found in the Disnix documentation.

However, there are a number of details that must be taken care of, which are not described in the manual (yet). Furthermore, the best way to get Disnix working is by compiling it from source so that all the required optional features are enabled.

In order to enable the mssql-database and iis-webapplication activation types, you must first manually install SQL server and IIS on your Windows system:

I have used the following instructions to install Microsoft SQL server: http://visualcsharptutorials.com/2011/05/installing-sql-server-2008-express. Important is that you need to configure a SQL server user account with administration rights, a.k.a. the 'sa' user, which must use SQL server authentication. More information about this can be found here: http://www.eukhost.com/forums/f15/login-failed-user-sa-microsoft-sql-server-error-18456-a-12544/
Microsoft IIS was already installed on my machine, but perhaps this page can act as a reference: http://learn.iis.net/page.aspx/28/installing-iis-on-windows-vista-and-windows-7. Important is that you include the Web Deployment tool, which is not enabled by default.

Moreover, the configure script of the disnix-activation-scripts package must be able to find OSQL.EXE and MSDeploy command-line tools, which must be in your PATH. Otherwise the activation types that we need are disabled and we cannot deploy .NET applications.

On my Windows 7 machine, OSQL can be found in: C:\Program Files\Microsoft SQL Server\100\Tools\Binn and MSDeploy in: C:\Program Files\IIS\Microsoft Web Deploy. I have included a screenshot above, which shows you what the output should of the configure script should look like. As you can see, the configure script was able to detect the locations of the command line tools, because I have adapted the PATH environment variable.

Running the Disnix daemon

We also need to run the Disnix daemon on every machine in the network, so that we can remotely deploy the services we want. Probably the best way to get Cygwin services running is by using the cygrunsrv command, which runs Cygwin programs as Windows services.

Since the core Disnix daemon is a D-Bus service, we need to run the D-Bus system daemon, which can be configured by typing:

$ cygrunsrv -I dbus -p /usr/sbin/dbus-daemon.exe -a \
    '--system --nofork'

The Disnix service can be configured by typing:

$ cygrunsrv -I disnix -p /usr/local/bin/disnix-service.exe -a \
  '--activation-modules-dir /usr/local/libexec/disnix/activation-scripts' \
  -e 'PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin' \
  -y dbus

Disnix also needs to be remotely connectible. Disnix can use various interfaces, but the recommended interface is SSH. In order to connect through SSH, you also need to configure a SSH server. This can be done by executing the following script on Cygwin:

$ ssh-host-config

And you probably need to configure some SSH keys as well, to prevent Disnix asking for passwords for each operation. Check the OpenSSH documentation for more information.

After configuring the services, you probably need to activate them for the fist time, which can be done by the Windows service manager (Control Panel -> System and Security -> Administrative Tools -> Services). You need to pick the Disnix service and select the start option. If you want to use the SSH server, you need to pick and start the 'CYGWIN sshd' service as well. A screenshot is included above.

Example case

Now that I have explained how Disnix can be installed and configured on Windows, we probably also like to see what it's capabilities are. As an example case, I have ported the StaffTracker, a motivating example in our WASDeTT paper, from Java to .NET technology, using C# as an implementation language, ADO.NET as database manager, WCF to implement web services, and ASP.NET to create the web application front-end.

It was a nice opportunity to learn some of these technologies. I have to admit that Visual Studio 2010 was a very convenient development environment and it didn't take much time for me to port the example case. Although I was impressed by this, I currently have no plans to write any software using .NET technology except for this example case. (Perhaps I will port it to Mono as an experiment some day).

Distributed deployment

In order to make our example deployable through Disnix, I had to write Disnix expressions for each service component and I had to write a services, infrastructure and distribution model. A Disnix expression for a WCF web service looks like this:

{dotnetenv}:
{zipcodes}:

dotnetenv.buildSolution {
  name = "ZipcodeService";
  src = ../../../../services/webservices/ZipcodeService;
  baseDir = "ZipcodeService";
  slnFile = "ZipcodeService.csproj";
  targets = "Package";
  preBuild = ''
    sed -e 's|.\SQLEXPRESS|${zipcodes.target.hostname}\SQLEXPRESS|' \
        -e 's|Initial Catalog=zipcodes|Initial catalog=${zipcodes.name}|' \
        -e 's|User ID=sa|User ID=${zipcodes.target.msSqlUsername}|' \
        -e 's|Password=admin123$|Password=${zipcodes.target.msSqlPassword}|' \
        Web.config
  '';
}

As you may notice, the expression above looks similar to an ordinary Nix expression building a Visual Studio project, except that it uses the inter-dependency parameter (zipcodes) to configure a database connection string defined in the Web.config file, so that the web service can connect to its database back-end. More information about setting a connection string through this configuration file, can be found here: http://weblogs.asp.net/owscott/archive/2005/08/26/Using-connection-strings-from-web.config-in-ASP.NET-v2.0.aspx.

By using the Disnix expressions in conjunction with the services, infrastructure and distribution models, the .NET example can be deployed in a network of machines with a single command line instruction:

$ disnix-env -s services.nix -i infrastructure.nix -d distribution.nix

Below, I have a included a screenshot showing a Disnix deployment scenario with our .NET Staff Tracker example. In this screenshot, you can see a console showing Disnix output, a web browser displaying the entry page of the web application front-end, the IIS manager showing various deployed WCF web services, and the ASP.NET web front-end and the SQL server management studio showing a number of deployed databases. The complete system is deployed using a single command-line instruction.

Limitations

In this blog post I have shown how service-oriented .NET applications can be automatically deployed with Disnix. There are several slight inconveniences however:

Disnix is used to manage the service components of a system. Disnix does not deploy infrastructure components, such as web server or database server. DisnixOS is a NixOS based extension that takes care of this. However, DisnixOS cannot be used on Windows, because SQL server and IIS are tightly integrated into the Windows operating system and registry. We cannot use the Nix store to safely isolate them. You need to either install these infrastructure components manually or use other deployment solutions.
As mentioned in our previous blog post about .NET deployment, the .NET framework needs to be installed manually in order to be able to build Visual Studio projects. On most Windows installations, however, the .NET framework is already included.
The .NET build functions and activation scripts are quite new and not very well tested. Moreover, they could also break, because we currently have no way to automatically test them like we do with Linux software.
Also not all desired deployment features may be supported. I'd like to have feedback on this :-)

References

To use the .NET deployment features described in this blog post, you need to obtain the latest pre-release of Disnix, which can be downloaded from the Disnix Hydra project page
The StaffTracker .NET example case, can also be downloaded from the Disnix Hydra project page and can be freely used under the MIT license.
A description of the StaffTracker example case can be found in our WASDeTT paper, titled: 'Disnix: A toolset for distributed deployment' which can be obtained from my publications page.
I have used the following web page as a reference for writing WCF web services: http://www.codeproject.com/KB/WCF/first_WCF_Service.aspx
The following page is used as a reference for ADO.NET: http://www.csharp-station.com/Tutorials/AdoDotNet/Lesson02.aspx
The W3Schools tutorial about ASP.NET was sufficient for me, to understand it.
I have recently presented this subject at Philips. The slides can be found on my talks page.

Thursday, October 27, 2011

Software deployment complexity

In this blog post, I'd like to talk about the software deployment discipline in general. In my career as PhD student and while visiting academic conferences, I have noticed that software deployment is (and has never been) a very popular research subject within the software engineering community.

Furthermore, I have encountered many misconceptions about what software deployment is supposed to mean and even some people are surprised that people do research in this field. I have also received some vague complaints of certain reviewers saying that things that we do aren't novel and comments such as: "hmm, what can the software engineering community learn from this? I don't see the point..." and "this is not a research paper".

What is software deployment?

So what is actually software deployment? One of the first software deployment papers in academic research by Carzaniga et al [1], describes this discipline as follows:

Software deployment refers to all the activities that make a software system
available for use

Some of the activities that may be required to make a software system available for use are:

Building software components from source code
Packaging software
Transferring the software from the producer site to consumer site
Installation of the software system
Activation of the software system
Software upgrades

An important thing to point out is that the activities described above are all steps to make a software system available for use. I have noticed that many people mistakenly think that software deployment is just the installation of a system, which is not true.

Essentially, the point of software deployment is that a particular software system is developed with certain goals, features and behaviour in mind by the developers. Once this software system is to be used by end-users, it typically has to be made available for use in the consumer environment. Important is that the software system behaves exactly the way the developers have intended. It turns out that, for many reasons, this process has become very complicated nowadays and it is also very difficult to give any guarantees that a software system operates as intended. In some cases, systems may not work at all.

Relationship to software engineering

So what has software deployment to do with software engineering? According to [2] software engineering can be defined as:

Software Engineering (SE) is the application of a systematic, disciplined, quantifiable approach to the development, operation, and maintenance of software, and the study of these approaches; that is, the application of engineering to software.

Within the software engineering research community, we investigate techniques to improve and/or study software development processes. Typically, the deployment step is usually the last phase in a software development project, when the development process of a software system is completed and ready to be made available to end-users.

In old traditional waterfall-style software development projects, the deployment phase is not performed so frequently. Nowadays most software development projects are iterative in which features of the software are extended and improved, so for each cycle the system has to be redeployed. Especially in Agile software projects, which have short iterations (of about 2 weeks) it is crucial to be able to deploy a system easily.

Because of the way we develop software nowadays, the deployment process has become much more of a burden and that's why it is also important to have systematic, disciplined, quantifiable approaches for software deployment.

Apart from delivering systems to end-users, we also need to deploy a system to test it. In order to run a test suite, all necessary environmental dependencies must be present and correct. Without a reliable and reproducible deployment process, it becomes a burden and difficult to guarantee that tests succeed in all circumstances.

Why is software deployment complicated?

Back in the old days, software was developed for a specific machine (or hardware architecture), stored on a disk/tape and delivered to the customer. Then the customer loaded the program from the tape/disk into memory and was able to run the program. Apart from the operating system, all the required parts of the program were stored on the disk. Basically, my good old Commodore 64/128 worked like this. All software was made available on either cassette tapes or 5.25 inch floppy disks. Apart from the operating system and BASIC interpreter (which were stored in the ROM of the Commodore) everything that was required to run a program was available on the disk.

Some time later, component based software engineering (CBSE) was introduced and received wide acceptance. The advantages of CBSE were that software components can be obtained from third parties without having to develop those yourself and that components with the same or similar functionality can be shared and reused across multiple programs. CBSE greatly improved the quality of software and the productivity of developers. As a consequence, software products were no longer delivered as self-contained products, but became dependent on the components already residing on the target systems.

Although CBSE provides a number of advantages, it also introduced additional complexity and challenges. In order to be able to run a software program all dependencies must be present and correct and the program must be able to find them. There are all kinds of things that could go wrong while deploying a system. A dependency may be missing, or a program requires a newer version of a specific component. Also newer components may be incompatible with a program (sometimes this intentional, but also accidentally due to a bug on which a program may rely).

For example, in Microsoft Windows (but also on other platforms) this lead to a phenomenon called the DLL hell. Except for Windows DLLs, this phenomenon occurs in many different contexts as well, such as the JAR hell for Java programs. Even the good old AmigaOS, suffered from the same weakness although they were not that severe as they are now, because the versions of libraries didn't change that frequently.

In UNIX-like systems, such as Linux, you will notice that the degree of sharing of components through libraries is raised to almost a maximum. For these kind of systems, it is crucial to have deployment tooling to properly manage the packages installed on a system. In Linux distributions the package manager is a key aspect and also a distinct feature that sets a particular distribution apart from another. There are many package mangers around such as RPM, dpkg, portage, pacman, and Nix (which we use in our research as a basis for NixOS).

Apart from the challenges of deploying a system from scratch, many system are also upgraded because (in most cases) it's too costly and time consuming to deploy them over and over again, for each change. In most cases upgrading is a risky process, because files get modified and overwritten. An interruption or crash during an upgrade phase may have disastrous results. Also an upgrade may not always give the same results as a fresh installation of a system.

Importance of software deployment

So why is research in software deployment important?

First of all, (not surprisingly) software systems become bigger and increasingly more complex. Nowadays, some software systems are not only composed of many components, but these components are also distributed and deployed on various machines in a network working together to achieve a common goal. For example, service-oriented systems are composed this way. Deploying these kinds of systems manually is a very time consuming, complex, error prone and tedious process. The bigger the system gets, the more likely it becomes that an error occurs.
We have to be more flexible in reacting to events. For example, in a cloud infrastructure, if a machine breaks, we must be able to redeploy the system in such a way that services are still available to end-users, limiting the impact as much as possible.
We want to push changes to a system in production environment faster. Because systems become increasingly more complex, an automated deployment solution is essential. In Agile software development projects, a team wants to generate value as quickly as possible, for which it is essential to have a working system in a production environment as soon as possible. To achieve this goal, it is crucial that the deployment process can be performed without much trouble. A colleague of mine (Rini van Solingen), who is also a Scrum consultant, has covered this importance in a video blog interview.

Research

What are software deployment research subjects?

Mechanics. This field concerns the execution of the deployment activities. How can we make these steps reproducible, reliable, efficient? Most of the research that I do covers deployment mechanics.
Deployment planning. Where to place a component in a network of machines? How to compose components together?
Empirical research covering various aspects of deployment activities, such as: How to quantify build maintenance effort? How much maintenance is needed to keep deployment specifications (such as build specifications) up to date?

Where are software deployment papers published? Currently, there is no subfield conference about software deployment. In the past (a few years before I started my research), there were three editions of the Working Conference on Component Deployment, which is no longer held since 2005.

Most of the deployment papers are published in various conferences, such as the top general ones, subfield conferences about software maintenance, testing, cloud computing. The challenging part of this is that (depending on the subject) I have to adapt my story to the conference where I want my paper to be published. This requires me to explain the same problems over and over again and integrate these problems with the given problem domain, such as cloud computing or testing. This is not always trivial to do, nor will every reviewer understand what the point is.

Conclusion

In this blog post, I have explained what software deployment is about and why research in this field is important. Systems are becoming much bigger and more complicated and we want to respond to changes faster. In order to manage this complexity, we need research in providing automated deployment solutions.

References

[1] Carzaniga et al., A Characterization Framework for Software Deployment Technologies, 1998
[2] The Joint Task Force on Computing Curricula, Software Engineering 2004

Wednesday, September 14, 2011

Deploying .NET applications with the Nix package manager (part 2)

In my previous blog post, I have explained how the Nix package manager can be used to deploy .NET applications. One of the open issues was that run-time dependencies can't be resolved in a convenient way. I have explained three possible solutions, each having its pros and cons and none of them was ideal.

After writing that blog post, I have received a number of suggestions and reactions from people from the #nixos freenode channel. Moreover, during the dinner of the SEAMS symposium in Hawaii, I have heard similar suggestions. It seems that blogging about certain issues pays off after all!

The last two days I have decided to look at these suggestions and to do some experiments at Philips. I'm happy to report that I have a follow up story now, in which I have a new solution for resolving run-time dependencies of .NET executables. This solution is also the best option (in my opinion).

Implementing a wrapper for .NET applications

Apparently .NET has a reflection API. With this reflection API you can dynamically load classes and dynamically invoke methods. You can also load assemblies dynamically from any location whether they have a strong name or not.

The .NET runtime also seems to fire an AssemblyResolve event, in case a library assembly can't be found. Apparently you can create your own event handler, dealing with such an event and use it to load a missing assembly through the reflection API.

So by taking these features into account, it is possible to create a wrapper executable capable of resolving the run-time dependencies that we need. This is what the wrapper I have developed for Nix looks like (I actually had to write some C# code for this):

using System;
using System.Reflection;
using System.IO;

namespace HelloWorldWrapper
{
    class HelloWorldWrapper
    {
        private String[] AssemblySearchPaths = {
          @"C:\cygwin\nix\store\23ga...-ALibrary",
          @"C:\cygwin\nix\store\833p...-BLibrary"
        };

        private String ExePath =
          @"C:\cygwin\nix\store\27f2...-Executable\Executable.exe";

        private String MainClassName =
          "SomeExecutable.Executable";

        public HelloWorldWrapper(string[] args)
        {
            // Attach the resolve event handler to the AppDomain
            // so that missing library assemblies will be searched
            AppDomain currentDomain = AppDomain.CurrentDomain;
            currentDomain.AssemblyResolve +=
              new ResolveEventHandler(MyResolveEventHandler);

            // Dynamically load the executable assembly
            Assembly exeAssembly = Assembly.LoadFrom(ExePath);

            // Lookup the main class
            Type mainClass = exeAssembly.GetType(MainClassName);

            // Lookup the main method
            MethodInfo mainMethod = mainClass.GetMethod("Main");

            // Invoke the main method
            mainMethod.Invoke(this, new Object[] {args});
        }

        static void Main(string[] args)
        {
            new HelloWorldWrapper(args);
        }

        private Assembly MyResolveEventHandler(object sender,
          ResolveEventArgs args)
        {
            // This handler is called only when the common language
            // runtime tries to bind to the assembly and fails.

            Assembly MyAssembly;
            String assemblyPath = "";
            String requestedAssemblyName =
              args.Name.Substring(0, args.Name.IndexOf(","));

            // Search for the right path of the library assembly
            foreach (String curAssemblyPath in AssemblySearchPaths)
            {
                assemblyPath = curAssemblyPath + "/" +
                  requestedAssemblyName + ".dll";

                if (File.Exists(assemblyPath))
                    break;
            }

            // Load the assembly from the specified path. 
            MyAssembly = Assembly.LoadFrom(assemblyPath);

            // Return the loaded assembly.
            return MyAssembly;
        }

    }
}

The wrapper class defined above has a number of fields. The AssemblySearchPaths field defines a String array containing all the Nix store paths of the runtime dependencies. The exePath field defines a String referring to the path of the executable in the Nix store which we want to run. The MainClassName field defines the full name of the class containing the Main method we want to run.

In the Main method of this class, we create an instance of the wrapper. In the constructor, we attach our own custom resolve event handler to the app domain controller. Then we use the reflection API to dynamically load the actual executable and to invoke the main method in the specified main class.

When we load the executable assembly, the resolve event handler is triggered a number of times. Our custom MyResolveEvent handler, tries to load to given assembly in all the search paths defined in the AssemblySearchPaths string array, which should succeed if all runtime dependencies are present.

There is a small caveat, however, with dynamically invoking the Main method of another executable. By default, the Main method in C# programs is defined like this:

namespace SomeNamespace
{
    class SomeClass
    {
       static void Main(string[] args)
       {
       }
    }
}

Apparently, it has no access modifier, which means the internal access modifier is used by default. The internal access modifier restricts access to this method to all members of the same assembly. This means that we cannot invoke an external Main method from a different assembly like the wrapper. To counter this, we need to make the access modifier of the actual executable public (or make the wrapper class a friend, but nonetheless we need to make a small modification anyway).

Usage

I have implemented a convenience function in Nixpkgs: dotnetenv.buildWrapper that automatically builds a .NET executable and generates a wrapper for the given executable. The function can be invoked like this:

{dotnetenv, MyAssembly1, MyAssembly2}:

dotnetenv.buildWrapper {
  name = "My.Test.Assembly";
  src = /path/to/source/code;
  slnFile = "Assembly.sln";
  assemblyInputs = [
    dotnetenv.assembly20Path
    MyAssembly1
    MyAssembly2
  ];
  namespace = "TestNamespace";
  mainClassName = "MyTestApplication";
  mainClassFile = "MyTestApplication.cs";
}

As you may see, the structure of the dotnetenv.buildWrapper is similar to the dotnetenv.buildSolution function, except that it requires several additional parameters for the wrapper, such as the namespace, class name and file location of the class containing the Main method of the actual executable. The function automatically makes the given Main method in the given main class file public and it creates a wrapper class containing the right properties to run the actual executable, such as the location of the actual executable and the paths of the run-time dependencies.

By using this wrapper function, it is possible to run a .NET executable assembly from the Nix store without much trouble.

Conclusion

In this blog post, I have implemented a wrapper executable that deals with resolving run-time dependencies of a .NET application. The wrapper uses a resolve event handler which loads all the required library assemblies through the .NET reflection API. This wrapper can be automatically generated from a convenience function, which makes it possible to run .NET applications from the Nix store, without much trouble.

References

This page explains how the reflection API can be used to dynamically load assemblies: http://www.codeproject.com/KB/cs/DynLoadClassInvokeMethod.aspx
The following page explains various ways to load library assemblies beyond the base directory of the executable. It also includes an explanation of the AssemblyResolve event: http://support.microsoft.com/kb/837908

Tuesday, September 6, 2011

Deploying .NET applications with the Nix package manager

This probably sounds like a very strange topic to some (or perphaps, most) readers, but I have done some experiments in the past with deploying .NET applications by using the Nix package manager. The Nix package manager is mostly used on Unix-like systems (Linux, FreeBSD, etc.) and designed with Unix-principles in mind. Furthermore, a lot of people know me as a Microsoft-critic. So you probably wonder why I want to do this?

Motivation

Being able to use Nix for deploying .NET applications has the following benefits:

For installing or upgrading .NET applications you have the same deployment benefits that Nix has: Being able to store multiple versions/variants next to each other, dependency completeness, atomic upgrades and rollbacks and a garbage collector which safely removes components no longer in use.
You can use Hydra, our continuous build and integration server, for building and testing .NET applications in various environments including environmental dependencies
You can use Disnix, to manage the deployment of a service-oriented applications developed using .NET technology in a network machines. This also works for web applications. For example, you can deploy your ASP.NET / Microsoft SQL server database environment from a declarative specification. Because Disnix is built on top of Nix, it also provides features such as dependency completeness, (almost) atomic upgrades and a garbage collector in a distributed environment.
The Nix deployment technology and related tooling are designed as generic tools (i.e. not developed for a particular component technology). Being able to support .NET applications is a useful addition.
And finally, we have an industry partner in our research project, who's interested in this.

Global Assembly Cache (GAC)

When I talk about Nix (and especially about the principle of the Nix store) to .NET people, I often hear that the Global Assembly Cache (GAC) already solves the DLL-hell, so you have no worries. Although the GAC solves several common deployment issues, it has a number of drawbacks compared to the Nix store:

It only provides isolation for library assemblies. Other components such as executables, compilers, configuration files, or native libraries are not supported.
A library assembly must have a strong name, which gives a library an unique name. A strong name is composed of several attributes, such as a name, version number and culture. Furthermore, the library assembly is signed with a public/private key pair.
Creating a strong-named assembly is in many cases painful. A developer must take care that the combination of attributes is always unique. For example, for a new release the version number must be increased. Because developers have to take care of this, people typically don't use strong names for internal release cycles, because it's too much work.
Creating a strong-named assembly could go wrong. It may be possible that a developer forgets to update any of these strong name attributes, which makes it possible to create a different assembly with the same strong name. Then isolation in the GAC can't be provided.

In contrast to the GAC, you can store any type of component in the Nix store, such as executables, configuration files, compilers etc. Furthermore, the Nix store uses hash codes derived from all build-time dependencies of a component, which always provides unique component file names.

Building Visual Studio projects in Nix

So how can Visual Studio projects be supported in Nix to compile .NET applications? We have implemented a Nix function to support this:

{stdenv, dotnetfx}:
{ name, src, slnFile, targets ? "ReBuild"
, options ? "/p:Configuration=Debug;Platform=Win32"
, assemblyInputs ? []
}:
stdenv.mkDerivation {
  inherit name src;
  buildInputs = [ dotnetfx ];
  installPhase = ''
    for i in ${toString assemblyInputs}; do
      windowsPath=$(cygpath --windows $i)
      AssemblySearchPaths="$AssemblySearchPaths;$windowsPath"
    done
    export AssemblySearchPaths
    ensureDir $out
    outPath=$(cygpath --windows $out)\\
    MSBuild.exe ${slnFile} /nologo /t:${targets} \
      /p:OutputPath=$outPath ${options} ...
  '';
}

The Nix expression code fragment above shows you the definition of the dotnetenv.buildSolution function, which builds Visual Studio projects and stores the output in the Nix store.

The idea of this function is easy: The function takes several parameters, such as the name of the component, a build time options string, the filename of the Visual Studio solution file (SLN) and a list of libraries (Assembly Inputs). It uses the dotnetfx (.NET framework) as a buildtime dependency, which provides access to the MSBuild executable, used to build Visual Studio solution files.

In order to let MSBuild find its library dependencies, we set the AssemblySearchPaths environment variable to contain the paths to the Nix store components containing the library assemblies. After setting the environment variable, the MSBuild command is invoked to build the given solution file and to produce the output in a unique path in the Nix store. The cygpath command is used to convert UNIX path names to Windows path names (and vice versa).

{dotnetenv, MyAssembly1, MyAssembly2}:

dotnetenv.buildSolution {
  name = "My.Test.Assembly";
  src = /path/to/source/code;
  slnFile = "Assembly.sln";
  assemblyInputs = [
    dotnetenv.assembly20Path
    MyAssembly1
    MyAssembly2
  ];
}

The above Nix expression shows you how this function can be used to build a Visual Studio project. Like ordinary Nix expressions, this expression is also a function taking several input arguments, such as dotnetenv which provides the Visual Studio build function (shown in the previous code fragment) and the library assemblies which are required to build the project. In the body we call the buildSolution function with the right parameters, such as the Solution file and the library assemblies which this project requires. The dotnetenv.assembly20Path refers to the .NET 2.0 system assemblies directory.

rec {
  dotnetfx = ...
  stdenv = ...
  dotnetenv = import ../dotnetenv {
    inherit stdenv dotnetfx;
  };

  MyAssembly1 = import ../MyAssembly1 {
    inherit dotnetenv;
  };

  MyAssembly2 = import ../MyAssembly1 {
    inherit dotnetenv;
  };

  MyTestAssembly = import ../MyTestAssembly {
    inherit dotnetenv MyAssembly1 MyAssembly2;
  };
}

Like ordinary Nix expressions, we also have to compose Visual Studio components by calling the build function in the previous code fragment with the right parameters. This is done in the Nix expression shown above. The last attribute: MyTestAssembly imports the expression shown in the previous code fragement with the required function arguments. As you may see, also all dependencies of MyTestAssembly are defined in this file. By running the following command-line instruction (pkgs.nix is the filename of the code fragement above):

nix-env -f pkgs.nix -iA MyTestAssembly

The assembly in our example is build from source, including all library dependencies and the output is produced in:

/nix/store/ri0zzm2hmwg01w2wi0g4a3rnp0z24r8p-My.Test.Assembly

Running .NET applications from the Nix store

We have explained how can build .NET applications and how MSBuild is able to find the required build time dependencies. Except for building a .NET application with Nix, we also have to be able to run them from the Nix store. To make this possible, an executable assembly needs to find its runtime dependencies, which is more complicated than I thought.

The .NET runtime locates assemblies as follows:

First, it tries to determine the correct version of the assembly (only for strong named assemblies)
If the strong named assembly has been bound before in memory, that version will be used.
If the assembly is not already in memory, it checks the Global Assembly Cache (GAC).
And otherwise it probes the assembly, by looking in a config file or by using some probing heuristics.

Because Nix stores all components, including library assemblies, in unique folders in Nix store, this gives some challenges. If an executable is started from the Nix store, the required libraries can't be found, because probing heuristics look for libraries in the same basedir as the executable.

Currently, I have implemented three methods to resolve runtime dependencies (each approach has its pros and cons and none of them is ideal):

Copying DLLs into the same folder as the executable. This is the most expensive and inefficient method, because libraries are not shared on the hard drive. However, it does work with both private and strong named assemblies and it also works on older versions of Windows, such as Windows XP.
Creating a config file, which specifies where the libraries can be found. A disadvantage of this approach is that .NET does not allow private assemblies to be looked up in other locations beyond the basedir of the executable. Therefore assemblies need a strong name, which is not very practical because these have to be generated by hand.
The third option is creating NTFS symlinks in the same folder as the executable. This works also for private libraries. A disadvantage of this approach is that NTFS symlinks are only supported from Windows Vista and upwards. Furthermore, you need special user privileges to create them and their semantics are not exactly the same as UNIX symlinks.

Usage

If you want to experiment with the Visual Studio build functions in Nix, you need to install Nix on Cygwin and you need a checkout of Nixpkgs. Check the Nix documentation for more instructions on this.

The dotnetenv component can be found in the pkgs/buildsupport/ directory of Nixpkgs. You need to install the .NET framework yourself and you have to edit some of the attributes of dotnetenv so that the .NET framework utilities can be found.

Unfortunately, the .NET framework can't be deployed through Nix (yet), because of dependencies on the Windows register. I also haven't looked into the scripting possibilities of the installer. So the .NET framework deployment isn't entirely pure. (Perhaps someone is able to make some tweaks to do this, but I don't know if this is legal to do).

Conclusion

In this blog post, I have described how .NET applications can be deployed with Nix. However, there are some minor issues, such as the fact that runtime dependencies can't be resolved in a convenient way. Furthermore, the deployment isn't entirely pure as the .NET framework must be installed manually. I know that Mono is able to use the MONO_PATH environment variable to look for libraries in arbitrary locations. Unfortunately, it seems that the .NET framework does not have something like this.

I have been told that it's also possible to resolve run time dependencies programmatically. This way you can load any library assembly you want from any location. I'm curious if somebody has more information on this. Any feedback would be welcome, since I'm not a .NET expert.

References

I have given a presentation about this subject some time ago. Slides can be found on my presentations page.
The following page explains how .NET locates assemblies: http://msdn.microsoft.com/en-us/library/yx7xezcf%28VS.71%29.aspx

Wednesday, July 20, 2011

Second computer

Due to the bad weather conditions last week, I've decided to clean up some mess on the hard drive of my home PC. While cleaning, I found some old files transferred from an old computer. Previously, I wrote a blog post about my first computer, so I've decided to write a follow up story about the second, which was the last 'classic computer' I used, before I moved over to the boring PC era.

My second computer was a Commodore Amiga 500 produced by the same manufacturers of my good old Commodore 128. This computer has a completely new hardware architecture and was not compatible with any model of the 8-bit computer product line. A picture of this computer is shown above (I took somebody else's picture, because mine doesn't look that pretty anymore).

The Amiga 500 had a Motorola 68000 processor running at ~7 MHz and 512 KiB of chip memory (RAM shared by the video chip and applications). In our model, we've upgraded the amount of chip memory to 1 MiB and we've replaced the OCS chipset by the newer ECS chipset. Moreover, we also owned a 100 MiB harddrive, which included 8 MiB fast memory (RAM which is not shared with the video chip and thus faster).

Although these specs aren't that impressive for nowadays' standards, back then the Amiga was far ahead of it's time. At it's launch in 1987, its multimedia features were superior to the PC. The sound chip was probably it's most praised feature, which supports four sound channels (two for the left speaker and two for the right) with 8-bit resolution for each channel and a 6-bit volume control per channel. The Sound Blaster card of the PC at that time wasn't capable of this.

Usage

The Amiga was quite easy to use. For most users, the Amiga was used as a gaming machine. When turning the Amiga on, an image (or animation) appeared on the screen requesting the user to insert a floppy disk:

The picture on the left (shown above) is the famous splash screen which appeared if the Kickstart 1.3 ROM was used. The picture shown on the right is the splash screen used by the Kickstart 2.0 ROM and later versions.

By inserting a floppy disk of a game or program, the system automatically booted from the disk starting the program or game. This was possible, because operating system parts such as the shell were stored in the Kickstart ROM. Moreover, most commercial game disks used custom implemented bootblocks bypassing the operating system shell and filesystem completely.

Graphical desktop environment

Except for a command-line shell in which a disk booted automatically, the Amiga also included the Workbench, a graphical desktop environment offering similar features that we use in modern desktop environments. Although for most tasks the Workbench wasn't required, for hard drive users this environment was very convenient to pick and manage your programs installed on your hard drive.

Music composing: Protracker

A program I used quite a lot was ProTracker, to make my own musical compositions. A screenshot is shown on the left.

The idea of ProTracker was simple; A user has 4 channels to create sequences of notes (called "patterns"), which are chained together to form a complete song. ProTracker was probably not the best composer available for the Amiga, but was used quite frequently by game developers.

The MOD file format which ProTracker used, was also the de-facto file format for music and supported by many programs, because it was close to the underlying Amiga hardware and only required 4% of CPU usage to play a song (so that still plenty of system resources were available for graphics and other tasks).

I composed quite a number of Protracker modules, which I used for various purposes. Although most of the songs I composed sucked completely, there were also some nice ones :-)

Painting: Deluxe Paint

I also did some painting with Deluxe Paint. The graphics abilities of the Amiga were quite limited compared to graphics hardware we use nowadays. On the Amiga 500 only 32 configurable color registers were available, storing color values out of 4096 possible colors. More modern Amiga models (having the AGA chipset), supported 256 configurable colors out of 16 million colors, but unfortunately I didn't have such an advanced model. :-)

The Amiga has a number of special screen modes to use more colors out of the preconfigured 32. The Extra Halfbrite (EHB) screenmode offers 64 colors, in which the last 32 were half of the color values of the first 32. The Hold-and-Modify (HAM) screen mode modifies one of the red, green and blue channels of the adjacent pixel or picks a new color from the palette, so that (in theory) all 4096 colors can be used on one screen, using only 6 bits per pixel.

Programming: AMOS Professional

Of course I also did some programming. There were many programming languages and environments available for the Amiga, but I only used the AMOS Programming language, a programming language based on BASIC. AMOS Professional was the most advanced edition.

Apart from a BASIC interpreter and IDE (shown above), AMOS also included complementary utilities for creating sprites/blitter objects and AMAL, an animation scripting language and a compiler to create native executables. Moreover, it also included many example programs and even a number of complete games including their source code.

Furthermore, AMOS also had high level instructions to open IFF/ILBM files for images (created by programs such as Deluxe Paint) and a music player routine for the MOD format (to support music created by Protracker). These features allowed me to combine the earlier two tools to develop games. I did quite a number of attempts to develop a serious game and I received quite some interesting results. Unfortunately, I did not have all the knowledge and skills that I needed to develop something really useful. Also, because of my messy programming style, things became so complex that I abandoned the stuff I was working on, to start all over from scratch.

AMOS was quite successful back these days and it was used for a number of commercial games, such as Jetstrike and Scrorched Tanks.

Exotic Ripper and Eagle player

I also frequently used Exotic Ripper (shown left). This tool was very handy to rip music files from games. Because most game disks used custom bootblocks and bypassed the filesystem completely, we had to use other ways to retrieve the music files they were using. Luckily, after starting a game the music files remained somewhere in memory which we were able to retrieve using Exotic Ripper.

Another reason why ripping was so interesting, was because most games used the MOD file format. This allowed me to study how these songs were composed with Protracker and to steal their samples, which I used in my own songs. :-)

Eagle player (shown left) was an awesome player able to play many music formats, which allowed me to play the exotic formats that I've retrieved by using Exotic Ripper.

Scala Multimedia

Another notable application I used was Scala Multimedia. I didn't find this program that interesting for myself, but my mother and grandfather used this program to create video presentations and combine these presentations with video recordings by using a Genlock device. I discovered the abilities of this program much easier than them :-)

Scala could be considered as a predecessor of the Microsoft PowerPoint presentation program combined with video editing possibilities.

This idea of Scala is easy. Users create scripts, which are a sequence of scenes. For each scene, a user can specify a transition effect, transition time and some background music (using a file in MOD format). Each scene may contain text, pictures and/or animations (which are in IFF/ILBM format). Below I've included some screenshots from Scala scripts:

Games

Of course, the Amiga was very popular as a gaming machine. Above, some screenshots are shown of Turrican 2 (a conversion of the Commodore 64 version), Superfrog, Pinball Dreams and Lemmings Tribes 2. Except for good graphics, these games were also praised for their high quality soundtracks.

Conclusion

The Amiga showed great potential in the early 90s. Back then, the PC was still a very primitive/dull platform. Many tasks that we perform on our computers nowadays, were already possible back then, although not everything didn't looked that pretty. My current PC has 8 GiB of RAM (which is one thousand times the amount of RAM in the Amiga), but sometimes I get the impression that the things we do on our computers haven't improved that much.

Unfortunately, the PC hardware kept improving and eventually overtook the capabilities of the Amiga. When the new Amiga models having the AGA chipset (supporting 256 color registers) were launched, the SVGA graphics capabilies on the PC were superior to the Amiga. Moreover, due to some bad business decisions of Commodore, they went bankrupt in 1994. I still used the Amiga for about one year and then I moved to the PC.

I'm happy to report that my Amiga is still in my posession. The only bad thing is that I don't have working monitor anymore.

After Commodore went bankrupt, the Amiga is still not completely dead. Currently, the brand is owned by a company called Amiga Inc. Moreover, a company called Hyperion Entertainment continued the development of AmigaOS and has released version 4.0. There are rumours that some day a new Amiga product line will be sold, but nobody knows whether this is really going to happen.

References

To create the screenshots in this blog post, I have used UAE, the Ultimate/Unix/Unusuable Amiga Emulator, available under the GPL license
The source code of AMOS Professional and other AMOS variants has been released under a permissive BSD-like license by Clickteam, the authors' current employer.
The music player shown in this blog post: Eagleplayer is nowadays available as free/open source software under the GPL license. The Eagleplayer source code has been used for UADE, a plugin for audio players such as XMMS and Audacious to support many Amiga music formats.
Team 17, the game company behind Superfrog and many other Amiga titles, still exists today. Nowadays, they are well known for games, such as Worms Armageddon
Factor 5 the company who's created the Amiga ports of the Turrican games also still exists. Nowadays, they are well known for producing Star Wars games. You can download disk images of some of their Amiga games from their webpage.
Electronic Arts was the company responsible for developing Deluxe Paint and designing the IFF file format. Back in the old days they were a software development company. Nowadays, they have become a well-known games publishing company.

Monday, June 27, 2011

Concepts of programming languages

I'm quite busy with many things lately (too many things if I'm quite honest), but it's a good thing to point out that I'm still alive :-).

The last couple of months I was also a teaching assistant for the new edition of TU Delft's 'Concepts of programming languages' course, given by my supervisor Eelco Visser. In this course, students learn a couple of new programming languages, which are conceptually different compared to the first programming language they have learned at TU Delft (which is Java). The programming languages covered in this course are:

Scala, a multi-paradigm programming language integrating object oriented and functional programming language features. In this course, students use this language to discover functional programming, functional objects, traits and pattern matching.
The C programming language is used to teach students low-level principles, such as pointers and stack and heap based memory allocation.
JavaScript, is used as a different object oriented language which uses prototypes instead of classes for inheritance. Moreover, students also explore the underlying structure of objects in JavaScript, which are represented as associative arrays with prototypes.

Except for learning new programming languages using different concepts, the lab assignments are about programming languages themselves. A simple lambda calculus based language (which itself is a way to demonstrate programming language fundamentals) is used as a case study for the exercises.

In the final exercises of the lab, students have to implement a evaluator for the lambda calculus language in Scala, "translate" an object oriented Java program into a C program using structs and dispatch tables for virtual method lookup and they have to port the lambda calculus evaluator to JavaScript, which can be invoked through a web page in a web browser.

During my period as a PhD student I have had many discussions about programming languages, with colleagues and peers during conferences. One of the things I have discussed the most, is what language students should learn in the introduction programming language course.

Some peers from a German university, wanted students to learn C# instead of Java, which is normally taught at their university (because C# is the programming language they used for their research projects and they argued that they wouldn't attract any master students because they were not familiar with it). I also had discussions with peers about using a minimalistic programming language such as Lisp, because these languages are less complicated.

In my opinion, it doesn't matter that much what programming language is chosen as introduction. I think it's probably a better idea to pick a mainstream language, which is not to complicated than a complex language that nobody uses.

The way I see it, the main purpose of the introduction programming courses is to teach people dealing with complexity. To achieve that goal, they need to understand a programming language and they have to learn to divide a big problem into smaller sub problems and translate those to concepts which can be implemented in that language.

Moreover, I also find it important that software engineers are flexible regarding to programming languages, for the following reasons:

Once students start working in a software engineering project (e.g. for industry), they may have to use a programming language they are not familiar to.
No programming language is perfect for every job. By studying multiple programming languages you can make decisions about what language is the best solution for a particular task. For instance, for web programming JavaScript is in many cases a good choice, and for embedded systems programming C may be a better choice.

There are many factors influencing a programming language choice, such as maintenance, portability (of the language itself and the underlying platform) and efficiency.
By studying multiple programming language concepts, it become easier to learn other languages. Programming languages have differences, but also many similarities (e.g. compare the syntax of C, C++, Java and JavaScript or the structure of if-statements). Furthermore, languages also adopt concepts from each other, e.g. closures.
You may have to develop a programming language or domain specific language (DSL) yourself. For these tasks, knowledge about language concepts is crucial.

For all the reasons mentioned above, I think that having a 'Concepts of programming languages' course is essential for a software engineer and also "solves" the problem of thinking about what language is crucial for a student to learn. You have to make them familiar with multiple concepts.

Monday, May 2, 2011

Deployment abstractions for WebDSL

Four years ago, I joined the Software Engineering Research Group as a master's student, to work on a research assignment (which served as a preparation of my masters thesis about the initial prototype of Disnix). This is how I met my supervisor Eelco Visser and many colleagues I'm working with. Back then, my research was about the deployment aspects of WebDSL, a domain-specific language for developing web application with a rich data model, using Nix tooling.

Although I'm not a primary developer of WebDSL, I am involved in it's deployment related aspects. In this blog post I'd like to give some general information about WebDSL and some ideas I'm currently working on as part of my research.

WebDSL is a Domain Specific Language (DSL) for web applications with a rich data model started by Eelco Visser. While developing applications, people often make abstractions. This is done to deal with repetition and patterns which look similar, which otherwise have to be implemented over and over again. It also allows developers to talk about concrete (high-level) objects from their domain and prevents them to reinvent the wheel.

In many cases, programming language features are used to make abstractions, such as inheritance in an Object-oriented programming language. These conventional manners (e.g. method and classes) are not always sufficient. Sometimes libraries or frameworks have awkward interfaces. Some parts of a program are not checked by the compiler, such as SQL queries embedded in strings, which may be desirable. In many cases, there is a considerable distance between the programming language concepts and domain concepts. By developing a DSL you can deal with such issues and provide better abstractions for a particular domain. The definition of a DSL is as follows (according to [1]):

A DSL is a high-level software implementation language that supports concepts and abstractions that are related to a particular (application) domain.

WebDSL is a case study for DSL engineering. The web application domain is quite challenging. Many web applications are implemented in various languages, each used for a specific concern, e.g. CSS for styling, HTML for web pages (which are sometimes generated on the server using a PHP script or Java Servlet), JavaScript for manipulating the DOM on the client and so on. Moreover, sometimes a lot of boilerplate code has to be written, e.g. writing getters and setters in Java code.

WebDSL has a collection of intergrated sub languages for developing web applications, to allow a developer to specify various aspects of a web application. The WebDSL compiler generates all the implementation code for those aspects. Currently, WebDSL has sub languages to describe a data model capturing entities and their relationships, pages, access control and a work flow language.

Although building abstractions for the web is not really new (there are many frameworks available providing similar concepts) WebDSL has some unique aspects, such as the fact that it is a statically typed and a checked language and it uses a custom syntax, which can be transformed to various back-ends.

WebDSL also has a number of sister projects. Spoofax is an Eclipse based platform for developing textual domain-specific languages with full-featured Eclipse editor plugins. Spoofax itself is based on SDF2 and Stratego, which are used to implement the WebDSL grammars and compiler. Spoofax is used to provide an Eclipse IDE for WebDSL (shown in the screenshot above). As a sidenote, Spoofax is also used for other DSLs, such as Mobl, a DSL for mobile web applications and even for an IDE for Nix expressions (which is still highly experimental). Another sister project is Acoda, which can be used for automated coupled evolution of data models in WebDSL.

Although a high-level DSL for web applications and all its related features seem to be very nice, it will not solve all web application issues. Except for the development of web applications, web applications must also be deployed in a test environment or production environment, which is also quite challenging.

In order to make a WebDSL application ready for use, the WebDSL compiler itself and all its dependencies (such as SDF2 and Stratego) must be built and present. Apart for the compiler, we must also setup a necessary infrastructure capable of running a WebDSL application. For example, for the Java backend, you need to install a MySQL DBMS instance and an Apache Tomcat server (or another Java servlet container) configured with the right properties to host the generated Java web application and the underlying data model.

Moreover, in production environments, which may serve many users, other facilities must be present, such as multiple instances of Apache Tomcat and MySQL for load balancing (or fallbacks), a reverse proxy and so on. Furthermore, components of web applications are not always deployed on physical machines, but they may also be deployed in virtual machines for testing or in a cloud environment such as Amazon EC2, which have different characteristics and contraints.

Currently, apart from a prototype that deploys a WebDSL application on a single physical machine, there are no facilities to deal with complex networks of machines and cloud providers using virtual machines and their deployment processes. So in order to get a WebDSL application working in a complex distributed environment, deployment steps must be performed manually by a developer or system administrator which can be quite a burden.

I'm currently working on a solution that allows developers to deploy WebDSL applications in a custom infrastructure (consisting of physical machines and/or virtual machines) using simple high-level specifications. The tooling builds upon our earlier work on Nix, Disnix and NixOS.

References

The WebDSL researchr page gives some references to published WebDSL papers. [1] refers to the paper GTTSE paper titled: 'WebDSL: A Case Study in Domain-Specific Language Engineering' written by Eelco Visser.