Sander van der Burg's blog: October 2011

Monday, October 31, 2011

Deploying .NET services with Disnix

In two earlier blog posts, I have explained how the the Nix package manager can be used to deploy .NET software. I'm happy to report that I have extended these possibilities to Disnix.

With these new features it is possible to develop databases for Microsoft SQL server, implement WCF web services and ASP.NET web applications (which may be inter-connected to each other) and to automatically and reliably deploy them in a network of machines using Disnix.

Modifications to Disnix

The modifications I had to make to Disnix were relatively minor. Disnix can already be compiled on Cygwin, just as the Nix package manager. Since Disnix is built on top of Nix, it reuses the Nix functions I have developed in earlier blog posts to build Visual Studio projects.

The only missing piece in the deployment process, is the activation and deactivation of Microsoft SQL server databases and ASP.NET web applications on Internet Information Services (IIS) for which activation scripts must be developed. Luckily, Microsoft SQL server and IIS have command-line tools which can be scripted, to do this job.

To support these new types of services, I have developed the following activation scripts:

mssql-database. This activation script loads a schema on initial startup if the database does not exists. It uses the OSQL.EXE tool included with Microsoft SQL server, to automatically execute the SQL instructions from shell scripts to check whether the database exists and to create the tables if needed.
iis-webapplication. This activation script activates or deactivates a web application on Microsoft Internet Information Services. It uses the MSDeploy.exe tool to automatically activate a web application or deactivate a web application.

These activation scripts are automatically used by assigning a mssql-database or iis-webapplication type to a service in the Disnix services model.

Installing Disnix on Windows

Important is to know how to get Disnix working on Windows and how to enable support for .NET applications. Most of the installation steps on Windows/Cygwin are the same as UNIX systems. Details of the Disnix installation process can be found in the Disnix documentation.

However, there are a number of details that must be taken care of, which are not described in the manual (yet). Furthermore, the best way to get Disnix working is by compiling it from source so that all the required optional features are enabled.

In order to enable the mssql-database and iis-webapplication activation types, you must first manually install SQL server and IIS on your Windows system:

I have used the following instructions to install Microsoft SQL server: http://visualcsharptutorials.com/2011/05/installing-sql-server-2008-express. Important is that you need to configure a SQL server user account with administration rights, a.k.a. the 'sa' user, which must use SQL server authentication. More information about this can be found here: http://www.eukhost.com/forums/f15/login-failed-user-sa-microsoft-sql-server-error-18456-a-12544/
Microsoft IIS was already installed on my machine, but perhaps this page can act as a reference: http://learn.iis.net/page.aspx/28/installing-iis-on-windows-vista-and-windows-7. Important is that you include the Web Deployment tool, which is not enabled by default.

Moreover, the configure script of the disnix-activation-scripts package must be able to find OSQL.EXE and MSDeploy command-line tools, which must be in your PATH. Otherwise the activation types that we need are disabled and we cannot deploy .NET applications.

On my Windows 7 machine, OSQL can be found in: C:\Program Files\Microsoft SQL Server\100\Tools\Binn and MSDeploy in: C:\Program Files\IIS\Microsoft Web Deploy. I have included a screenshot above, which shows you what the output should of the configure script should look like. As you can see, the configure script was able to detect the locations of the command line tools, because I have adapted the PATH environment variable.

Running the Disnix daemon

We also need to run the Disnix daemon on every machine in the network, so that we can remotely deploy the services we want. Probably the best way to get Cygwin services running is by using the cygrunsrv command, which runs Cygwin programs as Windows services.

Since the core Disnix daemon is a D-Bus service, we need to run the D-Bus system daemon, which can be configured by typing:

$ cygrunsrv -I dbus -p /usr/sbin/dbus-daemon.exe -a \
    '--system --nofork'

The Disnix service can be configured by typing:

$ cygrunsrv -I disnix -p /usr/local/bin/disnix-service.exe -a \
  '--activation-modules-dir /usr/local/libexec/disnix/activation-scripts' \
  -e 'PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin' \
  -y dbus

Disnix also needs to be remotely connectible. Disnix can use various interfaces, but the recommended interface is SSH. In order to connect through SSH, you also need to configure a SSH server. This can be done by executing the following script on Cygwin:

$ ssh-host-config

And you probably need to configure some SSH keys as well, to prevent Disnix asking for passwords for each operation. Check the OpenSSH documentation for more information.

After configuring the services, you probably need to activate them for the fist time, which can be done by the Windows service manager (Control Panel -> System and Security -> Administrative Tools -> Services). You need to pick the Disnix service and select the start option. If you want to use the SSH server, you need to pick and start the 'CYGWIN sshd' service as well. A screenshot is included above.

Example case

Now that I have explained how Disnix can be installed and configured on Windows, we probably also like to see what it's capabilities are. As an example case, I have ported the StaffTracker, a motivating example in our WASDeTT paper, from Java to .NET technology, using C# as an implementation language, ADO.NET as database manager, WCF to implement web services, and ASP.NET to create the web application front-end.

It was a nice opportunity to learn some of these technologies. I have to admit that Visual Studio 2010 was a very convenient development environment and it didn't take much time for me to port the example case. Although I was impressed by this, I currently have no plans to write any software using .NET technology except for this example case. (Perhaps I will port it to Mono as an experiment some day).

Distributed deployment

In order to make our example deployable through Disnix, I had to write Disnix expressions for each service component and I had to write a services, infrastructure and distribution model. A Disnix expression for a WCF web service looks like this:

{dotnetenv}:
{zipcodes}:

dotnetenv.buildSolution {
  name = "ZipcodeService";
  src = ../../../../services/webservices/ZipcodeService;
  baseDir = "ZipcodeService";
  slnFile = "ZipcodeService.csproj";
  targets = "Package";
  preBuild = ''
    sed -e 's|.\SQLEXPRESS|${zipcodes.target.hostname}\SQLEXPRESS|' \
        -e 's|Initial Catalog=zipcodes|Initial catalog=${zipcodes.name}|' \
        -e 's|User ID=sa|User ID=${zipcodes.target.msSqlUsername}|' \
        -e 's|Password=admin123$|Password=${zipcodes.target.msSqlPassword}|' \
        Web.config
  '';
}

As you may notice, the expression above looks similar to an ordinary Nix expression building a Visual Studio project, except that it uses the inter-dependency parameter (zipcodes) to configure a database connection string defined in the Web.config file, so that the web service can connect to its database back-end. More information about setting a connection string through this configuration file, can be found here: http://weblogs.asp.net/owscott/archive/2005/08/26/Using-connection-strings-from-web.config-in-ASP.NET-v2.0.aspx.

By using the Disnix expressions in conjunction with the services, infrastructure and distribution models, the .NET example can be deployed in a network of machines with a single command line instruction:

$ disnix-env -s services.nix -i infrastructure.nix -d distribution.nix

Below, I have a included a screenshot showing a Disnix deployment scenario with our .NET Staff Tracker example. In this screenshot, you can see a console showing Disnix output, a web browser displaying the entry page of the web application front-end, the IIS manager showing various deployed WCF web services, and the ASP.NET web front-end and the SQL server management studio showing a number of deployed databases. The complete system is deployed using a single command-line instruction.

Limitations

In this blog post I have shown how service-oriented .NET applications can be automatically deployed with Disnix. There are several slight inconveniences however:

Disnix is used to manage the service components of a system. Disnix does not deploy infrastructure components, such as web server or database server. DisnixOS is a NixOS based extension that takes care of this. However, DisnixOS cannot be used on Windows, because SQL server and IIS are tightly integrated into the Windows operating system and registry. We cannot use the Nix store to safely isolate them. You need to either install these infrastructure components manually or use other deployment solutions.
As mentioned in our previous blog post about .NET deployment, the .NET framework needs to be installed manually in order to be able to build Visual Studio projects. On most Windows installations, however, the .NET framework is already included.
The .NET build functions and activation scripts are quite new and not very well tested. Moreover, they could also break, because we currently have no way to automatically test them like we do with Linux software.
Also not all desired deployment features may be supported. I'd like to have feedback on this :-)

References

To use the .NET deployment features described in this blog post, you need to obtain the latest pre-release of Disnix, which can be downloaded from the Disnix Hydra project page
The StaffTracker .NET example case, can also be downloaded from the Disnix Hydra project page and can be freely used under the MIT license.
A description of the StaffTracker example case can be found in our WASDeTT paper, titled: 'Disnix: A toolset for distributed deployment' which can be obtained from my publications page.
I have used the following web page as a reference for writing WCF web services: http://www.codeproject.com/KB/WCF/first_WCF_Service.aspx
The following page is used as a reference for ADO.NET: http://www.csharp-station.com/Tutorials/AdoDotNet/Lesson02.aspx
The W3Schools tutorial about ASP.NET was sufficient for me, to understand it.
I have recently presented this subject at Philips. The slides can be found on my talks page.

Thursday, October 27, 2011

Software deployment complexity

In this blog post, I'd like to talk about the software deployment discipline in general. In my career as PhD student and while visiting academic conferences, I have noticed that software deployment is (and has never been) a very popular research subject within the software engineering community.

Furthermore, I have encountered many misconceptions about what software deployment is supposed to mean and even some people are surprised that people do research in this field. I have also received some vague complaints of certain reviewers saying that things that we do aren't novel and comments such as: "hmm, what can the software engineering community learn from this? I don't see the point..." and "this is not a research paper".

What is software deployment?

So what is actually software deployment? One of the first software deployment papers in academic research by Carzaniga et al [1], describes this discipline as follows:

Software deployment refers to all the activities that make a software system
available for use

Some of the activities that may be required to make a software system available for use are:

Building software components from source code
Packaging software
Transferring the software from the producer site to consumer site
Installation of the software system
Activation of the software system
Software upgrades

An important thing to point out is that the activities described above are all steps to make a software system available for use. I have noticed that many people mistakenly think that software deployment is just the installation of a system, which is not true.

Essentially, the point of software deployment is that a particular software system is developed with certain goals, features and behaviour in mind by the developers. Once this software system is to be used by end-users, it typically has to be made available for use in the consumer environment. Important is that the software system behaves exactly the way the developers have intended. It turns out that, for many reasons, this process has become very complicated nowadays and it is also very difficult to give any guarantees that a software system operates as intended. In some cases, systems may not work at all.

Relationship to software engineering

So what has software deployment to do with software engineering? According to [2] software engineering can be defined as:

Software Engineering (SE) is the application of a systematic, disciplined, quantifiable approach to the development, operation, and maintenance of software, and the study of these approaches; that is, the application of engineering to software.

Within the software engineering research community, we investigate techniques to improve and/or study software development processes. Typically, the deployment step is usually the last phase in a software development project, when the development process of a software system is completed and ready to be made available to end-users.

In old traditional waterfall-style software development projects, the deployment phase is not performed so frequently. Nowadays most software development projects are iterative in which features of the software are extended and improved, so for each cycle the system has to be redeployed. Especially in Agile software projects, which have short iterations (of about 2 weeks) it is crucial to be able to deploy a system easily.

Because of the way we develop software nowadays, the deployment process has become much more of a burden and that's why it is also important to have systematic, disciplined, quantifiable approaches for software deployment.

Apart from delivering systems to end-users, we also need to deploy a system to test it. In order to run a test suite, all necessary environmental dependencies must be present and correct. Without a reliable and reproducible deployment process, it becomes a burden and difficult to guarantee that tests succeed in all circumstances.

Why is software deployment complicated?

Back in the old days, software was developed for a specific machine (or hardware architecture), stored on a disk/tape and delivered to the customer. Then the customer loaded the program from the tape/disk into memory and was able to run the program. Apart from the operating system, all the required parts of the program were stored on the disk. Basically, my good old Commodore 64/128 worked like this. All software was made available on either cassette tapes or 5.25 inch floppy disks. Apart from the operating system and BASIC interpreter (which were stored in the ROM of the Commodore) everything that was required to run a program was available on the disk.

Some time later, component based software engineering (CBSE) was introduced and received wide acceptance. The advantages of CBSE were that software components can be obtained from third parties without having to develop those yourself and that components with the same or similar functionality can be shared and reused across multiple programs. CBSE greatly improved the quality of software and the productivity of developers. As a consequence, software products were no longer delivered as self-contained products, but became dependent on the components already residing on the target systems.

Although CBSE provides a number of advantages, it also introduced additional complexity and challenges. In order to be able to run a software program all dependencies must be present and correct and the program must be able to find them. There are all kinds of things that could go wrong while deploying a system. A dependency may be missing, or a program requires a newer version of a specific component. Also newer components may be incompatible with a program (sometimes this intentional, but also accidentally due to a bug on which a program may rely).

For example, in Microsoft Windows (but also on other platforms) this lead to a phenomenon called the DLL hell. Except for Windows DLLs, this phenomenon occurs in many different contexts as well, such as the JAR hell for Java programs. Even the good old AmigaOS, suffered from the same weakness although they were not that severe as they are now, because the versions of libraries didn't change that frequently.

In UNIX-like systems, such as Linux, you will notice that the degree of sharing of components through libraries is raised to almost a maximum. For these kind of systems, it is crucial to have deployment tooling to properly manage the packages installed on a system. In Linux distributions the package manager is a key aspect and also a distinct feature that sets a particular distribution apart from another. There are many package mangers around such as RPM, dpkg, portage, pacman, and Nix (which we use in our research as a basis for NixOS).

Apart from the challenges of deploying a system from scratch, many system are also upgraded because (in most cases) it's too costly and time consuming to deploy them over and over again, for each change. In most cases upgrading is a risky process, because files get modified and overwritten. An interruption or crash during an upgrade phase may have disastrous results. Also an upgrade may not always give the same results as a fresh installation of a system.

Importance of software deployment

So why is research in software deployment important?

First of all, (not surprisingly) software systems become bigger and increasingly more complex. Nowadays, some software systems are not only composed of many components, but these components are also distributed and deployed on various machines in a network working together to achieve a common goal. For example, service-oriented systems are composed this way. Deploying these kinds of systems manually is a very time consuming, complex, error prone and tedious process. The bigger the system gets, the more likely it becomes that an error occurs.
We have to be more flexible in reacting to events. For example, in a cloud infrastructure, if a machine breaks, we must be able to redeploy the system in such a way that services are still available to end-users, limiting the impact as much as possible.
We want to push changes to a system in production environment faster. Because systems become increasingly more complex, an automated deployment solution is essential. In Agile software development projects, a team wants to generate value as quickly as possible, for which it is essential to have a working system in a production environment as soon as possible. To achieve this goal, it is crucial that the deployment process can be performed without much trouble. A colleague of mine (Rini van Solingen), who is also a Scrum consultant, has covered this importance in a video blog interview.

Research

What are software deployment research subjects?

Mechanics. This field concerns the execution of the deployment activities. How can we make these steps reproducible, reliable, efficient? Most of the research that I do covers deployment mechanics.
Deployment planning. Where to place a component in a network of machines? How to compose components together?
Empirical research covering various aspects of deployment activities, such as: How to quantify build maintenance effort? How much maintenance is needed to keep deployment specifications (such as build specifications) up to date?

Where are software deployment papers published? Currently, there is no subfield conference about software deployment. In the past (a few years before I started my research), there were three editions of the Working Conference on Component Deployment, which is no longer held since 2005.

Most of the deployment papers are published in various conferences, such as the top general ones, subfield conferences about software maintenance, testing, cloud computing. The challenging part of this is that (depending on the subject) I have to adapt my story to the conference where I want my paper to be published. This requires me to explain the same problems over and over again and integrate these problems with the given problem domain, such as cloud computing or testing. This is not always trivial to do, nor will every reviewer understand what the point is.

Conclusion

In this blog post, I have explained what software deployment is about and why research in this field is important. Systems are becoming much bigger and more complicated and we want to respond to changes faster. In order to manage this complexity, we need research in providing automated deployment solutions.

References

[1] Carzaniga et al., A Characterization Framework for Software Deployment Technologies, 1998
[2] The Joint Task Force on Computing Curricula, Software Engineering 2004