Monday, August 22, 2016

An extended self-adaptive deployment framework for service-oriented systems


Five years ago, while I was still in academia, I built an extension framework around Disnix (named: Dynamic Disnix) that enables self-adaptive redeployment of service-oriented systems. It was an interesting application as it demonstrated the full potential of service-oriented systems having their deployment processes automated with Disnix.

Moreover, the corresponding research paper was accepted for presentation at the SEAMS 2011 symposium (co-located with ICSE 2011) in Honolulu (Hawaii), which was (obviously!) a nice place to visit. :-)

Disnix's development was progressing at a very low pace for a while after I left academia, but since the end of 2014 I made some significant improvements. In contrast to the basic toolset, I did not improve Dynamic Disnix -- apart from the addition of a port assigner tool, I only kept the implementation in sync with Disnix's API changes to prevent it from breaking.

Recently, I have used Dynamic Disnix to give a couple of demos. As a result, I have improved some of its aspects a bit. For example, some basic documentation has been added. Furthermore, I have extended the framework's architecture to take a couple of new deployment planning aspects into account.

Disnix


For readers unfamiliar with Disnix: the primary purpose of the basic Disnix toolset is executing deployment processes of service-oriented systems. Deployments are driven by three kinds of declarative specifications:

  • The services model captures the services (distributed units of deployments) of which a system consists, their build/configuration properties and their inter-dependencies (dependencies on other services that may have to be reached through a network link).
  • The infrastructure model describes the target machines where services can be deployed to and their characteristics.
  • The distribution model maps services in the services model to machines in the infrastructure model.

By writing instances of the above specifications and running disnix-env:

$ disnix-env -s services.nix -i infrastructure.nix -d distribution.nix

Disnix executes all activities to get the system deployed, such as building their services from source code, distributing them to the target machines in the network and activating them. Changing any of these models and running disnix-env again causes the system to be upgraded. In case of an upgrade, Disnix will only execute the required activities making the process more efficient than deploying a system from scratch.

"Static" Disnix


So, what makes Disnix's deployment approach static? When looking at software systems from a very abstract point of view, they are supposed to meet a collection of functional and non-functional requirements. A change in a network of machines affects the ability for a service-oriented system to meet them, as the services of which these systems consist are typically distributed.

If a system relies on a critical component that has only one instance deployed and the machine that hosts it crashes, the functional requirements can no longer be met. However, even if we have multiple instances of the same components giving better guarantees that no functional requirements will be broken, important non-functional requirements may be affected, such as the responsiveness of a system.

We may also want to optimize a system's non-functional properties, such as its responsiveness, by adding more machines to the network that offer more system resources, or by changing the configuration of existing machine, e.g. upgrading the amount available RAM.

The basic Disnix toolset is considered static, because all these events events require manual modifications to the Disnix models for redeployment, so that a system can meet its requirements under the changed conditions.

For simple systems, manual reconfiguration is still doable, but with one hundred services, one hundred machines or a high frequency of events (or a combination of the three), it becomes too complex and time consuming.

For example, when a machine has been added or removed, we must rewrite the distribution model in such a way that all services are deployed to at least one machine and that none of them are mapped to machines that are not capable or allowed to host them. Furthermore, with microservices (one of their traits is that they typically embed HTTP servers), we must typically bind them to unique TCP ports that do not conflict with system services or other services deployed by Disnix. None of these configuration aspects are trivial for large service-oriented systems.

Dynamic Disnix


Dynamic Disnix extends Disnix's architecture with additional models and tools to cope with the dynamism of service oriented-systems. In the latest version, I have extended its architecture (which has been based on the old architecture described in the SEAMS 2011 paper and corresponding blog post):


The above diagram shows the structure of the dydisnix-self-adapt tool. The ovals denote command-line utilities, the rectangles denote files and the arrows denote files as inputs or outputs. As with the basic Disnix toolset, dydisnix-self-adapt is composed of command-line utilities each being responsible for executing an individual deployment activity:

  • On the top right, the infrastructure generator is shown that captures the configurations of the machines in the network and generates an infrastructure model from it. Currently, two different kinds of generators can be used: disnix-capture-infra (included with the basic toolset) that uses a bootstrap infrastructure model with connectivity settings, or dydisnix-geninfra-avahi that uses multicast DNS (through Avahi) to retrieve the machines' properties.
  • dydisnix-augment-infra is responsible for augmenting the generated infrastructure model with additional settings, such as passwords. It is typically undesired to automatically publish privacy-sensitive settings over a network using insecure connection protocols.
  • disnix-snapshot can be optionally used to preemptively capture the state of all stateful services (services with property: deployState = true; in the services model) so that the state of these services can be restored if a machine crashes or disappears. This tool is new in the extended architecture.
  • dydisnix-gendist generates a mapping of services to machines based on technical and non-functional properties defined in the services and infrastructure models.
  • dydisnix-port-assign assigns unique TCP port numbers to previously undeployed services and retains assigned TCP ports in a previous deployment for optimization purposes. This tool is new in the extended architecture.
  • disnix-env redeploys the system with the (statically) provided services model and the dynamically generated infrastructure and distribution models.

An example usage scenario


When a system has been configured to be (statically) deployed with Disnix (such as the infamous StaffTracker example cases that come in several variants), we need to add a few additional deployment specifications to make it dynamically deployable.

Auto discovering the infrastructure model


First, we must configure the machines in such a way that they publish their own configurations. The basic toolset comes with a primitive solution called: disnix-capture-infra that does not require any additional configuration -- it consults the Disnix service that is installed on every target machine.

By providing a simple bootstrap infrastructure model (e.g. infrastructure-bootstrap.nix) that only provides connectivity settings:

{
  test1.properties.hostname = "test1";
  test2.properties.hostname = "test2";
}

and running disnix-capture-infra, we can obtain the machines' configuration properties:

$ disnix-capture-infra infrastructure-bootstrap.nix

By setting the following environment variable, we can configure Dynamic Disnix to use the above command to capture the machines' infrastructure properties:

$ export DYDISNIX_GENINFRA="disnix-capture-infra infrastructure-bootstrap.nix"

Alternatively, there is the Dynamic Disnix Avahi publisher that is more powerful, but at the same time much more experimental and unstable than disnix-capture-infra.

When using Avahi, each machine uses multicast DNS (mDNS) to publish their own configuration properties. As a result, no bootstrap infrastructure model is needed. Simply gathering the data published by the machines on the same subnet suffices.

When using NixOS on a target machine, the Avahi publisher can be enabled by cloning the dydisnix-avahi Git repository and adding the following lines to /etc/nixos/configuration.nix:

imports = [ /home/sander/dydisnix/dydisnix-module.nix ];
services.dydisnixAvahiTest.enable = true;

To allow the coordinator machine to capture the configurations that the target machines publish, we must enable the Avahi system service. In NixOS, this can be done by adding the following lines to /etc/nixos/configuration.nix:

services.avahi.enable = true;

When running the following command-line instruction, the machines' configurations can be captured:

$ dydisnix-geninfra-avahi

Likewise, when setting the following environment variable:

$ export DYDISNIX_GENINFRA=dydisnix-geninfra-avahi

Dynamic Disnix uses the Avahi-discovery service to obtain an infrastructure model.

Writing an augmentation model


The Java version of StaffTracker for example uses MySQL to store data. Typically, it is undesired to publish the authentication credentials over the network (in particular with mDNS, which is quite insecure). We can augment these properties to the captured infrastructure model with the following augmentation model (augment.nix):

{infrastructure, lib}:

lib.mapAttrs (targetName: target:
  target // (if target ? containers && target.containers ? mysql-database then {
    containers = target.containers // {
      mysql-database = target.containers.mysql-database //
        { mysqlUsername = "root";
          mysqlPassword = "secret";
        };
    };
  } else {})
) infrastructure

The above model implements a very simple password policy, by iterating over each target machine in the discovered infrastructure model and adding the same mysqlUsername and mysqlPassword property when it encounters a MySQL container service.

Mapping services to machines


In addition to a services model and a dynamically generated (and optionally augmented) infrastructure model, we must map each service to machine in the network using a configured strategy. A strategy can be programmed in a QoS model, such as:

{ services
, infrastructure
, initialDistribution
, previousDistribution
, filters
, lib
}:

let
  distribution1 = filters.mapAttrOnList {
    inherit services infrastructure;
    distribution = initialDistribution;
    serviceProperty = "type";
    targetPropertyList = "supportedTypes";
  };

  distribution2 = filters.divideRoundRobin {
    distribution = distribution1;
  };
in
distribution2

The above QoS model implements the following policy:

  • First, it takes the initialDistribution model that is a cartesian product of all services and machines. It filters the machines on the relationship between the type attribute and the list of supportedTypes. This ensures that services will only be mapped to machines that can host them. For example, a MySQL database should only be deployed to a machine that has a MySQL DBMS installed.
  • Second, it divides the services over the candidate machines using the round robin strategy. That is, it divides services over the candidate target machines in equal proportions and in circular order.

Dynamically deploying a system


With the services model, augmentation model and QoS model, we can dynamically deploy the StaffTracker system (without manually specifying the target machines and their properties, and how to map the services to machines):

$ dydisnix-env -s services.nix -a augment.nix -q qos.nix

The Node.js variant of the StaffTracker example requires unique TCP ports for each web service and web application. By providing the --ports parameter we can include a port assignment specification that is internally managed by dydisnix-port-assign:

$ dydisnix-env -s services.nix -a augment.nix -q qos.nix --ports ports.nix

When providing the --ports parameter, the specification gets automatically updated when ports need to be reassigned.

Making a system self-adaptable from a deployment perspective


With dydisnix-self-adapt we can make a service-oriented system self-adaptable from a deployment perspective -- this tool continuously monitors the network for changes, and runs a redeployment when a change has been detected:

$ dydisnix-self-adapt -s services.nix -a augment.nix -q qos.nix

For example, when shutting down a machine in the network, you will notice that Dynamic Disnix automatically generates a new distribution and redeploys the system to get the missing services back.

Likewise, by adding the ports parameter, you can include port assignments as part of the deployment process:

$ dydisnix-self-adapt -s services.nix -a augment.nix -q qos.nix --ports ports.nix

By adding the --snapshot parameter, we can preemptively capture the state of all stateful services (services annotated with deployState = true; in the services model), such as the databases in which the records are stored. If a machine hosting databases disappears, Disnix can restore the state of the databases elsewhere.

$ dydisnix-self-adapt -s services.nix -a augment.nix -q qos.nix --snapshot

Keep in mind that this feature uses Disnix's snapshotting facilities, which may not be the best solution to manage state, in particular with large databases.

Conclusion


In this blog post, I have described an extended architecture of Dynamic Disnix. In comparison to the previous version, a port assigner has been added that automatically provides unique port numbers to services, and the disnix-snapshot utility that can preemptively capture the state of services, so that they can be restored if a machine disappears from the network.

Despite the fact that Dynamic Disnix has some basic documentation and other improvements from a usability perspective, Dynamic Disnix remains a very experimental prototype that should not be used for any production purposes. In contrast to the basic toolset, I have only used it for testing/demo purposes and I still have no real-life production experience with it. :-)

Moreover, I still have no plans to officially release it yet as many aspects still need to be improved/optimized. For now, you have to obtain the Dynamic Disnix source code from Github and use the included release.nix expression to install it. Furthermore, you probably need to a lot of courage. :-)

Finally, I have extended the Java and Node.js versions of the StaffTracker example as well as the virtual hosts example with simple augmentation and QoS models.

Wednesday, August 3, 2016

Porting node-simple-xmpp from the Node.js ecosystem to Titanium

As may have become obvious by reading some of my previous blog posts, I am frequently using JavaScript for a variety of programming purposes. Although JavaScript was originally conceived as a programming language for use in web browsers (mainly to make web pages more interactive), it is also becoming increasingly more popular in environments outside the browser, such as Node.js, a runtime environment for developing scalable network applications and Appcelerator Titanium, an environment to develop cross-platform mobile applications.

Apart from the fact these environments share a common programming language -- JavaScript -- and a number of basic APIs that come with the language, they all have their own platform-specific APIs to implement most of an application's basic functionality.

Moreover, they have their own ecosystem of third-party software packages. For example, in Node.js the NPM package manager is the ubiquitous way of publishing and obtaining software. For web browsers, bower can be used, although its adoption is not as widespread as NPM.

Because of these differences, reuse between JavaScript environments is not optimal, in particular for packages that have dependencies on functionality that is not part of JavaScript's core API.

In this blog post, I will describe our experiences with porting the simple-xmpp from the Node.js ecosystem to Titanium. This library has dependencies on Node.js-specific APIs, but with our porting strategy we were able to integrate it in our Titanium app without making any modifications to the original package.

Motivation: Adding chat functionality


Not so long ago, me and my colleagues have been looking into extending our mobile app product-line with chat functionality. As described in an earlier blog post, we use Titanium for developing our mobile apps and one of my responsibilities is automating their builds with Nix (and Hydra, the Nix-based continuous integration server).

Developing chat functionality is quite complex, and requires one to think about many concerns, such as the user-experience, security, reliability and scalability. Instead of developing a chat infrastructure from scratch (which would be much too costly for us), we have decided to adopt the XMPP protocol, for the following reasons:

  • Open standard. Everyone is allowed to make software implementing aspects of the XMPP standard. There are multiple server implementations and many client libraries available, in many programming languages including JavaScript.
  • Decentralized. Users do not have to connect to a single server -- a server can relay messages to users connected to another server. A decentralized approach is good for reliability and scalability.
  • Web-based. The XMPP protocol is built on technologies that empower the web (XML and HTTP). The fact that HTTP is used as a transport protocol, means that we can also support clients that are behind a proxy server.
  • Mature. XMPP has been in use for a quite some time and has some very prominent users. Currently, Sony uses it to enrich the PlayStation platform with chat functionality. In the past, it was also used as the basis for Google and Facebook's chat infrastructure.

Picking a server implementation was not too hard, as ejabberd was something I had experience with in my previous job (and as an intern at Philips Research) -- it supports all the XMPP features we need, and has proven to be very mature.

Unfortunately, for Titanium, there was no module available that implements XMPP client functionality, except for an abandoned project named titanium-xmpp that is no longer supported, and no longer seems to work with any recent versions of Titanium.

Finding a suitable library


As there was no working XMPP client module available for Titanium and we consider developing such a client for Titanium from scratch too costly, we first tried to fix titanium-xmpp, but it turned out that too many things were broken. Moreover, it used all kinds of practices (such as an old fashioned way of module loading through Ti.include()) that have been deprecated a long time ago.

Then we tried porting other JavaScript-based libraries to Titanium. The first candidate was strophe.js which is mainly browser-oriented (and can be used in Node.js through phantomjs, an environment providing a non-interactive web technology stack), but had too many areas that had to be modified and browser-specific APIs that require substitutes.

Finally, I discovered node-xmpp, an XMPP framework for Node.js that has a very modular architecture. For example, the client and server aspects are very-well separated as well as the XML parsing infrastructure. Moreover, we discovered simple-xmpp, a library built on top of it to make a number of common tasks easier to implement. Moreover, the infrastructure has been ported to web browsers using browserify.

Browserify is an interesting porting tool -- its main purpose is to provide a replacement for the CommonJS module system, which is a first-class citizen in Node.js, but unavailable in the browser. Browserify statically analyzes closures of CommonJS modules, and packs them into a single JavaScript file so that the module system is no longer needed.

Furthermore, browserify provides browser-equivalent substitutes for many core Node.js APIs, such as events, stream and path, making it considerably easier to migrate software from Node.js to the browser.

Porting simple-xmpp to Titanium


In addition to browserify, there also exists a similar approach for Titanium: titaniumifier, that has been built on top of the browserify architecture.

Similar to browserify, titaniumifier also packs a collection of CommonJS modules into a single JavaScript file. Moreover, it constructs a Titanium module from it, packs it into a Zip file that can be distributed to any Titanium developer so that it can be used by simply placing it into the root folder of the app project and adding the following module requirement to tiapp.xml:

<module platform="commonjs">ti-simple-xmpp</module>

Furthermore, it provides Titanium-equivalent substitute APIs for Node.js core APIs, but its library is considerably more slim and incomplete than browserify.

We can easily apply titatiumifier to simple-xmpp, by creating a new NPM project (package.json file) that has a dependency on simple-xmpp:

{
  "name": "ti-simple-xmpp",
  "version": "0.0.1",
  "dependencies": {
    "simple-xmpp": "1.3.x"
  }
}

and a proxy CommonJS module (index.js) that simply exposes the Simple XMPP module:

exports.SimpleXMPP = require('simple-xmpp');

After installing the project dependencies (simple-xmpp only) with:

$ npm install

We can attempt to migrate it to Titanium, by running the following command-line instruction:

$ titaniumifier

In my first titaniumify attempt, the tool says that some mandatory Titanium specific properties, such as a unique GUID identifier, are missing that need to be added to package.json:

"titaniumManifest": {
    "guid": "76cb731c-5abf-3b79-6cde-f04202e9ea6d"
},

After adding the missing GUID property, a CommonJS titanium module gets produced that we can integrate in any Titanium project we want:

$ titaniumifier
$ ls
ti-simple-xmpp-commonjs-0.0.1.zip

Fixing API mismatches


With our generated CommonJS package, we can start experimenting by creating a simple app that only connects to a remote XMPP server, by adding the following lines to a Titanium app's entry module (app.js):

var xmpp = require('ti-simple-xmpp').SimpleXMPP;

xmpp.connect({
    websocket: { url: 'ws://myserver.com:5280/websocket/' },
    jid : 'username@myserver.com',
    password : 'password',
    reconnect: true,
    preferred: 'PLAIN',
    skipPresence: false
});

In our first trial run, the app crashed with the following error message:

Object prototype may only be an Object or null

This problem seemed to be caused by the following line that constructs an object with a prototype:

ctor.prototype = Object.create(superCtor.prototype, {

After adding a couple of debugging statements in front of the Object.create() line that print the constructor and the prototype's properties, I noticed that it tries to instantiate a stream object (not a constructor function) without a prototype member. Referring to a prototype that is undefined, is apparently not allowed.

Deeper inspection revealed the following code block:

/*<replacement>*/
var Stream;
(function() {
  try {
    Stream = require('st' + 'ream');
  } catch (_) {} finally {
    if(!Stream) Stream = require('events').EventEmitter;
  }
})();
/*</replacement>*/

The above code block attempts to load the stream module, and provides the event emitter as a fallback if it cannot be loaded. The stream string has been scrambled to prevent browserify to statically bundle the module. It appears that the titaniumifier tool provides a very simple substitute that is an object. As a result, it does not use the event emitter as a fallback.

We can easily fix the stream object's prototype problem, by supplying it with an empty prototype property by creating a module (overrides.js) that modifies it:

try {
    var stream = require('st' + 'ream');
    stream.prototype = {};
} catch(ex) {
    // Just ignore if it didn't work
}

and by importing the overrides module in the index module (index.js) before including simple-xmpp:

exports.overrides = require('./overrides');
exports.SimpleXMPP = require('simple-xmpp');

After fixing the prototype problem, the next trial run crashed the app with the following error message:

undefined is not an object (evaluation process.version.slice)

which seemed to be caused by the following line:

var asyncWrite = !process.browser && [ 'v0.10', 'v0.9.'].indexOf(process.version.slice(0, 5)) > -1 ? setImmediate : processNextTick;

Apparently, titaniumifier does not provide any substitute for process.version and as a result invoking the slice member throws an exception. Luckily, we can circumvent this by making sure that process.browser yields true, by adding the following line to the overrides module (overrides.js):

process.browser = true;

The third trial run crashed the app with the following message:

Can't find variable: XMLHttpRequest at ti-simple.xmpp.js (line 1943)

This error is caused by the fact that there is no XMLHttpRequest object -- an API that a web browser would normally provide. I have found a Titanium-based XHR implementation on GitHub that provides an identical API.

By copying the xhr.js file into our project, wrapping it in a module (XMLHttpRequest.js), we can provide a constructor that is identical to the browser API:

exports.XMLHttpRequest = require('./xhr');

global.XMLHttpRequest = module.exports;

By adding it to our index module:

exports.overrides = require('./overrides');
exports.XMLHttpRequest = require('./XMLHttpRequest');
exports.SimpleXMPP = require('simple-xmpp');

we have provided a substitute for the XMLHttpRequest API that is identical to a browser.

In the fourth run, the app crashed with the following error message:

Can't find variable: window at ti-simple-xmpp.js (line 1789)

which seemed to be caused by the following line:

var WebSocket = require('faye-websocket') && require('faye-websocket').Client ? require('faye-websocket').Client : window.WebSocket

Apparently, there is no window object nor a WebSocket constructor, as they are browser-specific and not substituted by titaniumifier.

Fortunately, there seems to be a Websocket module for Titanium that works both on iOS and Android. The only inconvenience is that its API is similar, but not exactly identical to the browser's WebSocket API. For example, creating a WebSocket in the browser is done as follows:

var ws = new WebSocket("ws://localhost/websocket");

whereas with the TiWS module, it must be done as follows:

var tiws = require("net.iamyellow.tiws");

var ws = tiws.open("ws://localhost/websocket");

These differences make it very tempting to manually fix the converted simple XMPP library, but fortunately we can create an adapter that has an identical interface to the browser's WebSocket API, translating calls to the Titanium WebSockets module:

var tiws = require('net.iamyellow.tiws');

function WebSocket() {
    this.ws = tiws.createWS();
    var url = arguments[0];
    this.ws.open(url);

    var self = this;
    
    this.ws.addEventListener('open', function(ev) {
        self.onopen(ev);
    });
    
    this.ws.addEventListener('close', function() {
        self.onclose();
    });
    
    this.ws.addEventListener('error', function(err) {
        self.onerror(err);
    });
    
    this.ws.addEventListener('message', function(ev) {
        self.onmessage(ev);
    });
}

WebSocket.prototype.send = function(message) {
    return this.ws.send(message);
};

WebSocket.prototype.close = function() {
    return this.ws.close();
};

if(global.window === undefined) {
    global.window = {};
}

global.window.WebSocket = module.exports = WebSocket;

Adding the above module to the index module (index.js):

exports.overrides = require('./overrides');
exports.XMLHttpRequest = require('./XMLHttpRequest');
exports.WebSocket = require('./WebSocket');
exports.SimpleXMPP = require('simple-xmpp');

seems to be the last missing piece in the puzzle. In the fifth attempt, the app seems to properly establish an XMPP connection. Coincidentally, all the other chat functions also seem to work like a charm! Yay! :-)

Conclusion


In this blog post, I have described a process in which I have ported simple-xmpp from the Node.js ecosystem to Titanium. The process was mostly automated, followed by a number of trial, error and fix runs.

The fixes we have applied are substitutes (identical APIs for Titanium), adapters (modules that translate calls to a particular API into a calls to a Titanium-specific API) and overrides (imperative modifications to existing modules). These changes did not require me to modify the original package (the original package is simply a dependency of the ti-simple-xmpp project). As a result, we do not have to maintain a fork and we have only little maintenance on our side.

Limitations


Although the porting approach seems to fit our needs, there are a number of things missing. Currently, only XMPP over WebSocket connections are supported. Ordinary XMPP connections require a Titanium-equivalent replacement for Node.js' net.Socket API, which is completely missing.

Moreover, the Titanium WebSockets library has some minor issues. The first time we tested a secure web socket wss:// connection, the app crashed on iOS. Fortunately, this problem has been fixed now.

References


The ti-simple-xmpp package can be obtained from GitHub. Moreover, I have created a bare bones Alloy/Titanium-based example chat app (XMPPTestApp) exposing most of the library's functionality. The app can be used on both iOS and Android:


Acknowledgements


The work described in this blog post is a team effort -- Yiannis Tsirikoglou first attempted to port strophe.js and manually ported simple-xmpp to Titanium before I managed to complete the automated approach described in this blog post. Carlos Henrique Lustosa Zinato provided us Titanium-related advice and helped us diagnosing the TiWS module problems.

Monday, June 20, 2016

Using Disnix as a remote package deployer

Recently, I was asked whether it is possible to use Disnix as a tool for remote package deployment.

As described in a number of earlier blog posts, Disnix's primary purpose is not remote (or distributed) package management, but deploying systems that can be decomposed into services to networks of machines. To deploy these kinds of systems, Disnix executes all required deployment activities, including building services from source code, distributing them to target machines in the network and activating or deactivating them.

However, a service deployment process is basically a superset of an "ordinary" package deployment process. In this blog post, I will describe how we can do remote package deployment by instructing Disnix to only use a relevant subset of features.

Specifying packages as services


In the Nix packages collection, it is a common habit to write each package specification as a function in which the parameters denote the (local) build and runtime dependencies (something that Disnix's manual refers to as intra-dependencies) that the package needs. The remainder of the function describes how to build the package from source code and its provided dependencies.

Disnix has adopted this habit and extended this convention to services. The main difference between Nix package expressions and Disnix service expressions is that the latter also take inter-dependencies into account that refer to run-time dependencies on services that may have been deployed to other machines in the network. For services that have no inter-dependencies, a Disnix expression is identical to an ordinary package expression.

This means that, for example, an expression for a package such as the Midnight Commander is also a valid Disnix service with no inter-dependencies:

{ stdenv, fetchurl, pkgconfig, glib, gpm, file, e2fsprogs
, libX11, libICE, perl, zip, unzip, gettext, slang
}:

stdenv.mkDerivation {
  name = "mc-4.8.12";
  
  src = fetchurl {
    url = http://www.midnight-commander.org/downloads/mc-4.8.12.tar.bz2;
    sha256 = "15lkwcis0labshq9k8c2fqdwv8az2c87qpdqwp5p31s8gb1gqm0h";
  };
  
  buildInputs = [ pkgconfig perl glib gpm slang zip unzip file gettext
      libX11 libICE e2fsprogs ];

  meta = {
    description = "File Manager and User Shell for the GNU Project";
    homepage = http://www.midnight-commander.org;
    license = "GPLv2+";
    maintainers = [ stdenv.lib.maintainers.sander ];
  };
}

Composing packages locally


Package and service expressions are functions that do not specify the versions or variants of the dependencies that should be used. To allow services to be deployed, we must compose them by providing the desired versions or variants of the dependencies as function parameters.

As with ordinary Nix packages, Disnix has also adopted this convention for services. In addition, we have to compose a Disnix service twice -- first its intra-dependencies and later its inter-dependencies.

Intra-dependency composition in Disnix is done in a similar way as in the Nix packages collection:

{pkgs, system}:

let
  callPackage = pkgs.lib.callPackageWith (pkgs // self);

  self = {
    pkgconfig = callPackage ./pkgs/pkgconfig { };
  
    gpm = callPackage ./pkgs/gpm { };
  
    mc = callPackage ./pkgs/mc { };
  };
in
self

The above expression (custom-packages.nix) composes the Midnight Commander package by providing its intra-dependencies as function parameters. The third attribute (mc) invokes a function named: callPackage {} that imports the previous package expression and automatically provides the parameters having the same names as the function parameters.

The callPackage { } function first consults the self attribute set (that composes some of Midnight Commander's dependencies as well, such as gpm and pkgconfig) and then any package from the Nixpkgs repository.

Writing a minimal services model


Previously, we have shown how to build packages from source code and its dependencies, and how to compose packages locally. For the deployment of services, more information is needed. For example, we need to compose their inter-dependencies so that services know how to reach them.

Furthermore, Disnix's end objective is to get a running service-oriented system and carries out extra deployment activities for services to accomplish this, such as activation and deactivation. The latter two steps are executed by a Dysnomia plugin that is determined by annotating a service with a type attribute.

For package deployment, specifying these extra attributes and executing these remaining activities are in principle not required. Nonetheless, we still need to provide a minimal services model so that Disnix knows which units can be deployed.

Exposing the Midnight Commander package as a service, can be done as follows:

{pkgs, system, distribution, invDistribution}:

let
  customPkgs = import ./custom-packages.nix {
    inherit pkgs system;
  };
in
{
  mc = {
    name = "mc";
    pkg = customPkgs.mc;
    type = "package";
  };
}

In the above expression, we import our intra-dependency composition expression (custom-packages.nix), and we use the pkg sub attribute to refer to the intra-dependency composition of the Midnight Commander. We annotate the Midnight Commander service with a package type to instruct Disnix that no additional deployment steps need to be performed beyond the installation of the package, such activation or deactivation.

Since the above pattern is common to all packages, we can also automatically generate services for any package in the composition expression:

{pkgs, system, distribution, invDistribution}:

let
  customPkgs = import ./custom-packages.nix {
    inherit pkgs system;
  };
in
pkgs.lib.mapAttrs (name: pkg: {
  inherit name pkg;
  type = "package";
}) customPkgs

The above services model exposes all packages in our composition expression as a service.

Configuring the remote machine's search paths


With the services models shown in the previous section, we have all ingredients available to deploy packages with Disnix. To allow users on the remote machines to conveniently access their packages, we must add Disnix' Nix profile to the PATH of a user on the remote machines:

$ export PATH=/nix/var/nix/profiles/disnix/default/bin:$PATH

When using NixOS, this variable can be extended by adding the following line to /etc/nixos/configuration.nix:

environment.variables.PATH = [ "/nix/var/nix/profiles/disnix/default/bin" ];

Deploying packages with Disnix


In addition to a services model, Disnix needs an infrastructure and distribution model to deploy packages. For example, we can define an infrastructure model that may look as follows:

{
  test1.properties.hostname = "test1";
  test2 = {
    properties.hostname = "test2";
    system = "x86_64-darwin";
  };
}

The above infrastructure model describes two machines that have hostname test1 and test2. Furthermore, machine test2 has a specific system architecture: x86_64-darwin that corresponds to a 64-bit Intel-based Mac OS X.

We can distribute package to these two machines with the following distribution model:

{infrastructure}:

{
  gpm = [ infrastructure.test1 ];
  pkgconfig = [ infrastructure.test2 ];
  mc = [ infrastructure.test1 infrastructure.test2 ];
}

In the above distribution model, we distribute package gpm to machine test1, pkgconfig to machine test2 and mc to both machines.

When running the following command-line instruction:

$ disnix-env -s services.nix -i infrastructure.nix -d distribution.nix

Disnix executes all activities to get the packages in the distribution model deployed to the machines, such as building them from source code (including its dependencies), and distributing their dependency closures to the target machines.

Because machine test2 may have a different system architecture as the coordinator machine responsible for carrying out the deployment, Disnix can use Nix's delegation mechanism to forward a build to a machine that is capable of doing it.

Alternatively, packages can also be built on the target machines through Disnix:

$ disnix-env --build-on-targets \
  -s services.nix -i infrastructure.nix -d distribution.nix

After the deployment above command-line instructions have succeeded, we should be able to start the Midnight Commander on any of the target machines, by running:

$ mc

Deploying any package from the Nixpkgs repository


Besides deploying a custom set of packages, it is also possible to use Disnix to remotely deploy any package in the Nixpkgs repository, but doing so is a bit tricky.

The main challenge lies in the fact that the Nix packages set is a nested set of attributes, whereas Disnix expects services to be addressed in one attribute set only. Fortunately, the Nix expression language and Disnix models are flexible enough to implement a solution. For example, we can define a distribution model as follows:

{infrastructure}:

{
  mc = [ infrastructure.test1 ];
  git = [ infrastructure.test1 ];
  wget = [ infrastructure.test1 ];
  "xlibs.libX11" = [ infrastructure.test1 ];
}

Note that we use a dot notation: xlibs.libX11 as an attribute name to refer to libX11 that can only be referenced as a sub attribute in Nixpkgs.

We can write a services model that uses the attribute names in the distribution model to refer to the corresponding package in Nixpkgs:

{pkgs, system, distribution, invDistribution}:

pkgs.lib.mapAttrs (name: targets:
  let
    attrPath = pkgs.lib.splitString "." name;
  in
  { inherit name;
    pkg = pkgs.lib.attrByPath attrPath
      (throw "package: ${name} cannot be referenced in the package set")
      pkgs;
    type = "package";
  }
) distribution

With the above service model we can deploy any Nix package to any remote machine with Disnix.

Multi-user package management


Besides supporting single user installations, Nix also supports multi-user installations in which every user has its own private Nix profile with its own set of packages. With Disnix we can also manage multiple profiles. For example, by adding the --profile parameter, we can deploy another Nix profile that, for example, contains a set of packages for the user: sander:

$ disnix-env -s services.nix -i infrastructure.nix -d distribution.nix \
  --profile sander

The user: sander can access its own set of packages by setting the PATH environment variable to:

$ export PATH=/nix/var/nix/profiles/disnix/sander:$PATH

Conclusion


Although Disnix has not been strictly designed for this purpose, I have described in this blog post how Disnix can be used as a remote package deployer by using a relevant subset of Disnix features.

Moreover, I now consider the underlying Disnix primitives to be mature enough. As such, I am announcing the release of Disnix 0.6!

Acknowledgements


I gained the inspiration for writing this blog post from discussions with Matthias Beyer on the #nixos IRC channel.

Saturday, June 11, 2016

Deploying containers with Disnix as primitives for multi-layered service deployments

As explained in an earlier blog post, Disnix is a service deployment tool that can only be used after a collection of machines have been predeployed providing a number of container services, such as a service manager (e.g. systemd), a DBMS (e.g. MySQL) or an application server (e.g. Apache Tomcat).

To deploy these machines, we need an external solution. Some solutions are:

  • Manual installations requiring somebody to obtain a few machines, manually installing operating systems (e.g. a Linux distribution), and finally installing all required software packages, such as Nix, Dysnomia, Disnix and any additional container services. Manually configuring a machine is typically tedious, time consuming and error prone.
  • NixOps. NixOps is capable of automatically instantiating networks of virtual machines in the cloud (such as Amazon EC2) and deploying entire NixOS system configurations to them. These NixOS configurations can be used to automatically deploy Dysnomia, Disnix and any container service that we need. A drawback is that NixOps is NixOS-based and not really useful if you want to deploy services to machines running different kinds of operating systems.
  • disnixos-deploy-network. In a Disnix-context, services are basically undefined units of deployment, and we can also automatically deploy entire NixOS configurations to target machines as services. A major drawback of this approach is that we require predeployed machines running Disnix first.

Although there are several ways to manage the underlying infrastructure of services, they are basically all or nothing solutions with regards to automation -- we either have to manually deploy entire machine configurations ourselves or we are stuck to a NixOS-based solution that completely automates it.

In some scenarios (e.g. when it is desired to deploy services to non-Linux operating systems), the initial deployment phase becomes quite tedious. For example, it took me quite a bit of effort to set up the heterogeneous network deployment demo I have given at NixCon2015.

In this blog post, I will describe an approach that serves as an in-between solution -- since services in a Disnix-context can be (almost) any kind of deployment unit, we can also use Disnix to deploy container configurations as services. These container services can also be deployed to non-NixOS systems, which means that we can alleviate the effort in setting up the initial target system configurations where Disnix can deploy services to.

Deploying containers as services with Disnix


As with services, containers in a Disnix-context could take any form. For example, in addition to MySQL databases (that we can deploy as services with Disnix), we can also deploy the corresponding container: the MySQL DBMS server, as a Disnix service:

{ stdenv, mysql, dysnomia
, name ? "mysql-database"
, mysqlUsername ? "root", mysqlPassword ? "secret"
, user ? "mysql-database", group ? "mysql-database"
}:

stdenv.mkDerivation {
  inherit name;
  
  buildCommand = ''
    mkdir -p $out/bin
      
    # Create wrapper script
    cat > $out/bin/wrapper <<EOF
    #! ${stdenv.shell} -e
      
    case "\$1" in
        activate)
            # Create group, user and the initial database if it does not exists
            # ...

            # Run the MySQL server
            ${mysql}/bin/mysqld_safe --user=${user} --datadir=${dataDir} --basedir=${mysql} --pid-file=${pidDir}/mysqld.pid &
            
            # Change root password
            # ...
            ;;
        deactivate)
            ${mysql}/bin/mysqladmin -u ${mysqlUsername} -p "${mysqlPassword}" -p shutdown
            
            # Delete the user and group
            # ...
            ;;
    esac
    EOF
    
    chmod +x $out/bin/wrapper

    # Add Dysnomia container configuration file for the MySQL DBMS
    mkdir -p $out/etc/dysnomia/containers

    cat > $out/etc/dysnomia/containers/${name} <<EOF
    mysqlUsername="${mysqlUsername}"
    mysqlPassword="${mysqlPassword}"
    EOF
    
    # Copy the Dysnomia module that manages MySQL databases
    mkdir -p $out/etc/dysnomia/modules
    cp ${dysnomia}/libexec/dysnomia/mysql-database $out/etc/dysnomia/modules
  '';
}

The above code fragment is a simplified Disnix expression that can be used to deploy a MySQL server. The above expression produces a wrapper script, which carries out a set of deployment activities invoked by Disnix:

  • On activation, the wrapper script starts the MySQL server by spawning the mysqld_safe daemon process in background mode. Before starting the daemon, it also intitializes some of the server's state, such as creating user accounts under which the daemon runs and setting up the system database if it does not exists (these steps are left out of the example for simplicity reasons).
  • On deactivation it shuts down the MySQL server and removes some of the attached state, such as the user accounts.

Besides composing a wrapper script, we must allow Dysnomia (and Disnix) to deploy databases as Disnix services to the MySQL server that we have just deployed:

  • We generate a Dysnomia container configuration file with the MySQL server settings to allow a database (that gets deployed as a service) to know what credentials it should use to connect to the database.
  • We bundle a Dysnomia plugin module that implements the deployment activities for MySQL databases, such as activation and deactivation. Because Dysnomia offers this plugin as part of its software distribution, we make a copy of it, but we could also compose our own plugin from scratch.

With the earlier shown Disnix expression, we can define the MySQL server as a service in a Disnix services model:

mysql-database = {
  name = "mysql-database";
  pkg = customPkgs.mysql-database;
  dependsOn = {};
  type = "wrapper";
};

and distribute it to a target machine in the network by adding an entry to the distribution model:

mysql-database = [ infrastructure.test2 ];

Configuring Disnix and Dysnomia


Once we have deployed containers as Disnix services, Disnix (and Dysnomia) must know about their availability so that we can deploy services to these recently deployed containers.

Each time Disnix has successfully deployed a configuration, it generates Nix profiles on the target machines in which the contents of all services can be accessed from a single location. This means that we can simply extend Dysnomia's module and container search paths:

export DYSNOMIA_MODULES_PATH=$DYSNOMIA_MODULES_PATH:/nix/var/nix/profiles/disnix/containers/etc/dysnomia/modules
export DYSNOMIA_CONTAINERS_PATH=$DYSNOMIA_CONTAINERS_PATH:/nix/var/nix/profiles/disnix/containers/etc/dysnomia/containers

with the paths to the Disnix profiles that have containers deployed.

A simple example scenario


I have modified the Java variant of the ridiculous Disnix StaffTracker example to support a deployment scenario with containers as Disnix services.

First, we need to start with a collection of machines having a very basic configuration without any additional containers. The StaffTracker package contains a bare network configuration that we can deploy with NixOps, as follows:

$ nixops create ./network-bare.nix ./network-virtualbox.nix -d vbox
$ nixops deploy -d vbox

By configuring the following environment variables, we can connect Disnix to the machines in the network that we have just deployed with NixOps:

$ export NIXOPS_DEPLOYMENT=vbox
$ export DISNIX_CLIENT_INTERFACE=disnix-nixops-client

We can write a very simple bootstrap infrastructure model (infrastructure-bootstrap.nix), to dynamically capture the configuration of the target machines:

{
  test1.properties.hostname = "test1";
  test2.properties.hostname = "test2";
}

Running the following command:

$ disnix-capture-infra infrastructure-bootstrap.nix > infrastructure-bare.nix

yields an infrastructure model (infrastructure-containers.nix) that may have the following structure:

{
  "test1" = {
    properties = {
      "hostname" = "test1";
      "system" = "x86_64-linux";
    };
    containers = {
      process = {
      };
      wrapper = {
      };
    };
    "system" = "x86_64-linux";
  };
  "test2" = {
    properties = {
      "hostname" = "test2";
      "system" = "x86_64-linux";
    };
    containers = {
      process = {
      };
      wrapper = {
      };
    };
    "system" = "x86_64-linux";
  };
}

As may be observed in the captured infrastructure model shown above, we have a very minimal configuration only hosting the process and wrapper containers, that integrate with host system's service manager, such as systemd.

We can deploy a Disnix configuration having Apache Tomcat and the MySQL DBMS as services, by running:

$ disnix-env -s services-containers.nix \
  -i infrastructure-bare.nix \
  -d distribution-containers.nix \
  --profile containers

Note that we have provided an extra parameter to Disnix: --profile to isolate the containers from the default deployment environment. If the above command succeeds, we have a deployment architecture that looks as follows:


Both machines have Apache Tomcat deployed as a service and machine test2 also runs a MySQL server.

When capturing the target machines' configurations again:

$ disnix-capture-infra infrastructure-bare.nix > infrastructure-containers.nix

we will receive an infrastructure model (infrastructure-containers.nix) that may have the following structure:

{
  "test1" = {
    properties = {
      "hostname" = "test1";
      "system" = "x86_64-linux";
    };
    containers = {
      tomcat-webapplication = {
        "tomcatPort" = "8080";
      };
      process = {
      };
      wrapper = {
      };
    };
    "system" = "x86_64-linux";
  };
  "test2" = {
    properties = {
      "hostname" = "test2";
      "system" = "x86_64-linux";
    };
    containers = {
      mysql-database = {
        "mysqlUsername" = "root";
        "mysqlPassword" = "secret";
        "mysqlPort" = "3306";
      };
      tomcat-webapplication = {
        "tomcatPort" = "8080";
      };
      process = {
      };
      wrapper = {
      };
    };
    "system" = "x86_64-linux";
  };
}

As may be observed in the above infrastructure model, both machines provide a tomcat-webapplication container exposing the TCP port number that the Apache Tomcat server has been bound to. Machine test2 exposes the mysql-database container with its connectivity settings.

We can now deploy the StaffTracker system (that consists of multiple MySQL databases and Apache Tomcat web applications) by running:

$ disnix-env -s services.nix \
  -i infrastructure-containers.nix \
  -d distribution.nix \
  --profile services

Note that I use a different --profile parameter, to tell Disnix that the StaffTracker components belong to a different environment than the containers. If I would use --profile containers again, Disnix will undeploy the previously shown containers environment with the MySQL DBMS and Apache Tomcat and deploy the databases and web applications, which will lead to a failure.

If the above command succeeds, we have the following deployment architecture:


The result is that we have all the service components of the StaffTracker example deployed to containers that are also deployed by Disnix.

An advanced example scenario: multi-containers


We could go even one step beyond the example I have shown in the previous section. In the first example, we deploy no more than one instance of each container to a machine in the network -- this is quite common, as it rarely happens that you want to run two MySQL or Apache Tomcat servers on a single machine. Most Linux distributions (including NixOS) do not support deploying multiple instances of system services out of the box.

However, with a few relatively simple modifications to the Disnix expressions of the MySQL DBMS and Apache Tomcat services, it becomes possible to allow multiple instances to co-exist on the same machine. What we basically have to do is identifying the conflicting runtime resources, making them configurable and changing their values in such a way that they no longer conflict.

{ stdenv, mysql, dysnomia
, name ? "mysql-database"
, mysqlUsername ? "root", mysqlPassword ? "secret"
, user ? "mysql-database", group ? "mysql-database"
, dataDir ? "/var/db/mysql", pidDir ? "/run/mysqld"
, port ? 3306
}:

stdenv.mkDerivation {
  inherit name;
  
  buildCommand = ''
    mkdir -p $out/bin
    
    # Create wrapper script
    cat > $out/bin/wrapper <<EOF
    #! ${stdenv.shell} -e
       
    case "\$1" in
        activate)
            # Create group, user and the initial database if it does not exists
            # ...

            # Run the MySQL server
            ${mysql}/bin/mysqld_safe --port=${toString port} --user=${user} --datadir=${dataDir} --basedir=${mysql} --pid-file=${pidDir}/mysqld.pid --socket=${pidDir}/mysqld.sock &
            
            # Change root password
            # ...
            ;;
        deactivate)
            ${mysql}/bin/mysqladmin --socket=${pidDir}/mysqld.sock -u ${mysqlUsername} -p "${mysqlPassword}" -p shutdown
            
            # Delete the user and group
            # ...
            ;;
    esac
    EOF
    
    chmod +x $out/bin/wrapper
  
    # Add Dysnomia container configuration file for the MySQL DBMS
    mkdir -p $out/etc/dysnomia/containers
    
    cat > $out/etc/dysnomia/containers/${name} <<EOF
    mysqlUsername="${mysqlUsername}"
    mysqlPassword="${mysqlPassword}"
    mysqlPort=${toString port}
    mysqlSocket=${pidDir}/mysqld.sock
    EOF
    
    # Copy the Dysnomia module that manages MySQL databases
    mkdir -p $out/etc/dysnomia/modules
    cp ${dysnomia}/libexec/dysnomia/mysql-database $out/etc/dysnomia/modules
  '';
}

For example, I have revised the MySQL server Disnix expression with additional parameters that change the TCP port the service binds to, the UNIX domain socket that is used by the administration utilities and the filesystem location where the databases are stored. Moreover, these additional configuration properties are also exposed by the Dysnomia container configuration file.

These additional parameters make it possible to define multiple variants of container services in the services model:

{distribution, invDistribution, system, pkgs}:

let
  customPkgs = import ../top-level/all-packages.nix {
    inherit system pkgs;
  };
in
rec {
  mysql-production = {
    name = "mysql-production";
    pkg = customPkgs.mysql-production;
    dependsOn = {};
    type = "wrapper";
  };
  
  mysql-test = {
    name = "mysql-test";
    pkg = customPkgs.mysql-test;
    dependsOn = {};
    type = "wrapper";
  };
  
  tomcat-production = {
    name = "tomcat-production";
    pkg = customPkgs.tomcat-production;
    dependsOn = {};
    type = "wrapper";
  };
  
  tomcat-test = {
    name = "tomcat-test";
    pkg = customPkgs.tomcat-test;
    dependsOn = {};
    type = "wrapper";
  };
}

I can, for example, map two MySQL DBMS instances and the two Apache Tomcat servers to the same machines in the distribution model:

{infrastructure}:

{
  mysql-production = [ infrastructure.test1 ];
  mysql-test = [ infrastructure.test1 ];
  tomcat-production = [ infrastructure.test2 ];
  tomcat-test = [ infrastructure.test2 ];
}

Deploying the above configuration:

$ disnix-env -s services-multicontainers.nix \
  -i infrastructure-bare.nix \
  -d distribution-multicontainers.nix \
  --profile containers

yields the following deployment architecture:


As can be observed, we have two instances of the same container hosted on the same machine. When capturing the configuration:

$ disnix-capture-infra infrastructure-bare.nix > infrastructure-multicontainers.nix

we will receive a Nix expression that may look as follows:

{
  "test1" = {
    properties = {
      "hostname" = "test1";
      "system" = "x86_64-linux";
    };
    containers = {
      mysql-production = {
        "mysqlUsername" = "root";
        "mysqlPassword" = "secret";
        "mysqlPort" = "3306";
        "mysqlSocket" = "/run/mysqld-production/mysqld.sock";
      };
      mysql-test = {
        "mysqlUsername" = "root";
        "mysqlPassword" = "secret";
        "mysqlPort" = "3307";
        "mysqlSocket" = "/run/mysqld-test/mysqld.sock";
      };
      process = {
      };
      wrapper = {
      };
    };
    "system" = "x86_64-linux";
  };
  "test2" = {
    properties = {
      "hostname" = "test2";
      "system" = "x86_64-linux";
    };
    containers = {
      tomcat-production = {
        "tomcatPort" = "8080";
        "catalinaBaseDir" = "/var/tomcat-production";
      };
      tomcat-test = {
        "tomcatPort" = "8081";
        "catalinaBaseDir" = "/var/tomcat-test";
      };
      process = {
      };
      wrapper = {
      };
    };
    "system" = "x86_64-linux";
  };
}

In the above expression, there are two instances of MySQL and Apache Tomcat deployed to the same machine. These containers have their resources configured in such a way that they do not conflict. For example, both MySQL instances bind to a different TCP ports (3306 and 3307) and different UNIX domain sockets (/run/mysqld-production/mysqld.sock and /run/mysqld-test/mysqld.sock).

After deploying the containers, we can also deploy the StaffTracker components (databases and web applications) to them. As described in my previous blog post, we can use an alternative (and more verbose) notation in the distribution model to directly map services to containers:

{infrastructure}:

{
  GeolocationService = {
    targets = [
      { target = infrastructure.test2; container = "tomcat-test"; }
    ];
  };
  RoomService = {
    targets = [
      { target = infrastructure.test2; container = "tomcat-production"; }
    ];
  };
  StaffService = {
    targets = [
      { target = infrastructure.test2; container = "tomcat-test"; }
    ];
  };
  StaffTracker = {
    targets = [
      { target = infrastructure.test2; container = "tomcat-production"; }
    ];
  };
  ZipcodeService = {
    targets = [
      { target = infrastructure.test2; container = "tomcat-test"; }
    ];
  };
  rooms = {
    targets = [
      { target = infrastructure.test1; container = "mysql-production"; }
    ];
  };
  staff = {
    targets = [
      { target = infrastructure.test1; container = "mysql-test"; }
    ];
  };
  zipcodes = {
    targets = [
      { target = infrastructure.test1; container = "mysql-production"; }
    ];
  };
}

As may be observed in the distribution model above, we deploy databases and web application to both instances that are hosted on the same machine.

We can deploy the services of which the StaffTracker consists, as follows:

$ disnix-env -s services.nix \
  -i infrastructure-multicontainers.nix \
  -d distribution-advanced.nix \
  --profile services

and the result is the following deployment architecture:


As may be observed in the picture above, we now have a running StaffTracker system that uses two MySQL and two Apache Tomcat servers on one machine. Isn't it awesome? :-)

Conclusion


In this blog post, I have demonstrated an approach in which we deploy containers as services with Disnix. Containers serve as potential deployment targets for other Disnix services.

Previously, we only had NixOS-based solutions to manage the configuration of containers, which makes using Disnix on other platforms than NixOS painful, as the containers had to be deployed manually. The approach described in this blog post serves as an in-between solution.

In theory, the process in which we deploy containers as services first followed by the "actual" services, could be generalized and extended into a layered service deployment model, with a new tool automating the process and declarative specifications capturing the properties of the layers.

However, I have decided not to implement this new model any time soon for practical reasons -- in nearly all of my experiences with service deployment, I have almost never encountered the need to have more than two layers supported. The only exception I can think of is the deployment of Axis2 web services to an Axis2 container -- the Axis2 container is a Java web application that must be deployed to Apache Tomcat first, which in turn requires the presence of the Apache Tomcat server.

Availability


I have integrated the two container deployment examples into the Java variant of the StaffTracker example.

The new concepts described in this blog post are part of the development version of Disnix and will become available in the next release.

Thursday, May 19, 2016

Mapping services to containers with Disnix and a new notational convention

In the last couple of months, I have made a number of major changes to the internals of Disnix. As described in a couple of older blog posts, deployment with Disnix is driven by three models each capturing a specific concern:

  • The services model specifies the available distributable components, how to construct them (from source code, intra-dependencies and inter-dependencies), and their types so that they can be properly activated or deactivated on the target machines.
  • The infrastructure model specifies the available target machines and their relevant deployment properties.
  • The distribution model maps services in the service model to target machines in the infrastructure model.

By running the following command-line instruction with the three models as parameters:

$ disnix-env -s services.nix -i infrastructure.nix -d distribution.nix

Disnix executes all required activities to get the system deployed, including building, distributing, activating and deactivating services.

I have always described the final step, the activation phase, as deactivating obsolete and activating new services on the target machines. However, this is an over simplification of what really happens.

In reality, Disnix does more than just carrying out an activation step on a target machine -- to get a service activated or deactivated, Disnix invokes Dysnomia that modifies the state of a so-called container hosting a collection of components. As with components, the definition of a container in Dysnomia is deliberately left abstract and represent anything, such as a Java Servlet container (e.g. Apache Tomcat), a DBMS (e.g. MySQL) or the operating system's service manager (e.g. systemd).

So far, these details were always hidden in Disnix and the container mapping was an implicit operation, which I never really liked. Furthermore, there are situations in which you may want to have more control over this mapping.

In this blog post, I will describe my recent modifications and a new notational convention that can be used to treat containers as first-class citizens.

A modified infrastructure model formalism


Previously, a Disnix infrastructure model had the following structure:

{
  test1 = {
    hostname = "test1.example.org";
    tomcatPort = 8080;
    system = "i686-linux";
  };
  
  test2 = {
    hostname = "test2.example.org";
    tomcatPort = 8080;
    mysqlPort = 3306;
    mysqlUsername = "root";
    mysqlPassword = "admin";
    system = "x86_64-linux";
    numOfCores = 1;
    targetProperty = "hostname";
    clientInterface = "disnix-ssh-client";
  }; 
}

The above Nix expression is an attribute set in which each key corresponds to a target machine in the network and each value is an attribute set containing arbitrary machine properties.

These properties are used for a variety of deployment activities. Disnix made no hard distinction between them -- some properties have a special meaning, but most of them could be freely chosen, yet this does not become clear from the model.

In the new notational convention, the target machine properties have been categorized:

{
  test1 = {
    properties = {
      hostname = "test1.example.org";
    };
    
    containers = {
      tomcat-webapplication = {
        tomcatPort = 8080;
      };
    };
    
    system = "i686-linux";
  };
  
  test2 = {
    properties = {
      hostname = "test2.example.org";
    };
    
    containers = {
      tomcat-webapplication = {
        tomcatPort = 8080;
      };
      
      mysql-database = {
        mysqlPort = 3306;
        mysqlUsername = "root";
        mysqlPassword = "admin";
      };
    };
    
    system = "x86_64-linux";
    numOfCores = 1;
    targetProperty = "hostname";
    clientInterface = "disnix-ssh-client";
  }; 
}

The above expression has a more structured notation:

  • The properties attribute refers to arbitrary machine-level properties that are used at build-time and to connect from the coordinator to the target machine.
  • The containers attribute set defines the available container services on a target machine and their relevant deployment properties. The container properties are used at build-time and activation time. At activation time, they are passed as parameters to the Dysnomia module that activates a service in the corresponding container.
  • The remainder of the target attributes are optional system properties. For example, targetProperty defines which attribute in properties contains the address to connect to the target machine. clientInterface refers to the executable that establishes a remote connection, system defines the system architecture of the target machine (so that services will be correctly built for it), and numOfCores defines how many concurrent activation operations can be executed on the target machine.

As may have become obvious, in the new notation it becomes clear what container services the target machine provides, whereas in the old notation they were hidden.

An alternative distribution model notation


I have also introduced an alternative notation for mappings in the distribution model. A traditional Disnix distribution model typically looks as follows:

{infrastructure}:

{
  ...
  StaffService = [ infrastructure.test2 ];
  StaffTracker = [ infrastructure.test1 infrastructure.test2 ];
}

In the above expression, each attribute name refers to a service in the service model and each value to a list of machines in the infrastructure model.

As explained earlier, besides deploying a service to a machine, a service also gets deployed to a container hosted on the machine, which is not reflected in the distribution model.

When using the above notation, Disnix executes a so-called auto mapping strategy to containers. It simply takes the type attribute from the services model (which is used to determine which Dysnomia module to use that carries out the activation and deactivation steps):

StaffTracker = {
  name = "StaffTracker";
  pkg = customPkgs.StaffTracker;
  dependsOn = {
    inherit GeolocationService RoomService;
    inherit StaffService ZipcodeService;
  };
  type = "tomcat-webapplication";
};

and deploys the service to the container with the same name as the type. For example, all services of type: tomcat-webapplication will be deployed to a container named: tomcat-webapplication (and uses the Dysnomia module named: tomcat-webapplication to activate or deactivate it).

In most cases auto-mapping suffices -- we typically only run one container service on a machine, e.g. one MySQL DBMS, one Apache Tomcat application server. That is why the traditional notation remains the default in Disnix.

However, sometimes it may also be desired to have more control over the container mappings. The new Disnix also supports an alternative and more verbose notation. For example, the following mapping of the StaffTracker service is equivalent to the traditional mapping shown in the previous distribution model:

StaffTracker = {
  targets = [ { target = infrastructure.test1; } ];
};

We can use the alternative notation to control the container mapping, for example:

{infrastructure}:

{
  ...

  StaffService = {
    targets = [
      { target = infrastructure.test1;
        container = "tomcat-production";
      }
    ];
  };
  StaffTracker = {
    targets = [
      { target = infrastructure.test1;
        container = "tomcat-test";
      }
    ];
  };
};

By adding the container attribute to a mapping, we can override the auto mapping strategy and specify the name of the container that we want to deploy to. This alternative notation allows us to deploy to a container whose name does not match the type or to manage networks of machines having multiple instances of the same container deployed.

For example, in the above distribution model, both services are Apache Tomcat web applications. We map StaffService to a container called: tomcat-production and StaffTracker to a container called: tomcat-test. Both containers are hosted on the same machine: test1.

A modified formalism to refer to inter-dependency parameters


As a consequence of modifying the infrastructure and distribution model notations, referring to inter-dependency parameters in Disnix expressions also slightly changed:

{stdenv, StaffService}:
{staff}:

let
  contextXML = ''
    <Context>
      <Resource name="jdbc/StaffDB" auth="Container"
        type="javax.sql.DataSource"
        maxActivate="100" maxIdle="30" maxWait="10000"
        username="${staff.target.container.mysqlUsername}"
        password="${staff.target.container.mysqlPassword}"
        driverClassName="com.mysql.jdbc.Driver"
        url="jdbc:mysql://${staff.target.properties.hostname}:${toString (staff.target.container.mysqlPort)}/${staff.name}?autoReconnect=true" />
    </Context>
  '';
in
stdenv.mkDerivation {
  name = "StaffService";
  buildCommand = ''
    mkdir -p $out/conf/Catalina
    cat > $out/conf/Catalina/StaffService.xml <<EOF
    ${contextXML}
    EOF
    ln -sf ${StaffService}/webapps $out/webapps
  '';
}

The above example is a Disnix expression that configures the StaffService service. The StaffService connects to a remote MySQL database (named: staff) which is provided as an inter-dependency parameter. The Disnix expression uses the properties of the inter-dependency parameter to configure a so called context XML file which Apache Tomcat uses to establish a (remote) JDBC connection so that the web service can connect to it.

Previously, each inter-dependency parameter provided a targets sub attribute referring to targets in the infrastructure model to which the inter-dependency has been mapped in the distribution model. Because it is quite common to map to a single target only, there is also a target sub attribute that refers to the first element for convenience.

In the new Disnix, the targets now refer to container mappings instead of machine mappings and implement a new formalism to reflect this:

  • The properties sub attribute refers to the machine level properties in the infrastructure model
  • The container sub attribute refers to the container properties to which the inter-dependency has been deployed.

As can be observed in the expression shown above, both sub attributes are used in the above expression to allow the service to connect to the remote MySQL database.

Visualizing containers


Besides modifying the notational conventions and the underlying deployment mechanisms, I have also modified disnix-visualize to display containers. The following picture shows an example:



In the above picture, the light grey boxes denote machines, the dark grey boxes denote containers, the ovals services and the arrows inter-dependency relationships. In my opinion, these new visualizations are much more intuitive -- I still remember that in an old blog post that summarizes my PhD thesis I used a hand-drawn diagram to illustrate why deployments of service-oriented systems were complicated. In this diagram I already showed containers, yet in the visualizations generated by disnix-visualize they were missing. Now finally, this mismatch has been removed from the tooling.

(As a sidenote: it is still possible to generate the classic non-containerized visualizations by providing the: --no-containers command-line option).

Capturing the infrastructure model from the machines' Dysnomia container configuration files


The new notational conventions also makes it possible to more easily implement yet another use case. As explained in an earlier blog post, when it is desired to deploy services with Disnix, we need predeployed machines running Nix, Dysnomia and Disnix installed and a number of container services (such as MySQL and Apache Tomcat) first.

After deploying the machines, we must hand-write an infrastructure model reflecting their properties. Hand writing infrastructure models is sometimes tedious and error prone. In my previous blog post, I have shown that it is possible to automatically generate Dysnomia container configuration files from NixOS configurations that capture properties of entire machine configurations.

We can now also do the opposite: generating an expression of a machines' Dysnomia container configuration files and compose an infrastructure model from it. This takes away the majority of the burden of handwriting infrastructure models.

For example, we can write a Dysnomia-enabled NixOS configuration:

{config, pkgs, ...}:

{
  services = {
    openssh.enable = true;
    
    mysql = {
      enable = true;
      package = pkgs.mysql;
      rootPassword = ../configurations/mysqlpw;
    };
    
    tomcat = {
      enable = true;
      commonLibs = [ "${pkgs.mysql_jdbc}/share/java/mysql-connector-java.jar" ];
      catalinaOpts = "-Xms64m -Xmx256m";
    };
  };
  
  dysnomia = {
    enable = true;
    enableAuthentication = true;
    properties = {
      hostname = config.networking.hostName;
      mem = "$(grep 'MemTotal:' /proc/meminfo | sed -e 's/kB//' -e 's/MemTotal://' -e 's/ //g')";
    };
  };
}

The above NixOS configuration deploys two container services: MySQL and Apache Tomcat. Furthermore, it defines some non-functional machine-level properties, such as the hostname and the amount of RAM (mem) the machine has (which is composed dynamically by consulting the kernel's /proc filesystem).

As shown in the previous blog post, when deploying the above configuration with:

$ nixos-rebuild switch

The Dysnomia NixOS module automatically composes the /etc/dysnomia/properties and /etc/dysnomia/containers configuration files. When running the following command:

$ dysnomia-containers --generate-expr
{
  properties = {
    "hostname" = "test1";
    "mem" = "1023096";
    "supportedTypes" = [
      "mysql-database"
      "process"
      "tomcat-webapplication"
    ];
    "system" = "x86_64-linux";
  };
  containers = {
    mysql-database = {
      "mysqlPassword" = "admin";
      "mysqlPort" = "3306";
      "mysqlUsername" = "root";
    };
    tomcat-webapplication = {
      "tomcatPort" = "8080";
    };
  };
}

Dysnomia generates a Nix expression of the general properties and container configuration files.

We can do the same operation in a network of machines by running the disnix-capture-infra tool. First, we need to write a very minimal infrastructure model that only captures the connectivity attributes:

{
  test1.properties.hostname = "test1";
  test2.properties.hostname = "test2";
}

When running:

$ disnix-capture-infra infrastructure-basic.nix
{
  test1 = {
    properties = {
      "hostname" = "test1";
      "mem" = "1023096";
      "supportedTypes" = [
        "mysql-database"
        "process"
        "tomcat-webapplication"
      ];
      "system" = "x86_64-linux";
    };
    containers = {
      mysql-database = {
        "mysqlPassword" = "admin";
        "mysqlPort" = "3306";
        "mysqlUsername" = "root";
      };
      tomcat-webapplication = {
        "tomcatPort" = "8080";
      };
    };
  };
  test2 = ...
}

Disnix captures the configurations of all machines in the basic infrastructure model and returns an augmented infrastructure model containing all its properties.

(As a sidenote: disnix-capture-infra is not the only infrastructure model generator I have developed. In the self-adaptive adaptive deployment framework built on top of Disnix, I have developed an Avahi-based discovery service that can also generate infrastructure models. It is also more powerful (but quite hacky and immature) because it dynamically discovers the machines in the network, so it does not require a basic infrastructure model to be written. Moreover, it automatically responds to events when a machine's configuration changes.

I have modified the Avahi-based discovery tool to use Dysnomia's expression generator as well.

Also, the DisnixOS toolset can generate infrastructure models from networked NixOS configurations).

Discussion


In this blog post, I have described the result of a number of major internal changes to Disnix that make the containers concept a first class citizen. Fortunately, from an external perspective the changes are minor, but still backwards incompatible -- we must follow a new convention for the infrastructure model and refer to the target properties of inter-dependency parameters in a slightly different way.

In return you will get:

  • A more intuitive notation. As explained, we do not only deploy to a machine, but also to a container hosted on the machine. Now the deployment models and corresponding visualizations reflect this concept.
  • More control and power. We can deploy to multiple containers of the same type on the same machine, e.g. we can have two MySQL DBMSes on the same machine.
  • More correctness. When activating or deactivating a service all infrastructure properties were propagated as parameters to the corresponding Dysnomia module. Why does mysql-database module needs to know about a postgresql-database and vice versa? Now Dysnomia modules only get to know what they need to know.
  • Discovery. We can generate the infrastructure model from Dysnomia container configuration hosted on the target machines with relative ease.

A major caveat is that deployment planning (implemented in the Dynamic Disnix framework) can also potentially be extended from machine-level to container-level.

At the moment, I did make these modifications. This means that Dynamic Disnix can still generate distribution models, but only on machine level. As a consequence, Dynamic Disnix only allows a user to refer to a target's machine-level properties (i.e. the properties attribute in the infrastructure model) for deployment planning purposes, and not to any container-specific properties.

Container-level deployment planning is also something I intend to support at some point in the future.

Availability


The new notational conventions and containers concepts are part of the development version of Disnix and will become available in the next release. Moreover, I have modified the Disnix examples to use the new notations.

Tuesday, April 19, 2016

Managing the state of mutable components in NixOS configurations with Dysnomia


In an old blog post (and research paper) from a couple of years ago, I have described a prototype version of Dysnomia -- a toolset that can be used to deploy so-called "mutable components". In the middle of last year, I have integrated the majority of its concepts into the mainstream version of Dysnomia, because I had found some practical use for it.

So far, I have only used Dysnomia in conjunction with Disnix -- Disnix executes all activities required to deploy a service-oriented system, such as:

  • Building services and their intra-dependencies from source code. By default, Disnix performs the builds on the coordinator machine, but can also optionally delegate them to target machines in the network.
  • Distributing services and their intra-dependency closures to the appropriate target machines in the network.
  • Activating newly deployed services, and deactivating obsolete services.
  • Optionally snapshotting, transferring and restoring the state of services (or a subset of services) that have moved from a target machine to another.

For carrying out the building and distribution activities, Disnix invokes the Nix package manager as it provides a number of powerful features that makes deployment of packages more reliable and reproducible.

However, not all activities required to deploy service-oriented systems are supported by Nix and this is where Dysnomia comes in handy -- one of Dysnomia's objectives is to uniformly activate and deactivate mutable components in containers by modifying the latter's state. The other objective is to uniformly support snapshotting and restoring the state of mutable components deployed in a container.

The definitions of mutable components and containers are deliberately left abstract in a Dysnomia context. Basically, they can represent anything, such as:

  • A MySQL database schema component and a MySQL DBMS container.
  • An Java web application component (WAR file) and an Apache Tomcat container.
  • A UNIX process component and a systemd container.
  • Even NixOS configurations can be considered mutable components.

To support many kinds of component and container flavours, Dysnomia has been designed as a plugin system -- each Dysnomia module has a standardized interface (basically a process taking two standard command line parameters) and implement a set of standard deployment activities (e.g. activate, deactivate, snapshot and restore) for each type of container.

Despite the fact that Dysnomia has originally been designed for use with Disnix (the package was historically known as Disnix activation scripts), it can also be used a standalone tool or in combination with other deployment solutions. (As a sidenote: the reason why I picked the name Dysnomia is, because like Nix, it is the name of a moon of a Trans-Neptunian object).

Similar to Disnix, when deploying NixOS configurations, all activities to deploy the static parts of a system are carried out by the Nix package manager.

However, in the final step (the activation step) a big generated shell script is executed that is responsible for deploying the dynamic parts of a system, such as the updating the GRUB bootloader, reloading systemd units, creating folders that store variable data (e.g. /var), creating user accounts and so on.

In some cases, it may also be desired to deploy mutable components as part of a NixOS system configuration:

  • Some systems are monolithic and cannot be be decomposed into services (i.e. distributable units) of deployment.
  • Some NixOS modules have scripts to initialize the state of a system service on first startup, such as a database, but do it in their own ad-hoc way, e.g. there is no real formalism behind it.
  • You may also want to use Dysnomia's (primitive) snapshotting facilities for backup purposes.

Recently I did some interesting experiments with Dysnomia on NixOS-level. In this blog post, I will show how Dysnomia can be used in conjunction with NixOS.

Deploying NixOS configurations


As described in earlier blog posts, in NixOS, deployment is driven by a single NixOS configuration file (/etc/nixos/configuration.nix), such as:

{pkgs, ...}:

{
  boot.loader.grub = {
    enable = true;
    device = "/dev/sda";
  };

  fileSystems."/" = {
    device = "/dev/disk/by-label/nixos";
    fsType = "ext4";  
  };

  services = {
    openssh.enable = true;
    
    mysql = {
      enable = true;
      package = pkgs.mysql;
      rootPassword = ../configurations/mysqlpw;
    };
  };
}

The above configuration file states that we want to deploy a system using the GRUB bootloader, having a single root partition, running OpenSSH and MySQL as system services. The configuration can be deployed with a single-command line instruction:

$ nixos-rebuild switch

When running the above command-line instruction, the Nix package manager deploys all required packages and configuration files. After all packages have been successfully deployed, the activation script gets executed. As a result, we have a system running OpenSSH and MySQL.

By modifying the above configuration and adding another service after MySQL:

...

mysql = {
  enable = true;
  package = pkgs.mysql;
  rootPassword = ../configurations/mysqlpw;
};

tomcat = {
  enable = true;
  commonLibs = [ "${pkgs.mysql_jdbc}/share/java/mysql-connector-java.jar" ];
  catalinaOpts = "-Xms64m -Xmx256m";
};

...

and running the same command-line instruction again:

$ nixos-rebuild switch

The NixOS configuration gets upgraded to also run Apache Tomcat as a system service in addition to MySQL and OpenSSH. When upgrading, Nix only builds or downloads the packages that have not been deployed before making the upgrade process much more efficient than rebuilding it from scratch.

Managing collections of mutable components


Similar to NixOS configurations (that represent entire system configurations), we need to manage the deployment of mutable components belonging to a system configuration as a whole. I have developed a new tool called: dysnomia-containers for this purpose.

The following command-line instruction queries all available containers on a system that serve as potential deployment targets:

$ dysnomia-containers --query-containers
mysql-database
process
tomcat-webapplication
wrapper

What the above command-line instruction does is searching all folders in the DYSNOMIA_CONTAINERS_PATH environment variable (that defaults to: /etc/dysnomia/containers) for container configuration files and displays their names, such as mysql-database corresponding to a MySQL DBMS server, and process and wrapper that are virtual containers integrating with the host system's service manager, such as systemd.

We can also query the available mutable components that we can deploy to the above listed containers:

$ dysnomia-containers --query-available-components
mysql-database/rooms
mysql-database/staff
mysql-database/zipcodes
tomcat-webapplication/GeolocationService
tomcat-webapplication/RoomService
tomcat-webapplication/StaffService
tomcat-webapplication/StaffTracker
tomcat-webapplication/ZipcodeService

The above command-line instruction displays all the available mutable component configurations that reside in directories provided by the DYSNOMIA_COMPONENTS_PATH environment variable, such as three MySQL databases and five Apache Tomcat web applications.

We can deploy all the available mutable components to the available containers, by running:

$ dysnomia-containers --deploy
Activating component: rooms in container: mysql-database
Activating component: staff in container: mysql-database
Activating component: zipcodes in container: mysql-database
Activating component: GeolocationService in container: tomcat-webapplication
Activating component: RoomService in container: tomcat-webapplication
Activating component: StaffService in container: tomcat-webapplication
Activating component: StaffTracker in container: tomcat-webapplication
Activating component: ZipcodeService in container: tomcat-webapplication

Besides displaying the available mutable components and deploying them, we can also query which ones have been deployed already:

$ dysnomia-containers --query-activated-components
mysql-database/rooms
mysql-database/staff
mysql-database/zipcodes
tomcat-webapplication/GeolocationService
tomcat-webapplication/RoomServiceWrapper
tomcat-webapplication/StaffService
tomcat-webapplication/StaffTracker
tomcat-webapplication/ZipcodeService

The dysnomia-containers tool uses the set of available and activated components to make an upgrade more efficient -- when deploying a new system configuration, it will deactivate the components that have been activated that are not available anymore, and activate the available components that have not been activated yet. The components that are both in the old and new configuration remain untouched.

For example, if we would run dysnomia-containers --deploy again, then nothing will be deployed or undeployed as the configuration remained identical.

We can also take snapshots of all activated mutable components (for example, for backup purposes):

$ dysnomia-containers --snapshot

After running the above command, the Dysnomia snapshot utility may show you the following output:

$ dysnomia-snapshots --query-all
mysql-database/rooms/faede34f3bf658884020a31ca98f16503da9a90bf3313cc96adc5c2358c0b054
mysql-database/staff/e9af7042064c33379ba9fe9272f61986b5a85de63c57732f067695e499a3a18f
mysql-database/zipcodes/637faa3e79ec6c2db71ac4023e86f29890e54233ea6592680fd88481725d44a3

As may be noticed, for each MySQL database (we have three of them) we have taken a snapshot. (For the Apache Tomcat web applications, no snapshots have been taken because state management for these kinds of components is unsupported).

We can also restore the state from the snapshots that we just have taken:

$ dysnomia-containers --restore

The above command restores the state of all three databases.

Finally, as with services deployed by Disnix, deactivating a mutable component does not imply that its state is removed automatically. Instead, it has been marked as garbage and must be explicitly removed by running:

$ dysnomia-containers --collect-garbage

NixOS integration


To actually make the previously shown deployment activities work, we need configuration files for all the containers and mutable components and put them into locations that are reachable from the DYSNOMIA_CONTAINERS_PATH and DYSNOMIA_COMPONENTS_PATH environment variables.

Obviously, they can be written by hand (as demonstrated in my previous blog post about Dysnomia), but this is not always very practical to do on a system-level. Moreover, there is some repetition involved as a NixOS configuration and container configuration files capture common properties.

I have developed a Dysnomia NixOS module to automate Dysnomia's configuration through NixOS. It can be enabled by adding the following property to a NixOS configuration file:

dysnomia.enable = true;

We can specify container properties in a NixOS configuration file as follows:

dysnomia.containers = {
  mysql-database = {
    mysqlUsername = "root";
    mysqlPassword = "secret";
    mysqlPort = 3306;
  };
  tomcat-webapplication = {
    tomcatPort = 8080;
  };
  ...
};

The Dysnomia module generates the corresponding container configuration files having the same names as each attribute name in the dysnomia.containers set and composes their contents from the sub attribute sets by translating them to text files with key=value pairs.

Most of the dysnomia.containers properties can be automatically generated by the Dysnomia NixOS module as well, since most of them have already been specified elsewhere in a NixOS configuration. For example, by enabling MySQL in a Dysnomia-enabled NixOS configuration:

services.mysql = {
  enable = true;
  package = pkgs.mysql;
  rootPassword = ../configurations/mysqlpw;
};

The Dysnomia module automatically generates the corresponding container properties as shown previously. The Dysnomia NixOS module integrates with all NixOS features for which Dysnomia provides a plugin.

In addition to containers, we can also specify the available mutable components as part of a NixOS configuration:

dysnomia.components = {
  mysql-database = {
    rooms = pkgs.writeTextFile {
      name = "rooms";
      text = ''
        create table room
        ( Room     VARCHAR(10)    NOT NULL,
          Zipcode  VARCHAR(6)     NOT NULL,
          PRIMARY KEY(Room)
        );
      '';
    };
    staff = ...
    zipcodes = ...
  };

  tomcat-webapplication = {
    ...
  };
};

As can be observed in the above example, the dysnomia.components attribute set captures the available mutable components per container. For the mysql-database container, we have defined three databases: rooms, staff and zipcodes. Each attribute refers to a Nix build function that produces an SQL file representing the initial state of the database on first activation (typically a schema).

Besides MySQL databases, we can use the tomcat-webapplication attribute to automatically deploy Java web applications to the Apache Tomcat servlet container. The corresponding values of each mutable component refer to the result of a Nix build function that produce a Java web application archive (WAR file).

The Dysnomia module automatically composes a directory with symlinks referring to the generated mutable component configurations reachable through the DYSNOMIA_COMPONENTS_PATH environment variable.

Distributed infrastructure state management


In addition to deploying mutable components belonging to a single NixOS configuration, I have mapped the NixOS-level Dysnomia deployment concepts to networks of NixOS machines by extending the DisnixOS toolset (the Disnix extension integrating Disnix' service deployment concepts with NixOS' infrastructure deployment).

It may not have been stated explicitly in any of my previous blog posts, but DisnixOS can also be used deploy a network of NixOS configurations to target machines in a network. For example, we can compose a networked NixOS configuration that includes the machine configuration shown previously:

{
  test1 = import ./configurations/mysql-tomcat.nix;
  test2 = import ./configurations/empty.nix;
}

The above configuration file is an attribute set defining two machine configurations. The first attribute (test1) refers to our previous NixOS configuration running MySQL and Apache Tomcat as system services.

We can deploy the networked configuration with the following command-line instruction:

$ disnixos-deploy-network network.nix

As a sidenote: although DisnixOS can deploy networks of NixOS configurations, NixOps does a better job in accomplishing this. Moreover, DisnixOS only supports deployment of NixOS configurations to bare-metal servers and cannot instantiate any VMs in the cloud.

Furthermore, what DisnixOS also does differently compared to NixOps, is invoking Dysnomia to activate or deactivate NixOS configurations -- the corresponding NixOS plugin executes the big monolithic NixOS activation script for the activation step and runs nixos-rebuild --rollback switch for the deactivation step.

I have extended the Dysnomia's nixos-configuration plugin with state management operations. Snapshotting the state of a NixOS configuration simply means running:

$ dysnomia-containers --snapshot

Likewise, restoring the state of a NixOS configuration can be done with:

$ dysnomia-containers --restore

And removing obsolete state with:

$ dysnomia-containers --collect-garbage

When using Disnix to manage state, we may have mutable components deployed as part of a system configuration and mutable components deployed as services in the same environment. To prevent the snapshots of the services to conflict with the ones belonging to a machine's system configuration, we set the DYSNOMIA_STATEDIR environment variable to: /var/state/dysnomia-nixos for system-level state management and to /var/state/dysnomia for service-level state management to keep them apart.

With these additional operations, we can capture the state of all mutable components part of the system configurations in a network:

$ disnixos-snapshot-network network.nix

This yields a snapshot of the test1 machine stored in the Dysnomia snapshot store on the coordinator machine:

$ dysnomia-snapshots --query-latest
nixos-configuration/nixos-system-test1-16.03pre-git/4c4751f10648dfbbf8e25c924391e80913c8a6a600f7b481d73cd88ff3d32730

When inspecting the contents of the NixOS system configuration snapshot, we will observe:

$ cd /var/state/dysnomia/snapshots/$(dysnomia-snapshots --query-latest)
$ find -maxdepth 3 -mindepth 3 -type d
./mysql-database/rooms/faede34f3bf658884020a31ca98f16503da9a90bf3313cc96adc5c2358c0b054
./mysql-database/staff/e9af7042064c33379ba9fe9272f61986b5a85de63c57732f067695e499a3a18f
./mysql-database/zipcodes/637faa3e79ec6c2db71ac4023e86f29890e54233ea6592680fd88481725d44a3

The contents of the NixOS system configuration snapshot consist all snapshots of the mutable components belonging to its system configuration.

Similar to restoring the state of individual mutable components, we can restore the state of all mutable components part of a system configuration in a network of machines:

$ disnixos-snapshot-network network.nix

And remove their obsolete state, by running:

$ disnixos-delete-network-state network.nix

TL;DR: Discussion


In this blog post, I have described an extension to Dysnomia that makes it possible to manage the state of mutable components belonging to a system configuration, and a NixOS module making it possible to automatically configure Dysnomia from a NixOS configuration file.

This new extension makes it possible to deploy mutable components belonging to systems that cannot be divided into distributable deployment units (or services in a Disnix-context), such as monolithic system configurations.

To summarize: if it is desired to manage the state of mutable components in a NixOS configuration, you need to provide a number of additional configuration settings. First, we must enable Dysnomia:

dysnomia.enable = true;

Then enable a number of container services, such as MySQL:

services.mysql.enable = true;

(As explained earlier, the Dysnomia module will automatically generate its corresponding container properties).

Finally, we can specify a number of available mutable components that can be deployed automatically, such as a MySQL database:

dysnomia.components = {
  mysql-database = {
    rooms = pkgs.writeTextFile {
      name = "rooms";
      text = ''
        create table room
        ( Room     VARCHAR(10)    NOT NULL,
          Zipcode  VARCHAR(6)     NOT NULL,
          PRIMARY KEY(Room)
        );
      '';
    };
  };
}

After deploying a Dysnomia-enabled NixOS system configuration through:

$ nixos-rebuild switch

We can deploy the mutable components belonging to it, by running:

$ dysnomia-containers --deploy

Unfortunately, managing mutable components on a system-level also has a huge drawback, in particular in distributed environments. Snapshots of entire system configurations are typically too coarse -- whenever the state of any of the mutable components change, a new system-level composite snapshot is generated that is composed of the snapshots of all mutable components.

Typically, these snapshots contain redundant data that is not shared among snapshot generations (although there are potential solutions to cope with this, I have not implemented any optimizations yet). As explained in my previous Dysnomia-related blog posts, snapshotting individual components can already be quite expensive (such as large databases), and these costs may become significantly larger on a system-level.

Likewise, restoring state on system-level implies that the state of all mutable components will be restored. This is also typically undesired as it may be too destructive and time consuming. Moreover, moving the state from one machine to another when a mutable components gets migrated is also much more expensive.

For more control and more efficient deployment of mutable components, it would typically be better to develop a Disnix service-model so that they can be managed individually.

Because of these drawbacks, I am not prominently advertising DisnixOS' distributed state management features. Moreover, I also did not attempt to integrate these features into NixOps, for the same reasons.

References


The dysnomia-containers tool as well as the distributed infrastructure management facilities have been integrated into the development versions of Dysnomia and DisnixOS, and will become part of the next Disnix release.

I have also added a sub example to the Java version of the Disnix staff tracker example to demonstrate how these features can be used.

As a final note, the Dysnomia NixOS module has not yet been integrated in NixOS. Instead, the module must be imported from a Dysnomia Git clone, by adding the following line to a NixOS configuration file:

imports = [ /home/sander/dysnomia/dysnomia-module.nix ];