Sander van der Burg's blog: August 2016

Monday, August 22, 2016

An extended self-adaptive deployment framework for service-oriented systems

Five years ago, while I was still in academia, I built an extension framework around Disnix (named: Dynamic Disnix) that enables self-adaptive redeployment of service-oriented systems. It was an interesting application as it demonstrated the full potential of service-oriented systems having their deployment processes automated with Disnix.

Moreover, the corresponding research paper was accepted for presentation at the SEAMS 2011 symposium (co-located with ICSE 2011) in Honolulu (Hawaii), which was (obviously!) a nice place to visit. :-)

Disnix's development was progressing at a very low pace for a while after I left academia, but since the end of 2014 I made some significant improvements. In contrast to the basic toolset, I did not improve Dynamic Disnix -- apart from the addition of a port assigner tool, I only kept the implementation in sync with Disnix's API changes to prevent it from breaking.

Recently, I have used Dynamic Disnix to give a couple of demos. As a result, I have improved some of its aspects a bit. For example, some basic documentation has been added. Furthermore, I have extended the framework's architecture to take a couple of new deployment planning aspects into account.

Disnix

For readers unfamiliar with Disnix: the primary purpose of the basic Disnix toolset is executing deployment processes of service-oriented systems. Deployments are driven by three kinds of declarative specifications:

The services model captures the services (distributed units of deployments) of which a system consists, their build/configuration properties and their inter-dependencies (dependencies on other services that may have to be reached through a network link).
The infrastructure model describes the target machines where services can be deployed to and their characteristics.
The distribution model maps services in the services model to machines in the infrastructure model.

By writing instances of the above specifications and running disnix-env:

$ disnix-env -s services.nix -i infrastructure.nix -d distribution.nix

Disnix executes all activities to get the system deployed, such as building their services from source code, distributing them to the target machines in the network and activating them. Changing any of these models and running disnix-env again causes the system to be upgraded. In case of an upgrade, Disnix will only execute the required activities making the process more efficient than deploying a system from scratch.

"Static" Disnix

So, what makes Disnix's deployment approach static? When looking at software systems from a very abstract point of view, they are supposed to meet a collection of functional and non-functional requirements. A change in a network of machines affects the ability for a service-oriented system to meet them, as the services of which these systems consist are typically distributed.

If a system relies on a critical component that has only one instance deployed and the machine that hosts it crashes, the functional requirements can no longer be met. However, even if we have multiple instances of the same components giving better guarantees that no functional requirements will be broken, important non-functional requirements may be affected, such as the responsiveness of a system.

We may also want to optimize a system's non-functional properties, such as its responsiveness, by adding more machines to the network that offer more system resources, or by changing the configuration of existing machine, e.g. upgrading the amount available RAM.

The basic Disnix toolset is considered static, because all these events events require manual modifications to the Disnix models for redeployment, so that a system can meet its requirements under the changed conditions.

For simple systems, manual reconfiguration is still doable, but with one hundred services, one hundred machines or a high frequency of events (or a combination of the three), it becomes too complex and time consuming.

For example, when a machine has been added or removed, we must rewrite the distribution model in such a way that all services are deployed to at least one machine and that none of them are mapped to machines that are not capable or allowed to host them. Furthermore, with microservices (one of their traits is that they typically embed HTTP servers), we must typically bind them to unique TCP ports that do not conflict with system services or other services deployed by Disnix. None of these configuration aspects are trivial for large service-oriented systems.

Dynamic Disnix

Dynamic Disnix extends Disnix's architecture with additional models and tools to cope with the dynamism of service oriented-systems. In the latest version, I have extended its architecture (which has been based on the old architecture described in the SEAMS 2011 paper and corresponding blog post):

The above diagram shows the structure of the dydisnix-self-adapt tool. The ovals denote command-line utilities, the rectangles denote files and the arrows denote files as inputs or outputs. As with the basic Disnix toolset, dydisnix-self-adapt is composed of command-line utilities each being responsible for executing an individual deployment activity:

On the top right, the infrastructure generator is shown that captures the configurations of the machines in the network and generates an infrastructure model from it. Currently, two different kinds of generators can be used: disnix-capture-infra (included with the basic toolset) that uses a bootstrap infrastructure model with connectivity settings, or dydisnix-geninfra-avahi that uses multicast DNS (through Avahi) to retrieve the machines' properties.
dydisnix-augment-infra is responsible for augmenting the generated infrastructure model with additional settings, such as passwords. It is typically undesired to automatically publish privacy-sensitive settings over a network using insecure connection protocols.
disnix-snapshot can be optionally used to preemptively capture the state of all stateful services (services with property: deployState = true; in the services model) so that the state of these services can be restored if a machine crashes or disappears. This tool is new in the extended architecture.
dydisnix-gendist generates a mapping of services to machines based on technical and non-functional properties defined in the services and infrastructure models.
dydisnix-port-assign assigns unique TCP port numbers to previously undeployed services and retains assigned TCP ports in a previous deployment for optimization purposes. This tool is new in the extended architecture.
disnix-env redeploys the system with the (statically) provided services model and the dynamically generated infrastructure and distribution models.

An example usage scenario

When a system has been configured to be (statically) deployed with Disnix (such as the infamous StaffTracker example cases that come in several variants), we need to add a few additional deployment specifications to make it dynamically deployable.

Auto discovering the infrastructure model

First, we must configure the machines in such a way that they publish their own configurations. The basic toolset comes with a primitive solution called: disnix-capture-infra that does not require any additional configuration -- it consults the Disnix service that is installed on every target machine.

By providing a simple bootstrap infrastructure model (e.g. infrastructure-bootstrap.nix) that only provides connectivity settings:

{
  test1.properties.hostname = "test1";
  test2.properties.hostname = "test2";
}

and running disnix-capture-infra, we can obtain the machines' configuration properties:

$ disnix-capture-infra infrastructure-bootstrap.nix

By setting the following environment variable, we can configure Dynamic Disnix to use the above command to capture the machines' infrastructure properties:

$ export DYDISNIX_GENINFRA="disnix-capture-infra infrastructure-bootstrap.nix"

Alternatively, there is the Dynamic Disnix Avahi publisher that is more powerful, but at the same time much more experimental and unstable than disnix-capture-infra.

When using Avahi, each machine uses multicast DNS (mDNS) to publish their own configuration properties. As a result, no bootstrap infrastructure model is needed. Simply gathering the data published by the machines on the same subnet suffices.

When using NixOS on a target machine, the Avahi publisher can be enabled by cloning the dydisnix-avahi Git repository and adding the following lines to /etc/nixos/configuration.nix:

imports = [ /home/sander/dydisnix/dydisnix-module.nix ];
services.dydisnixAvahiTest.enable = true;

To allow the coordinator machine to capture the configurations that the target machines publish, we must enable the Avahi system service. In NixOS, this can be done by adding the following lines to /etc/nixos/configuration.nix:

services.avahi.enable = true;

When running the following command-line instruction, the machines' configurations can be captured:

$ dydisnix-geninfra-avahi

Likewise, when setting the following environment variable:

$ export DYDISNIX_GENINFRA=dydisnix-geninfra-avahi

Dynamic Disnix uses the Avahi-discovery service to obtain an infrastructure model.

Writing an augmentation model

The Java version of StaffTracker for example uses MySQL to store data. Typically, it is undesired to publish the authentication credentials over the network (in particular with mDNS, which is quite insecure). We can augment these properties to the captured infrastructure model with the following augmentation model (augment.nix):

{infrastructure, lib}:

lib.mapAttrs (targetName: target:
  target // (if target ? containers && target.containers ? mysql-database then {
    containers = target.containers // {
      mysql-database = target.containers.mysql-database //
        { mysqlUsername = "root";
          mysqlPassword = "secret";
        };
    };
  } else {})
) infrastructure

The above model implements a very simple password policy, by iterating over each target machine in the discovered infrastructure model and adding the same mysqlUsername and mysqlPassword property when it encounters a MySQL container service.

Mapping services to machines

In addition to a services model and a dynamically generated (and optionally augmented) infrastructure model, we must map each service to machine in the network using a configured strategy. A strategy can be programmed in a QoS model, such as:

{ services
, infrastructure
, initialDistribution
, previousDistribution
, filters
, lib
}:

let
  distribution1 = filters.mapAttrOnList {
    inherit services infrastructure;
    distribution = initialDistribution;
    serviceProperty = "type";
    targetPropertyList = "supportedTypes";
  };

  distribution2 = filters.divideRoundRobin {
    distribution = distribution1;
  };
in
distribution2

The above QoS model implements the following policy:

First, it takes the initialDistribution model that is a cartesian product of all services and machines. It filters the machines on the relationship between the type attribute and the list of supportedTypes. This ensures that services will only be mapped to machines that can host them. For example, a MySQL database should only be deployed to a machine that has a MySQL DBMS installed.
Second, it divides the services over the candidate machines using the round robin strategy. That is, it divides services over the candidate target machines in equal proportions and in circular order.

Dynamically deploying a system

With the services model, augmentation model and QoS model, we can dynamically deploy the StaffTracker system (without manually specifying the target machines and their properties, and how to map the services to machines):

$ dydisnix-env -s services.nix -a augment.nix -q qos.nix

The Node.js variant of the StaffTracker example requires unique TCP ports for each web service and web application. By providing the --ports parameter we can include a port assignment specification that is internally managed by dydisnix-port-assign:

$ dydisnix-env -s services.nix -a augment.nix -q qos.nix --ports ports.nix

When providing the --ports parameter, the specification gets automatically updated when ports need to be reassigned.

Making a system self-adaptable from a deployment perspective

With dydisnix-self-adapt we can make a service-oriented system self-adaptable from a deployment perspective -- this tool continuously monitors the network for changes, and runs a redeployment when a change has been detected:

$ dydisnix-self-adapt -s services.nix -a augment.nix -q qos.nix

For example, when shutting down a machine in the network, you will notice that Dynamic Disnix automatically generates a new distribution and redeploys the system to get the missing services back.

Likewise, by adding the ports parameter, you can include port assignments as part of the deployment process:

$ dydisnix-self-adapt -s services.nix -a augment.nix -q qos.nix --ports ports.nix

By adding the --snapshot parameter, we can preemptively capture the state of all stateful services (services annotated with deployState = true; in the services model), such as the databases in which the records are stored. If a machine hosting databases disappears, Disnix can restore the state of the databases elsewhere.

$ dydisnix-self-adapt -s services.nix -a augment.nix -q qos.nix --snapshot

Keep in mind that this feature uses Disnix's snapshotting facilities, which may not be the best solution to manage state, in particular with large databases.

Conclusion

In this blog post, I have described an extended architecture of Dynamic Disnix. In comparison to the previous version, a port assigner has been added that automatically provides unique port numbers to services, and the disnix-snapshot utility that can preemptively capture the state of services, so that they can be restored if a machine disappears from the network.

Despite the fact that Dynamic Disnix has some basic documentation and other improvements from a usability perspective, Dynamic Disnix remains a very experimental prototype that should not be used for any production purposes. In contrast to the basic toolset, I have only used it for testing/demo purposes and I still have no real-life production experience with it. :-)

Moreover, I still have no plans to officially release it yet as many aspects still need to be improved/optimized. For now, you have to obtain the Dynamic Disnix source code from Github and use the included release.nix expression to install it. Furthermore, you probably need to a lot of courage. :-)

Finally, I have extended the Java and Node.js versions of the StaffTracker example as well as the virtual hosts example with simple augmentation and QoS models.

Wednesday, August 3, 2016

Porting node-simple-xmpp from the Node.js ecosystem to Titanium

As may have become obvious by reading some of my previous blog posts, I am frequently using JavaScript for a variety of programming purposes. Although JavaScript was originally conceived as a programming language for use in web browsers (mainly to make web pages more interactive), it is also becoming increasingly more popular in environments outside the browser, such as Node.js, a runtime environment for developing scalable network applications and Appcelerator Titanium, an environment to develop cross-platform mobile applications.

Apart from the fact that these environments share a common programming language -- JavaScript -- and a number of basic APIs that come with the language, they all have their own platform-specific APIs to implement most of an application's basic functionality.

Moreover, they have their own ecosystem of third-party software packages. For example, in Node.js the NPM package manager is the ubiquitous way of publishing and obtaining software. For web browsers, bower can be used, although its adoption is not as widespread as NPM.

Because of these differences, reuse between JavaScript environments is not optimal, in particular for packages that have dependencies on functionality that is not part of JavaScript's core API.

In this blog post, I will describe our experiences with porting the simple-xmpp from the Node.js ecosystem to Titanium. This library has dependencies on Node.js-specific APIs, but with our porting strategy we were able to integrate it in our Titanium app without making any modifications to the original package.

Motivation: Adding chat functionality

Not so long ago, me and my colleagues have been looking into extending our mobile app product-line with chat functionality. As described in an earlier blog post, we use Titanium for developing our mobile apps and one of my responsibilities is automating their builds with Nix (and Hydra, the Nix-based continuous integration server).

Developing chat functionality is quite complex, and requires one to think about many concerns, such as the user-experience, security, reliability and scalability. Instead of developing a chat infrastructure from scratch (which would be much too costly for us), we have decided to adopt the XMPP protocol, for the following reasons:

Open standard. Everyone is allowed to make software implementing aspects of the XMPP standard. There are multiple server implementations and many client libraries available, in many programming languages including JavaScript.
Decentralized. Users do not have to connect to a single server -- a server can relay messages to users connected to another server. A decentralized approach is good for reliability and scalability.
Web-based. The XMPP protocol is built on technologies that empower the web (XML and HTTP). The fact that HTTP is used as a transport protocol, means that we can also support clients that are behind a proxy server.
Mature. XMPP has been in use for a quite some time and has some very prominent users. Currently, Sony uses it to enrich the PlayStation platform with chat functionality. In the past, it was also used as the basis for Google and Facebook's chat infrastructure.

Picking a server implementation was not too hard, as ejabberd was something I had experience with in my previous job (and as an intern at Philips Research) -- it supports all the XMPP features we need, and has proven to be very mature.

Unfortunately, for Titanium, there was no module available that implements XMPP client functionality, except for an abandoned project named titanium-xmpp that is no longer supported, and no longer seems to work with any recent versions of Titanium.

Finding a suitable library

As there was no working XMPP client module available for Titanium and we consider developing such a client for Titanium from scratch too costly, we first tried to fix titanium-xmpp, but it turned out that too many things were broken. Moreover, it used all kinds of practices (such as an old fashioned way of module loading through Ti.include()) that have been deprecated a long time ago.

Then we tried porting other JavaScript-based libraries to Titanium. The first candidate was strophe.js which is mainly browser-oriented (and can be used in Node.js through phantomjs, an environment providing a non-interactive web technology stack), but had too many areas that had to be modified and browser-specific APIs that require substitutes.

Finally, I discovered node-xmpp, an XMPP framework for Node.js that has a very modular architecture. For example, the client and server aspects are very-well separated as well as the XML parsing infrastructure. Moreover, we discovered simple-xmpp, a library built on top of it to make a number of common tasks easier to implement. Moreover, the infrastructure has been ported to web browsers using browserify.

Browserify is an interesting porting tool -- its main purpose is to provide a replacement for the CommonJS module system, which is a first-class citizen in Node.js, but unavailable in the browser. Browserify statically analyzes closures of CommonJS modules, and packs them into a single JavaScript file so that the module system is no longer needed.

Furthermore, browserify provides browser-equivalent substitutes for many core Node.js APIs, such as events, stream and path, making it considerably easier to migrate software from Node.js to the browser.

Porting simple-xmpp to Titanium

In addition to browserify, there also exists a similar approach for Titanium: titaniumifier, that has been built on top of the browserify architecture.

Similar to browserify, titaniumifier also packs a collection of CommonJS modules into a single JavaScript file. Moreover, it constructs a Titanium module from it, packs it into a Zip file that can be distributed to any Titanium developer so that it can be used by simply placing it into the root folder of the app project and adding the following module requirement to tiapp.xml:

<module platform="commonjs">ti-simple-xmpp</module>

Furthermore, it provides Titanium-equivalent substitute APIs for Node.js core APIs, but its library is considerably more slim and incomplete than browserify.

We can easily apply titatiumifier to simple-xmpp, by creating a new NPM project (package.json file) that has a dependency on simple-xmpp:

{
  "name": "ti-simple-xmpp",
  "version": "0.0.1",
  "dependencies": {
    "simple-xmpp": "1.3.x"
  }
}

and a proxy CommonJS module (index.js) that simply exposes the Simple XMPP module:

exports.SimpleXMPP = require('simple-xmpp');

After installing the project dependencies (simple-xmpp only) with:

$ npm install

We can attempt to migrate it to Titanium, by running the following command-line instruction:

$ titaniumifier

In my first titaniumify attempt, the tool says that some mandatory Titanium specific properties, such as a unique GUID identifier, are missing that need to be added to package.json:

"titaniumManifest": {
    "guid": "76cb731c-5abf-3b79-6cde-f04202e9ea6d"
},

After adding the missing GUID property, a CommonJS titanium module gets produced that we can integrate in any Titanium project we want:

$ titaniumifier
$ ls
ti-simple-xmpp-commonjs-0.0.1.zip

Fixing API mismatches

With our generated CommonJS package, we can start experimenting by creating a simple app that only connects to a remote XMPP server, by adding the following lines to a Titanium app's entry module (app.js):

var xmpp = require('ti-simple-xmpp').SimpleXMPP;

xmpp.connect({
    websocket: { url: 'ws://myserver.com:5280/websocket/' },
    jid : 'username@myserver.com',
    password : 'password',
    reconnect: true,
    preferred: 'PLAIN',
    skipPresence: false
});

In our first trial run, the app crashed with the following error message:

Object prototype may only be an Object or null

This problem seemed to be caused by the following line that constructs an object with a prototype:

ctor.prototype = Object.create(superCtor.prototype, {

After adding a couple of debugging statements in front of the Object.create() line that print the constructor and the prototype's properties, I noticed that it tries to instantiate a stream object (not a constructor function) without a prototype member. Referring to a prototype that is undefined, is apparently not allowed.

Deeper inspection revealed the following code block:

/*<replacement>*/
var Stream;
(function() {
  try {
    Stream = require('st' + 'ream');
  } catch (_) {} finally {
    if(!Stream) Stream = require('events').EventEmitter;
  }
})();
/*</replacement>*/

The above code block attempts to load the stream module, and provides the event emitter as a fallback if it cannot be loaded. The stream string has been scrambled to prevent browserify to statically bundle the module. It appears that the titaniumifier tool provides a very simple substitute that is an object. As a result, it does not use the event emitter as a fallback.

We can easily fix the stream object's prototype problem, by supplying it with an empty prototype property by creating a module (overrides.js) that modifies it:

try {
    var stream = require('st' + 'ream');
    stream.prototype = {};
} catch(ex) {
    // Just ignore if it didn't work
}

and by importing the overrides module in the index module (index.js) before including simple-xmpp:

exports.overrides = require('./overrides');
exports.SimpleXMPP = require('simple-xmpp');

After fixing the prototype problem, the next trial run crashed the app with the following error message:

undefined is not an object (evaluation process.version.slice)

which seemed to be caused by the following line:

var asyncWrite = !process.browser && [ 'v0.10', 'v0.9.'].indexOf(process.version.slice(0, 5)) > -1 ? setImmediate : processNextTick;

Apparently, titaniumifier does not provide any substitute for process.version and as a result invoking the slice member throws an exception. Luckily, we can circumvent this by making sure that process.browser yields true, by adding the following line to the overrides module (overrides.js):

process.browser = true;

The third trial run crashed the app with the following message:

Can't find variable: XMLHttpRequest at ti-simple.xmpp.js (line 1943)

This error is caused by the fact that there is no XMLHttpRequest object -- an API that a web browser would normally provide. I have found a Titanium-based XHR implementation on GitHub that provides an identical API.

By copying the xhr.js file into our project, wrapping it in a module (XMLHttpRequest.js), we can provide a constructor that is identical to the browser API:

exports.XMLHttpRequest = require('./xhr');

global.XMLHttpRequest = module.exports;

By adding it to our index module:

exports.overrides = require('./overrides');
exports.XMLHttpRequest = require('./XMLHttpRequest');
exports.SimpleXMPP = require('simple-xmpp');

we have provided a substitute for the XMLHttpRequest API that is identical to a browser.

In the fourth run, the app crashed with the following error message:

Can't find variable: window at ti-simple-xmpp.js (line 1789)

which seemed to be caused by the following line:

var WebSocket = require('faye-websocket') && require('faye-websocket').Client ? require('faye-websocket').Client : window.WebSocket

Apparently, there is no window object nor a WebSocket constructor, as they are browser-specific and not substituted by titaniumifier.

Fortunately, there seems to be a Websocket module for Titanium that works both on iOS and Android. The only inconvenience is that its API is similar, but not exactly identical to the browser's WebSocket API. For example, creating a WebSocket in the browser is done as follows:

var ws = new WebSocket("ws://localhost/websocket");

whereas with the TiWS module, it must be done as follows:

var tiws = require("net.iamyellow.tiws");

var ws = tiws.open("ws://localhost/websocket");

These differences make it very tempting to manually fix the converted simple XMPP library, but fortunately we can create an adapter that has an identical interface to the browser's WebSocket API, translating calls to the Titanium WebSockets module:

var tiws = require('net.iamyellow.tiws');

function WebSocket() {
    this.ws = tiws.createWS();
    var url = arguments[0];
    this.ws.open(url);

    var self = this;
    
    this.ws.addEventListener('open', function(ev) {
        self.onopen(ev);
    });
    
    this.ws.addEventListener('close', function() {
        self.onclose();
    });
    
    this.ws.addEventListener('error', function(err) {
        self.onerror(err);
    });
    
    this.ws.addEventListener('message', function(ev) {
        self.onmessage(ev);
    });
}

WebSocket.prototype.send = function(message) {
    return this.ws.send(message);
};

WebSocket.prototype.close = function() {
    return this.ws.close();
};

if(global.window === undefined) {
    global.window = {};
}

global.window.WebSocket = module.exports = WebSocket;

Adding the above module to the index module (index.js):

exports.overrides = require('./overrides');
exports.XMLHttpRequest = require('./XMLHttpRequest');
exports.WebSocket = require('./WebSocket');
exports.SimpleXMPP = require('simple-xmpp');

seems to be the last missing piece in the puzzle. In the fifth attempt, the app seems to properly establish an XMPP connection. Coincidentally, all the other chat functions also seem to work like a charm! Yay! :-)

Conclusion

In this blog post, I have described a process in which I have ported simple-xmpp from the Node.js ecosystem to Titanium. The process was mostly automated, followed by a number of trial, error and fix runs.

The fixes we have applied are substitutes (identical APIs for Titanium), adapters (modules that translate calls to a particular API into a calls to a Titanium-specific API) and overrides (imperative modifications to existing modules). These changes did not require me to modify the original package (the original package is simply a dependency of the ti-simple-xmpp project). As a result, we do not have to maintain a fork and we have only little maintenance on our side.

Limitations

Although the porting approach seems to fit our needs, there are a number of things missing. Currently, only XMPP over WebSocket connections are supported. Ordinary XMPP connections require a Titanium-equivalent replacement for Node.js' net.Socket API, which is completely missing.

Moreover, the Titanium WebSockets library has some minor issues. The first time we tested a secure web socket wss:// connection, the app crashed on iOS. Fortunately, this problem has been fixed now.

References

The ti-simple-xmpp package can be obtained from GitHub. Moreover, I have created a bare bones Alloy/Titanium-based example chat app (XMPPTestApp) exposing most of the library's functionality. The app can be used on both iOS and Android:

Acknowledgements

The work described in this blog post is a team effort -- Yiannis Tsirikoglou first attempted to port strophe.js and manually ported simple-xmpp to Titanium before I managed to complete the automated approach described in this blog post. Carlos Henrique Lustosa Zinato provided us Titanium-related advice and helped us diagnosing the TiWS module problems.