Monday, August 12, 2019

A new input model transformation pipeline for Disnix

As explained in earlier blog posts, Disnix (as well as other tools in the Nix project) are driven by declarative specifications -- instead of describing the activities that need to be carried out to deploy a system (such as building and distributing packages), we specify all the relevant properties of a service-oriented system:

  • The services model describes all the services that can be deployed to target machines in a network, how they can be built from their sources, how they depend on each other and what their types are, so the the deployment system knows how they can be activated.
  • The infrastructure model captures all target machines in the network, their properties, and the containers they provide. Containers in a Disnix-context are services that manage the life-cycle of a component, such as an application server, service manager or database management service (DBMS).
  • The distribution model maps services to containers on the target machines.

By running the following command-line instruction:

$ disnix-env -s services.nix -i infrastructure.nix -d distribution.nix

Disnix infers all the activities that need to be executed to get the system in a running state, such as building packages from source code (or downloading substitutes from a binary cache), the distribution of packages, the activation of a system and taking and restoring state snapshots.

Conceptually, this approach may sound very simple but the implementation that infers the deployment process is not. Whilst the input models are declarative, they are not executable -- there is not a one-on-one mapping between properties in the input models and the activities that Disnix needs to carry out.

To be able to execute deployment activities, Disnix transforms the three input models into a single declarative specification (called a deployment manifest file) that contains one-on-one mappings between deployment artifacts (e.g. Nix profiles, Nix packages and snapshots) and deployment targets (the target machines and/or container services). The transformation pipeline fills in the blanks with default settings, and transforms the input models into several intermediate representations, before it gets transformed into the manifest file.

So far, the intermediate representations and final result were never well defined. Instead, they have organically evolved and were heavily revised several times. As a result of adding new features and not having well defined representations, it became very hard to make changes and reason about the correctness of the models.

In my previous blog post, I have developed libnixxml to make the integration between a data model defined in the Nix expression language and external tools (that implement deployment activities that Nix does not support) more convenient. I am primarily using this library to simplify the integration of manifest files with Disnix tools.

As an additional improvement, I have revised the transformation pipeline, with well-defined intermediate representations. Besides a better quality transformation pipeline with well-defined intermediate stages, the Disnix toolset can now also take the intermediate model representations as input parameters, which is quite convenient for integration with external tooling and experimentation purposes. Furthermore, a new input model has been introduced.

In the blog post, I will describe the steps in the transformation pipeline, and the intermediate representations of the deployment models.

Separated concerns: services, infrastructure, distribution models


As explained earlier in this blog post, Disnix deployment are primarily driven by three input models: the services, infrastructure and distribution models. The reason why I have picked three input models (as opposed to a single configuration file) is to separate concerns and allow these concerns to be reused in different kinds of deployment scenarios.

For example, we can write a simple services model (services.nix) that describes two services that have an inter-dependency on each other:

{distribution, invDistribution, system, pkgs}:

let customPkgs = import ../top-level/all-packages.nix { 
  inherit system pkgs;
};
in
rec {
  HelloMySQLDB = {
    name = "HelloMySQLDB";
    pkg = customPkgs.HelloMySQLDB;
    dependsOn = {};
    type = "mysql-database";
  };

  HelloDBService = {
    name = "HelloDBService";
    pkg = customPkgs.HelloDBServiceWrapper;
    dependsOn = {
      inherit HelloMySQLDB;
    };
    type = "tomcat-webapplication";
  };
}

The above services model captures two services with the following properties:

  • The HelloMySQLDB services refers to a MySQL database backend that stores data. The type property: mysql-database specifies which Dysnomia module should be used to manage the lifecycle of the service. For example, the mysql-database Dysnomia module will create the database on initial startup.
  • The HelloDBService is a web service that exposes the data stored in the database backend to the outside it world. Since it requires the presence of a MySQL database backend and needs to know where it has been deployed, the database backend been declared as an inter-dependency of the service (by means of the dependsOn attribute).

    The tomcat-webapplication type specifies that Disnix should use the Apache Tomcat Dysnomia module, to activate the corresponding Java-based web service inside the Apache Tomcat servlet container.

The services model captures the aspects of a service-oriented system from a functional perspective, without exposing much of the details of the environments they may run in. This is intentional -- the services are meant to be deployed to a variety of environments. Target agnostic services make it possible, for example, to write an infrastructure model of a test environment (infrastructure-test.nix):

{
  test1 = {
    properties = {
      hostname = "test1.example.org";
    };
    
    containers = {
      tomcat-webapplication = {
        tomcatPort = 8080;
      };
    };
  };
  
  test2 = {
    properties = {
      hostname = "test2.example.org";
    };
    
    containers = {
      tomcat-webapplication = {
        tomcatPort = 8080;
      };
      
      mysql-database = {
        mysqlPort = 3306;
        mysqlUsername = "mysqluser";
        mysqlPassword = builtins.readFile ./mysqlpw;
      };
    };
  };
}

and a distribution model that maps the services to the target machines in the infrastructure model (distribution-test.nix):

{infrastructure}:

{
  HelloMySQLDB = [ infrastructure.test2 ];
  HelloDBService = [ infrastructure.test1 ];
}

With these three deployment models, we can deploy a system to a test environment, by running:

$ disnix-env -s services.nix \
  -i infrastructure-test.nix \
  -d distribution-test.nix

and later switch to a production environment using the same functional services model, after the system has been properly validated in the test environment:

$ disnix-env -s services.nix \
  -i infrastructure-prod.nix \
  -d distribution-prod.nix

Similarly, we can adjust the distribution model to only deploy a sub set of the services of a system for, say, experimentation purposes.

Unifying the input models into a single specification: the deployment architecture model


The first step in transforming the input models into a single executable specification, is unifying the specifications into one single declarative specification, that I will call the deployment architecture model. The name is derived from the concept of deployment architectures in software architecture terminology:
a description that specifies the distribution of software components over hardware nodes.

A Disnix deployment architecture model may look as follows:

{system, pkgs}:

let customPkgs = import ../top-level/all-packages.nix { 
  inherit system pkgs;
};
in
rec {
  services = rec {
    HelloMySQLDB = {
      name = "HelloMySQLDB";
      pkg = customPkgs.HelloMySQLDB;
      dependsOn = {};
      type = "mysql-database";

      targets = [ infrastructure.test2 ];
    };

    HelloDBService = {
      name = "HelloDBService";
      pkg = customPkgs.HelloDBServiceWrapper;
      dependsOn = {
        inherit HelloMySQLDB;
      };
      type = "tomcat-webapplication";

      targets = [ infrastructure.test1 ];
    };
  };

  infrastructure = {
    test1 = {
      properties = {
        hostname = "test1.example.org";
      };
    
      containers = {
        tomcat-webapplication = {
          tomcatPort = 8080;
        };
      };
    };
  
    test2 = {
      properties = {
        hostname = "test2.example.org";
      };
    
      containers = {
        tomcat-webapplication = {
          tomcatPort = 8080;
        };
      
      mysql-database = {
        mysqlPort = 3306;
        mysqlUsername = "mysqluser";
        mysqlPassword = builtins.readFile ./mysqlpw;
      };
    };
  };
}

The above deployment architecture defines has the following properties:

  • The services and infrastructure models are unified into a a single attribute set in which the services attribute refers to the available services and infrastructure attribute to the available deployment targets.
  • The separated distribution concern is completely eliminated -- the mappings in the distribution models are augmented to the corresponding services, by means of the targets attribute. The transformation step basically checks whether no targets property was specified already, and if there is not -- it will consider the targets in the distribution model the deployment targets of the service.

    The fact that the targets attribute will not be overridden, also makes it possible to already specify the targets in the services model, if desired.

In addition to the three deployment models, it is now also possible as an end-user to write a deployment architecture model and use that to automate deployments. The following command-line instruction will deploy a service-oriented system from a deployment architecture model:

$ disnix-env -A architecture.nix

Normalizing the deployment architecture model


Unifying models into a single deployment architecture specification is a good first step in producing an executable specification, but more needs to be done to fully reach that goal.

There are certain deployment properties that are unspecified in the examples shown earlier. For some configuration properties, Disnix provides reasonable default values, such as:

  • Each service can indicate whether they want their state to be managed by Dysnomia (with the property deployState), so that data will automatically be migrated when moving the service from one machine to another. The default setting is false and can be overridden with the --deploy-state parameter.

    If a service does not specify this property then Disnix will automatically propagate the default setting as a parameter.
  • Every target machine in the infrastructure model also has specialized settings for connecting to the target machines, building packages and running tasks concurrently:

    test2 = {
      properties = {
        hostname = "test2.example.org";
      };
        
      containers = {
        tomcat-webapplication = {
          tomcatPort = 8080;
        };
          
        mysql-database = {
          mysqlPort = 3306;
          mysqlUsername = "mysqluser";
          mysqlPassword = builtins.readFile ./mysqlpw;
        };
    
        clientInterface = "disnix-ssh-client";
        targetProperty = "hostname";
        numOfCores = 1;
        system = "x86_64-linux";
      };
    };
    

    If none of these advanced settings are provided, Disnix will assume that the every target machine has the same system architecture (system) as the coordinator machine (so that the Nix package manager does not have to delegate a build to a machine that has a compatible architecture), we use the Disnix SSH client (disnix-ssh-client) interface executable (clientInterface) to connect to the target machine (using the hostname property as a connection string) and we only run one activity per target machine concurrently: numOfCores.

In addition to unspecified properties (that need to be augmented with default values), we also have properties that are abstract specifications. These specifications need to be translated into more concrete representations:

  • As explained in an older blog post, the targets property -- that maps services to targets -- does not only map services to machines, but also to container services hosted on that machine. In most cases, you will only use one container instance per service type -- for example, running two MySQL DBMS services (e.g. one on TCP port 3306 and another on 3307) is far less common use case scenario.

    If no container mapping is provided, Disnix will do an auto-mapping to a container service that corresponds to the service's type property.

    The MySQLDBService's targets property shown in the last deployment architecture model gets translated into the following property:

    {system, pkgs}:
    
    rec
    {
      services = rec {
        HelloMySQLDB = {
          name = "HelloMySQLDB";
          ...
    
          targets = [
            rec {
              selectedContainer = "mysql-database";
    
              container = {
                mysqlPort = 3306;
                mysqlUsername = "mysqluser";
                mysqlPassword = builtins.readFile ./mysqlpw;
              };
    
              properties = {
                hostname = "test2.example.org";
              };
    
              clientInterface = "disnix-ssh-client";
              targetProperty = "hostname";
              numOfCores = 1;
              system = "x86_64-linux";
            }
          ];
        };
      };
    
      infrastructure = ...
    }
    

    As may be observed, the target provides a selectedContainer property to indicate to what container the service needs to be deployed. The properties of all the containers that the service does not need to know about are discarded.
  • Another property that needs to be extended is the inter-dependency specifications (dependsOn and connectsTo). Typically, inter-dependency specifications are only specified on a functional level -- a service typically only specifies that it depends on another service disregarding the location where that service may have been deployed.

    If no target location is specified, then Disnix will assume that the service has an inter-dependency on all possible locations where that dependency may be deployed. If an inter-dependency is redundantly deployed, then that service also has an inter-dependency on all redundant replicas.

    The fact that it is also possible to specify the targets of the inter-dependencies, makes it also possible to optimize certain deployments. For example, you can also optimize a service's performance by forcing it to bind to an inter-dependency that is deployed to the same target machine, so that it will not be affected by slow network connectivity.

    The dependsOn property of the HelloDBService will translate to:

    dependsOn = {
      HelloMySQLDB = {
        name = "HelloMySQLDB";
        pkg = customPkgs.HelloMySQLDB;
        dependsOn = {};
        type = "mysql-database";
    
        targets = [
          {
            selectedContainer = "mysql-database";
    
            container = {
              mysqlPort = 3306;
              mysqlUsername = "mysqluser";
              mysqlPassword = builtins.readFile ./mysqlpw;
            };
    
            properties = {
              hostname = "test2.example.org";
            };        
          }
        ];
      };
    };
    

    In the above code fragment, the inter-dependency has been augmented with a targets property corresponding to the targets where that inter-dependency has been deployed to.


The last ingredient to generate an executable specification is building the services from source code so that we can map their build results to the target machines. To accomplish this, Disnix generates two invisible helper attributes for each service:

HelloDBService = {
  name = "HelloDBService";
  pkg = customPkgs.HelloDBServiceWrapper;
  dependsOn = {
    inherit HelloMySQLDB;
  };
  type = "tomcat-webapplication";

  ...

  _systemsPerTarget = [ "x86_64-linux" "x86_64-darwin" ];
  _pkgsPerSystems = {
    "x86_64-linux" = "/nix/store/91abq...-HelloDBService";
    "x86_64-darwin" = "/nix/store/f1ap2...-HelloDBService";
  };
};

The above code example shows the two "hidden" properties augmented to the HelloDBService:

  • The _systemsPerTarget specifies for which CPU architecture/operating systems the service must be built. Normally, services are target agnostic and should always yield the same Nix store path (with a build that is nearly bit-identical), but the system architecture of the target machine is an exception to deviate from this property -- it is also possible to deploy the same service to different CPU architectures/operating systems. In such cases the build result could be different.
  • The _pkgsPerSystem specifies for each system architecture, the Nix store path to the build result. A side effect of evaluating the Nix store path is the service also gets built from source code.

Finally, it will compose a deployment architecture model attribute named: targetPackages that refers to a list of Nix store paths to be distributed to each machine in the network:

{
  targetPackages = {
    test1 = [
      "/nix/store/91abq...-HelloDBService"
    ];

    test2 = [
      "/nix/store/p9af1...-HelloMySQLDB"
    ];
  };

  services = ...
  infrastructure = ...
}

The targetPackages attribute is useful for a variety of reasons, as we will see later.

Generating a deployment model


With a normalized architecture model, we can generate an executable specification that I will call a deployment model. The deployment model can be used for executing all remaining activities after the services have been built.

An example of a deployment model could be:

{
  profiles = {
    test1 = "/nix/store/...-test1";
    test2 = "/nix/store/...-test2";
  };

  services = {
    "ekfekrerw..." = {
      name = "HelloMySQLDB";
      pkg = "/nix/store/...";
      type = "mysql-database";
      dependsOn = [
      ];
      connectsTo = [
      ];
    };

    "dfsjs9349..." = {
      name = "HelloDBService";
      pkg = "/nix/store/...";
      type = "tomcat-webapplication";
      dependsOn = [
        { target = "test1";
          container = "mysql-database";
          service = "ekfekrerw...";
        }
      ];
      connectsTo = [
      ];
    };
  };

  infrastructure = {
    test1 = {
      properties = {
        hostname = "test1.example.org";
      };
      containers = {
        apache-webapplication = {
          documentRoot = "/var/www";
        };
      };
      system = "x86_64-linux";
      numOfCores = 1;
      clientInterface = "disnix-ssh-client";
      targetProperty = "hostname";
    };
    test2 = {
      properties = {
        hostname = "test2.example.org";
      };
      containers = {
        mysql-database = {
          mysqlPort = "3306";
        };
      };
      system = "x86_64-linux";
      numOfCores = 1;
      clientInterface = "disnix-ssh-client";
      targetProperty = "hostname";
    };
  };

  serviceMappings = [
    { service = "ekfekrerw...";
      target = "test2";
      container = "mysql-database";
    }
    { service = "dfsjs9349...";
      target = "test1";
      container = "tomcat-webapplication";
    }
  ];

  snapshotMappings = [
    { service = "ekfekrerw...";
      component = "HelloMySQLDB";
      container = "mysql-database";
      target = "test2";
    }
  ];
}

  • The profiles attribute refers to Nix profiles mapped to target machines and is derived from the targetPackages property in the normalized deployment architecture model. From the profiles property Disnix derives all steps of the distribution phase in which all packages and their intra-dependencies are copied to machines in the network.
  • The services attribute refers to all services that can be mapped to machines. The keys in this attribute set are SHA256 hash codes are recursively computed from the Nix store path of the package, the type, and all the inter-dependency mappings. Using hash codes to identify the services makes it possible to easily see whether a service is identical to another or not (by comparing hash codes), so that upgrades can be done more efficiently.
  • The infrastructure attribute is unchanged compared to the deployment architecture model and still stores target machine properties.
  • The serviceMappings attribute maps services in the services attribute set, to target machines in the network stored in the infrastructure attribute set and containers hosted on the target machines.

    From these mappings, Disnix can derive the steps to activate and deactivate the services of which a system is composed, ensure that all dependencies are present and that the services are activated or deactivated in the right order.
  • The snapshotMappings attribute state that for each services mapped to a target machines and container, we also want to migrate the state (by taking and restoring snapshots) if the service gets moved from one machine to another.

Although a deployment model is quite low-level, it is now also possible to manually write one, and deploy it by running:

$ disnix-env -D deployment.nix

disnix-env invokes an external executable called: disnix-deploy that executes the remaining activities of deployment process after the build process succeeds. disnix-depoy as well as the tools that execute individual deployment activities are driven by a manifest files. A manifest file is simply a one-on-one translation of the deployment model in the Nix expression language to XML following the NixXML convention.

Generating a build model


To build the services from source code, Disnix simply uses Nix's build facilities to execute the build. If nothing special has been configured, all builds will be executed on the coordinator machine, but this may not always be desired.

Disnix also facilitates heterogeneous architecture support. For example, if the coordinator machine is a Linux machine and a target machine is macOS (which is not compatible with the Linux system architecture), then Nix should delegate the build to a remote machine that is capable of building it. This is not something that Disnix handles for you out of the box -- you must configure Nix yourself to allow builds to be delegated.

It is also possible to optionally let Disnix delegate builds to the target machines in the network. To make build delegation work, Disnix generates a build model from a normalized deployment architecture model:

{
  derivations = [
    { "/nix/store/HelloMySQLDB-....drv"; interface = "test1"; }
    { "/nix/store/HelloDBService-....drv"; interface = "test2"; }
  ];

  interfaces = {
    test1 = {
      targetAddress = "test1.example.org";
      clientInterface = "disnix-ssh-client";
    };

    test2 = {
      targetAddress = "test2.example.org";
      clientInterface = "disnix-ssh-client";
    };
  };
}

The build model shown above defines the following properties:

  • The derivations attribute maps Nix store derivation files (low-level Nix specifications that capture build procedures and dependencies) to machines in the network that should perform the build. This information is used by Disnix to delegate store derivation closure to target machines, use Nix to build the packages remotely, and fetch the build results back to the coordinator machine.
  • The interfaces attribute is a sub set of the infrastructure model that contains the connectivity settings for each target machine.

By running the following command, you can execute a build model to delegate builds to remote machines and fetch their results back:

$ disnix-delegate -B build.nix

If the build delegation option is enabled (for example, by passing --build-on-targets parameter to disnix-env) then Disnix will work a so-called distributed derivation file. Similar to a manifest file, a distributed derivation file is a one-on-one translation from the build model written in the Nix expression language to XML using the NixXML convention.

Packages model


In the normalized architecture model and deployment model, we generate a targetPackages property that we can use to compose Nix profiles with packages from.

For a variety of reasons, I thought it would also be interesting to give the user direct control to use this property. A new feature in Disnix is that you can now also write a packages model:

{pkgs, system}:

{
  test1 = [
    pkgs.mc
  ];

  test2 = [
    pkgs.wget
    pkgs.curl
  ];
}

The above packages model says that we should distribute the Midnight Commander package to the test1 machine, and wget and curl to the test2 machine.

Running the following command will deploy the packages to the target machines in the network:

$ disnix-env -i infrastructure.nix -P pkgs.nix

You can also combine the three common Disnix models with a package model:

$ disnix-env -s services.nix \
  -i infrastructure.nix \
  -d distribution.nix \
  -P pkgs.nix

then Disnix will deploy the services that are distributed to target machines and the supplemental packages defined in the packages model.

The packages model is useful for a variety of reasons:

  • Although it is already possible to use Disnix as a simple package deployer (by setting the types of services to: package), the packages model approach makes it even easier. Furthermore, you also more easily specify sets of packages for target machines. The only thing you cannot do is deploying packages that have inter-dependencies on services, e.g. a client that is preconfigured to connect to a service.
  • The hybrid approach makes it possible to more smooth make a transition to Disnix when automating the deployment process of a system. You can start by managing the dependencies with Nix, then package pieces of the project as Nix packages, then use Disnix to deploy them to remote machines, and finally turn pieces of the system into services that can be managed by Disnix.

Conclusion


In this blog post, I have described a new transformation pipeline in Disnix with well-defined intermediate steps that transforms the input models to a deployment model that is consumable by the tools that implement the deployment activities.

The following diagram summarizes the input models, intermediate models and output models:


The new transformation pipeline has the following advantages over the old infrastructure:

  • The implementation is much easier to maintain and we can more easily reason about its correctness
  • We have access to a broader range of configuration properties. For example, it was previously not possible to select the targets of the inter-dependencies.
  • The output models: deployment and build models are much more easily consumable by the Disnix tools that execute the remainder of the deployment activities. The domain models in the code, also closely resemble the structure of the build and deployment models. This can also be partially attributed to libnixxml that I have described in my previous blog post.
  • We can more easily implement new input models, such as the packages model.
  • The implementation of the disnix-reconstruct tool that reconstructs the manifest on the coordinator machine from metadata stored on the target machines also has become much simpler -- we can get rid of most of the custom code and generate a deployment model instead.

Availability


The new pipeline is available in the current development version of Disnix and will become available for general use in the next Disnix release.

The deployment models described in this blog post are incompatible with the manifest file format used in the last stable release of Disnix. This means that after upgrading Disnix, you need to convert any previous deployment configuration by running the disnix-convert tool.

Monday, May 27, 2019

A Nix-friendly XML-based data exchange library for the Disnix toolset

In the last few months, I have been intensively working on a variety of internal improvements to the Disnix toolset.

One of the more increasingly complex and tedious aspects in the Disnix toolset is data exchange -- Disnix implements declarative deployment in the sense that it takes three specifications written in the Nix expression language as inputs: a services model that specifies the deployable units, their properties and how they depend on each other, an infrastructure model specifies the available target machines and their properties, and a distribution model that specifies the mapping between services in the services models and target machines in the infrastructure model.

From these three declarative models, Disnix derives all the activities that need to be carried out to get a system in a running state: building services from source, distributing services (and their dependencies) to target machines, activating the services, and (optionally) restoring state snapshots.

Using the Nix expression language for these input models is useful for a variety of reasons:

  • We can use the Nix package manager's build infrastructure to reliably build services from source code, including all their required dependencies, and store them in isolation in the Nix store. The Nix store ensures that multiple variants and versions can co-exist and that we can always roll back to previous versions.
  • Because the Nix expression language is a purely functional domain-specific language (that in addition to data structures supports functions), we can make all required configuration parameters (such as the dependencies of the services that we intend to deploy) explicit by using functions so that we know that all mandatory settings have been specified.

Although the Nix expression language is a first class-citizen concept for tasks carried out by the Nix package manager, we also want to use the same specifications to instruct tools that carry out activities that Nix does not implement, such as the tools that activate the services and restore state snapshots.

The Nix expression language is not designed to be consumed by other tools than Nix (as a sidenote: despite this limitation, it is still somewhat possible to use the Nix expression language independently of the package manager in experimental setups, such as this online tutorial, but the libexpr component of Nix does not have a stable interface or commitment to make the language portable across tools).

As a solution, I convert objects in the Nix expression language to XML, so that they can be consumed by any of the tools that implement the activities that Nix does not support.

Although this may sound conceptually straight forward, the amount of data that needs to be converted, and code that needs to be written to parse that data is growing bigger and more complex, and becomes increasingly harder to adjust and maintain.

To cope with this growing complexity, I have standardized a collection of Nix-XML conversion patterns, and wrote a library named: libnixxml that can be used to make data interchange in both directions more convenient.

Converting objects in the Nix expression language to XML


The Nix expression language supports a variety of language integrations. For example, it can export Nix objects to XML and JSON, and import from JSON and TOML data.

The following Nix attribute set:
{
  message = "This is a test";
  tags = [ "test" "example" ];
}

can be converted to XML (with the builtins.toXML primop) or by running:

$ nix-instantiate --eval-only --xml --strict example.nix

resulting in the following XML data:

<?xml version='1.0' encoding='utf-8'?>
<expr>
  <attrs>
    <attr column="3" line="2" name="message" path="/home/sander/example.nix">
      <string value="This is a test" />
    </attr>
    <attr column="3" line="3" name="tags" path="/home/sander/example.nix">
      <list>
        <string value="test" />
        <string value="example" />
      </list>
    </attr>
  </attrs>
</expr>

Although the above XML code fragment is valid XML, it is basically also just a literal translation of the underlying abstract syntax tree (AST) to XML.

An AST dump is not always very practical for consumption by an external application -- it is not very "readable", contains data that we do not always need (e.g. line and column data), and imposes (due to the structure) additional complexity on a program to parse the XML data to a domain model. As a result, exported XML data almost always needs to be converted to an XML format that is more practical for consumption.

For all the input models that Disnix consumes, I was originally handwriting XSL stylesheets converting the XML data to a format that can be more easily consumed and handwriting all the parsing code. Eventually, I derived a number of standard patterns.

For example, a more practical XML representation of the earlier shown Nix expression could be:

<?xml version="1.0"?>
<expr>
  <message>This is a test</message>
  <tags>
    <elem>test</elem>
    <elem>example</elem>
  </tags>
</expr>

In the above expression, the type and meta information is discarded. The attribute set is translated to a collection of XML sub elements in which the element names correspond to the attribute keys. The list elements are translated to generic sub elements (the above example uses elem, but any element name can be picked). The above notation is IMO, more readable, more concise and easier to parse by an external program.

Attribute keys may be identifiers, but can also be strings containing characters that invalidate certain XML element names (e.g. < or >). It is also possible to use a slightly more verbose notation in which a generic element name is used and the name property is used for each attribute set key:

<?xml version="1.0"?>
<expr>
  <attr name="message">This is a test</attr>
  <attr name="tags">
    <elem>test</elem>
    <elem>example</elem>
  </attr>
</expr>

When an application has a static domain model, it is not necessary to know any types (e.g. this conversion can be done in the application code using the application domain model). However, it may also be desired to construct data structures dynamically.

For dynamic object construction, type information needs to be known. Optionally, XML elements can be annotated with type information:

<?xml version="1.0"?>
<expr type="attrs">
  <attr name="message" type="string">This is a test</attr>
  <attr name="tags" type="list">
    <elem type="string">test</elem>
    <elem type="string">example</elem>
  </attr>
</expr>

To automatically convert data to XML format following the above listed conventions, I have created a standardized XSL stylesheet and command-line tool that can automatically convert Nix expressions.

The following command generates the first XML code fragment:

$ nixexpr2xml --attr-style simple example.nix

We can use the verbose notation for attribute sets, by running:

$ nixexpr2xml --attr-style verbose example.nix

Type annotations can be enabled by running:

$ nixexpr2xml --attr-style verbose --enable-types example.nix

The root, attribute and list element representations as well as the attribute set and types properties use generic element and property names. Their names can also be adjusted, if desired:

$ nixexpr2xml --root-element-name root \
  --list-element-name item \
  --attr-element-name property \
  --name-attribute-name key \
  --type-attribute-name mytype
  example.nix

Parsing a domain model


In addition to producing more "practical" XML data, I have also implemented utility functions that help me consuming the XML data to construct a domain model in the C programming language, that consists values (strings, integers etc.), structs, list-like data structures (e.g. arrays, linked lists) and table-like data structures, such as hash tables.

For example, the following XML document only containing a string:

<expr>hello</expr>

can be parsed to a string in C as follows:

#include <nixxml-parse.h>

xmlNodePtr element;
/* Open XML file and obtain root element */
xmlChar *value = NixXML_parse_value(element, NULL);
printf("value is: %s\n"); // value is: hello

We can also use functions to parse (nested) data structures. For example, to parse the following XML code fragment representing an attribute set:

<expr>
  <attr name="firstName">Sander</attr>
  <attr name="lastName">van der Burg</attr>
</expr>

We can use the following code snippet:

#include <stdlib.h>
#include <nixxml-parse.h>

xmlNodePtr element;

typedef struct
{
    xmlChar *firstName;
    xmlChar *lastName;
}
ExampleStruct;

void *create_example_struct(xmlNodePtr element, void *userdata)
{
    return calloc(1, sizeof(ExampleStruct));
}

void parse_and_insert_example_struct_member(xmlNodePtr element, void *table, const xmlChar *key, void *userdata)
{
    ExampleStruct *example = (ExampleStruct*)table;

    if(xmlStrcmp(key, (xmlChar*) "firstName") == 0)
        example->firstName = NixXML_parse_value(element, userdata);
    else if(xmlStrcmp(key, (xmlChar*) "lastName") == 0)
        example->lastName = NixXML_parse_value(element, userdata);
}

/* Open XML file and obtain root element */

ExampleStruct *example = NixXML_parse_verbose_heterogeneous_attrset(element, "attr", "name", NULL, create_example_struct, parse_and_insert_example_struct_member);

To parse the attribute set in the XML code fragment above (that uses a verbose notation) and derive a struct from it, we invoke the NixXML_parse_verbose_heterogeneous_attrset() function. The parameters specify that the XML code fragment should be parsed as follows:

  • It expects the name of the XML element of each attribute to be called: attr.
  • The property that refers to the name of the attribute is called: name.
  • To create a struct that stores the attributes in the XML file, the function: create_example_struct() will be executed that allocates memory for it and initializes all fields with NULL values.
  • The logic that parses the attribute values and assigns them to the struct members is in the parse_and_insert_example_member() function. The implementation uses NixXML_parse_value() (as shown in the previous example) to parse the attribute values.

In addition to parsing values as strings and attribute sets as structs, it is also possible to:

  • Parse lists, by invoking: NixXML_parse_list()
  • Parse uniformly typed attribute sets (in which every attribute set member has the same type), by invoking: NixXML_parse_verbose_attrset()
  • Parse attribute sets using the simple XML notation for attribute sets (as opposed to the verbose notation): NixXML_parse_simple_attrset() and NixXML_parse_simple_heterogeneous_attrset()

Printing Nix or XML representation of a domain model


In addition to parsing NixXML data to construct a domain model, the inverse process is also possible -- the API also provides convenience functions to print an XML or Nix representation of a domain model.

For example, the following string in C:

char *greeting = "Hello";

can be displayed as a string in the Nix expression language as follows:

#include <nixxml-print-nix.h>

NixXML_print_string_nix(stdout, greeting, 0, NULL); // outputs: "Hello"

or as an XML document, by running:

#include <nixxml-print-xml.h>

NixXML_print_open_root_tag(stdout, "expr");
NixXML_print_string_xml(stdout, greeting, 0, NULL, NULL);
NixXML_print_close_root_tag(stdout, "expr");

producing the following output:

<expr>Hello</expr>

The example struct shown in the previous section can be printed as a Nix expression with the following code:

#include <nixxml-print-nix.h>

void print_example_attributes_nix(FILE *file, const void *value, const int indent_level, void *userdata, NixXML_PrintValueFunc print_value)
{
    ExampleStruct *example = (ExampleStruct*)value;
    NixXML_print_attribute_nix(file, "firstName", example->firstName, indent_level, userdata, NixXML_print_string_nix);
    NixXML_print_attribute_nix(file, "lastName", example->lastName, indent_level, userdata, NixXML_print_string_nix);
}

NixXML_print_attrset_nix(stdout, &example, 0, NULL, print_example_attributes_nix, NULL);

The above code fragment executes the function: NixXML_print_attrset_nix() to print the example struct as an attribute set. The attribute set printing function invokes the function: print_example_attributes_nix() to print the attribute set members.

The print_example_attributes_nix() function prints each attribute assignment. It uses the NixXML_print_string_nix() function (shown in the previous example) to print each member as a string in the Nix expression language.

The result of running the above code is the following Nix expression:

{
  "firstName" = "Sander";
  "lastName" = "van der Burg";
}

the same struct can be printed as XML (using the verbose notation for attribute sets) with the following code:

#include <nixxml-print-xml.h>

void print_example_attributes_xml(FILE *file, const void *value, const char *child_element_name, const char *name_property_name, const int indent_level, const char *type_property_name, void *userdata, NixXML_PrintXMLValueFunc print_value)
{
    ExampleStruct *example = (ExampleStruct*)value;
    NixXML_print_verbose_attribute_xml(file, child_element_name, name_property_name, "firstName", example->firstName, indent_level, NULL, userdata, NixXML_print_string_xml);
    NixXML_print_verbose_attribute_xml(file, child_element_name, name_property_name, "lastName", example->lastName, indent_level, NULL, userdata, NixXML_print_string_xml);
}

NixXML_print_open_root_tag(stdout, "expr");
NixXML_print_verbose_attrset_xml(stdout, &example, "attr", "name", 0, NULL, NULL, print_example_attributes_xml, NULL);
NixXML_print_close_root_tag(stdout, "expr");

The above code fragment uses a similar strategy as the previous example (by invoking NixXML_print_verbose_attrset_xml()) to print the example struct as an XML file using a verbose notation for attribute sets.

The attribute set members are printed by the print_example_attributes_xml() function.

The result of running the above code is the following XML output:

<expr>
  <attr name="firstName">Sander</attr>
  <attr name="lastName">van der Burg</attr>
</expr>

In addition to printing values and attribute sets, it is also possible to:

  • Print lists in Nix and XML format: NixXML_print_list_nix(), NixXML_print_list_xml()
  • Print attribute sets in simple XML notation: NixXML_print_simple_attrset_xml()
  • Print strings as int, float or bool: NixXML_print_string_as_*_xml.
  • Print integers: NixXML_print_int_xml()
  • Disable indentation by setting the indent_level parameter to -1.
  • Print type annotated XML, by setting the type_property_name parameter to a string that is not NULL.

Using abstract data structures


There is no standardized library for abstract data structures in C, e.g. lists, maps, trees etc. As a result, each framework provides their own implementations of them. To parse lists and attribute sets (that have arbitrary structures), you need generalized data structures that are list-like or table-like.

libnixxml provides two sub libraries to demonstrate how integration with abstract data structures can be implemented. One sub library is called libnixxml-data that uses pointer arrays for lists and xmlHashTable for attribute sets, and another is called libnixxml-glib that integrates with GLib using GPtrArray structs for lists and GHashTables for attribute sets.

The following XML document:

<expr>
  <elem>test</elem>
  <elem>example</elem>
</expr@>

can be parsed as a pointer array (array of strings) as follows:

#include <nixxml-ptrarray.h>

xmlNodePtr element;
/* Open XML file and obtain root element */
void **array = NixXML_parse_ptr_array(element, "elem", NULL, NixXML_parse_value);

and printed as a Nix expression with:

NixXML_print_ptr_array_nix(stdout, array, 0, NULL, NixXML_print_string_nix);

and as XML with:

NixXML_print_open_root_tag(stdout, "expr");
NixXML_print_ptr_array_xml(stdout, array, "elem", 0, NULL, NULL, NixXML_print_string_xml);
NixXML_print_close_root_tag(stdout, "expr");

Similarly, there is a module that works with xmlHashTables providing a similar function interface as the pointer array module.

Working with generic NixXML nodes


By using generic data structures to represent lists and tables, type annotated NixXML data and a generic NixXML_Node struct (that indicates what kind of node we have, such as a value, list or attribute set) we can also automatically parse an entire document by using a single function call:

#include <nixxml-ptrarray.h>
#include <nixxml-xmlhashtable.h>
#include <nixxml-parse-generic.h>

xmlNodePtr element;
/* Open XML file and obtain root element */
NixXML_Node *node = NixXML_generic_parse_expr(element,
    "type",
    "name",
    NixXML_create_ptr_array,
    NixXML_create_xml_hash_table,
    NixXML_add_value_to_ptr_array,
    NixXML_insert_into_xml_hash_table,
    NixXML_finalize_ptr_array);

The above function composes a generic NixXML_Node object. The function interface uses function pointers to compose lists and tables. These functions are provided by the pointer array and xmlHashTable modules in the libnixxml-data library.

We can also print an entire NixXML_Node object structure as a Nix expression:

#include <nixxml-print-generic-nix.h>

NixXML_print_generic_expr_nix(stdout,
    node,
    0,
    NixXML_print_ptr_array_nix,
    NixXML_print_xml_hash_table_nix);

as well as XML (using simple or verbose notation for attribute sets):

#include <nixxml-print-generic-xml.h>

NixXML_print_generic_expr_verbose_xml(stdout,
    node,
    0,
    "expr",
    "elem",
    "attr",
    "name",
    "type",
    NixXML_print_ptr_array_xml,
    NixXML_print_xml_hash_table_verbose_xml);

Summary


The following table summarizes the concepts described in this blog post:

Concept Nix expression representation XML representation C application domain model
value "hello" hello char*
list [ "hello" "bye" ] <elem>hello</elem><elem>bye</elem> void**, linked list, ...
attribute set { a = "hello"; b = "bye"; } <a>hello</a><b>bye</b> xmlHashTablePtr, struct, ...
attribute set { a = "hello"; b = "bye"; } <attr name="a">hello</attr><attr name="b">bye</attr> xmlHashTablePtr, struct, ...

The above table shows the concepts that the NixXML defines, and how they can be represented in the Nix expression language, XML and in a domain model of a C application.

The representations of these concepts can be translated as follows:

  • To convert a raw AST XML representation of a Nix expression to NixXML, we can use the included XSL stylesheet or run the nixexpr2xml command.
  • XML concepts can be parsed to a domain model in a C application by invoking NixXML_parse_* functions for the appropriate concepts and XML representation.
  • Domain model elements can be printed as XML by invoking NixXML_print_*_xml functions.
  • Domain model elements can be printed in the Nix expression language by invoking NixXML_print_*_nix functions.

Benefits


I have re-engineered the current development versions of Disnix and the Dynamic Disnix toolsets to use libnixxml for data exchange. For Disnix, there is much fewer boilerplate code that I need to write for the parsing infrastructure, making it significantly easier to maintain it.

In the Dynamic Disnix framework, libnixxml provides even more benefits beyond a simpler parsing infrastructure. The Dynamic Disnix toolset provides deployment planning methods, and documentation and visualization tools. These concerns are orthogonal to the features of the core Disnix toolset -- there is first-class Nix/Disnix integration, but the features of Dynamic Disnix should work with any service-oriented system (having a model that works with services and dependencies) regardless of what technology is used to carry out the deployment process itself.

With libnixxml it is now quite easy to make all these tools both accept Nix and XML representations of their input models, and make them output data in both Nix and XML. It is now also possible to use most features of Dynamic Disnix, such as the visualization features described in the previous blog post, independently of Nix and Disnix.

Moreover, the deployment planning methods should now also be able to more conveniently invoke external tools, such as SAT-solvers.

Related work


libnixxml is not the only Nix language integration facility I wrote. I also wrote NiJS (that is JavaScript-based) and PNDP (that is PHP-based). Aside from the language (C programming language), the purpose of libnixxml is not to replicate the functionality of these two libraries in C.

Basically, libnixxml has the inverse purpose -- NiJS and PNDP are useful for systems that already have a domain model (e.g. a domain-specific configuration management tool), and make it possible to generate the required Nix expression language code to conveniently integrate with Nix.

In libnixxml, the Nix expression representation is the basis and libnixxml makes it more convenient for external programs to consume such a Nix expression. Moreover, libnixxml only facilitates data interchange, and not all Nix expression language features.

Conclusion


In this blog post, I have described libnixxml that makes XML-based data interchange with configurations in the Nix expression language and domain models in the C programming language more convenient. It is part of the current development version of Disnix and can be obtained as a separate project from my GitHub page.

Thursday, February 28, 2019

Generating functional architecture documentation from Disnix service models

In my previous blog post, I have described a minimalistic architecture documentation approach for service-oriented systems based on my earlier experiences with setting up basic configuration management repositories. I used this approach to construct a documentation catalog for the platform I have been developing at Mendix.

I also explained my motivation -- it improves developer effectiveness, team consensus and the on-boarding of new team members. Moreover, it is a crucial ingredient in improving the quality of a system.

Although we are quite happy with the documentation, my biggest inconvenience is that I had to derive it entirely by hand -- I consulted various kinds of sources, but since existing documentation and information provided by people may be incomplete or inconsistent, I considered the source code and deployment configuration files the ultimate source of truth, because no matter how elegantly a diagram is drawn, it is useless if it does not match the actual implementation.

Because a manual documentation process is very costly and time consuming, a more ideal situation would be to have an automated approach that automatically derives architecture documentation from deployment specifications.

Since I am developing a deployment framework for service-oriented systems myself (Disnix), I have decided to extend it with a generator that can derive architecture diagrams and supplemental descriptions from the deployment models using the conventions I have described in my previous blog post.

Visualizing deployment architectures in Disnix


As explained in my previous blog post, the notation that I used for the diagrams was not something I invented from scratch, but something I borrowed from Disnix.

Disnix already has a feature (for quite some time) that can visualize deployment architectures referring to a description that shows how the functional parts (the services/components) are mapped to physical resources (e.g. machines/containers) in a network.

For example, after deploying a service-oriented system, such as my example web application system, by running:

$ disnix-env -s services.nix -i infrastructure.nix \
  -d distribution-bundles.nix

You can visualize the corresponding deployment architecture of the system, by running:

$ disnix-visualize > out.dot

The above command-line instruction generates a directed graph in the DOT language. The resulting dot file can be converted into a displayable image (such as a PNG or SVG file) by running:

$ dot -Tpng out.dot > out.png

Resulting in a diagram of the deployment architecture that may look as follows:


The above diagram uses the following notation:

  • The light grey boxes denote machines in a network. In the above deployment scenario, we have two them.
  • The ovals denote services (more specifically: in a Disnix-context, they reflect any kind of distributable deployment unit). Services can have almost any shape, such as web services, web applications, and databases. Disnix uses a plugin system called Dysnomia to make sure that the appropriate deployment steps are carried out for a particular type of service.
  • The arrows denote inter-dependencies. When a service points to another service means that the latter is an inter-dependency of the former service. Inter-dependency relationships ensure that the dependent service gets all configuration properties so that it knows how to reach the dependency and the deployment system makes sure that inter-dependencies of a specific service are deployed first.

    In some cases, enforcing the right activation order of activation may be expensive. It is also possible to drop the ordering requirement, as denoted by the dashed arrows. This is acceptable for redirects from the portal application, but not acceptable for database connections.
  • The dark grey boxes denote containers. Containers can be any kind of runtime environment that hosts zero or more distributable deployment units. For example, the container service of a MySQL database is a MySQL DBMS, whereas the container service of a Java web application archive can be a Java Servlet container, such as Apache Tomcat.

Visualizing the functional architecture of service-oriented systems


The services of which a service-oriented systems is composed are flexible -- they can be deployed to various kinds of environments, such a test environment, a second fail-over production environment or a local machine.

Because services can be deployed to a variety of targets, it may also be desired to get an architectural view of the functional parts only.

I created a new tool called: dydisnix-visualize-services that can be used to generate functional architecture diagrams by visualizing the services in the Disnix services model:


The above diagram is a visual representation of the services model of the example web application system, using a similar notation as the deployment architecture without showing any environment characteristics:

  • Ovals denote services and arrows denote inter-dependency relationships.
  • Every service is annotated with its type, so that it becomes clear what kind of a shape a service has and what kind of deployment procedures need to be carried out.

Despite the fact that the above diagram is focused on the functional parts, it may still look quite detailed, even from a functional point of view.

Essentially, the architecture of my example web application system is a "system of sub systems" -- each sub system provides an isolated piece of functionality consisting of a database backend and web application front-end bundle. The portal sub system is the entry point and responsible for guiding the users to the sub systems implementing the functionality that they want to use.

It is also possible to annotate services in the Disnix services model with a group and description property:

{distribution, invDistribution, pkgs, system}:

let
  customPkgs = import ../top-level/all-packages.nix {
    inherit pkgs system;
  };

  groups = {
    homework = "Homework";
    literature = "Literature";
    ...
  };
in
{
  homeworkdb = {
    name = "homeworkdb";
    pkg = customPkgs.homeworkdb;
    type = "mysql-database";
    group = groups.homework;
    description = "Database backend of the Homework subsystem";
  };

  homework = {
    name = "homework";
    pkg = customPkgs.homework;
    dependsOn = {
      inherit usersdb homeworkdb;
    };
    type = "apache-webapplication";
    appName = "Homework";
    group = groups.homework;
    description = "Front-end of the Homework subsystem";
  };

  ...
}

In the above services model, I have grouped every database and web application front-end bundle in a group that represents a sub system (such as Homework). By adding the --group-subservices parameter to the dydisnix-visualize-services command invocation, we can simplify the diagram to only show the sub systems and how these sub systems are inter-connected:

$ dydisnix-visualize-services -s services.nix -f png \
  --group-subservices

resulting in the following functional architecture diagram:


As may be observed in the picture above, all services have been grouped. The service groups are denoted by ovals with dashed borders.

We can also query sub architecture diagrams of every group/sub system. For example, the following command generates a sub architecture diagram for the Homework group:

$ dydisnix-visualize-services -s services.nix -f png \
  --group Homework --group-subservices

resulting in the following diagram:


The above diagram will only show the the services in the Homework group and their context -- i.e. non-transitive dependencies and services that have a dependency on any service in the requested group.

Services that exactly fit the group or any of its parent groups will be displayed verbatim (e.g. the homework database back-end and front-end). The other services will be categorized into in the lowest common sub group (the Users and Portal sub systems).

For more complex architectures consisting of many layers, you may probably want to generate all available architecture diagrams in one command invocation. It is also possible to run the visualization tool in batch mode. In batch mode, it will recursively generate diagrams for the top-level architecture and every possible sub group and stores them in a specified output folder:

$ dydisnix-visualize-services --batch -s services.nix -f svg \
  --output-dir out

Generating supplemental documentation


Another thing I have explained in my previous blog post is that providing diagrams is useful, but they cannot clear up all confusion -- you also need to document and clarify additional details, such as the purposes of the services.

It also possible to generate a documentation page for each group showing a table of services with their descriptions and types:

The following command generates a documentation page for the Homework group:

$ dydisnix-document-services -s services.nix --group Homework

It is also possible to adjust the generation process by providing a documentation configuration file (by using the --docs parameter):

$ dydisnix-document-services -f services.nix --docs docs.nix \
  --group Homework

The are a variety of settings that can be provided in a documentation configuration file:

{
  groups = {
    Homework = "Homework subsystem";
    Literature = "Literature subsystem";
    ...
  };

  fields = [ "description" "type" ];

  descriptions = {
    type = "Type";
    description = "Description";
  };
}

The above configuration file specifies the following properties:

  • The descriptions for every group.
  • Which fields should be displayed in the overview table. It is possible to display any property of a service.
  • A description of every field in the services model.

Like the visualization tool, the documentation tool can also be used in batch mode to generate pages for all possible groups and sub groups.

Generating a documentation catalog


In addition to generating architecture diagrams and descriptions, it is also possible to combine both tools to automatically generate a complete documentation catalog for a service-oriented system, such as the web application example system:

$ dydisnix-generate-services-docs -s services.nix --docs docs.nix \
  -f svg --output-dir out

By opening the entry page in the output folder, you will get an overview of the top-level architecture, with a description of the groups.


By clicking on a group hyperlink, you can inspect the sub architecture of the corresponding group, such as the 'Homework' sub system:


The above page displays the sub architecture diagram of the 'Homework' subsystem and a description of all services belonging to that group.

Another particularly interesting aspect is the 'Portal' sub system:


The portal's purpose is to redirect users to functionality provided by the other sub systems. The above architecture diagram displays all the sub systems in grouped form to illustrate that there is a dependency relationship, but without revealing all their internal details that clutters the diagram with unnecessary implementation details.

Other features


The tools support more use cases than those described in this blog post -- it is also possible, for example, to create arbitrary layers of sub groups by using the '/' character as a delimiter in the group identifier. I also used the company platform as an example case, that can be decomposed into four layers.

Availability


The tools described in this blog post are part of the latest development version of Dynamic Disnix -- a very experimental extension framework built on top of Disnix that can be used to make service-oriented systems self-adaptive by redeploying their services in case of events.

The reason why I have added these tools to Dynamic Disnix (and not the core Disnix toolset) is because the extension toolset has an infrastructure to parse and reflect over individual Disnix models.

Although I promised to make an official release of Dynamic Disnix a very long time ago, this still has not happened yet. However, the documentation feature is a compelling reason to stabilize the code and make the framework more usable.

Thursday, January 31, 2019

A minimalistic discovery and architecture documentation process

In a blog post written a couple of years ago, I have described how to set up a basic configuration management process in a small organization that is based on the process framework described in the IEEE 828-2012 configuration management standard. The most important prerequisite for setting up such a process is identifying all configuration items (CIs) and storing them in a well-organized repository.

There are many ways to organize configuration items ranging from simple to very sophisticated solutions. I used a very small set of free and open source tools, and a couple of simple conventions to set up a CI repository:

  • A Git repository with an hierarchical directory structure referring to configurations items. Each path component in the directory structure serves a specific purpose to group configuration items. The overall strategy was to use a directory structure with a maximum of three levels: environment/machine/application. Using Git makes it possible to version configuration items and share the repository with team members.
  • Using markdown to write down the purposes of the configuration items and descriptions how they can be reproduced. Markdown works well for two reasons: it can be nicely formatted in a browser, but also read from a terminal when logged in to remote servers via SSH.
  • Using Dia for drawing diagrams of systems consisting of more complex applications components. Dia is not the most elegant program around, but it works well enough, it is free and open source, and supported on Linux, Windows and macOS.

My main motivation to formalize configuration management (but only lightly), despite being in a small organization, is to prevent errors and minimize delays and disruptions while remaining flexible by not being bound to all kinds of complex management procedures.

I wrote this blog post while I was still employed at a small-sized startup company with only one development team. In the meantime, I have joined a much bigger organization (Mendix) that has many cross-disciplinary development teams that work concurrently on various aspects of our service and product portfolio.

About microservices


When I just joined, the amount of information I had to absorb was quite overwhelming. I also learned that we heavily adopted the microservices architecture paradigm for our entire online service platform.

According to Martin Fowler's blog post on microservices, using microservices offers the following benefits:

  • Strong module boundaries. You can divide the functionality of a system into microservices and make separate teams responsible for the development of each service. This makes it possible to iterate faster and offer better quality because teams can focus on themselves on a subset of features only.
  • Independent deployment. Microservices can be deployed independently making it possible to ship features when they are done, without having complex integration cycles.
  • Technology diversity. Microservices are language and technology agnostic. You can pick almost any programming language (e.g. Java, Python, Mendix, Go), data storage solution (e.g. PostgreSQL, MongoDB, InfluxDB) or operating system (e.g. Linux, FreeBSD, Windows) to implement a microservice making it possible pick the most suitable combination of technologies and use them at their full advantage.

However, decomposing a system into a collection of collaborating services also comes at a (sometimes substantial!) price:

  • There is typically much more operational complexity. Because there are many components and typically a large infrastructure to manage, activities such as deploying, upgrading, and monitoring the condition of a system is much more time consuming and complex. Furthermore, because of technology diversity, there are also many kinds of specialized deployment procedures that you need to carry out.
  • Data is eventually consistent. You have to live with the fact that (temporary) inconsistencies could end up in your data, and you must invest in implementing facilities that keep your data is consistent.
  • Because of distribution development is harder in general -- it is more difficult to diagnose errors (e.g. a failure in one service could trigger a chain reaction of errors, without having proper error traces), it is harder to test a system because of additional deployment complexity. The network links between services may be slow and subject to failure, causing all kinds of unpredictable problems. Also machines that host critical services may crash.

Studying the architecture


When applied properly, e.g. functionality is well separated and there is strong cohesion and weak coupling between services, while investing in solutions to cope with the challenges listed above, the benefits of microservices can be reaped, resulting in a scalable systems that can be developed my multiple teams working on features concurrently.

However, an important prerequisite for making changes in such an environment, and maintaining or improving the quality properties of a system, requires discipline and a relatively good understanding of the environment -- in the beginning, I faced all kinds of practical problems when I wanted to make even a subtle change -- some areas of our platform where documented, while others were not. Some documentation was also outdated, slightly incomplete and sometimes inconsistent with the actual implementation.

Certain areas of our platform were also highly complex resulting in very complex architectural views, with many boxes and arrows. Furthermore, information was also scattered around many different places.

As part of my on-boarding process, and as a means to cope with some of my practical problems, I have created a documentation repository of the platform that our team develops by extending the (minimalistic) principles for configuration management described in the earlier blog post.

I realized that simply identifying the service components of which the system consists, is not enough to get an understanding of the system -- there are many items and complex details that need to be captured.

In addition to the identification of all configuration items, I also want:

  • Proper guidance. To understand a particular piece of functionality, I should not need to study every component in detail. Instead, I want to know the full context and only the details of the relevant components.
  • Completeness. I want all service components to be visible. I do not want any details to be covered up. For example, I have also seen quite a few diagrams that hide complex implementation details. I much rather want flaws to be visible so that they can be resolved at a later point in time.
  • Clear boundaries. Our platform is not self contained, but relies on services provided by other teams. I want to know what components are our responsibility and what is managed by external teams.
  • Clarity. I want to know what the purpose of a component is. Their names may not always necessarily reflect or explain what they do.
  • Consistency. No matter how nicely a diagram is drawn, it should match the actual implementation or it is of very little use.
  • References to the actual implementation. I also want to know where I can find the implementation of a component, such as its Git repository.

Documenting the architecture


To visualize the architecture of our platform and organize all relevant information, I followed a strategy:

  • I took the components (typically their source code repositories) as the basis for everything else -- every component translates to a box in the architecture diagram.
  • I analyzed the dependency relationships between the components and denoted them as arrows. When a box points to another box by means of an arrow, this means that the other box is a dependency that should be deployed first. When a dependency is absent, the service will (most likely) not work.
  • I also discovered that the platform diagram easily gets cluttered by the sheer amount of components -- I decided to combine components that have very strongly correlated functionality in feature groups (that have dashed borders). Every feature group in architecture diagrams refers to another sub architecture diagram that provides a more specialized view of the feature group.
  • To clearly illustrate the difference between components that are our responsibility and those that are maintained by others teams, I make all external dependencies visible in the top-level architecture diagram.

The notation I used for these diagrams is not something I have entirely invented from scratch -- it is inspired by graph theory, package management and service management concepts. Disnix, for example, can visualize deployment architectures by using a similar notation.

To find all relevant information to create the diagrams, I consulted various sources:

  • I studied existing documents and diagrams to get a basic understanding of the system and an idea of the details I should look at.
  • I talked to a variety of people from various teams.
  • I looked inside the configuration settings of all deployment solutions used, e.g. the Amazon AWS console, Docker, CloudFoundry, Kubernetes, Nix configuration files.
  • Peek inside the source code repositories and look for settings that are references to other systems, such as configuration values that store URLs.
  • When I am in doubt: I consider the deployment configuration files and source code the "ultimate source of truth", because no matter how nice a diagram looks, it is useless if it is implemented differently.

Finally, just drawing diagrams will not completely suffice when the goal is provide clarity. I also observed that I need to document some leftover details.

Foremost, having a diagram without the semantics not explained will typically leave too many details open to interpretation to the user, so you need to explain the notation.

Second, you need to provide additional details about the services. I typically enumerate the following properties in a table for every component:

  • The name of the component.
  • A one line description stating its purpose.
  • The type of project (e.g. a Python/Java/Go project, Docker container, AWS Lambda function, etc.). This is useful to determine the kind of deployment procedure for the component.
  • A reference to the source code repository, e.g. a Git repository. The README of the corresponding repository should provide more detailed information about the project.

Benefits


Although it is quite a bit of work to set up, having a well documented architecture provides us the following benefits:

  • More effective deployment. Because of the feature groups and dividing the architecture into multiple layers, general concepts and details are separated. This makes it easier for developers to focus and absorb the right detailed knowledge to change a service.
  • More consensus in the team about the structure of the system and general quality attributes, such as scalability and security.
  • Better on-boarding for new team members.

Discussion


Writing architecture documentation IMO is not rocket science, just discipline. Obviously, there are much more sophisticated tools available to organize and visualize architectures (even tools that can generate code and reverse engineer code), but this is IMO not a hard requirement to start documenting.

However, you can not take all confusion away -- even if you have the best possible architecture documentation, people's thinking habits are shaped by the concepts they know and there will always be a slight mismatch (which is documented in academic research: 'Why Is It So Hard to Define Software Architecture?' written by Jason Baragry and Karl Reed).

Finally, architecture documentation is only a first good step to improve the quality of service-oriented systems. To make it a success, much more is needed, such as:

  • Automated (and reproducible) deployment processes.
  • More documentation (such as the APIs, end-user documentation, project documentation).
  • Automated unit, integration and acceptance testing.
  • Monitoring.
  • Measuring and improving code quality, test coverage, etc.
  • Using design patterns, architectural patterns, good programming abstractions.
  • And many more aspects.

But to do these things properly, having proper architecture documentation is an important prerequisite.

Related work


UPDATE: after publishing this blog post and giving an internal presentation at Mendix about this subject, I received a couple of questions about architectural approaches that share similarities with my approach, such as the C4 model.

The C4 model also uses a layered approach in which the top-level diagram displays the context of the system (the relation of the system with external users and systems), and deeper layers gradually reveal more details of the inner workings of the system while limiting the view to a subset of components to prevent details obfuscating the purpose.

I did not use this approach as an example reference, but my work is basically built on top of the same underlying principles that the C4 model builds on -- creating abstractions.

Creating abstractions in modeling is popularized already in the 70s by various computer scientists, such as Edward Yourdon and Tom DeMarco, who bring concepts from structural programming to other domains, such as modeling (as explained in the paper: 'The Choice of New Software Development Methodologies for Software Development Projects' by Edward Yourdon).

One of the mental aids in structured programming is abstraction, so that it "only matters what something does, disregarding how it works". I took data flow modeling (DFD) as an example technique (that also facilitates layers of abstractions including a top-level context DFD), but I replaced the data flow notation by dependency modeling.

Furthermore, the C4 model also provides a number of diagram types for each abstraction layer with specific purposes, but in my approach the notation and purposes of each layer are left abstract.

Sunday, December 30, 2018

8th yearly blog reflection

Similar to previous years, I will reflect over last year's writings. Again the occasion is my blog's anniversary -- today, it has been exactly 8 years ago that I started this blog.

Disnix


In the first two months of this year, most of my work was focused on Disnix. I added a number of extra features to Disnix that are particularly useful to diagnose errors in an environment in which services are distributed over a collection of machines in a network.

I also revised the internals of Disnix in such a way so that it has become possible to deploy systems with circular dependencies, by dropping the ordering requirement on inter-dependencies, when this is desired.

Finally, I extended my custom web application framework's example applications repository (that I released last year) and made it a public Disnix example case, to provide a more realistic/meaningful public example in addition to my trivial/hypothetical examples.

The new features described in these three blog posts are part of Disnix 0.8, released in March this year.

Mendix


The biggest news of the year is probably the fact that I embarked on a new challenge. In April, I joined Mendix, a company that provides a low-code application development platform.

Since the moment I joined, there have been many things I learned and quite a few new people I got to know.

From my learning experiences, I wrote an introduction blog post to the Mendix platform, specifically aimed at people with an advanced programming background (as a sidenote: the Mendix platform targets various kinds of audiences, including users with a limited programming background).

At Mendix, the Nix project and its tools are still largely unknown technology. In the company, sharing learning experiences within the entire R&D team is generally encouraged.

Mendix already uses rivalling technologies to automate application deployments. As an exercise to learn the technical architecture of running applications better, I automated the deployment process of Mendix applications with the Nix package manager and NixOS making it possible to automatically deploy a Mendix application by writing a simple NixOS configuration file and running only a single command-line instruction.

I also presented the Nix automation process to the entire R&D department and wrote an article about it on the public Mendix blog. (I kept a transcription on my own blog for archiving purposes).

Nix


In addition to Disnix, and joining Mendix (where I gave an introduction to Nix), I did some general Nix-related work as well.

To improve my (somewhat unorthodox) way of working, I created a syntax highlighter for the Nix expression language for one of my favourite tools: the editor the comes with the Midnight Commander.

As a personal experiment and proposal to tidy up some internals in the Nix packages repository, I developed my own build function abstractions to mimic Nix's stdenv.mkDerivation {} function abstraction with the purpose to clearly identify and separate its concerns.

Later, I extended this function abstraction approach to tidy up the deployment automation of pluggable SDKs, such as the Android SDK, to make it easier to automatically compose SDKs with plugins and the corresponding applications that they build.

Finally, I experimented with an approach that can automatically patch all ELF binaries in the Android SDK (and other binary only software projects), so that they will run from the Nix store without any problems.

Some of the function abstraction techniques described in the blog posts listed above as well as the auto patching strategy are used in the revised version of the Android build infrastructure that I merged into the master version of Nixpkgs last week. Aside from upgrading the Android SDK to the latest version, these improvements make maintaining the Android SDK and its plugins much easier.

In addition to writing Nix-related blog posts, I did much more Nix related stuff. I also gave a talk at NixCon 2018 (the third conference held about Nix and its related technologies) about Dysnomia's current state of affairs and I released a new major version of node2nix that adds Node.js 10.x support.

Overall top 10 of my blog posts


As with previous years, I will publish my overall top 10 of most popular blog posts:

  1. Managing private Nix packages outside the Nixpkgs tree. As I predicted in my blog reflection last year, this blog post is going to overtake the popularity of the blog post about GNU Guix. It seems that this blog post proves that we should provide more practical/hands-on information to people who just want to start using the Nix package manager.
  2. On Nix and GNU Guix. This has been my most popular blog post since 2012 and has now dropped to the second place. I think this is caused by the fact that GNU Guix is not as controversial as it used to be around the time it was first announced.
  3. An evaluation and comparison of Snappy Ubuntu. Remains my third most popular blog post but is gradually dropping in popularity. I have not heard much about the Snappy package manager developments in the last two years.
  4. Setting up a multi-user Nix installation on non-NixOS systems. Is still my fourth most popular blog post. I believe this blog post should drop in popularity soon, thanks to a number of improvements made to the Nix package manager. There is now a Nix installer for non-NixOS systems that supports multi-user and single-user installations on any Linux and macOS system.
  5. Yet another blog post about Object Oriented Programming and JavaScript. Still at the same place compared to last year. I noticed that still quite a few people use this blog post as a resource to learn more about prototypes in JavaScript, despite the fact that newer implementations of the ECMAScript standard have better functions (such as: Object.create) to manage objects with prototypes.
  6. An alternative explanation of the Nix package manager. Remains at exactly the same spot compared to last year. It probably remains popular due to the fact that this blog post is my preferred explanation recipe.
  7. On NixOps, Disnix, service deployment and infrastructure deployment. No change compared to last year. It still seems to clear up confusion to groups of people.
  8. Asynchronous programming with JavaScript. Maintains the same position compared to last year. I have no idea why it remains so popular, despite many improvements to the Node.js runtime and JavaScript language.
  9. Composing FHS-compatible chroot environments with Nix (or deploying Steam in NixOS). The popularity of this blog post has been gradually dropping in the last few years, but all of sudden it increased again. As a result, it now rose to the 9th place.
  10. A more realistic public Disnix example. This is the only blog post I wrote in 2018. It seems that this public example case is helpful for people to understand the kind of systems and requirements you have to meet in order to use Disnix to its full potential.

Discussion


What you may have probably noticed is that there is a drop in some of my more technical blog posts this year. This drop is caused by the fact that I am basically in a transition period -- I still have familiarize myself with my new environment and it is probably going to take a bit of time before I am back in my usual rythm.

But do not be worried. As usual, I have plenty of ideas and there will be more interesting stuff coming next year!

The final thing I would like to say is:

HAPPY NEW YEAR!!!!