Sander van der Burg's blog: Using Disnix as a simple and minimalistic dependency-based process manager

In my previous blog post I have demonstrated that I can deploy an entire service-oriented system locally with Disnix without the need of obtaining any external physical or virtual machines (or even Linux containers).

The fact that I could do this with relative ease is a benefit of using my experimental process manager-agnostic deployment framework that I have developed earlier, allowing you to target a variety of process management solutions with the same declarative deployment specifications.

Most notably, the fact that the framework can also work with processes that daemonize and let foreground processes automatically daemonize, make it very convenient to do local unprivileged user deployments.

To refresh your memory: a process that daemonizes spawns another process that keeps running in the background while the invoking process terminates after the initialization is done. Since there is no way for the caller to know the PID of the daemon process, daemons typically follow the convention to write a PID file to disk (containing the daemon's process ID), so that it can eventually be reliably terminated.

In addition to spawning a daemon process that remains in the background, services should also implement a number of steps to make it well-behaving, such as resetting signals handlers, clearing privacy sensitive environment variables, and dropping privileges etc.

In earlier blog posts, I argued that managing foreground processes with a process manager is typically more reliable (e.g. a PID of a foreground process is always known to be right).

On the other hand, processes that daemonize also have certain advantages:

They are self contained -- they do not rely on any external services to operate. This makes it very easy to run a collection of processes for local experimentation.
They have a standard means to notify the caller that the service is ready. By convention, the executable that spawns the daemon process is only supposed to terminate when the daemon has been successfully initialized. For example, foreground processes that are managed by systemd, should invoke the non-standard sd_notify() function to notify systemd that they are ready.

Although these concepts are nice, properly daemonizing a process is the responsibility of the service implementer -- as a consequence, it is not a guarantee that all services will properly implement all steps to make a daemon well-behaving.

Since the management of daemons is straight forward and self contained, the Nix expression language provides all kinds of advantages over data-oriented configuration languages (e.g. JSON or YAML) and Disnix has a flexible deployment model that works with a dependency graph and a plugin system that can activate and deactivate all kinds of components, I realized that I could integrate these facilities to make my own simple dependency-based process manager.

In this blog post, I will describe how this process management approach works.

Specifying a process configuration

A simple Nix expression capturing a daemon deployment configuration might look as follows:

{writeTextFile, mydaemon}:

writeTextFile {
  name = "mydaemon";
  text = ''
    process=${mydaemon}/bin/mydaemon
    pidFile=/var/run/mydaemon.pid
  '';
  destination = "/etc/dysnomia/process";
}

The above Nix expression generates a textual configuration file:

The process field specifies the path to executable to start (that in turn spawns a deamon process that keeps running in the background).
The pidFile field indicates the location of the PID file containing the process ID of the daemon process, so that it can be reliably terminated.

Most common system services (e.g. the Apache HTTP server, MySQL and PostgreSQL) can daemonize on their own and follow the same conventions. As a result, the deployment system can save you some configuration work by providing reasonable default values:

If no pidFile is provided, then the deployment system assumes that the daemon generates a PID file with the same name as the executable and resides in the directory that is commonly used for storing PID files: /var/run.
If a package provides only a single executable in the bin/ sub folder, then it is also not required to specify a process.

The fact that the configuration system provides reasonable defaults, means that for trivial services we do not have to specify any configuration properties at all -- simply providing a single executable in the package's bin/ sub folder suffices.

Do these simple configuration facilities really suffice to manage all kinds of system services? The answer is most likely no, because we may also want to manage processes that cannot daemonize on their own, or we may need to initialize some state first before the service can be used.

To provide these additional facilities, we can create a wrapper script around the executable and refer to it in the process field of the deployment specification.

The following Nix expression generates a deployment configuration for a service that requires state and only runs as a foreground process:

{stdenv, writeTextFile, writeScript, daemon, myForegroundService}:

let
  myForegroundServiceWrapper = writeScript {
    name = "myforegroundservice-wrapper";
    text = ''
      #! ${stdenv.shell} -e

      mkdir -p /var/lib/myservice
      exec ${daemon}/bin/daemon -U -F /var/run/mydaemon.pid -- \
        ${myForegroundService}/bin/myservice
    '';
  };
in
writeTextFile {
  name = "mydaemon";
  text = ''
    process=${myForegroundServiceWrapper}
    pidFile=/var/run/mydaemon.pid
  '';
  destination = "/etc/dysnomia/process";
}

As you may observe by looking at the Nix expression shown above, the Nix expression generates a wrapper script that does the following:

First, it creates the required state directory: /var/lib/myservice so that the service can work properly.
Then it invokes libslack's daemon command to automatically daemonize the service. The daemon command will automatically store a PID file containing the daemon's process ID, so that the configuration system knows how to terminate it. The value of the -F parameter passed to the daemon executable and the pidFile configuration property are the same.

Typically, in deployment systems that use a data-driven configuration language (such as YAML or JSON) obtaining a wrapped executable is a burden, but in the Nix expression language this is quite convenient -- the language allows you to automatically build packages and other static artifacts such as configuration files and scripts, and pass their corresponding Nix store paths as parameters to configuration files.

The combination of wrapper scripts and a simple configuration file suffices to manage all kinds of services, but it is fairly low-level -- to automate the deployment process of a system service, you basically need to re-implement the same kinds of configuration properties all over again.

In the Nix process management-framework, I have developed a high-level abstraction function for creating managed processes that can be used to target all kinds of process managers:

{createManagedProcess, runtimeDir}:
{port}:

let
  webapp = import ../../webapp;
in
createManagedProcess rec {
  name = "webapp";
  description = "Simple web application";

  # This expression can both run in foreground or daemon mode.
  # The process manager can pick which mode it prefers.
  process = "${webapp}/bin/webapp";
  daemonArgs = [ "-D" ];

  environment = {
    PORT = port;
    PID_FILE = "${runtimeDir}/${name}.pid";
  };
}

The above Nix expression is a constructor function that generates a configuration for a web application process (with an embedded HTTP server) that returns a static HTML page.

The createManagedProcess function abstraction function can be used to generate configuration artifacts for systemd, supervisord, and launchd and various kinds of scripts, such as sysvinit scripts and BSD rc scripts.

I can also easily adjust the generator infrastructure to generate the configuration files shown earlier (capturing the path of an executable and a PID file) with a wrapper script.

Managing daemons with Disnix

As explained in earlier blog posts about Disnix, services in a Disnix deployment model are abstract representations of basically any kind of deployment unit.

Every service is annotated with a type field. Disnix consults a plugin system named Dysnomia to invoke the corresponding plugin that can manage the lifecycle of that service, e.g. by activating or deactivating it.

Implementing a Dysnomia module for directly managing daemons is quite straight forward -- as an activation step I just have to start the process defined in the configuration file (or the single executable that resides in the bin/ sub folder of the package).

As a deactivation step (which purpose is to stop a process) I simply need to send a TERM signal to the PID in the PID file, by running:

$ kill $(cat $pidFile)

Translation to a Disnix deployment specification

The last remaining bits in the puzzle is process dependency management and the translation to a Disnix services model so that Disnix can carry out the deployment.

Deployments managed by the Nix process management framework are driven by so-called processes models that capture the properties of running process instances, such as:

{ pkgs ? import <nixpkgs> { inherit system; }
, system ? builtins.currentSystem
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager ? "disnix"
}:

let
  constructors = import ./constructors.nix {
    inherit pkgs stateDir runtimeDir logDir tmpDir forceDisableUserChange processManager;
  };
in
rec {
  webapp = rec {
    port = 5000;
    dnsName = "webapp.local";

    pkg = constructors.webapp {
      inherit port;
    };
  };

  nginxReverseProxy = rec {
    port = 8080;

    pkg = constructors.nginxReverseProxyHostBased {
      webapps = [ webapp ];
      inherit port;
    } {};
  };
}

The above Nix expression is a simple example of a processes model defining two running processes:

The webapp process is the web application process described earlier that runs an embedded HTTP server and serves a static HTML page.
The nginxReverseProxy is an Nginx web server that acts as a reverse proxy server for the webapp process. To make this service to work properly, it needs to be activated after the webapp process is activated. To ensure that the activation is done in the right order, webapp is passed as a process dependency to the nginxReverseProxyHostBased constructor function.

As explained in previous blog posts, Disnix deployments are driven by three kinds of deployment specifications: a services model that captures the service components of which a system consists, an infrastructure model that captures all available target machines and their configuration properties and a distribution model that maps services in the services model to machines in the infrastructure model.

The processes model and Disnix services model are quite similar -- the latter is actually a superset of the processes model.

We can translate process instances to Disnix services in a straight forward manner. For example, the nginxReverseProxy process can be translated into the following Disnix service configuration:

nginxReverseProxy = rec {
  name = "nginxReverseProxy";
  port = 8080;

  pkg = constructors.nginxReverseProxyHostBased {
    webapps = [ webapp ];
    inherit port;
  } {};

  activatesAfter = {
    inherit webapp;
  };

  type = "process";
};

In the above specification, the process configuration has been augmented with the following properties:

A name property because this is a mandatory field for every service.
In the process management framework all process instances are managed by the same process manager, but in Disnix services can have all kinds of shapes and forms and require a plugin to manage their life-cycles.

To allow Disnix to manage daemons, we specify the type property to refer to our process Dysnomia module that starts and terminates a daemon from a simple textual specification.
The process dependencies are translated to Disnix inter-dependencies by using the activatesAfter property.

In Disnix, inter-dependency parameters serve two purposes -- they provide the inter-dependent services with configuration parameters and they ensure the correct activation ordering.

The activatesAfter parameter disregards the first inter-dependency property, because we are already using the process management framework's convention for propagating process dependencies.

To allow Disnix to carry out the deployment of processes only a services model does not suffice. Since we are only interested in local deployment, we can just provide an infrastructure model with only a localhost target and a distribution model that maps all services to localhost.

To accomplish this, we can use the same principles for local deployments described in the previous blog post.

An example deployment scenario

I have added a new tool called nixproc-disnix-switch to the Nix process management framework that automatically converts processes models into Disnix deployment models and invokes Disnix to locally deploy a system.

The following command will carry out the complete deployment of our webapp example system, shown earlier, using Disnix as a simple dependency-based process manager:

$ nixproc-disnix-switch --state-dir /home/sander/var \
  --force-disable-user-change processes.nix

In addition to using Disnix for deploying processes, we can also use its other features. For example, another application of Disnix I typically find useful is the deployment visualization tool.

We can also use Disnix to generate a DOT graph from the deployment architecture of the currently deployed system and generate an image from it:

$ disnix-visualize > out.dot
$ dot -Tpng out.dot > out.png

Resulting in the following diagram:

In the first blog post that I wrote about the Nix process management framework (in which I explored a functional discipline using sysvinit-scripts as a basis), I was using hand-drawn diagrams to illustrate deployments.

With the Disnix backend, I can use Disnix's visualization tool to automatically generate these diagrams.

Discussion

In this blog post, I have shown that by implementing a few very simple concepts, we can use Disnix as a process management backend for the experimental Nix-based process management framework.

Although it was fun to develop a simple process management solution, my goal is not to compete with existing process management solutions (such as systemd, launchd or supervisord) -- this solution is primarily designed for simple use cases and local experimentation.

For production deployments, you probably still want to use a more sophisticated solution. For example, in production scenarios you also want to check the status of running processes and send them reload instructions. These are features that the Disnix backend does not support.

The Nix process management framework supports a variety of process managers, but none of them can be universally used on all platforms that Disnix can run on. For example, the sysvinit-script module works conveniently for local deployments but is restricted to Linux only. Likewise the bsdrc-script module only works on FreeBSD (and theoretically on NetBSD and OpenBSD). supervisord works on most UNIX-like systems, but is not self contained -- processes rely on the availability of the supervisord service to run.

This Disnix-based process management solution is simple and portable to all UNIX-like systems that Disnix has been tested on.

The process module described in this blog post is a replacement for the process module that already exists in the current release of Dysnomia. The reason why I want it to be replaced is that Dysnomia now provides better alternatives to the old process module.

For example, when it is desired to have your process managed by systemd, then the new systemd-unit module should be used that is more reliable, supports many more features and has a simpler implementation.

Furthermore, I made a couple of mistakes in the past. The old process module was originally implemented as a simple module that would start a foreground process in the background, by using the nohup command. At the time I developed that module, I did not know much about developing daemons, nor about the additional steps daemons need to carry out to make themselves well-behaving.

nohup is not a proper solution for daemonizing foreground processes, such as critical system services -- a process might inherit privacy-sensitive environment variables, does not change the current working directory to the root folder and keep external drives mounted, and could also behave unpredictably if signal handlers have been changed from the default behaviour.

At some point I believed that it is more reliable to use a process manager to manage the lifecycle of a process and adjusted the process module to do that. Originally I used Upstart for this purpose, and later I switched to systemd, with sysvinit-scripts (and the direct approach with nohup as alternative implementations).

Basically the process module provided three kinds of implementations in which none of them provided an optimal deployment experience.

I made a similar mistake with Dysnomia's wrapper module. Originally, its only purpose was to delegate the execution of deployment activities to a wrapper script included with the component that needs to be deployed. Because I was using this script mostly to deploy daemons, I have also adjusted the wrapper module to use an external process manager to manage the lifecycle of the daemon that the wrapper script might spawn.

Because of these mistakes and poor separation of functionality, I have decided to deprecate the old process and wrapper modules. Since they are frequently used and I do not want to break compatibility with old deployments, they can still be used if Dysnomia is configured in legacy mode, which is the default setting for the time being.

When using the old modules, Dysnomia will display a warning message explaining you that you should migrate to better alternatives.

Availability

The process Dysnomia module described in this blog post is part of the current development version of Dysnomia and will become available in the next release.

The Nix process management framework (which is still a highly-experimental prototype) includes the disnix backend (described in this blog post), allowing you to automatically translate a processes model to Disnix deployment models and uses Disnix to deploy a system.

Sander van der Burg's blog

Thursday, June 11, 2020

Using Disnix as a simple and minimalistic dependency-based process manager