Saturday, February 15, 2020

A declarative process manager-agnostic deployment framework based on Nix tooling

In a previous blog post written two months ago, I have introduced a new experimental Nix-based process framework, that provides the following features:

  • It uses the Nix expression language for configuring running process instances, including their dependencies. The configuration process is based on only a few simple concepts: function definitions to define constructors that generate process manager configurations, function invocations to compose running process instances, and Nix profiles to make collections of process configurations accessible from a single location.
  • The Nix package manager delivers all packages and configuration files and isolates them in the Nix store, so that they never conflict with other running processes and packages.
  • It identifies process dependencies, so that a process manager can ensure that processes are activated and deactivated in the right order.
  • The ability to deploy multiple instances of the same process, by making conflicting resources configurable.
  • Deploying processes/services as an unprivileged user.
  • Advanced concepts and features, such as namespaces and cgroups, are not required.

Another objective of the framework is that it should work with a variety of process managers on a variety of operating systems.

In my previous blog post, I was deliberately using sysvinit scripts (also known as LSB Init compliant scripts) to manage the lifecycle of running processes as a starting point, because they are universally supported on Linux and self contained -- sysvinit scripts only require the right packages installed, but they do not rely on external programs that manage the processes' life-cycle. Moreover, sysvinit scripts can also be conveniently used as an unprivileged user.

I have also developed a Nix function that can be used to more conveniently generate sysvinit scripts. Traditionally, these scripts are written by hand and basically require that the implementer writes the same boilerplate code over and over again, such as the activities that start and stop the process.

The sysvinit script generator function can also be used to directly specify the implementation of all activities that manage the life-cycle of a process, such as:

{createSystemVInitScript, nginx, stateDir}:
{configFile, dependencies ? [], instanceSuffix ? ""}:

let
  instanceName = "nginx${instanceSuffix}";
  nginxLogDir = "${stateDir}/${instanceName}/logs";
in
createSystemVInitScript {
  name = instanceName;
  description = "Nginx";
  activities = {
    start = ''
      mkdir -p ${nginxLogDir}
      log_info_msg "Starting Nginx..."
      loadproc ${nginx}/bin/nginx -c ${configFile} -p ${stateDir}
      evaluate_retval
    '';
    stop = ''
      log_info_msg "Stopping Nginx..."
      killproc ${nginx}/bin/nginx
      evaluate_retval
    '';
    reload = ''
      log_info_msg "Reloading Nginx..."
      killproc ${nginx}/bin/nginx -HUP
      evaluate_retval
    '';
    restart = ''
      $0 stop
      sleep 1
      $0 start
    '';
    status = "statusproc ${nginx}/bin/nginx";
  };
  runlevels = [ 3 4 5 ];

  inherit dependencies instanceName;
}

In the above Nix expression, we specify five activities to manage the life-cycle of Nginx, a free/open source web server:

  • The start activity initializes the state of Nginx and starts the process (as a daemon that runs in the background).
  • stop stops the Nginx daemon.
  • reload instructs Nginx to reload its configuration
  • restart restarts the process
  • status shows whether the process is running or not.

Besides directly implementing activities, the Nix function invocation shown above can also be used on a much higher level -- typically, sysvinit scripts follow the same conventions. Nearly all sysvinit scripts implement the activities described above to manage the life-cycle of a process, and these typically need to be re-implemented over and over again.

We can also generate the implementations of these activities automatically from a high level specification, such as:

{createSystemVInitScript, nginx,  stateDir}:
{configFile, dependencies ? [], instanceSuffix ? ""}:

let
  instanceName = "nginx${instanceSuffix}";
  nginxLogDir = "${stateDir}/${instanceName}/logs";
in
createSystemVInitScript {
  name = instanceName;
  description = "Nginx";
  initialize = ''
    mkdir -p ${nginxLogDir}
  '';
  process = "${nginx}/bin/nginx";
  args = [ "-c" configFile "-p" stateDir ];
  runlevels = [ 3 4 5 ];

  inherit dependencies instanceName;
}

You could basically say that the above createSystemVInitScript function invocation makes the configuration process of a sysvinit script "more declarative" -- you do not need to specify the activities that need to be executed to manage processes, but instead, you specify the relevant characteristics of a running process.

From this high level specification, the implementations for all required activities will be derived, using conventions that are commonly used to write sysvinit scripts.

After completing the initial version of the process management framework that works with sysvinit scripts, I have also been investigating other process managers. I discovered that their configuration processes have many things in common with the sysvinit approach. As a result, I have decided to explore these declarative deployment concepts a bit further.

In this blog post, I will describe a declarative process manager-agnostic deployment approach that we can integrate into the experimental Nix-based process management framework.

Writing declarative deployment specifications for managed running processes


As explained in the introduction, I have also been experimenting with other process managers than sysvinit. For example, instead of generating a sysvinit script that manages the life-cycle of a process, such as the Nginx server, we can also generate a supervisord configuration file to define Nginx as a program that can be managed with supervisord:

{createSupervisordProgram, nginx, stateDir}:
{configFile, dependencies ? [], instanceSuffix ? ""}:

let
  instanceName = "nginx${instanceSuffix}";
  nginxLogDir = "${stateDir}/${instanceName}/logs";
in
createSupervisordProgram {
  name = instanceName;
  command = "mkdir -p ${nginxLogDir}; "+
    "${nginx}/bin/nginx -c ${configFile} -p ${stateDir}";
  inherit dependencies;
}

Invoking the above function will generate a supervisord program configuration file, instead of a sysvinit script.

With the following Nix expression, we can generate a systemd unit file so that Nginx's life-cycle can be managed by systemd:

{createSystemdService, nginx, stateDir}:
{configFile, dependencies ? [], instanceSuffix ? ""}:

let
  instanceName = "nginx${instanceSuffix}";
  nginxLogDir = "${stateDir}/${instanceName}/logs";
in
createSystemdService {
  name = instanceName;
  Unit = {
    Description = "Nginx";
  };
  Service = {
    ExecStartPre = "+mkdir -p ${nginxLogDir}";
    ExecStart = "${nginx}/bin/nginx -c ${configFile} -p ${stateDir}";
    Type = "simple";
  };

  inherit dependencies;
}

What you may probably notice when comparing the above two Nix expressions with the last sysvinit example (that captures process characteristics instead of activities), is that they all contain very similar properties. Their main difference is a slightly different organization and naming convention, because each abstraction function is tailored towards the configuration conventions that each target process manager uses.

As discussed in my previous blog post about declarative programming and deployment, declarativity is a spectrum -- the above specifications are (somewhat) declarative because they do not capture the activities to manage the life-cycle of the process (the how). Instead, they specify what process we want to run. The process manager derives and executes all activities to bring that process in a running state.

sysvinit scripts themselves are not declarative, because they specify all activities (i.e. shell commands) that need to be executed to accomplish that goal. supervisord configurations and systemd services configuration files are (somewhat) declarative, because they capture process characteristics -- the process manager executes derives all required activities to bring the process in a running state.

Despite the fact that I am not specifying any process management activities, these Nix expressions could still be considered somewhat a "how specification", because each configuration is tailored towards a specific process manager. A process manager, such as syvinit, is a means to accomplish something else: getting a running process whose life-cycle can be conveniently managed.

If I would revise the above specifications to only express what I kind of running process I want, disregarding the process manager, then I could simply write:

{createManagedProcess, nginx, stateDir}:
{configFile, dependencies ? [], instanceSuffix ? ""}:

let
  instanceName = "nginx${instanceSuffix}";
  nginxLogDir = "${stateDir}/${instanceName}/logs";
in
createManagedProcess {
  name = instanceName;
  description = "Nginx";
  initialize = ''
    mkdir -p ${nginxLogDir}
  '';
  process = "${nginx}/bin/nginx";
  args = [ "-c" configFile" -p" "${stateDir}/${instanceName}" ];

  inherit dependencies instanceName;
}

The above Nix expression simply states that we want to run a managed Nginx process (using certain command-line arguments) and before starting the process, we want to initialize the state by creating the log directory, if it does not exists yet.

I can translate the above specification to all kinds of configuration artifacts that can be used by a variety of process managers to accomplish the same outcome. I have developed six kinds of generators allowing me to target the following process managers:


Translating the properties of the process manager-agnostic configuration to a process manager-specific properties is quite straight forward for most concepts -- in many cases, there is a direct mapping between a property in the process manager-agnostic configuration to a process manager-specific property.

For example, when we intend to target supervisord, then we can translate the process and args parameters to a command invocation. For systemd, we can translate process and args to the ExecStart property that refers to a command-line instruction that starts the process.

Although the process manager-agnostic abstraction function supports enough features to get some well known system services working (e.g. Nginx, Apache HTTP service, PostgreSQL, MySQL etc.), it does not facilitate all possible features of each process manager -- it will provide a reasonable set of common features to get a process running and to impose some restrictions on it.

It is still possible work around the feature limitations of process manager-agnostic deployment specifications. We can also influence the generation process by defining overrides to get process manager-specific properties supported:

{createManagedProcess, nginx, stateDir}:
{configFile, dependencies ? [], instanceSuffix ? ""}:

let
  instanceName = "nginx${instanceSuffix}";
  nginxLogDir = "${stateDir}/${instanceName}/logs";
in
createManagedProcess {
  name = instanceName;
  description = "Nginx";
  initialize = ''
    mkdir -p ${nginxLogDir}
  '';
  process = "${nginx}/bin/nginx";
  args = [ "-c" configFile" -p" "${stateDir}/${instanceName}" ];

  inherit dependencies instanceName;

  overrides = {
    sysvinit = {
      runlevels = [ 3 4 5 ];
    };
  };
}

In the above example, we have added an override specifically for sysvinit to tell that the init system that the process should be started in runlevels 3, 4 and 5 (which implies the process should stopped in the remaining runlevels: 0, 1, 2, and 6). The other process managers that I have worked with do not have a notion of runlevels.

Similarly, we can use an override to, for example, use systemd-specific features to run a process in a Linux namespace etc.

Simulating process manager-agnostic concepts with no direct equivalents


For some process manager-agnostic concepts, process managers do not always have direct equivalents. In such cases, there is still the possibility to apply non-trivial simulation strategies.

Foreground processes or daemons


What all deployment specifications shown in this blog post have in common is that their main objective is to bring a process in a running state. How these processes are expected to behave is different among process managers.

sysvinit and BSD rc scripts expect processes to daemonize -- on invocation, a process spawns another process that keeps running in the background (the daemon process). After the initialization of the daemon process is done, the parent process terminates. If processes do not deamonize, the startup process execution blocks indefinitely.

Daemons introduce another complexity from a process management perspective -- when invoking an executable from a shell session in background mode, the shell can you tell its process ID, so that it can be stopped when it is no longer necessary.

With deamons, an invoked processes forks another child process (or when it supposed to really behave well: it double forks) that becomes the daemon process. The daemon process gets adopted by the init system, and thus remains in the background even if the shell session ends.

The shell that invokes the executable does not know the PIDs of the resulting daemon processes, because that value is only propagated to the daemon's parent process, not the calling shell session. To still be able to control it, a well-behaving daemon typically writes its process IDs to a so-called PID file, so that it can be reliably terminated by a shell command when it is no longer required.

sysvinit and BSD rc scripts extensively use PID files to control daemons. By using a process' PID file, the managing sysvinit/BSD rc script can tell you whether a process is running or not and reliably terminate a process instance.

"More modern" process managers, such as launchd, supervisord, and cygrunsrv, do not work with processes that daemonize -- instead, these process managers are daemons themselves that invoke processes that work in "foreground mode".

One of the advantages of this approach is that services can be more reliably controlled -- because their PIDs are directly propagated to the controlling daemon from the fork() library call, it is no longer required to work with PID files, that may not always work reliably (for example: a process might abrubtly terminate and never clean its PID file, giving the system the false impression that it is still running).

systemd improves process control even further by using Linux cgroups -- although foreground process may be controlled more reliably than daemons, they can still fork other processes (e.g. a web service that creates processes per connection). When the controlling parent process terminates, and does not properly terminate its own child processes, they may keep running in the background indefintely. With cgroups it is possible for the process manager to retain control over all processes spawned by a service and terminate them when a service is no longer needed.

systemd has another unique advantage over the other process managers -- it can work both with foreground processes and daemons, although foreground processes seem to have to preference according to the documentation, because they are much easier to control and develop.

Many common system services, such as OpenSSH, MySQL or Nginx, have the ability to both run as a foreground process and as a daemon, typically by providing a command-line parameter or defining a property in a configuration file.

To provide an optimal user experience for all supported process managers, it is typically a good thing in the process manager-agnostic deployment specification to specify both how a process can be used as a foreground process and as a daemon:

{createManagedProcess, nginx, stateDir, runtimeDir}:
{configFile, dependencies ? [], instanceSuffix ? ""}:

let
  instanceName = "nginx${instanceSuffix}";
  nginxLogDir = "${stateDir}/${instanceName}/logs";
in
createManagedProcess {
  name = instanceName;
  description = "Nginx";
  initialize = ''
    mkdir -p ${nginxLogDir}
  '';
  process = "${nginx}/bin/nginx";
  args = [ "-p" "${stateDir}/${instanceName}" "-c" configFile ];
  foregroundProcessExtraArgs = [ "-g" "daemon off;" ];
  daemonExtraArgs = [ "-g" "pid ${runtimeDir}/${instanceName}.pid;" ];

  inherit dependencies instanceName;

  overrides = {
    sysvinit = {
      runlevels = [ 3 4 5 ];
    };
  };
}

In the above example, we have revised Nginx expression to both specify how the process can be started as a foreground process and as a daemon. The only thing that needs to be configured differently is one global directive in the Nginx configuration file -- by default, Nginx runs as a deamon, but by adding the daemon off; directive to the configuration we can run it in foreground mode.

When we run Nginx as daemon, we configure a PID file that refers to the instance name so that multiple instances can co-exist.

To make this conveniently configurable, the above expression does the following:

  • The process parameter specifies the process that needs to be started both in foreground mode and as a daemon. The args parameter specifies common command-line arguments that both the foreground and daemon process will use.
  • The foregroundProcessExtraArgs parameter specifies additional command-line arguments that are only used when the process is started in foreground mode. In the above example, it is used to provide Nginx the global directive that disables the daemon setting.
  • The daemonExtraArgs parameter specifies additional command-line arguments that are only used when the process is started as a daemon. In the above example, it used to provide Nginx a global directive with a PID file path that uniquely identifies the process instance.

For custom software and services implemented in different language than C, e.g. Node.js, Java or Python, it is far less common that they have the ability to daemonize -- they can typically only be used as foreground processes.

Nonetheless, we can still daemonize foreground-only processes, by using an external tool, such as libslack's daemon command:

$ daemon -U -i myforegroundprocess

The above command deamonizes the foreground process and creates a PID file for it, so that it can be managed by the sysvinit/BSD rc utility scripts.

The opposite kind of "simulation" is also possible -- if a process can only be used as a daemon, then we can use a proxy process to make it appear as a foreground process:

export _TOP_PID=$$

# Handle to SIGTERM and SIGINT signals and forward them to the daemon process
_term()
{
    trap "exit 0" TERM
    kill -TERM "$pid"
    kill $_TOP_PID
}

_interrupt()
{
    kill -INT "$pid"
}

trap _term SIGTERM
trap _interrupt SIGINT

# Start process in the background as a daemon
${executable} "$@"

# Wait for the PID file to become available.
# Useful to work with daemons that don't behave well enough.
count=0

while [ ! -f "${_pidFile}" ]
do
    if [ $count -eq 10 ]
    then
        echo "It does not seem that there isn't any pid file! Giving up!"
        exit 1
    fi

    echo "Waiting for ${_pidFile} to become available..."
    sleep 1

    ((count=count++))
done

# Determine the daemon's PID by using the PID file
pid=$(cat ${_pidFile})

# Wait in the background for the PID to terminate
${if stdenv.isDarwin then ''
  lsof -p $pid +r 3 &>/dev/null &
'' else if stdenv.isLinux || stdenv.isCygwin then ''
  tail --pid=$pid -f /dev/null &
 '' else if stdenv.isBSD || stdenv.isSunOS then ''
   pwait $pid &
 '' else
   throw "Don't know how to wait for process completion on system: ${stdenv.system}"}

# Wait for the blocker process to complete.
# We use wait, so that bash can still
# handle the SIGTERM and SIGINT signals that may be sent to it by
# a process manager
blocker_pid=$!
wait $blocker_pid

The idea of the proxy script shown above is that it runs as a foreground process as long as the daemon process is running and relays any relevant incoming signals (e.g. a terminate and interrupt) to the daemon process.

Implementing this proxy was a bit tricky:

  • In the beginning of the script we configure signal handlers for the TERM and INT signals so that the process manager can terminate the daemon process.
  • We must start the daemon and wait for it to become available. Although the parent process of a well-behaving daemon should only terminate when the initialization is done, this turns out not be a hard guarantee -- to make the process a bit more robust, we deliberately wait for the PID file to become available, before we attempt to wait for the termination of the daemon.
  • Then we wait for the PID to terminate. The bash shell has an internal wait command that can be used to wait for a background process to terminate, but this only works with processes in the same process group as the shell. Daemons are in a new session (with different process groups), so they cannot be monitored by the shell by using the wait command.

    From this Stackoverflow article, I learned that we can use the tail command of GNU Coreutils, or lsof on macOS/Darwin, and pwait on BSDs and Solaris/SunOS to monitor processes in other process groups.
  • When a command is being executed by a shell script (e.g. in this particular case: tail, lsof or pwait), the shell script can no longer respond to signals until the command completes. To still allow the script to respond to signals while it is waiting for the daemon process to terminate, we must run the previous command in background mode, and we use the wait instruction to block the script. While a wait command is running, the shell can respond to signals.

The generator function will automatically pick the best solution for the selected target process manager -- this means that when our target process manager are sysvinit or BSD rc scripts, the generator automatically picks the configuration settings to run the process as a daemon. For the remaining process managers, the generator will pick the configuration settings that runs it as a foreground process.

If a desired process model is not supported, then the generator will automatically simulate it. For instance, if we have a foreground-only process specification, then the generator will automatically configure a sysvinit script to call the daemon executable to daemonize it.

A similar process happens when a daemon-only process specification is deployed for a process manager that cannot work with it, such as supervisord.

State initialization


Another important aspect in process deployment is state initialization. Most system services require the presence of state directories in which they can store their PID, log and temp files. If these directories do not exist, the service may not work and refuse to start.

To cope with this problem, I typically make processes self initializing -- before starting the process, I check whether the state has been intialized (e.g. check if the state directories exist) and re-initialize the initial state if needed.

With most process managers, state initialization is easy to facilitate. For sysvinit and BSD rc scripts, we just use the generator to first execute the shell commands to initialize the state before the process gets started.

Supervisord allows you to execute multiple shell commands in a single command directive -- we can just execute a script that initializes the state before we execute the process that we want to manage.

systemd has a ExecStartPre directive that can be used to specify shell commands to execute before the main process starts.

Apple launchd and cygrunsrv, however, do not have a generic shell execution mechanism or some facility allowing you to execute things before a process starts. Nonetheless, we can still ensure that the state is going to be initialized by creating a wrapper script -- first the wrapper script does the state initialization and then executes the main process.

If a state initialization procedure was specified and the target process manager does not support scripting, then the generator function will transparently wrap the main process into a wrapper script that supports state initialization.

Process dependencies


Another important generic concept is process dependency management. For example, Nginx can act as a reverse proxy for another web application process. To provide a functional Nginx service, we must be sure that the web application process gets activated as well, and that the web application is activated before Nginx.

If the web application process is activated after Nginx or missing completely, then Nginx is (temporarily) unable to redirect incoming requests to the web application process causing end-users to see bad gateway errors.

The process managers that I have experimented with all have a different notion of process dependencies.

sysvinit scripts can optionally declare dependencies in their comment sections. Tools that know how to interpret these dependency specifications can use it to decide the right activation order. Systems using sysvinit typically ignore this specification. Instead, they work with sequence numbers in their file names -- each run level configuration directory contains a prefix (S or K) followed by two numeric digits that defines the start or stop order.

supervisord does not work with dependency specifications, but every program can optionally provide a priority setting that can be used to order the activation and deactivation of programs -- lower priority numbers have precedence over high priority numbers.

From dependency specifications in a process management expression, the generator function can automatically derive sequence numbers for process managers that require it.

Similar to sysvinit scripts, BSD rc scripts can also declare dependencies in their comment sections. Contrary to sysvinit scripts, BSD rc scripts can use the rcorder tool to parse these dependencies from the comments section and automatically derive the order in which the BSD rc scripts need to be activated.

cygrunsrv also allows you directly specify process dependencies. The Windows service manager makes sure that the service get activated in the right order and that all process dependencies are activated first. The only limitation is that cygrunsrv only allows up to 16 dependencies to be specified per service.

To simulate process dependencies with systemd, we can use two properties. The Wants property can be used to tell systemd that another service needs to be activated first. The After property can be used to specify the ordering.

Sadly, it seems that launchd has no notion of process dependencies at all -- processes can be activated by certain events, e.g. when a kernel module was loaded or through socket activation, but it does not seem to have the ability to configure process dependencies or the activation ordering. When our target process manager is launchd, then we simply have to inform the user that proper activation ordering cannot be guaranteed.

Changing user privileges


Another general concept, that has subtle differences in each process manager, is changing user privileges. Typically for the deployment of system services, you do not want to run these services as root user (that has full access to the filesystem), but as an unprivileged user.

sysvinit and BSD rc scripts have to change users through the su command. The su command can be used to change the user ID (UID), and will automatically adopt the primary group ID (GID) of the corresponding user.

Supervisord and cygrunsrv can also only change user IDs (UIDs), and will adopt the primary group ID (GID) of the corresponding user.

Systemd and launchd can both change the user IDs and group IDs of the process that it invokes.

Because only changing UIDs are universally supported amongst process managers, I did not add a configuration property that allows you to change GIDs in a process manager-agnostic way.

Deploying process manager-agnostic configurations


With a processes Nix expression, we can define which process instances we want to run (and how they can be constructed from source code and their dependencies):

{ pkgs ? import  { inherit system; }
, system ? builtins.currentSystem
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager
}:

let
  constructors = import ./constructors.nix {
    inherit pkgs stateDir runtimeDir logDir tmpDir;
    inherit forceDisableUserChange processManager;
  };                                                                                                                                                                                               
in                                                                                                                                                                                                 
rec {                                                                                                                                                                                              
  webapp = rec {                                                                                                                                                                                   
    port = 5000;                                                                                                                                                                                   
    dnsName = "webapp.local";                                                                                                                                                                      
                                                                                                                                                                                                   
    pkg = constructors.webapp {                                                                                                                                                                    
      inherit port;                                                                                                                                                                                
    };                                                                                                                                                                                             
  };                                                                                                                                                                                               
                                                                                                                                                                                                   
  nginxReverseProxy = rec {
    port = 8080;

    pkg = constructors.nginxReverseProxy {
      webapps = [ webapp ];
      inherit port;
    } {};
  };
}

In the above Nix expression, we compose two running process instances:

  • webapp is a trivial web application process that will simply return a static HTML page by using the HTTP protocol.
  • nginxReverseProxy is a Nginx server configured as a reverse proxy server. It will forward incoming HTTP requests to the appropriate web application instance, based on the virtual host name. If a virtual host name is webapp.local, then Nginx forwards the request to the webapp instance.

To generate the configuration artifacts for the process instances, we refer to a separate constructors Nix expression. Each constructor will call the createManagedProcess function abstraction (as shown earlier) to construct a process configuration in a process manager-agnostic way.

With the following command-line instruction, we can generate sysvinit scripts for the webapp and Nginx processes declared in the processes expression, and run them as an unprivileged user with the state files managed in our home directory:

$ nixproc-build --process-manager sysvinit \
  --state-dir /home/sander/var \
  --force-disable-user-change processes.nix

By adjusting the --process-manager parameter we can also generate artefacts for a different process manager. For example, the following command will generate systemd unit config files instead of sysvinit scripts:

$ nixproc-build --process-manager systemd \
  --state-dir /home/sander/var \
  --force-disable-user-change processes.nix

The following command will automatically build and deploy all processes, using sysvinit as a process manager:

$ nixproc-sysvinit-switch --state-dir /home/sander/var \
  --force-disable-user-change processes.nix

We can also run a life-cycle management activity on all previously deployed processes. For example, to retrieve the statuses of all processes, we can run:

$ nixproc-sysvinit-runactivity status

We can also traverse the processes in reverse dependency order. This is particularly useful to reliably stop all processes, without breaking any process dependencies:

$ nixproc-sysvinit-runactivity -r stop

Similarly, there are command-line tools to use the other supported process managers. For example, to deploy systemd units instead of sysvinit scripts, you can run:

$ nixproc-systemd-switch processes.nix

Distributed process manager-agnostic deployment with Disnix


As shown in the previous process management framework blog post, it is also possible to deploy processes to machines in a network and have inter-dependencies between processes. These kinds of deployments can be managed by Disnix.

Compared to the previous blog post (in which we could only deploy sysvinit scripts), we can now also use any process manager that the framework supports. The Dysnomia toolset provides plugins that supports all process managers that this framework supports:

{ pkgs, distribution, invDistribution, system
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager ? "sysvinit"
}:

let
  constructors = import ./constructors.nix {
    inherit pkgs stateDir runtimeDir logDir tmpDir;
    inherit forceDisableUserChange processManager;
  };

  processType =
    if processManager == "sysvinit" then "sysvinit-script"
    else if processManager == "systemd" then "systemd-unit"
    else if processManager == "supervisord" then "supervisord-program"
    else if processManager == "bsdrc" then "bsdrc-script"
    else if processManager == "cygrunsrv" then "cygrunsrv-service"
    else throw "Unknown process manager: ${processManager}";
in
rec {
  webapp = rec {
    name = "webapp";
    port = 5000;
    dnsName = "webapp.local";
    pkg = constructors.webapp {
      inherit port;
    };
    type = processType;
  };

  nginxReverseProxy = rec {
    name = "nginxReverseProxy";
    port = 8080;
    pkg = constructors.nginxReverseProxy {
      inherit port;
    };
    dependsOn = {
      inherit webapp;
    };
    type = processType;
  };
}

In the above expression, we have extended the previously shown processes expression into a Disnix service expression, in which every attribute in the attribute set represents a service that can be distributed to a target machine in the network.

The type attribute of each service indicates which Dysnomia plugin needs to manage its life-cycle. We can automatically select the appropriate plugin for our desired process manager by deriving it from the processManager parameter.

The above Disnix expression has a drawback -- in a heteregenous network of machines (that run multiple operating systems and/or process managers), we need to compose all desired variants of each service with configuration files for each process manager that we want to use.

It is also possible to have target-agnostic services, by delegating the translation steps to the corresponding target machines. Instead of directly generating a configuration file for a process manager, we generate a JSON specification containing all parameters that are passed to createManagedProcess. We can use this JSON file to build the corresponding configuration artefacts on the target machine:

{ pkgs, distribution, invDistribution, system
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager ? null
}:

let
  constructors = import ./constructors.nix {
    inherit pkgs stateDir runtimeDir logDir tmpDir;
    inherit forceDisableUserChange processManager;
  };
in
rec {
  webapp = rec {
    name = "webapp";
    port = 5000;
    dnsName = "webapp.local";
    pkg = constructors.webapp {
      inherit port;
    };
    type = "managed-process";
  };

  nginxReverseProxy = rec {
    name = "nginxReverseProxy";
    port = 8080;
    pkg = constructors.nginxReverseProxy {
      inherit port;
    };
    dependsOn = {
      inherit webapp;
    };
    type = "managed-process";
  };
}

In the above services model, we have set the processManager parameter to null causing the generator to print JSON presentations of the function parameters passed to createManagedProcess.

The managed-process type refers to a Dysnomia plugin that consumes the JSON specification and invokes the createManagedProcess function to convert the JSON configuration to a configuration file used by the preferred process manager.

In the infrastructure model, we can configure the preferred process manager for each target machine:

{
  test1 = {
    properties = {
      hostname = "test1";
    };
    containers = {
      managed-process = {
        processManager = "sysvinit";
      };
    };
  };

  test2 = {
    properties = {
      hostname = "test2";
    };
    containers = {
      managed-process = {
        processManager = "systemd";
      };
    };
  };
}

In the above infrastructure model, the managed-proces container on the first machine: test1 has been configured to use sysvinit scripts to manage processes. On the second test machine: test2 the managed-process container is configured to use systemd to manage processes.

If we distribute the services in the services model to targets in the infrastructure model as follows:

{infrastructure}:

{
  webapp = [ infrastructure.test1 ];
  nginxReverseProxy = [ infrastructure.test2 ];
}

and the deploy the system as follows:

$ disnix-env -s services.nix -i infrastructure.nix -d distribution.nix

Then the webapp process will distributed to the test1 machine in the network and will be managed with a sysvinit script.

The nginxReverseProxy will be deployed to the test2 machine and managed as a systemd job. The Nginx reverse proxy forwards incoming connections to the webapp.local domain name to the web application process hosted on the first machine.

Discussion


In this blog post, I have introduced a process manager-agnostic function abstraction making it possible to target all kinds of process managers on a variety of operating systems.

By using a single set of declarative specifications, we can:

  • Target six different process managers on four different kinds of operating systems.
  • Implement various kinds of deployment scenarios: production deployments, test deployments as an unprivileged user.
  • Construct multiple instances of processes.

In a distributed-context, the advantage is that we can uniformly target all supported process managers and operating systems in a heterogeneous environment from a single declarative specification.

This is particularly useful to facilitate technology diversity -- for example, one of the key selling points of Microservices is that "any technology" can be used to implement them. In many cases, technology diversity is "restricted" to frameworks, programming languages, and storage technologies.

One particular aspect that is rarely changed is the choice of operating systems, because of the limitations of deployment tools -- most deployment solutions for Microservices are container-based and heavily rely on Linux-only concepts, such as Namespaces and cgroups.

With this process managemenent framework and the recent Dysnomia plugin additions for Disnix, it is possible to target all kinds of operating systems that support the Nix package manager, making the operating system component selectable as well. This, for example, allows you to also pick the best operating system to implement a certain requirement -- for example, when performance is important you might pick Linux, and when there is a strong emphasis on security, you could pick OpenBSD to host a mission criticial component.

Limitations


The following table, summarizes the differences between the process manager solutions that I have investigated:

sysvinit bsdrc supervisord systemd launchd cygrunsrv
Process type daemon daemon foreground foreground
daemon
foreground foreground
Process control method PID files PID files Process PID cgroups Process PID Process PID
Scripting support yes yes yes yes no no
Process dependency management Numeric ordering Dependency-based Numeric ordering Dependency-based
+ dependency loading
None Dependency-based
+ dependency loading
User changing capabilities user user user and group user and group user and group user
Unprivileged user deployments yes* yes* yes yes* no no
Operating system support Linux FreeBSD
OpenBSD
NetBSD
Many UNIX-like:
Linux
macOS
FreeBSD
Solaris
Linux (+glibc) only macOS (Darwin) Windows (Cygwin)

Although we can facilitate lifecycle management from a common specification with a variety of process managers, only the most important common features are supported.

Not every concept can be done in a process manager agnostic way. For example, we cannot generically do any isolation of resources (except for packages, because we use Nix). It is difficult to generalize these concepts because these they are not standardized, e.g. the POSIX standard does not descibe namespaces and cgroups (or similar concepts).

Furthermore, most process managers (with the exception of supervisord) are operating system specific. As a result, it still matters what process manager is picked.

Related work


Process manager-agnostic deployment is not entirely a new idea. Dysnomia already has a target-agnostic 'process' plugin for quite a while, that translates a simple deployment specification (constisting of key-value pairs) to a systemd unit configuration file or sysvinit script.

The features of Dysnomia's process plugin are much more limited compared to the createManagedProcess abstraction function described in this blog post. It does not support any other than process managers than sysvint and systemd, and it can only work with foreground processes.

Furthermore, target agnostic configurations cannot be easily extended -- it is possible to (ab)use the templating mechanism, but it has no first class overridde facilities.

I also found a project called pleaserun that also has the objective to generate configuration files for a variety of process managers (my approach and pleaserunit, both support sysvinit scripts, systemd and launchd).

It seems to use template files to generate the configuration artefacts, and it does not seem to have a generic extension mechanism. Furthermore, it provides no framework to configure the location of shared resources, automatically install package dependencies or to compose multiple instances of processes.

Some remaining thoughts


Although the Nix package manager (not the NixOS distribution), should be portable amongst a variety of UNIX-like systems, it turns out that the only two operating systems that are well supported are Linux and macOS. Nix was reported to work on a variety of other UNIX-like systems in the past, but recently it seems that many things are broken.

To make Nix work on FreeBSD 12.1, I have used the latest stable Nix package manager version with patches from this repository. It turns out that there is still a patch missing to work around in a bug in FreeBSD that incorrectly kills all processes in a process group. Fortunately, when we run Nix as as unprivileged user, this bug does not seem to cause any serious problems.

Recent versions of Nixpkgs turn out to be horribly broken on FreeBSD -- the FreeBSD stdenv does not seem to work at all. I tried switching back to stdenv-native (a stdenv environment that impurely uses the host system's compiler and core executables), but that also no longer seems to work in the last three major releases -- the Nix expression evaluation breaks in several places. Due to the intense amount of changes and assumptions that the stdenv infrastructure currently makes, it was as good as impossible for me to fix the infrastructure.

As another workaround, I reverted back very to a very old version of Nixpkgs (version 17.03 to be precise), that still has a working stdenv-native environment. With some tiny adjustments (e.g. adding some shell aliases for some GNU variants of certain shell executables to stdenv-native), I have managed to get some basic Nix packages working, including Nginx on FreeBSD.

Surprisingly, running Nix on Cygwin was less painful than FreeBSD (because of all the GNUisms that Cygwin provides). Similar to FreeBSD, recent versions of Nixpkgs also appear to be broken, including the Cygwin stdenv environment. By reverting back to release-18.03 (that still has a somewhat working stdenv for Cygwin), I have managed to build a working Nginx version.

As a future improvement to Nixpkgs, I would like to propose a testing solution for stdenv-native. Although I understand that is difficult to dedicate manpower to maintain all unconventional Nix/Nixpkgs ports, stdenv-native is something that we can also convienently test on Linux and prevent from breaking in the future.

Availability


The latest version of my experimental Nix-based process framework, that includes the process manager-agnostic configuration function described in this blog post, can be obtained from my GitHub page.

In addition, the repository also contains some example cases, including the web application system described in this blog post, and a set of common system services: MySQL, Apache HTTP server, PostgreSQL and Apache Tomcat.

Monday, January 6, 2020

Writing a well-behaving daemon in the C programming language

Slightly over one month ago, I wrote a blog post about a new experimental Nix-based process management framework that I have been developing. For this framework, I need to experiment with processes that run in the foreground (i.e. they block the shell of the user that invokes it as long as it is running), and daemons -- processes that run in the background and are not directly controlled by the user.

Deamons are (still) a common practice in the UNIX world (although this is changing nowadays with process managers, such as systemd and launchd) to make system services available to end users, such as web servers, the secure shell, and FTP.

To make experimentation more convenient, I wanted to write a very simple service that can run both in the foreground and as a deamon. Initially, I thought writing a daemon would be straight forward, but this turned out to be much more difficult than I initially anticipated.

I have learned that daemonizing a process is quite simple, but writing a well-behaving deamon is quite complicated. I have been studying a number of sources on how to properly write one and none of them provided me all the information that I needed. As a result, I have decided to do some investigation myself and write a blog post about my findings.

The basics


As I have stated earlier, the basics of writing a daemon in the C programming language are simple. For example, I can write a very trivial service whose only purpose is to print: Hello on the terminal every second until it receives a terminate or interrupt signal:

#include <stdio.h>
#include <unistd.h>
#include <signal.h>

volatile int terminate = FALSE;

static void handle_termination(int signum)
{
    terminate = TRUE;
}

static void init_service(void)
{
    signal(SIGINT, handle_termination);
    signal(SIGTERM, handle_termination);
}

static void run_main_loop(void)
{
    while(!terminated)
    {
        fprintf(stderr, "Hello!\n");
        sleep(1);
    }
}

The following trivial main method allows us to let the service run in "foreground mode":

int main()
{
    init_service();
    run_main_loop();
    return 0; 
}

The above main method initializes the service (that configures the signal handlers) and invokes the main loop (as defined in the previous code example). The main loop keeps running until it receives a terminate (SIGTERM) or interrupt (SIGINT) signal that unblocks the main loop.

When we run the above program in a shell session, we should observe:

$ ./simpleservice
Hello!
Hello!
Hello!

We will see that the service prints Hello! every second until it gets terminated. Moreover, we will notice that the shell is blocked from receiving user input until we terminate the process. Furthermore, if we terminate the shell (for example by sending it a TERM signal from another shell session), the service gets terminated as well.

We can easily change the main method, shown earlier, to turn our trivial service (that runs in foreground mode) into a deamon:

#include <sys/types.h>
#include <unistd.h>
#include <stdio.h>

int main()
{
    pid_t pid = fork();

    if(pid == -1)
    {
        fprintf(stderr, "Can't fork daemon process!\n");
        return 1;
    }
    else if(pid == 0)
        run_main_loop();

    return 0;
}

The above code forks a child process, the child process executes the main loop, and the parent process terminates immediately.

When running the above program on the terminal, we should see that the ./simpleservice command returns almost immediately and a daemon process keeps running in the background. Stopping our shell session (e.g. with the exit command or killing it by sending a TERM signal to it), does not cause the daemon process to be stopped.

This behaviour can be easily explained -- because the shell only waits for the completion of the process that it invokes (the parent process), it will no longer block indefinitely, because it terminates directly after forking the child process.

The daemon process keeps running (even if we end our shell session), because it gets orphaned from the parent and adopted by the process that runs at PID 1 -- the init system.

Writing a well-behaving daemon


The above code fragments may probably look very trivial. Is this really sufficient to create a deamon? You could probably already guess that the answer is: no.

To learn more about properly writing a daemon, I studied various sources. The first source I consulted was the Linux Daemon HOWTO, but that document turned out to be a bit outdated (to be precise: it was last updated in 2004). This document basically shows how to implement a very minimalistic version of a well-behaving daemon. It does much more than just forking a child process, for reasons that I will explain later in this blog post.

After some additional searching, I stumbled on systemd's recommendations for writing a traditional SysV daemon (this information can also be found by opening the following manual page: man 7 daemon). systemd's daemon manual page specifies even more steps. Contrary to the Linux Daemon HOWTO, it does not provide any code examples.

Despite the fact that the HOWTO implements extra requirements than just a simple fork, it still looked quite simple. Implementing all systemd recommendations, however, turned out to be much more complicated than I expected.

It also made me realize: why is all this stuff needed? None of the sources that I studied so far, explain me why all these additional steps need to be implemented.

After some thinking, I believe I understand why: a well-behaving daemon needs to be fully detached from user control, be controllable from an external program, and act safely and predictably.

In the following sections I will explain what I believe is the rationale for each step described in the systemd daemon manual page. Moreover, I will describe the means that I used to implement each requirement:

Closing all file descriptors, except the standard ones: stdin, stdout, stderr


Closing all, but the standard file descriptors, is a good practice, because the daemon process inherits all open files from the calling process (e.g. the shell session from which the daemon is invoked).

Not closing any additional open file descriptors may cause the file descriptors to remain open for an indefinite amoung of time, making it impossible to cleanly unmount the partition where these files may have been stored. Moreover, it also keeps file descriptors unnecessarily allocated.

The daemon manual page describes two strategies to implement closing these non-standard file descriptors. On Linux, it is possible to iterate over the content of the: /proc/self/fd file. A portable, but less efficient, way is to iterate from file descriptor 3 to the value returned by getrlimit for RLIMIT_NOFILE.

I ended up implementing this step with the following function:

#include <sys/time.h>
#include <sys/resource.h>

static int close_non_standard_file_descriptors(void)
{
    unsigned int i;

    struct rlimit rlim;
    int num_of_fds = getrlimit(RLIMIT_NOFILE, &rlim);

    if(num_of_fds == -1)
        return FALSE;

    for(i = 3; i < num_of_fds; i++)
        close(i);

    return TRUE;
}

Resetting all signal handlers to their defaults


Similar to file descriptors, the daemon process also inherits the signal handler configuration of the caller process. If signal handlers have been altered, then the daemon process may behave in a non-standard and unpredictable way.

For example, the TERM signal handler could have been overridden so that the daemon no longer cleanly shuts down when it receives a TERM signal. As a countermeasure, the signal handlers must be reset to their default behaviour.

The systemd daemon manual page suggests to iterate over all signals up to the limit of _NSIG and resetting them to SIG_DFL.

I did some investigation and it seems that this method is not standardized by e.g. POSIX -- _NSIG is a constant that glibc defines and it is not a guarantee that other libc implementations will provide the same constant.

I ended up implementing the following function:

#include <signal.h>

static int reset_signal_handlers_to_default(void)
{
#if defined _NSIG
    unsigned int i;

    for(i = 1; i < _NSIG; i++)
    {
         if(i != SIGKILL && i != SIGSTOP)
             signal(i, SIG_DFL);
    }
#endif
    return TRUE;
}

The above implementation iterates from the first signal handler, up until the maximum signal handler. It will ignore SIGKILL and SIGSTOP because they cannot be overridden.

Unfortunately, this implementation will not work with libc implementations that lack the _NSIG constant. I am really curious if somebody could suggest me a standards compliant way to reset all signal handlers.

Resetting the signal mask


It is also possible to completely block certain signals by adjusting the signal mask. The signal mask also gets inherited by the daemon from the calling process. To make a daemon act predicatably, e.g. it should do a proper shutdown when it receives the TERM signal, it would be a good thing to reset the signal mask to the default configuration.

I ended up implementing this requirement with the following function:

static int clear_signal_mask(void)
{
    sigset_t set;

    return((sigemptyset(&set) == 0)
      && (sigprocmask(SIG_SETMASK, &set, NULL) == 0));
}

Sanitizing the environment block


Another property that a daemon process inherits from the caller are the environment variables. Some environment variables might negatively affect the behaviour of the daemon. Furthermore, environment variables may also contain privacy-sensitive information that could get exposed if the security of a deamon gets compromised.

As a counter-measure, it would be good to sanitize the environment block. For example, by removing environment variables with clearenv() or using a white listing approach.

For my trivial example case, I did not need to sanitize the environment block because no environment variables are used.

Forking a background process


After closing all non-standard file descriptors, and resetting the signal handlers to their default behaviour, we can fork a background process. The primary reason to fork a background process, as explained earlier, is to get it orphaned from the parent so that it gets adopted by PID 1, the init system, and stays in the background.

We must actually fork twice, as I will explain later. First, I will fork a child process that I will call a helper process. The helper process will do some more housekeeping work and forks another child process, that will become our daemon process.

Detaching from the terminal


The child process is still attached to the terminal of the caller process, and can still read input from the terminal and send output to the terminal. To completely detach it from the terminal (and any user interaction), we must adjust the session ID:

if(setsid() == -1)
{
    /* Do some error handling */
}

and then we must fork again, so that the daemon can never re-acquire a terminal again. The second fork will create the real daemon process. The helper process should terminate so that the newly created daemon process gets adopted by the init system (that runs on PID 1):

if(fork_daemon_process(pipefd[1],
  pid_file,
  data,
  initialize_daemon,
  run_main_loop) == -1)
{
    /* Do some error handling */
}

/*
 * Exit the helper process,
 * so that the daemon process gets adopted by PID 1
 */
exit(0);

Connecting /dev/null to standard input, output and error in the daemon process


Since we have detached from the terminal, we should connect /dev/null to the standard file descriptors in the daemon process, because these file descriptors are still connected to the terminal from which we have detached.

I implemented this requirement with the following function:

static int attach_standard_file_descriptors_to_null(void)
{
    int null_fd_read, null_fd_write;

    return(((null_fd_read = open(NULL_DEV_FILE, O_RDONLY)) != -1)
      && (dup2(null_fd_read, STDIN_FILENO) != -1)
      && ((null_fd_write = open(NULL_DEV_FILE, O_WRONLY)) != -1)
      && (dup2(null_fd_write, STDOUT_FILENO) != -1)
      && (dup2(null_fd_write, STDERR_FILENO) != -1));
}

Resetting the umask to 0 in the daemon process


The umask (a setting that globally alters file permissions of newly created files) may have been adjusted by the calling process, causing directories and files created by the daemon to have unpredictable file permissions.

As a countermeasure, we should reset the umask to 0 with the following function call:

umask(0);

Changing current working directory to / in the daemon process


The daemon process also inherits the current working directory of the caller process. It may happen that the current working directory refers to an external drive or partition. As a result, it can no longer be cleanly unmounted while the daemon is running.

To prevent this from happening, we should change the current working directory to the root folder, because that is the only partition that is guaranteed to stay mounted while the system is running:

if(chdir("/") == -1)
{
    /* Do some error handling */
}

Creating a PID file in the daemon process


Because a program that daemonizes forks another process, and terminates immediately, there is no way for the caller (e.g. the shell) to know what the process ID (PID) of the daemon process is. The caller can only know the PID of the parent process, that terminates right after setting up the daemon.

A common practice to know the PID of the daemon process is to write a PID file that contains its process ID (PID). A PID file can be used to reliably terminate service, when it is no longer needed.

According to the systemd recommendations, a PID file must be created in a race free fashion, e.g. when a daemon has been started already it should not attempt to create another PID file with the same name.

I ended up implementing this requirement as follows:

static int create_pid_file(const char *pid_file)
{
    pid_t my_pid = getpid();
    char my_pid_str[10];
    int fd;

    sprintf(my_pid_str, "%d", my_pid);

    if((fd = open(pid_file, O_CREAT | O_EXCL | O_WRONLY, S_IRUSR | S_IWUSR)) == -1)
        return FALSE;

    if(write(fd, my_pid_str, strlen(my_pid_str)) == -1)
        return FALSE;

    close(fd);

    return TRUE;
}

In the above implementation, the O_EXCL flag makes sure that the a previously generated PID file cannot already exist or belong to another process. If a PID file happens to exist already, the initialization of the daemon fails.

Dropping privileges in the daemon process, if applicable


Since daemons are typically long running, and they are typically started by the super user (root), they are also typically a security risk. By default, if a process is started as root, the daemon process also has root privileges and full access to the entire filesystem, if its security gets compromised.

For this reason, it is typically a good idea to drop privileges in the daemon process. There are a variety of restrictions you can impose, such as changing the ownership of the process to an unprivileged user:

if(setgid(100) == 0 && setuid(1000) == 0)
{
    /* Execute some code with restrictive user permissions */
    ...
}
else
{
    fprintf(stderr, "Cannot change user permissions!\n");
    exit(1);
}

In my trivial example case, I had no such requirement.

Notifying the parent process when the initialization of the daemon is complete


Another practical problem you may run into with daemons is that you do not know (for sure) when they are ready to be used. Because the parent process terminates immediately and delegates most of the work, including the initialization steps, to the daemon process (that runs in the background), you may already attempt to use it before the initialization is done. If you rely on a network connection, then it may happen that right after starting the daemon, the network link does not work.

Furthermore, there is no way to know for sure how long it would take before all the daemon's services become available. This particularly inconvenient for scripting.

For me personally, notification was the most complicated requirement to implement.

systemd's daemon manual page suggests to use an unnamed pipe. I ended up with an implementation that looks as follows:

Before doing any forking, I will create a pipe, and pass the corresponding file descriptors the to utility function that creates the helper process, as described earlier:

int pipefd[2];

if(pipe(pipefd) == -1)
    return STATUS_CANNOT_CREATE_PIPE;
else
{
    if(fork_helper_process(pipefd, pid_file, data, initialize_daemon, run_main_loop) == -1)
        return STATUS_CANNOT_FORK_HELPER_PROCESS;
    else
    {
         /* Wait for notification from the parent */
    }
}

The helper and daemon process will use the write end of the pipe to send notification messages. I ended up using it as follows:

static pid_t fork_helper_process(int pipefd[2],
  const char *pid_file,
  void *data,
  int (*initialize_daemon) (void *data),
  int (*run_main_loop) (void *data))
{
    pid_t pid = fork();

    if(pid == 0)
    {
        close(pipefd[0]); /* Close unneeded read-end */

        if(setsid() == -1)
        {
            notify_parent_process(pipefd[1], STATUS_CANNOT_SET_SID);
            exit(STATUS_CANNOT_SET_SID);
        }

        /* Fork again, so that the terminal can not be acquired again */
        if(fork_daemon_process(pipefd[1], pid_file, data, initialize_daemon, run_main_loop) == -1)
        {
            notify_parent_process(pipefd[1], STATUS_CANNOT_FORK_DAEMON_PROCESS);
            exit(STATUS_CANNOT_FORK_DAEMON_PROCESS);
        }

        exit(0); /* Exit the helper process, so that the daemon process gets adopted by PID 1 */
    }

    return pid;
}

If something fails or the entire initialization process finishes successfully completes, the helper and daemon processes invoke the notify_parent_process() function to send a message over the write end of the pipe to notify the parent. In case of an error, the helper or daemon process also terminates with the same exit status.

I implemented the notification function as follows:

static void notify_parent_process(int writefd, DaemonStatus message)
{
    char byte = (char)message;
    while(write(writefd, &byte, 1) == 0);
    close(writefd);
}

The above function simply sends a message (of only one byte in size) over the pipe and then closes the connection. The possible messages are encoded in the following enumeration:

typedef enum
{
    STATUS_INIT_SUCCESS                  = 0x0,
    STATUS_CANNOT_ATTACH_STD_FDS_TO_NULL = 0x1,
    STATUS_CANNOT_CHDIR                  = 0x2,
    ...
    STATUS_CANNOT_SET_SID                = 0xc,
    STATUS_CANNOT_FORK_DAEMON_PROCESS    = 0xd,
    STATUS_UNKNOWN_DAEMON_ERROR          = 0xe
}
DaemonStatus;

The parent process will not terminate immediately, but waits for a notification message from the helper or daemon processes:

DaemonStatus exit_status;

close(pipefd[1]); /* Close unneeded write end */
exit_status = wait_for_notification_message(pipefd[0]);
close(pipefd[0]);
return exit_status;

When the parent receives a notification message, it will simply propagate the value as an exit status (which is 0 if everything succeeds, and non-zero when the process fails somewhere). The non-zero exit status corresponds to a value in the enumeration (shown earlier) allowing us to trace the origins of the error.

The function that waits for the the notification of the daemon process is implemented as follows:

static DaemonStatus wait_for_notification_message(int readfd)
{
    char buf[BUFFER_SIZE];
    ssize_t bytes_read = read(readfd, buf, 1);

    if(bytes_read == -1)
        return STATUS_CANNOT_READ_FROM_PIPE;
    else if(bytes_read == 0)
        return STATUS_UNKNOWN_DAEMON_ERROR;
    else
        return buf[0];
}

The above method reads from the pipe, will block as long as no data was sent and the write end of the pipe was not closed, and returns the byte that it has received.

Exiting the parent process after the daemon initialization is done


This requirement overlaps with the previous requirement and can be met by calling exit() after the notification message was sent (and/or the write end of the pipe was closed).

Discussion


I really did not expect that writing a well-behaving deamon (that follows systemd's recommendations) would be so difficult. I ended up writing 206 LOC to implement all the functionality listed above. Maybe I could reduce this amount a bit with some clever programming tricks, but my objective was to keep the code clear, have it decomposed into functions and make it understandable.

There are solutions that alleviate the burden of creating a daemon. A prominent example would be BSD's daemon() function (that is also included with glibc). It is a single function call that can be used to automatically daemonize a process. Unfortunately, it does not seem to meet all requirements that systemd specifies.

I also looked at many Stackoverflow posts, and although they correctly cite the systemd's daemon manual page with requirements for a well-behaving daemon, none of the solutions that I could find fully meet all requirements -- in particular, I could not find any good examples that implement a protocol that notify the parent process when the daemon process was successfully initialized.

Because none of these Stackoverflow posts provide what I need, I have decided to not use any of these articles as an example, but start from scratch and look for all relevant pieces myself.

One aspect that still puzzles me is how to "properly" iterate over all signal handlers. The solution hinted by systemd is non-standard requiring a glibc specific constant. Some sources say that there is no standardized equivalent, so I am still curious whether there is a recipe that can reset all signal handlers to their default behaviour in a standard compliant way.

In the introduction section, I mentioned that daemons are still a common practice in UNIX-like systems, such as Linux, but that this is changing. IMO this is for a good reason -- services typically need to reimplement the same kind of functionality over and over again. Furthermore, I have noticed that not all daemons meet all requirements and could behave incorrectly. For example, it is not a guarantee that a deamon correctly writes a PID file with the PID of the daemon process.

For these reasons, systemd's daemon manual page also describes "new style daemons", that are considerably easier to implement with less boilerplate code. Apple has similar recommendations for launchd.

With "new style daemons", processes just spawn in foreground mode, and the process manager (e.g. systemd, launchd or supervisord) takes care of all "housekeeping tasks" -- the process manager makes sure that it runs in the background, drops user privileges etc.

Furthermore, because the process manager directly invokes the daemon process (and as a result knows its PID), controlling a daemon is also less fragile -- the requirement that a PID file needs to be properly created is also dropped.

Availability


The daemonize infrastructure described in this blog post is used by the example webapp that can be found in my experimental Nix process framework repository.

Monday, December 30, 2019

9th annual blog reflection

Today, it is my blog's 9th anniversary. Similar to previous years, this is a good opportunity for reflection.

A summary of 2019


There are a variety of things that I did in 2019 that can be categorized as follows:

Documenting the architecture of service-oriented systems


In the beginning of the year, most of my efforts were concentrated on software architecture related aspects. At Mendix, I have spent quite a bit of time on documenting the architecture of the platform that I and my team work on.

I wrote a blog post about my architecture documentation approach, that is based on a small number of simple configuration management principles that I came up several years ago. I gave a company wide presentation about this subject, that was well-received.

Later, I have extended Dynamic Disnix, a self-adaptive deployment framework that I have developed as part of my past research, with new tools to automate the architecture documentation approach that I have presented previously -- with these new tools, it has become possible to automatically generate architecture diagrams and descriptions from Disnix service models.

Furthermore, the documentation tools can also group components (making it possible to decompose diagrams into layers) and generate architecture documentation catalogs for complete systems.

Improving Disnix's data handling and data integration capabilities


While implementing the architecture documentation tools in Dynamic Disnix, I ran into several limitations of Disnix's internal architecture.

To alleviate the burden of developing new potential future integrations, I have "invented" NixXML, an XML file format that is moderately readable, that can be easily converted from and to Nix expressions, and is still useful to be used without Nix. I wrote a library called libnixxml that can be used to produce and consume NixXML data.

In addition to writing libnixxml. I also rewrote significant portions of Disnix and Dynamic Disnix to work with libnixxml (instead of hand written parsing and Nix expression generation code). The result is that we need far less boilerplate code, we have more robust data parsing, and we have significantly better error reporting.

In the process of rewriting the Disnix data handling portions, I have been following a number of conventions that I have documented in this blog post.

Implementing a new input model transformation pipeline in Disnix


Besides data integration and data handling, I have also re-engineered the input model transformation pipeline in Disnix, that translates the Disnix input models (a services, infrastructure and distribution model) capturing aspects of a service-oriented systems to an executable model (a deployment manifest/model) from which the deployment activities that it needs to run can be derived.

The old implementation evolved organically and has grown so much that it became too complicated to adjust and maintain. In contrast, the new transformation pipeline consists of well-defined intermediate models.

As a result of having a better, more well-defined transformation pipeline, more properties can be configured.

Moreover, it is now also possible to use a fourth and new input model: a packages model that can be used to directly deploy a set of packages to a target machine in the network.

The new data handling features and transformation pipeline were integrated into Disnix 0.9 that was released in September 2019.

An experimental process management framework


In addition to the release of Disnix 0.9 that contains a number of very big internal improvements, I have started working on a new experimental Nix-based process management framework.

I wrote a blog post about the first milestone: a functional organization that makes it possible to construct multiple instances of managed processes, that can be deployed by privileged and unprivileged users.

Miscellaneous


Earlier this year, I was also involved in several discussions about developer motivation. Since this is a common and recurring subject in the developer community, I have decided to write about my own experiences.

Some thoughts


Compared to previous years, this year is my least productive from a blogging perspective -- this can be attributed to several reasons. First and foremost, there are a number of major personal and work-related events that contributed to a drop in productivity for a while.

Second, some of the blog posts that I wrote are results of very big undertakings -- for example, the internal improvements to Disnix took me several months to complete.

As an improvement for next year, I will try to more carefully decompose my tasks, prioritize and order them in a better way. For example, implementing the architecture documentation tools was unnecessarily difficult before NixXML and the revised data models were implemented. I should have probably done these tasks in the opposite order.

Overall top 10 of my blog posts


As with previous years, I am going to publish the top 10 of my most frequently read blog posts. Surprisingly, not much has changed compared to last year:

  1. Managing private Nix packages outside the Nixpkgs tree. As with last year, this blog post remains to be the most popular. I believe this can be attributed to the fact that there is still no official, simple hands-on tutorial avaialble that allows users to easily experiment with building packages.
  2. On Nix and GNU Guix. This blog post used to be my most popular blog since 2012, but has dropped to the second place last year. I think this shows that the distinction between Nix and GNU Guix has become more clear these days.
  3. An evaluation and comparison of Snappy Ubuntu. Still remains very popular, but I have not heard much from Snappy lately so it still remains a bit of a mystery why it remains third.
  4. Setting up a multi-user Nix installation on non-NixOS systems. Some substantial improvements have been made to support multi-user installations on other operating systems than NixOS, but I believe we can still do better. There are some ongoing efforts to improve the Nix installer even further.
  5. Yet another blog post about Object Oriented Programming and JavaScript. This blog post is still at the same spot compared to last year, and still seems to be read quite frequently. It seems that documenting my previous misunderstandings are quite helpful.
  6. An alternative explanation of the Nix package manager. This blog post is a somewhat unconventional Nix explanation recipe, that typically has my preference. It seems to catch quite a bit of attention.
  7. On NixOps, Disnix, service deployment and infrastructure deployment. This blog post is still as popular as last year. It seems that it is quite frequently used as an introduction to NixOps, although I did not write this blog post for that purpose.
  8. Asynchronous programming with JavaScript. A blog post I wrote several years ago, when I was learning about Node.js and asynchronous programming concepts. It still seems to remain popular even though the blog post is over six years old.
  9. Composing FHS-compatible chroot environments with Nix (or deploying Steam in NixOS). This blog post is still read quite frequently, probably because of Steam, which is a very popular application for gamers.
  10. Auto patching prebuilt binary software packages for deployment with the Nix package manager. This is the only new blog post in the overall top 10 compared to last year. Patching binaries to deploy software on NixOS is a complicated subject. The popularity of this blog post proves that developing solutions to make this process more convenient is valuable.

Conclusion


As with previous years, I am still not out of ideas so stay tuned!

There is one more think I'd like to say and that is:


HAPPY NEW YEAR!!!!!!!!!!!

Monday, November 11, 2019

A Nix-based functional organization for managing processes

The Nix expression language and the Nix packages repository follow a number of unorthodox, but simple conventions that provide all kinds of benefits, such as the ability to conveniently construct multiple variants of packages and store them safely in isolation without any conflicts.

The scope of the Nix package manager, however, is limited to package deployment only. Other tools in the Nix project extend deployment to other kinds of domains, such as machine level deployment (NixOS), networks of machines (NixOps) and service-oriented systems (Disnix).

In addition to packages, there is also a category of systems (such as systems following the microservices paradigm) that are composed of running processes.

Recently, I have been automating deployments of several kinds of systems that are composed of running processes and I have investigated how we can map the most common Nix packaging conventions to construct specifications that we can use to automate the deployment of these kinds of systems.

Some common Nix packaging conventions


The Nix package manager implements a so-called purely functional deployment model. In Nix, packages are constructed in the Nix expression language from pure functions in which side effects are eliminated as much as possible, such as undeclared dependencies residing in global directories, such as /lib and /bin.

The function parameters of a build function refer to all required inputs to construct the package, such as the build instructions, the source code, environment variables and all required build-time dependencies, such as compilers, build tools and libraries.

A big advantage of eliminating side effects (or more realistically: significantly reducing side effects) is to support reproducible deployment -- when building the same package with the same inputs on a different machine, we should get a (nearly) bit-identical result.

Strong reproducibility guarantees, for example, make it possible to optimize package deployments by only building a package from source code once and then downloading binary substitutes from remote servers that can be trusted.

In addition to the fact that packages are constructed by executing pure functions (with some caveats), the Nixpkgs repository -- that contains a large set of well known free and open source packages -- follows a number of conventions. One of such conventions is that most package build recipes reside in separate files and that each recipe declares a function.

An example of such a build recipe is:

{ stdenv, fetchurl, pkgconfig, glib, gpm, file, e2fsprogs
, perl, zip, unzip, gettext, libssh2, openssl}:

stdenv.mkDerivation rec {
  pname = "mc";
  version = "4.8.23";

  src = fetchurl {
    url = "http://www.midnight-commander.org/downloads/${pname}-${version}.tar.xz";
    sha256 = "077z7phzq3m1sxyz7li77lyzv4rjmmh3wp2vy86pnc4387kpqzyx";
  };

  buildInputs = [
    pkgconfig perl glib slang zip unzip file gettext libssh2 openssl
  ];

  configureFlags = [ "--enable-vfs-smb" ];

  meta = {
    description = "File Manager and User Shell for the GNU Project";
    homepage = http://www.midnight-commander.org;
    maintainers = [ stdenv.lib.maintainers.sander ];
    platforms = with stdenv.lib.platforms; linux ++ darwin;
  };
}

The Nix expression shown above (pkgs/tools/misc/mc/default.nix) describes how to build the Midnight Commander from source code and its inputs:

  • The first line declares a function in which the function arguments refer to all dependencies required to build Midnight Commander: stdenv refers to an environment that provides standard UNIX utilities, such as cat and ls and basic build utilities, such as gcc and make. fetchurl is a utility function that can be used to download artifacts from remote locations and that can verify the integrity of the downloaded artifact.

    The remainder of the function arguments refer to packages that need to be provided as build-time dependencies, such as tools and libraries.
  • In the function body, we invoke the stdenv.mkDerivation function to construct a Nix package from source code.

    By default, if no build instructions are provided, it will automatically execute the standard GNU Autotools/GNU Make build procedure: ./configure; make; make install, automatically downloads and unpacks the tarball specified by the src parameter, and uses buildInputs to instruct the configure script to automatically find the dependencies it needs.

A function definition that describes a package build recipe is not very useful on its own -- to be able to build a package, it needs to be invoked with the appropriate parameters.

A Nix package is composed in a top-level Nix expression (pkgs/top-level/all-packages.nix) that declares one big data structure: an attribute set, in which every attribute name refers to a possible variant of a package (typically only one) and each value to a function invocation that builds the package, with the desired versions of variants of the dependencies that a package may need:

{ system ? builtins.currentSystem }:

rec {
  stdenv = ...
  fetchurl = ...
  pkgconfig = ...
  glib = ...

  ...

  openssl = import ../development/libraries/openssl {
    inherit stdenv fetchurl zlib ...;
  };

  mc = import ../tools/misc/mc {
    inherit stdenv fetchurl pkgconfig glib gpm file e2fsprogs perl;
    inherit zip unzip gettext libssh2 openssl;
  };
}

The last attribute (mc) in the attribute set shown above, builds a specific variant of Midnight Commander, by passing the dependencies that it needs as parameters. It uses the inherit language construct to bind the parameters that are declared in the same lexical scope.

All the dependencies that Midnight Commander needs are declared in the same attribute set and composed in a similar way.

(As a sidenote: in the above example, we explicitly propagate all function parameters, which is quite verbose and tedious. In Nixpkgs, it is also possible to use a convenience function called: callPackage that will automatically pass the attributes with the same names as the function arguments as parameters.)

With the composition expression above and running the following command-line instruction:

$ nix-build all-packages.nix -A mc
/nix/store/wp3r8qv4k510...-mc-4.8.23

The Nix package manager will first deploy all build-time dependencies that Midnight Commander needs, and will then build Midnight Commander from source code. The build result is stored in the Nix store (/nix/store/...-mc-4.8.23), in which all build artifacts reside in isolation in their own directories.

We can start Midnight Commander by providing the full path to the mc executable:

$ /nix/store/wp3r8qv4k510...-mc-4.8.23/bin/mc

The prefix of every artifact in the Nix store is a SHA256 hash code derived from all inputs provided to the build function. The SHA256 hash prefix makes it possible to safely store multiple versions and variants of the same package next to each other, because they never share the same name.

If Nix happens to compute a SHA256 that is already in the Nix store, then the build result is exactly the same, preventing Nix from doing the same build again.

Because the Midnight Commander build recipe is a function, we can also adjust the function parameters to build different variants of the same package. For example, by changing the openssl parameter, we can build a Midnight Commander variant that uses a specific version of OpenSSL that is different than the default version:

{ system ? builtins.currentSystem }:

rec {
  stdenv = ...
  fetchurl = ...
  pkgconfig = ...
  glib = ...

  ...

  openssl_1_1_0 = import ../development/libraries/openssl/1.1.0.nix {
    inherit stdenv fetchurl zlib ...;
  };

  mc_alternative = import ../tools/misc/mc {
    inherit stdenv fetchurl pkgconfig glib gpm file e2fsprogs perl;
    inherit zip unzip gettext libssh2;
    openssl = openssl_1_1_0; # Use a different OpenSSL version
  };
}

We can build our alternative Midnight Commander variant as follows:

$ nix-build all-packages.nix -A mc_alternative
/nix/store/0g0wm23y85nc0y...-mc-4.8.23

As may be noticed, we get a different Nix store path, because we build Midnight Commander with different build inputs.

Although the purely functional model provides all kinds of nice benefits (such as reproducibility, the ability conveniently construct multiple variants of a package, and storing them in isolation without any conflicts), it also has a big inconvenience from a user point of view -- as a user, it is very impractical to remember the SHA256 hash prefixes of a package to start a program.

As a solution, Nix also makes it possible to construct user environments (probably better known as Nix profiles), by using the nix-env tool or using the buildEnv {} function in Nixpkgs.

User environments are symlink trees that blend the content of a set of packages into a single directory in the Nix store so that they can be accessed from one single location. By adding the bin/ sub folder of a user environment to the PATH environment variable, it becomes possible for a user to start a command-line executable without specifying a full path.

For example, with the nix-env tool we can install the Midnight Commander in a Nix profile:

$ nix-env -f all-packages.nix -iA mc

and then start it as follows:

$ mc

The above command works if the Nix profile is in the PATH environment variable of the user.

Mapping packaging conventions to process management


There are four important packaging conventions that the Nix package manager and the Nixpkgs repository follow that I want to emphasize:

  • Invoking the derivation function (typically through stdenv.mkDerivation or an abstraction built around it) builds a package from its build inputs.
  • Every package build recipe defines a function in which the function parameters refer to all possible build inputs. We can use this function to compose all kinds of variants of a package.
  • Invoking a package build recipe function constructs a particular variant of a package and stores the result in the Nix store.
  • Nix profiles blend the content of a collection of packages into one directory and makes them accessible from a single location.

(As a sidenote: There is some discussion in the Nix community about these concepts. For example, one of the (self-)criticisms is that the Nix expression language, that is specifically designed as a DSL for package management, has no package concept in the language.

Despite this oddity, I personally think that functions are a simple and powerful concept. The only thing that is a bit of a poor decision in my opinion is to call the mechanism that executes a build: derivation).

Process management is quite different from package management -- we need to have an executable deployed first (typically done by a package manager, such as Nix), but in addition, we also need to manage the life-cycle of a process, such as starting and stopping it. These facilities are not Nix's responsibility. Instead, we need to work with a process manager that can facilitate these.

Furthermore, systems composed of running processes have a kind of dependency relationship that Nix does not manage -- they may also communicate with other processes (e.g. via a network connection or UNIX domain sockets).

As a consequence, they require the presence of other processes in order to work. This means that processes need to be activated in the right order or, alternatively, the communication between two dependent processes need to be queued until both are available.

If these dependency requirements are not met, then a system may not work. For example, a web application process is useless if the database backend is not available.

In order to fully automate the deployment of systems that are composed of running processes, we can do package management with Nix first and then we need to:

  • Integrate with a process manager, by generating artifacts that a process manager can work with, such as scripts and/or configuration files.
  • Make it possible to specify the process dependencies so that they can be managed (by a process manager or by other means) and activated in the right order.

Generating sysvinit scripts


There a variety of means to manage processes. A simple (and for today's standards maybe an old fashioned and perhaps controversial) way to manage processes is by using sysvinit scripts (also known as LSB Init compliant scripts).

A sysvinit script implements a set of activities and a standardized interface allowing us to manage the lifecycle of a specific process, or a group of processes.

For example, on a traditional Linux distribution, we can start a process, such as the Nginx web server, with the following command:

$ /etc/init.d/nginx start

and stop it as follows:

$ /etc/init.d/nginx stop

A sysvinit script is straight forward to implement and follows a number of conventions:

#!/bin/bash

## BEGIN INIT INFO
# Provides:      nginx
# Default-Start: 3 4 5
# Default-Stop:  0 1 2 6
# Should-Start:  webapp
# Should-Stop:   webapp
# Description:   Nginx
## END INIT INFO

. /lib/lsb/init-functions

case "$1" in
  start)
    log_info_msg "Starting Nginx..."
    mkdir -p /var/nginx/logs
    start_daemon /usr/bin/nginx -c /etc/nginx.conf -p /var/nginx 
    evaluate_retval
    ;;

  stop)
    log_info_msg "Stopping Nginx..."
    killproc /usr/bin/nginx
    evaluate_retval
    ;;

  reload)
    log_info_msg "Reloading Nginx..."
    killproc /usr/bin/nginx -HUP
    evaluate_retval
    ;;

  restart)
    $0 stop
    sleep 1
    $0 start
    ;;

  status)
    statusproc /usr/bin/nginx
    ;;

  *)
    echo "Usage: $0 {start|stop|reload|restart|status}"
    exit 1
    ;;
esac

  • A sysvinit script typically starts by providing some metadata, such a description, in which runlevels it needs to be started and stopped, and which dependencies the script has.

    In classic Linux distributions, meta information is typically ignored, but more sophisticated process managers, such as systemd, can use it to automatically configure the activation/deactivation ordering.
  • The body defines a case statement that executes a requested activity.
  • Activities use a special construct (in the example above it is: evaluate_retval) to display the status of an instruction, typically whether a process has started or stopped successfully or not, using appropriate colors (e.g. red in case of a failure, green in case of sucess).
  • sysvinit scripts typically define a number of commonly used activities: start starts a process, stop stops a process, reload sends a HUP signal to the process to let it reload its configuration (if applicable), restart restarts the process, status indicates the status, and there is a fallback activity that displays the usage to the end user to show which activities can be executed.

sysvinit scripts use number of utility functions that are defined by the Linux Standards Base (LSB):

  • start_daemon is a utility function that is typically used for starting a process. It has the expectation that the process daemonizes -- a process that daemonizes will fork another process that keeps running in the background and then terminates immediately.

    Controlling a daemonized processes is a bit tricky -- when spawning a process the shell can tell you its process id (PID), so that it can be controlled, but it cannot tell you the PID of the process that gets daemonized by the invoked process, because that is beyond the shell's control.

    As a solution, most programs that daemonize will write a PID file (e.g. /var/run/nginx.pid) that can be used to determine the PID of the daemon so that it can be controlled.

    To do proper housekeeping, the start_daemon function will check whether such a PID file already exists, and will only start the process when it needs to.
  • Stopping a process, or sending it a different kind of signal, is typically done with the killproc function.

    This function will search for the corresponding PID file of the process (by default, a PID file that has the same name as the executable or a specified PID file) and uses the corresponding PID content to terminate the daemon. As a fallback, if no PID file exists, it will scan the entire process table and kills the process with the same name.
  • We can determine the status of a process (e.g. whether it is running or not), with the statusproc function that also consults the corresponding PID file or scans the process table if needed.

Most common system software have the ability to deamonize, such as nginx, the Apache HTTP server, MySQL and PostgreSQL. Unfortunately, application services (such as microservices) that are implemented with technologies such as Python, Node.js or Java Springboot do not have this ability out of the box.

Fortunately, we can use an external utility, such as libslack's daemon command, to let these foreground-only processes daemonize. Although it is possible to conveniently daemonize external processes, this functionality is not part of the LSB standard.

For example, using the following command to start the web application front-end process will automatically daemonize a foreground process, such as a simple Node.js web application, and creates a PID file so that it can be controlled by the sysvinit utility functions:

$ daemon -U -i /home/sander/webapp/app.js

In addition to manually starting and stopping sysvinit scripts, sysvinit scripts are also typically started on startup and stopped on shutdown, or when a user switches between runlevels. These processes are controlled by symlinks that reside in an rc.d directory that have specific prefixes:

/etc/
  init.d/
    webapp
    nginx
  rc0.d/
    K98nginx -> ../init.d/nginx
    K99webapp -> ../init.d/webapp
  rc1.d/
    K98nginx -> ../init.d/nginx
    K99webapp -> ../init.d/webapp
  rc2.d/
    K98nginx -> ../init.d/nginx
    K99webapp -> ../init.d/webapp
  rc3.d/
    S00webapp -> ../init.d/nginx
    S01nginx -> ../init.d/webapp
  rc4.d/
    S00webapp -> ../init.d/nginx
    S01nginx -> ../init.d/webapp
  rc5.d/
    S00webapp -> ../init.d/nginx
    S01nginx -> ../init.d/webapp
  rc6.d/
    K98nginx -> ../init.d/nginx
    K99webapp -> ../init.d/webapp

In the above directory listing, every rc?.d directory contains symlinks to scripts in the init.d directory.

The first character of each symlink file indicates whether an init.d script should be started (S) or stopped (K). The two numeric digits that follow indicate the order in which the scripts need to be started and stopped.

Each runlevel has a specific purpose as described in the LSB standard. In the above situation, when we boot the system in multi-user mode on the console (run level 3), first our Node.js web application will be started, followed by nginx. On a reboot (when we enter runlevel 6) nginx and then the web application will be stopped. Basically, the stop order is the reverse of the start order.

To conveniently automate the deployment of sysvinit scripts, I have created a utility function called: createSystemVInitScript that makes it possible to generate sysvinit script with the Nix package manager.

We can create a Nix expression that generates a sysvinit script for nginx, such as:

{createSystemVInitScript, nginx}:

let
  configFile = ./nginx.conf;
  stateDir = "/var";
in
createSystemVInitScript {                                                                                                                                                                                          
  name = "nginx";
  description = "Nginx";
  activities = {
    start = ''
      mkdir -p ${stateDir}/logs
      log_info_msg "Starting Nginx..."
      loadproc ${nginx}/bin/nginx -c ${configFile} -p ${stateDir}
      evaluate_retval
    '';
    stop = ''
      log_info_msg "Stopping Nginx..."
      killproc ${nginx}/bin/nginx
      evaluate_retval
    '';
    reload = ''
      log_info_msg "Reloading Nginx..."
      killproc ${nginx}/bin/nginx -HUP
      evaluate_retval
    '';
    restart = ''
      $0 stop
      sleep 1
      $0 start
    '';
    status = "statusproc ${nginx}/bin/nginx";
  };
  runlevels = [ 3 4 5 ];
}

The above expression defines a function in which the function parameters refer to all dependencies that we need to construct the sysvinit script to manage a nginx server: createSystemVInitScript is the utility function that creates sysvinit scripts, nginx is the package that provides Nginx.

In the body, we invoke the: createSystemVInitScript to construct a sysvinit script:

  • The name corresponds to name of the sysvinit script and the description to the description displayed in the metadata header.
  • The activities parameter refers to an attribute set in which every name refers to an activity and every value to the shell commands that need to be executed for this activity.

    We can use this parameter to specify the start, stop, reload, restart and status activities for nginx. The function abstraction will automatically configure the fallback activity that displays the usage to the end-user including the activities that the script supports.
  • The runlevels parameter indicates in which runlevels the init.d script should be started. For these runlevels, the function will create start symlinks. An implication is that for the runlevels that are not specified (0, 1, 2, and 6) the script will automatically create stop symlinks.

As explained earlier, sysvinit script use conventions. One of such conventions is that most activities typically display a description, then execute a command, and finally display the status of that command, such as:

log_info_msg "Starting Nginx..."
loadproc ${nginx}/bin/nginx -c ${configFile} -p ${stateDir}
evaluate_retval

The createSystemVInit script also a notion of instructions, that are automatically translated into activities displaying task descriptions (derived from the general description) and the status. Using the instructions parameter allows us to simplify the above expression to:

{createSystemVInitScript, nginx}:

let
  configFile = ./nginx.conf;
  stateDir = "/var";
in
createSystemVInitScript {                                                                                                                                                                                          
  name = "nginx";
  description = "Nginx";
  instructions = {
    start = {
      activity = "Starting";
      instruction = ''
        mkdir -p ${stateDir}/logs
        loadproc ${nginx}/bin/nginx -c ${configFile} -p ${stateDir}
      '';
    };
    stop = {
      activity = "Stopping";
      instruction = "killproc ${nginx}/bin/nginx";
    };
    reload = {
      activity = "Reloading";
      instruction = "killproc ${nginx}/bin/nginx -HUP";
    };
  };
  activities = {
    status = "statusproc ${nginx}/bin/nginx";
  };
  runlevels = [ 3 4 5 ];
}

In the above expression, the start, stop and reload activities have been simplified by defining them as instructions allowing us to write less repetitive boilerplate code.

We can reduce the amount of boilerplate code even further -- the kind of activities that we need to implement for managing process are typically mostly the same. When we want to manage a process, we typically want a start, stop, restart, status activity and, if applicable, a reload activity if a process knows how to handle the HUP signal.

Instead of speciying activities or instructions, it is also possible to specify which process we want to manage, and what kind of parameters the process should take:

{createSystemVInitScript, nginx}:

let
  configFile = ./nginx.conf;
  stateDir = "/var";
in
createSystemVInitScript {                                                                                                                                                                                          
  name = "nginx";
  description = "Nginx";
  initialize = ''
    mkdir -p ${stateDir}/logs
  '';
  process = "${nginx}/bin/nginx";
  args = [ "-c" configFile "-p" stateDir ];
  runlevels = [ 3 4 5 ];
}

From the process and args parameters, the createSystemVInitScript automatically derives all relevant activities that we need to manage the process. It is also still possible to augment or override the generated activities by means of the instructions or activities parameters.

Besides processes that already have the ability to daemonize, it is also possible to automatically daemonize foreground processes with this function abstraction. This is particularly useful to generate a sysvinit script for the Node.js web application service, that lacks this ability:

{createSystemVInitScript}:

let
  webapp = (import ./webapp {}).package;
in
createSystemVInitScript {
  name = "webapp";
  process = "${webapp}/lib/node_modules/webapp/app.js";
  processIsDaemon = false;
  runlevels = [ 3 4 5 ];
  environment = {
    PORT = 5000;
  };
}

In the above Nix expression, we set the parameter: processIsDaemon to false (the default value is: true) to indicate that the process is not a deamon, but a foreground process. The createSystemVInitScript function will generate a start activity that invokes the daemon command to daemonize it.

Another interesting feature is that we can specify process dependency relationships. For example, an nginx server can act as a reverse proxy for the Node.js web application.

To reliably activate the entire system, we must make sure that the web application process is deployed before Nginx is deployed. If we activate the system in the opposite order, then the reverse proxy may redirect users to an non-existent web application causing them to see 502 bad gateway errors.

We can use the dependency parameter with a reference to a sysvinit script to indicate that this sysvinit script has a dependency. For example, we can revise the Nginx sysvinit script expression as follows:

{createSystemVInitScript, nginx, webapp}:

let
  configFile = ./nginx.conf;
  stateDir = "/var";
in
createSystemVInitScript {                                                                                                                                                                                          
  name = "nginx";
  description = "Nginx";
  initialize = ''
    mkdir -p ${stateDir}/logs
  '';
  process = "${nginx}/bin/nginx";
  args = [ "-c" configFile "-p" stateDir ];
  runlevels = [ 3 4 5 ];
  dependencies = [ webapp ];
}

In the above example, we pass the webapp sysvinit script as a dependency (through the dependencies parameter). Adding it as a dependency causes the generator to compute a start sequence number for the nginx script that will be higher than the web app sysvinit script and stop sequence number that will be lower than the web app script.

The different sequence numbers ensure that webapp is started before nginx starts, and that the nginx stops before the webapp stops.

Configuring managed processes


So far composing sysvinit scripts is still very similar to composing ordinary Nix packages. We can also extend the four Nix packaging conventions described in the introduction to create a process management discipline.

Similar to the convention in which every package is in a separate file, and defines a function in which the function parameters refers to all package dependencies, we can extend this convention for processes to also include relevant parameters to configure a service.

For example, we can write a Nix expression for the web application process as follows:

{createSystemVInitScript, port ? 5000}:

let
  webapp = (import /home/sander/webapp {}).package;
in
createSystemVInitScript {
  name = "webapp";
  process = "${webapp}/lib/node_modules/webapp/app.js";
  processIsDaemon = false;
  runlevels = [ 3 4 5 ];
  environment = {
    PORT = port;
  };
}

In the above expression, the port function parameter allows us to configure the TCP port where the web application listens to (and defaults to 5000).

We can also make the configuration of nginx configurable. For example, we can create a function abstraction that creates a configuration for nginx to let it act as a reverse proxy for the web application process shown earlier:

{createSystemVInitScript, stdenv, writeTextFile, nginx
, runtimeDir, stateDir, logDir, port ? 80, webapps ? []}:

let
  nginxStateDir = "${stateDir}/nginx";
in
import ./nginx.nix {
  inherit createSystemVInitScript nginx instanceSuffix;
  stateDir = nginxStateDir;

  dependencies = map (webapp: webapp.pkg) webapps;

  configFile = writeTextFile {
    name = "nginx.conf";
    text = ''
      error_log ${nginxStateDir}/logs/error.log;
      pid ${runtimeDir}/nginx.pid;

      events {
        worker_connections 190000;
      }

      http {
        ${stdenv.lib.concatMapStrings (dependency: ''
          upstream webapp${toString dependency.port} {
            server localhost:${toString dependency.port};
          }
        '') webapps}

        ${stdenv.lib.concatMapStrings (dependency: ''
          server {
            listen ${toString port};
            server_name ${dependency.dnsName};

            location / {
              proxy_pass  http://webapp${toString dependency.port};
            }
          }
        '') webapps}
      }
    '';
  };
}

The above Nix expression's funtion header defines, in addition to the package dependencies, process configuration parameters that make it possible to configure the TCP port that Nginx listens to (port 80 by default) and to which web applications it should forward requests based on their virtual host property.

In the body, these properties are used to generate a nginx.conf file that defines virtualhosts for each web application process. It forwards incoming requests to the appropriate web application instance. To connect to a web application instance, it uses the port number that the webapp instance configuration provides.

Similar to ordinary Nix expressions, Nix expressions for processes also need to be composed, by passing the appropriate function parameters. This can be done in a process composition expression that has the following structure:

{ pkgs ? import <nixpkgs> { inherit system; }
, system ? builtins.currentSystem
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
}:

let
  createSystemVInitScript = import ./create-sysvinit-script.nix {
    inherit (pkgs) stdenv writeTextFile daemon;
    inherit runtimeDir tmpDir;

    createCredentials = import ./create-credentials.nix {
      inherit (pkgs) stdenv;
    };

    initFunctions = import ./init-functions.nix {
      basePackages = [
        pkgs.coreutils
        pkgs.gnused
        pkgs.inetutils
        pkgs.gnugrep
        pkgs.sysvinit
      ];
      inherit (pkgs) stdenv;
      inherit runtimeDir;
    };
  };
in
rec {
  webapp = rec {
    port = 5000;
    dnsName = "webapp.local";

    pkg = import ./webapp.nix {
      inherit createSystemVInitScript port;
    };
  };

  nginxReverseProxy = rec {
    port = 80;

    pkg = import ./nginx-reverse-proxy.nix {
      inherit createSystemVInitScript;
      inherit stateDir logDir runtimeDir port;
      inherit (pkgs) stdenv writeTextFile nginx;
      webapps = [ webapp ];
    };
  };
}

The above expression (processes.nix) has the following structure:

  • The expression defines a function in which the function parameters allow common properties that apply to all processes to be configured: pkgs refers to the set of Nixpkgs that contains a big collection of free and open source packages, system refers to the system architecture to build packages for, and stateDir to the directory where processes should store their state (which is /var according to the LSB standard).

    The remaining parameters specify the runtime, log and temp directories, that are typically sub directories in the state directory.
  • In the let block, we compose our createSystemVInitScript function using the relevant state directory parameters, base packages and utility functions.
  • In the body, we construct an attribute set in which every name represents a process name and every value an attribute set that contains process properties.
  • One reserved process property of a process attribute set is the pkg property that refers to a package providing the sysvinit script.
  • The remaining process properties can be freely chosen and can be consumed by any process that has a dependency on it.

    For example, the nginxReverseProxy service uses the port and dnsName properties of the webapp process to configure nginx to forward requests to the provided DNS host name (webapp.local) to the web application process listening on the specified TCP port (5000).

Using the above composition Nix expression for processes and the following command-line instruction, we can build the sysvinit script for the web application process:

$ nix-build processes.nix -A webapp

We can start the web application process by using the generated sysvinit script, as follows:

$ ./result/bin/etc/rc.d/init.d/webapp start

and stop it as follows:

$ ./result/bin/etc/rc.d/init.d/webapp stop

We can also build the nginx reverse proxy in a similar way, but to properly activate it, we must make sure that the webapp process is activated first.

To reliably manage a set of processes and activate them in the right order, we can also generate a Nix profile that contains all init.d scripts and rc.d symlinks for stopping and starting:

{ pkgs ? import <nixpkgs> { inherit system; }
, system ? builtins.currentSystem
}:

let
  buildSystemVInitEnv = import ./build-sysvinit-env.nix {
    inherit (pkgs) buildEnv;
  };
in
buildSystemVInitEnv {
  processes = import ./processes.nix {
    inherit pkgs system;
  };
}

The above expression imports the process composition expression shown earlier, and invokes the buildSystemVInitEnv to compose a Nix profile out of it. We can build this environment as follows:

$ nix-build profile.nix

Visually, the content of the Nix profile can presented as follows:


In the above diagram the ovals denote processes and the arrows denote process dependency relationships. The arrow indicates that the webapp process needs to be activated before the nginxReverseProxy.

We can use the system's rc script manage the starting and stopping the processes when runlevels are switched. Runlevels 1-5 make it possible to start the processes on startup and 0 and 6 to stop them on shutdown or reboot.

In addition to the system's rc script, we can also directly control the processes in a Nix profile -- I have created a utility script called: rcswitch that makes it possible to manually start all processes in a profile:

$ rcswitch ./result/etc/rc.d/rc3.d

we can also use the rcswitch command to do an upgrade from one set of processes to another:

$ rcswitch ./result/etc/rc.d/rc.3 ./oldresult/etc/rc.d/rc3.d

The above command checks which of the sysvinit scripts exist in both profiles and will only deactivate obsolete processes and activate new processes.

With the rcrunactivity command it is possible to run arbitrary activities on all processes in a profile. For example, the following command will show all statuses:

$ rcactivity status ./result/etc/rc.d/rc3.d

Deploying services as an unprivileged user


The process composition expression shown earlier is also a Nix function that takes various kinds of state properties as parameters.

By default, it has been configured in such a way that it facilitates production deployments. For example, it stores the state of all services in the global /var directory. Only the super user has the permissions to alter the structure of the global /var directory.

It is also possible to change these configuration parameters in such a way that it becomes possible as an unprivileged user to do process deployment.

For example, by changing the port number of the nginxReverseProxy process to a value higher than 1024, such as 8080 (an unprivileged user is not allowed to bind any services to ports below 1024), and changing the stateDir parameter to a directory in a user's home directory, we can deploy our web application service and Nginx reverse proxy as an unprivileged user:

$ nix-build processes.nix --argstr stateDir /home/sander/var \
  -A nginxReverseProxy

By overriding the stateDir parameter, the resulting Nginx process has been configured to store all state in /home/sander/var as opposed to the global /var that cannot be modified by an unprivileged user.

As an unprivileged user, I should be able to start the Nginx reverse proxy as follows:

$ ./result/etc/rc.d/init.d/nginx start

The above Nginx instance can be reached by opening: http://localhost:8080 in a web browser.

Creating multiple process instances


So far, we have only been deploying single instances of processes. For the Nginx reverse proxy example, it may also be desired to deploy multiple instances of the webapp process so that we can manage forwardings for multiple virtual domains.

We can adjust the Nix expression for the webapp to make it possible to create multiple process instances:

{createSystemVInitScript}:
{port, instanceSuffix ? ""}:

let
  webapp = (import ./webapp {}).package;
  instanceName = "webapp${instanceSuffix}";
in
createSystemVInitScript {
  name = instanceName;
  inherit instanceName;
  process = "${webapp}/lib/node_modules/webapp/app.js";
  processIsDaemon = false;
  runlevels = [ 3 4 5 ];
  environment = {
    PORT = port;
  };
}

The above Nix expression is a modified webapp build recipe that facilitates instantiation:

  • We have split the Nix expression into two nested functions. The first line: the outer function header defines all dependencies and configurable properties that apply to all services instances.
  • The inner function header allows all instance specific properties to be configured so that multiple instances can co-exist. An example of such a property is the port parameter -- only one service can bind to a specific TCP port. Configuring an instance to bind to different port allows two instances co-exist.

    The instanceSuffix parameter makes it possible to give each webapp process a unique name (e.g. by providing a numeric value).

    From the package name and instance suffix a unique instanceName is composed. Propagating the instanceName to the createSystemVInitScript function instructs the daemon command to create a unique PID file (not a PID file that corresponds to the executable name) for each daemon process so that multiple instances can be controlled independently.

Although this may sound as a very uncommon use case, it is also possible to change the Nix expression for the Nginx reverse proxy to support multiple instances.

Typically, for system services, such as web servers and database servers, it is very uncommon to run multiple instances at the same time. Despite the fact that it is uncommon, it is actually possible and quite useful for development and/or experimentation purposes:

{ createSystemVInitScript, stdenv, writeTextFile, nginx
, runtimeDir, stateDir, logDir}:

{port ? 80, webapps ? [], instanceSuffix ? ""}:

let
  instanceName = "nginx${instanceSuffix}";
  nginxStateDir = "${stateDir}/${instanceName}";
in
import ./nginx.nix {
  inherit createSystemVInitScript nginx instanceSuffix;
  stateDir = nginxStateDir;

  dependencies = map (webapp: webapp.pkg) webapps;

  configFile = writeTextFile {
    name = "nginx.conf";
    text = ''
      error_log ${nginxStateDir}/logs/error.log;
      pid ${runtimeDir}/${instanceName}.pid;

      events {
        worker_connections 190000;
      }

      http {
        ${stdenv.lib.concatMapStrings (dependency: ''
          upstream webapp${toString dependency.port} {
            server localhost:${toString dependency.port};
          }
        '') webapps}

        ${stdenv.lib.concatMapStrings (dependency: ''
          server {
            listen ${toString port};
            server_name ${dependency.dnsName};

            location / {
              proxy_pass  http://webapp${toString dependency.port};
            }
          }
        '') webapps}
      }
    '';
  };
}

The code fragment above shows a revised Nginx expression that supports instantiation:

  • Again, the Nix expression defines a nested function in which the outer function header refers to configuration properties for all services, whereas the inner function header refers to all conflicting parameters that need to be changed so that multiple instances can co-exist.
  • The port parameter allows the TCP port where Nginx bind to be configured. To have two instances co-existing they both need to bind to unreserved ports.
  • As with the previous example, the instanceSuffix parameter makes it possible to compose unique names for each Nginx instance. The instanceName variable that is composed from it, is used to create and configure a dedicate state directory, and a unique PID file that does not conflict with other Nginx instances.

With this new convention of nested functions for instantiatable services means that we have to compose these expressions twice. First, we need to pass all parameters that configure properties that apply to all service instances. This can be done in a Nix expression that has the following structure:

{ pkgs
, system
, stateDir
, logDir
, runtimeDir
, tmpDir
}:

let
  createSystemVInitScript = import ./create-sysvinit-script.nix {
    inherit (pkgs) stdenv writeTextFile daemon;
    inherit runtimeDir tmpDir;

    createCredentials = import ./create-credentials.nix {
      inherit (pkgs) stdenv;
    };

    initFunctions = import ./init-functions.nix {
      basePackages = [
        pkgs.coreutils
        pkgs.gnused
        pkgs.inetutils
        pkgs.gnugrep
        pkgs.sysvinit
      ];
      inherit (pkgs) stdenv;
      inherit runtimeDir;
    };
  };
in
{
  webapp = import ./webapp.nix {
    inherit createSystemVInitScript;
  };

  nginxReverseProxy = import ./nginx-reverse-proxy.nix {
    inherit createSystemVInitScript stateDir logDir runtimeDir;
    inherit (pkgs) stdenv writeTextFile nginx;
  };
}

The above Nix expression is something we could call a constructors expression (constructors.nix) that returns an attribute set in which each member refers to a function that allows us to compose a specific process instance.

By using the constructors expression shown above, we can create a processes composition expression that works with multiple instances:

{ pkgs ? import  { inherit system; }
, system ? builtins.currentSystem
, stateDir ? "/home/sbu"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
}:

let
  constructors = import ./constructors.nix {
    inherit pkgs system stateDir runtimeDir logDir tmpDir;
  };
in
rec {
  webapp1 = rec {
    port = 5000;
    dnsName = "webapp1.local";

    pkg = constructors.webapp {
      inherit port;
      instanceSuffix = "1";
    };
  };

  webapp2 = rec {
    port = 5001;
    dnsName = "webapp2.local";

    pkg = constructors.webapp {
      inherit port;
      instanceSuffix = "2";
    };
  };

  webapp3 = rec {
    port = 5002;
    dnsName = "webapp3.local";

    pkg = constructors.webapp {
      inherit port;
      instanceSuffix = "3";
    };
  };

  webapp4 = rec {
    port = 5003;
    dnsName = "webapp4.local";

    pkg = constructors.webapp {
      inherit port;
      instanceSuffix = "4";
    };
  };

  nginxReverseProxy = rec {
    port = 8080;

    pkg = constructors.nginxReverseProxy {
      webapps = [ webapp1 webapp2 webapp3 webapp4 ];
      inherit port;
    };
  };

  webapp5 = rec {
    port = 6002;
    dnsName = "webapp5.local";

    pkg = constructors.webapp {
      inherit port;
      instanceSuffix = "5";
    };
  };

  webapp6 = rec {
    port = 6003;
    dnsName = "webapp6.local";

    pkg = constructors.webapp {
      inherit port;
      instanceSuffix = "6";
    };
  };

  nginxReverseProxy2 = rec {
    port = 8081;

    pkg = constructors.nginxReverseProxy {
      webapps = [ webapp5 webapp6 ];
      inherit port;
      instanceSuffix = "2";
    };
  };
}

In the above expression, we import the constructors expression, as shown earlier. In the body, we construct multiple instances of these processes by using the constructors functions:

  • We compose six web application instances (webapp1, webapp2, ..., webapp6), each of them listening on a unique TCP port.
  • We compose two Nginx instances (nginxReverseProxy, nginxReverseProxy2). The first instance listens on TCP port 8080 and redirects the user to any of the first three web application processes, based on the virtual host name. The other Nginx instance listens on TCP port 8081, redirecting the user to the remaining web apps based on the virtual host name.

We can represent the above composition expression visually, as follows:


As with the previous examples, we can deploy each process instance individually:

$ nix-build processes.nix -A webapp3
$ ./result/etc/rc.d/init.d/webapp3 start

Or the the whole set as a Nix profile:

$ nix-build profile.nix
$ rcswitch ./result/etc/rc.d/rc3.d

Again, the rcswitch command will make sure that all processes are activated in the right order. This means that the webapp processes are activated first, followed by the Nginx reverse proxies.

Managing user accounts/state with Dysnomia


Most of the deployment of the processes can be automated in a stateless way -- Nix can deploy the executable as a Nix package and the sysvinit script can manage the lifecycle.

There is another concern, that we may also want to address. Typically, it is not recommended to run processes as a root user, such as essential system services, for security and safety reasons.

In order to run a process as an unprivileged user, an unprivileged group and user account must be created first by some means. Furthermore, when undeploying a process, we may also want to remove the dedicated user and group.

User account management is a feature that the Nix package manager does not support -- Nix only works with files stored in the Nix store and cannot/will not (by design) change any files on the host system, such as /etc/passwd where the user accounts are stored.

I have created a deployment tool for state management (Dysnomia) that can be used for this purpose. It facilitates a plugin system that can manage deployment activities for components that Nix does not support: activating, deactivating, taking snapshots, restoring snapshots etc.

I have created a Dysnomia plugin called: sysvinit-script that can activate or deactivate a process by invoking a sysvinit script. It can also create or discard users and groups from a declarative configuration file that is included with a sysvinit script.

We can revise a process Nix expression to start a process as an unprivileged user:

{createSystemVInitScript}:
{port, instanceSuffix ? ""}:

let
  webapp = (import ./webapp {}).package;
  instanceName = "webapp${instanceSuffix}";
in
createSystemVInitScript {
  name = instanceName;
  inherit instanceName;
  process = "${webapp}/lib/node_modules/webapp/app.js";
  processIsDaemon = false;
  runlevels = [ 3 4 5 ];
  environment = {
    PORT = port;
  };
  user = instanceName;

  credentials = {
    groups = {
      "${instanceName}" = {};
    };
    users = {
      "${instanceName}" = {
        group = instanceName;
        description = "Webapp";
      };
    };
  };
}

The above Nix expression is a revised webapp Nix expression that facilitates user switching:

  • The user parameter specifies that we want to run the process as an unprivileged user. Because this process can also be instantiated, we have to make sure that it gets a unique name. To facilitate that, we create a user with the same username as the instance name.
  • The credentials parameter refers to a specification that instructs the sysvinit-script Dysnomia plugin to create an unprivileged user and group on activation, and discard them on deactivation.

For production purposes (e.g. when we deploy processes as the root user), switching to unprivileged users is useful, but for development purposes, such as running a set of processes as an unprivileged user, we cannot switch users because we may not have the permissions to do so.

For convenience purposes, it is also possible to globally disable user switching, which we can do as follows:

{ pkgs
, stateDir
, logDir
, runtimeDir
, tmpDir
, forceDisableUserChange
}:

let
  createSystemVInitScript = import ./create-sysvinit-script.nix {
    inherit (pkgs) stdenv writeTextFile daemon;
    inherit runtimeDir tmpDir forceDisableUserChange;

    createCredentials = import ./create-credentials.nix {
      inherit (pkgs) stdenv;
    };

    initFunctions = import ./init-functions.nix {
      basePackages = [
        pkgs.coreutils
        pkgs.gnused
        pkgs.inetutils
        pkgs.gnugrep
        pkgs.sysvinit
      ];
      inherit (pkgs) stdenv;
      inherit runtimeDir;
    };
  };
in
{
  ...
}

In the above example, the forceDisableUserChange parameter can be used to globally disable user switching for all sysvinit scripts composed in the expression. It invokes a feature of the createSystemVInitScript to ignore any user settings that might have been propagated to it.

With the following command we can deploy a process that does not switch users, despite having user settings configured in the process Nix expressions:

$ nix-build processes.nix --arg forceDisableUserChange true

Distributed process deployment with Disnix


As explained earlier, I have adopted four common Nix package conventions and extended them suit the needs of process management.

This is not the only solution that I have implemented that builds on these four conventions -- the other solution is Disnix, that extends Nix's packaging principles to (distributed) service-oriented systems.

Disnix extends Nix expressions for ordinary packages with another category of dependencies: inter-dependencies that model dependencies on services that may have been deployed to remote machines in a network and require a network connection to work.

In Disnix, a service expression is a nested function in which the outer function header specifies all intra-dependencies (local dependencies, such as build tools and libraries), and the inner function header refers to inter-dependencies.

It is also possible to combine the concepts of process deployment described in this blog post with the service-oriented system concepts of Disnix, such as inter-dependencies -- the example with Nginx reverse proxies and web application processes can be extended to work in a network of machines.

Besides deploying a set processes (that may have dependencies on each other) to a single machine, it is also possible to deploy the web application processes to different machines in the network than the machine where the Nginx reverse proxy is deployed to.

We can configure the reverse proxy in such a way that it will forward requests to the machine where the web application processes may have been deployed to.

{ createSystemVInitScript, stdenv, writeTextFile, nginx
, runtimeDir, stateDir, logDir
}:

{port ? 80, instanceSuffix ? ""}:

interDeps:

let
  instanceName = "nginx${instanceSuffix}";
  nginxStateDir = "${stateDir}/${instanceName}";
in
import ./nginx.nix {
  inherit createSystemVInitScript nginx instanceSuffix;
  stateDir = nginxStateDir;

  dependencies = map (dependencyName: 
    let
      dependency = builtins.getAttr dependencyName interDeps;
    in
    dependency.pkg
   ) dependencies;

  configFile = writeTextFile {
    name = "nginx.conf";
    text = ''
      error_log ${nginxStateDir}/logs/error.log;
      pid ${runtimeDir}/${instanceName}.pid;

      events {
        worker_connections 190000;
      }

      http {
        ${stdenv.lib.concatMapStrings (dependencyName:
          let
            dependency = builtins.getAttr dependencyName interDeps;
          in
          ''
            upstream webapp${toString dependency.port} {
              server ${dependency.target.properties.hostname}:${toString dependency.port};
            }
          '') (builtins.attrNames interDeps)}

        ${stdenv.lib.concatMapStrings (dependencyName:
          let
            dependency = builtins.getAttr dependencyName interDeps;
          in
          ''
            server {
              listen ${toString port};
              server_name ${dependency.dnsName};

              location / {
                proxy_pass  http://webapp${toString dependency.port};
              }
            }
          '') (builtins.attrNames interDeps)}
        }
    '';
  };
}

The above Nix expression is a revised Nginx configuration that also works with inter-dependencies:

  • The above Nix expression defines three nested functions. The purpose of the outermost function (the first line) is to configure all local dependencies that are common to all process instances. The middle function defines all process instance parameters that are potentially conflicting and need to be configurd with unique values so that multiple instances can co-exist. The third (inner-most) function refers to the inter-dependencies of this process: services that may reside on a different machine in the network and need to be reached with a network connection.
  • The inter-dependency function header (interDeps:) takes an arbitrary number of dependencies. These inter-dependencies refer to all web application process instances that the Nginx reverse proxy should redirect to.
  • In the body, we generate an nginx.conf that uses the inter-dependencies to set up the forwardings.

    Compared to the previous Nginx reverse proxy example, it will use the dependency.target.properties.hostname property that refers to the hostname of the machine where the web application process is deployed to instead of a forwarding to localhost. This makes it possible to connect to a web application process that may have been deployed to another machine.
  • The inter-dependencies are also passed to the dependencies function parameter of the Nginx function. This will ensure that if Nginx and a web application process are distributed to the same machine by Disnix, they will also get activated in the right order by the system's rc script on startup.

A with the previous examples, we need to compose the above Disnix expression multiple times. The composition of the constructors can be done in the constructors expression (as shown in the previous examples).

The processes' instance properties and inter-dependencies can be configured in the Disnix services model, that shares many similarities with process composition expression, shown earlier. As a matter of fact, a Disnix services model is a superset of it:

{ pkgs, distribution, invDistribution, system
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? true
}:

let
  constructors = import ./constructors.nix {
    inherit pkgs stateDir runtimeDir logDir tmpDir;
    inherit forceDisableUserChange;
  };
in
rec {
  webapp = rec {
    name = "webapp";
    port = 5000;
    dnsName = "webapp.local";
    pkg = constructors.webapp {
      inherit port;
    };
    type = "sysvinit-script";
  };

  nginxReverseProxy = rec {
    name = "nginxReverseProxy";
    port = 8080;
    pkg = constructors.nginxReverseProxy {
      inherit port;
    };
    dependsOn = {
      inherit webapp;
    };
    type = "sysvinit-script";
  };
}

The above Disnix services model defines two services (representing processes) that have an inter-dependency on each other, as specified with the dependsOn parameter property of each service.

The sysvinit-script type property instructs Disnix to deploy the services as processes managed by a sysvinit script. In a Disnix-context, services have no specific form or meaning, and can basically represent anything. The type property is used to tell Disnix with what kind of service we are dealing with.

To properly configure remote dependencies we also need to know the target machines where we can deploy to and what their properties are. This is where we can use an infrastructure model for.

For example, a simple infrastructure model of two machines could be:

{
  test1.properties.hostname = "test1";
  test2.properties.hostname = "test2";
}

We must also tell Disnix to which target machines we want to distribute the services. This can be done in a distribution model:

{infrastructure}:

{
  webapp = [ infrastructure.test1 ];
  nginxReverseProxy = [ infrastructure.test2 ];
}

In the above distribution model we distribute the webapp process to the first target machine and the nginxReverseProxy to the second machine. Because both services are deployed to different machines in the network, the nginxReverseProxy uses a network link to forward incoming requests to the web application.

By running the following command-line instruction:

$ disnix-env -s services.nix -i infrastructure.nix -d distribution.nix

Disnix will deploy the processes to the target machines defined in the distribution model.

The result is the following deployment architecture:


As may be noticed by looking at the above diagram, the process dependency manifest itself as a network link managed as an inter-dependency by Disnix.

Conclusion


In this blog post, I have described a Nix-based functional organization for managing processes based on four simple Nix packaging conventions. This approach offers the following benefits:

  • Integration with many process managers that manage the lifecycle of a process (in this particular blog post: using sysvinit scripts).
  • The ability to relocate state to other locations, which is useful to facilitate unprivileged user deployments.
  • The ability to create multiple instances of processes, by making conflicting properties configurable.
  • Disabling user switching, which is useful to facilitate unprivileged user deployments.
  • It can be used on any Linux system that has the Nix package manager installed. It can be used on NixOS, but NixOS is not a requirement.

Related work


Integrating process management with Nix package deployment is not a new subject, nor something that is done for the first time.

Many years ago, there was the "trace" Subversion repository (that was named after the research project TraCE: Transparent Configuration Environments funded by NWO/Jacquard), the repository in which all Nix-related development was done before the transition was made to GitHub (before 2012).

In the trace repository, there was also a services project that could be used to generate sysvinit-like scripts that could be used on any Linux distribution, and several non-Linux systems as well, such as FreeBSD.

Eelco Dolstra's PhD thesis Chapter 9 describes a distributed deployment prototype that extends the init script approach to networks of machines. The prototype facilitates the distribution of init scripts to remote machines and heterogeneous operating systems deployment -- an init script can be built for multiple operating systems, such as Linux and FreeBSD.

Although the prototype shares some concepts with Disnix and the process management described in this blog post support, it also lacks many features -- it has no notion of process dependencies, inter-dependencies, the ability to separate services/processes and infrastructure, and to specify distribution mappings between process and target machines including the deployment of redundant instances.

Originally, NixOS used to work with the generated scripts from services sub project in the trace repository, but quite quickly adopted Upstart as its init system. Gradually, the init scripts and upstart jobs got integrated, and eventually replaced by Upstart jobs completely. As a result, it was no longer possible to run services independently of NixOS.

NixOS is a Linux distribution whose static aspects are fully managed by Nix, including user packages, configuration files, the Linux kernel, and kernel modules. NixOS machine configurations are deployed from a single declarative specification.

Although NixOS is an extension of Nix deployment principles to machine-level deployment, a major conceptual difference between NixOS and the Nix packages repository is that NixOS generates a big data structure made out of all potential configuration options that NixOS provides. It uses this (very big) generated data structure as an input for an activation script that will initialize all dynamic system parts, such as populating the state directories (e.g. /var) and loading systemd jobs.

In early incarnations of NixOS, the organization of the repository was quite monolithic -- there was one NixOS file that defines all configuration options for all possible system configuration aspetcts, one file that defines the all the system user accounts, one file that defines all global configuration files in /etc. When it was desired to add a new system service, all these global configuration files need to be modified.

Some time later (mid 2009), the NixOS module system was introduced that makes it possible to isolate all related configuration aspects of, for example, a system service into a separate module. Despite the fact that configuration aspects are isolated, the NixOS module system has the ability (through a concept called fixed points) to refer to properties of the entire configuration. The NixOS module system merges all configuration aspects of all modules into a single configuration data structure.

The NixOS module system is quite powerful. In many ways, it is much more powerful than the process management approach described in this blog post. The NixOS module system allows you to refer, override and adjust any system configuration aspect in any module.

For example, a system service, such as the OpenSSH server, can automatically configure the firewall module in such a way that it will open the SSH port (port 22). With the functional approach described in this blog post, everything has to be made explicit and must be propagated through function arguments. This is probably more memory efficient, but a lot less flexible, and more tedious to write.

There are also certain things that NixOS and the NixOS module system cannot do. For example, with NixOS, it is not possible to create multiple instances of system services which the process management conventions described in this blog post can.

NixOS has another drawback -- evaluating system configurations requires all possible NixOS configuration options to be evaluated. There are actually quite a few of of them.

As a result, evaluating a NixOS configuration is quite slow and memory consuming. For single systems, this is typically not a big problem, but for networked NixOS/NixOps configurations, this may be a problem -- for example, I have an old laptop with 4 GiB of RAM that can no longer deploy a test network of three VirtualBox machines using the latest stable NixOS release (19.09), because the Nix evaluator runs out of memory.

Furthermore, NixOS system services can only be used when you install NixOS as your system's software distribution. It is currently not possible to install Nix on a conventional Linux distribution and use NixOS' system services (systemd services) independently of the entire operating system.

The lack of being able to deploy system services independently is not a limitation of the NixOS module system -- there is also an external project called nix-darwin that uses the NixOS module system to generate launchd services, that can be run on top of macOS, that is not managed by the Nix package manager.

The idea to have a separate function header for creating instances of processes is also not entirely new -- a couple of years ago I have revised the internal deployment model of Disnix to support multiple container instances.

In a Disnix-context, containers can represent anything that can host multiple service instances, such as a process manager, application container, or database management system. I was already using the convention to have a separate function header that makes it possible to create multiple instances of services. In this blog post, I have extended this formalism specifically for managing processes.

Discussion


In this blog post, I have picked sysvinit scripts for process management. The reason why I have picked an old-fashioned solution is not that I consider this to be the best process management facility, or that systemd, the init system that NixOS uses, is a bad solution.

My first reason to choose sysvinit scripts is because it is more universally supported than systemd.

The second reason is that I want to emphasize the value that a functional organization can provide, independent of the process management solution.

Using sysvinit scripts for managing process have all kinds of drawbacks and IMO there is a legitimate reason why alternatives exist, such as systemd (but also other solutions).

For example, controlling daemonized processes is difficult and fragile -- the convention that daemons should follow is to create PID files, but it is not a hard guarantee daemons will comply and that nothing will go wrong. As a result, a daemonized process may escape control of the process manager. systemd, for example, puts all processes that it needs to control in a cgroup and as a result, cannot escape systemd's control.

Furthermore, you may also want to use the more advanced features of the Linux kernel, such as namespaces and cgroups to prevent process from interfering with other processes on the system and the available system resources that a system provides. Namespaces and cgroups are a first class feature in systemd.

If you do not like sysvinit scripts: the functional organization described in this blog post is not specifically designed for sysvinit -- it is actually process manager agnostic. I have also implemented a function called: createSystemdService that makes it possible to construct systemd services.

The following Nix expression composes a systemd service for the web application process, shown earlier:

{stdenv, createSystemdService}:
{port, instanceSuffix ? ""}:

let
  webapp = (import ./webapp {}).package;
  instanceName = "webapp${instanceSuffix}";
in
createSystemdService {
  name = instanceName;

  environment = {
    PORT = port;
  };

  Unit = {
    Description = "Example web application";
    Documentation = http://example.com;
  };

  Service = {
    ExecStart = "${webapp}/lib/node_modules/webapp/app.js";
  };
}

I also tried supervisord -- we can write the following Nix expression to compose a supervisord program configuration file for the web application process:

{stdenv, createSupervisordProgram}:
{port, instanceSuffix ? ""}:

let
  webapp = (import ./webapp {}).package;
  instanceName = "webapp${instanceSuffix}";
in
createSupervisordProgram {
  name = instanceName;

  command = "${webapp}/lib/node_modules/webapp/app.js";
  environment = {
    PORT = port;
  };
}

Switching process managers retains our ability to benefit from the facilities that the functional configuration framework provides -- we can use it manage process dependencies, configure state directories, disable user management and when we use Disnix: manage inter-dependencies and bind it to services that are not processes.

Despite the fact that sysvinit scripts are primitive, there are also two advantages that I see over more "modern alternatives", such as systemd:

  • Systemd and supervisord require the presence of a deamon that manages processes (i.e. the systemd and supervisord deamons). sysvinit scripts are self-contained from a process management perspective -- the Nix package manager provides the package dependencies that the sysvinit scripts needs (e.g. basic shell utilities, sysvinit commands), but other than that, it does not require anything else.
  • We can also easily deploy sysvinit scripts to any Linux distribution that has the Nix package manager installed. There are no additional requirements. Systemd services, for example, require the presence of the systemd daemon. Furthermore, we also have to interfere with the host system's systemd service that may also be used to manage essential system services.
  • We can also easily use sysvinit scripts to deploy processes as an unprivileged user to a machine that has a single-user Nix installation -- the sysvinit script infrastructure does not require any tools or daemons that require super user privileges.

Acknowledgements


I have borrowed the init-functions script from the LFS Bootscripts package of the Linux from Scratch project to get an implementation of the utility functions that the LSB standard describes.

Availability and future work


The functionality described in this blog post is still a work in progress and only a first milestone in a bigger objective.

The latest implementation of the process management framework can be found in my experimental Nix process management repository. The sysvinit-script Dysnomia plugin resides in an experimental branch of the Dysnomia repository.

In the next blog post, I will introduce another interesting concept that we can integrate into the functional process management framework.