Sander van der Burg's blog: September 2020

As described in some of my recent blog posts, one of the more advanced features of Disnix as well as the experimental Nix process management framework is to deploy multiple instances of the same service to the same machine.

To make running multiple service instances on the same machine possible, these tools rely on conflict avoidance rather than isolation (typically used for containers). To allow multiple services instances to co-exist on the same machine, they need to be configured in such a way that they do not allocate any conflicting resources.

Although for small systems it is doable to configure multiple instances by hand, this process gets tedious and time consuming for larger and more technologically diverse systems.

One particular kind of conflicting resource that could be configured automatically are numeric IDs, such as TCP/UDP port numbers, user IDs (UIDs), and group IDs (GIDs).

In this blog post, I will describe how multiple service instances are configured (in Disnix and the process management framework) and how we can automatically assign unique numeric IDs to them.

Configuring multiple service instances

To facilitate conflict avoidance in Disnix and the Nix process management framework, services are configured as follows:

{createManagedProcess, tmpDir}:
{port, instanceSuffix ? "", instanceName ? "webapp${instanceSuffix}"}:

let
  webapp = import ../../webapp;
in
createManagedProcess {
  name = instanceName;
  description = "Simple web application";
  inherit instanceName;

  # This expression can both run in foreground or daemon mode.
  # The process manager can pick which mode it prefers.
  process = "${webapp}/bin/webapp";
  daemonArgs = [ "-D" ];

  environment = {
    PORT = port;
    PID_FILE = "${tmpDir}/${instanceName}.pid";
  };
  user = instanceName;
  credentials = {
    groups = {
      "${instanceName}" = {};
    };
    users = {
      "${instanceName}" = {
        group = instanceName;
        description = "Webapp";
      };
    };
  };

  overrides = {
    sysvinit = {
      runlevels = [ 3 4 5 ];
    };
  };
}

The Nix expression shown above is a nested function that describes how to deploy a simple self-contained REST web application with an embedded HTTP server:

The outer function header (first line) specifies all common build-time dependencies and configuration properties that the service needs:
- createManagedProcess is a function that can be used to define process manager agnostic configurations that can be translated to configuration files for a variety of process managers (e.g. systemd, launchd, supervisord etc.).
- tmpDir refers to the temp directory in which temp files are stored.
The inner function header (second line) specifies all instance parameters -- these are the parameters that must be configured in such a way that conflicts with other process instances are avoided:
- The instanceName parameter (that can be derived from the instanceSuffix) is a value used by some of the process management backends (e.g. the ones that invoke the daemon command) to derive a unique PID file for the process. When running multiple instances of the same process, each of them requires a unique PID file name.
- The port parameter specifies to which TCP port the service binds to. Binding the service to a port that is already taken by another service, causes the deployment of this service to fail.
In the function body, we invoke the createManagedProcess function to construct configuration files for all supported process manager backends to run the webapp process:
- As explained earlier, the instanceName is used to configure the daemon executable in such a way that it allocates a unique PID file.
- The process parameter specifies which executable we need to run, both as a foreground process or daemon.
- The daemonArgs parameter specifies which command-line parameters need to be propagated to the executable when the process should daemonize on its own.
- The environment parameter specifies all environment variables. The webapp service uses these variables for runtime property configuration.
- The user parameter is used to specify that the process should run as an unprivileged user. The credentials parameter is used to configure the creation of the user account and corresponding user group.
- The overrides parameter is used to override the process manager-agnostic parameters with process manager-specific parameters. For the sysvinit backend, we configure the runlevels in which the service should run.

Although the convention shown above makes it possible to avoid conflicts (assuming that all potential conflicts have been identified and exposed as function parameters), these parameters are typically configured manually:

{ pkgs, system
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager ? "sysvinit"
, ...
}:

let
  constructors = import ./constructors.nix {
    inherit pkgs stateDir runtimeDir logDir tmpDir forceDisableUserChange processManager;
  };

  processType = import ../../nixproc/derive-dysnomia-process-type.nix {
    inherit processManager;
  };
in
rec {
  webapp1 = rec {
    name = "webapp1";
    port = 5000;
    dnsName = "webapp.local";
    pkg = constructors.webapp {
      inherit port;
      instanceSuffix = "1";
    };
    type = processType;
  };

  webapp2 = rec {
    name = "webapp2";
    port = 5001;
    dnsName = "webapp.local";
    pkg = constructors.webapp {
      inherit port;
      instanceSuffix = "2";
    };
    type = processType;
  };
}

The above Nix expression shows both a valid Disnix services as well as a valid processes model that composes two web application process instances that can run concurrently on the same machine by invoking the nested constructor function shown in the previous example:

Each webapp instance has its own unique instance name, by specifying a unique numeric instanceSuffix that gets appended to the service name.
Every webapp instance binds to a unique TCP port (5000 and 5001) that should not conflict with system services or other process instances.

Previous work: assigning port numbers

Although configuring two process instances is still manageable, the configuration process becomes more tedious and time consuming when the amount and the kind of processes (each having their own potential conflicts) grow.

Five years ago, I already identified a resource that could be automatically assigned to services: port numbers.

I have created a very simple port assigner tool that allows you to specify a global ports pool and a target-specific pool pool. The former is used to assign globally unique port numbers to all services in the network, whereas the latter assigns port numbers that are unique to the target machine where the service is deployed to (this is to cope with the scarcity of port numbers).

Although the tool is quite useful for systems that do not consist of too many different kinds of components, I ran into a number limitations when I want to manage a more diverse set of services:

Port numbers are not the only numeric IDs that services may require. When deploying systems that consist of self-contained executables, you typically want to run them as unprivileged users for security reasons. User accounts on most UNIX-like systems require unique user IDs, and the corresponding users' groups require unique group IDs.
We typically want to manage multiple resource pools, for a variety of reasons. For example, when we have a number of HTTP server instances and a number of database instances, then we may want to pick port numbers in the 8000-9000 range for the HTTP servers, whereas for the database servers we want to use a different pool, such as 5000-6000.

Assigning unique numeric IDs

To address these shortcomings, I have developed a replacement tool that acts as a generic numeric ID assigner.

This new ID assigner tool works with ID resource configuration files, such as:

rec {
  ports = {
    min = 5000;
    max = 6000;
    scope = "global";
  };

  uids = {
    min = 2000;
    max = 3000;
    scope = "global";
  };

  gids = uids;
}

The above ID resource configuration file (idresources.nix) defines three resource pools: ports is a resource that represents port numbers to be assigned to the webapp processes, uids refers to user IDs and gids to group IDs. The group IDs' resource configuration is identical to the users' IDs configuration.

Each resource attribute refers the following configuration properties:

The min value specifies the minimum ID to hand out, max the maximum ID.
The scope value specifies the scope of the resource pool. global (which is the default option) means that the IDs assigned from this resource pool to services are globally unique for the entire system.

The machine scope can be used to assign IDs that are unique for the machine where a service is distributed to. When the latter option is used, services that are distributed two separate machines may have the same ID.

We can adjust the services/processes model in such a way that every service will use dynamically assigned IDs and that each service specifies for which resources it requires a unique ID:

{ pkgs, system
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager ? "sysvinit"
, ...
}:

let
  ids = if builtins.pathExists ./ids.nix then (import ./ids.nix).ids else {};

  constructors = import ./constructors.nix {
    inherit pkgs stateDir runtimeDir logDir tmpDir forceDisableUserChange processManager ids;
  };

  processType = import ../../nixproc/derive-dysnomia-process-type.nix {
    inherit processManager;
  };
in
rec {
  webapp1 = rec {
    name = "webapp1";
    port = ids.ports.webapp1 or 0;
    dnsName = "webapp.local";
    pkg = constructors.webapp {
      inherit port;
      instanceSuffix = "1";
    };
    type = processType;
    requiresUniqueIdsFor = [ "ports" "uids" "gids" ];
  };

  webapp2 = rec {
    name = "webapp2";
    port = ids.ports.webapp2 or 0;
    dnsName = "webapp.local";
    pkg = constructors.webapp {
      inherit port;
      instanceSuffix = "2";
    };
    type = processType;
    requiresUniqueIdsFor = [ "ports" "uids" "gids" ];
  };
}

In the above services/processes model, we have made the following changes:

In the beginning of the expression, we import the dynamically generated ids.nix expression that provides ID assignments for each resource. If the ids.nix file does not exists, we generate an empty attribute set. We implement this construction (in which the absence of ids.nix can be tolerated) to allow the ID assigner to bootstrap the ID assignment process.
Every hardcoded port attribute of every service is replaced by a reference to the ids attribute set that is dynamically generated by the ID assigner tool. To allow the ID assigner to open the services model in the first run, we provide a fallback port value of 0.
Every service specifies for which resources it requires a unique ID through the requiresUniqueIdsFor attribute. In the above example, both service instances require unique IDs to assign a port number, user ID to the user and group ID to the group.

The port assignments are propagated as function parameters to the constructor functions that configure the services (as shown earlier in this blog post).

We could also implement a similar strategy with the UIDs and GIDs, but a more convenient mechanism is to compose the function that creates the credentials, so that it transparently uses our uids and gids assignments.

As shown in the expression above, the ids attribute set is also propagated to the constructors expression. The constructors expression indirectly composes the createCredentials function as follows:

{pkgs, ids ? {}, ...}:

{
  createCredentials = import ../../create-credentials {
    inherit (pkgs) stdenv;
    inherit ids;
  };

  ...
}

The ids attribute set is propagated to the function that composes the createCredentials function. As a result, it will automatically assign the UIDs and GIDs in the ids.nix expression when the user configures a user or group with a name that exists in the uids and gids resource pools.

To make these UIDs and GIDs assignments go smoothly, it is recommended to give a process instance the same process name, instance name, user and group names.

Using the ID assigner tool

By combining the ID resources specification with the three Disnix models: a services model (that defines all distributable services, shown above), an infrastructure model (that captures all available target machines) and their properties and a distribution model (that maps services to target machines in the network), we can automatically generate an ids configuration that contains all ID assignments:

$ dydisnix-id-assign -s services.nix -i infrastructure.nix \
  -d distribution.nix \
  --id-resources idresources.nix --output-file ids.nix

The above command will generate an ids configuration file (ids.nix) that provides, for each resource in the ID resources model, a unique assignment to services that are distributed to a target machine in the network. (Services that are not distributed to any machine in the distribution model will be skipped, to not waste too many resources).

The output file (ids.nix) has the following structure:

{
  "ids" = {
    "gids" = {
      "webapp1" = 2000;
      "webapp2" = 2001;
    };
    "uids" = {
      "webapp1" = 2000;
      "webapp2" = 2001;
    };
    "ports" = {
      "webapp1" = 5000;
      "webapp2" = 5001;
    };
  };
  "lastAssignments" = {
    "gids" = 2001;
    "uids" = 2001;
    "ports" = 5001;
  };
}

The ids attribute contains for each resource (defined in the ID resources model) the unique ID assignments per service. As shown earlier, both service instances require unique IDs for ports, uids and gids. The above attribute set stores the corresponding ID assignments.
The lastAssignments attribute memorizes the last ID assignment per resource. Once an ID is assigned, it will not be immediately reused. This is to allow roll backs and to prevent data to incorrectly get owned by the wrong user accounts. Once the maximum ID limit is reached, the ID assigner will start searching for a free assignment from the beginning of the resource pool.

In addition to assigning IDs to services that are distributed to machines in the network, it is also possible to assign IDs to all services (regardless whether they have been deployed or not):

$ dydisnix-id-assign -s services.nix \
  --id-resources idresources.nix --output-file ids.nix

Since the above command does not know anything about the target machines, it only works with an ID resources configuration that defines global scope resources.

When you intend to upgrade an existing deployment, you typically want to retain already assigned IDs, while obsolete ID assignment should be removed, and new IDs should be assigned to services that have none yet. This is to prevent unnecessary redeployments.

When removing the first webapp service and adding a third instance:

{ pkgs, system
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager ? "sysvinit"
, ...
}:

let
  ids = if builtins.pathExists ./ids.nix then (import ./ids.nix).ids else {};

  constructors = import ./constructors.nix {
    inherit pkgs stateDir runtimeDir logDir tmpDir forceDisableUserChange processManager ids;
  };

  processType = import ../../nixproc/derive-dysnomia-process-type.nix {
    inherit processManager;
  };
in
rec {
  webapp2 = rec {
    name = "webapp2";
    port = ids.ports.webapp2 or 0;
    dnsName = "webapp.local";
    pkg = constructors.webapp {
      inherit port;
      instanceSuffix = "2";
    };
    type = processType;
    requiresUniqueIdsFor = [ "ports" "uids" "gids" ];
  };
  
  webapp3 = rec {
    name = "webapp3";
    port = ids.ports.webapp3 or 0;
    dnsName = "webapp.local";
    pkg = constructors.webapp {
      inherit port;
      instanceSuffix = "3";
    };
    type = processType;
    requiresUniqueIdsFor = [ "ports" "uids" "gids" ];
  };
}

And running the following command (that provides the current ids.nix as a parameter):

$ dydisnix -s services.nix -i infrastructure.nix -d distribution.nix \
  --id-resources idresources.nix --ids ids.nix --output-file ids.nix

we will get the following ID assignment configuration:

{
  "ids" = {
    "gids" = {
      "webapp2" = 2001;
      "webapp3" = 2002;
    };
    "uids" = {
      "webapp2" = 2001;
      "webapp3" = 2002;
    };
    "ports" = {
      "webapp2" = 5001;
      "webapp3" = 5002;
    };
  };
  "lastAssignments" = {
    "gids" = 2002;
    "uids" = 2002;
    "ports" = 5002;
  };
}

As may be observed, since the webapp2 process is in both the current and the previous configuration, its ID assignments will be retained. webapp1 gets removed because it is no longer in the services model. webapp3 gets the next numeric IDs from the resources pools.

Because the configuration of webapp2 stays the same, it does not need to be redeployed.

The models shown earlier are valid Disnix services models. As a consequence, they can be used with Dynamic Disnix's ID assigner tool: dydisnix-id-assign.

Although these Disnix services models are also valid processes models (used by the Nix process management framework) not every processes model is guaranteed to be compatible with a Disnix service model.

For process models that are not compatible, it is possible to use the nixproc-id-assign tool that acts as a wrapper around dydisnix-id-assign tool:

$ nixproc-id-assign --id-resources idresources.nix processes.nix

Internally, the nixproc-id-assign tool converts a processes model to a Disnix service model (augmenting the process instance objects with missing properties) and propagates it to the dydisnix-id-assign tool.

A more advanced example

The webapp processes example is fairly trivial and only needs unique IDs for three kinds of resources: port numbers, UIDs, and GIDs.

I have also developed a more complex example for the Nix process management framework that exposes several commonly used system services on Linux systems, such as:

{ pkgs ? import <nixpkgs> { inherit system; }
, system ? builtins.currentSystem
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager
}:

let
  ids = if builtins.pathExists ./ids.nix then (import ./ids.nix).ids else {};

  constructors = import ./constructors.nix {
    inherit pkgs stateDir runtimeDir logDir tmpDir cacheDir forceDisableUserChange processManager ids;
  };
in
rec {
  apache = rec {
    port = ids.httpPorts.apache or 0;

    pkg = constructors.simpleWebappApache {
      inherit port;
      serverAdmin = "root@localhost";
    };

    requiresUniqueIdsFor = [ "httpPorts" "uids" "gids" ];
  };

  postgresql = rec {
    port = ids.postgresqlPorts.postgresql or 0;

    pkg = constructors.postgresql {
      inherit port;
    };

    requiresUniqueIdsFor = [ "postgresqlPorts" "uids" "gids" ];
  };

  influxdb = rec {
    httpPort = ids.influxdbPorts.influxdb or 0;
    rpcPort = httpPort + 2;

    pkg = constructors.simpleInfluxdb {
      inherit httpPort rpcPort;
    };

    requiresUniqueIdsFor = [ "influxdbPorts" "uids" "gids" ];
  };
}

The above processes model exposes three service instances: an Apache HTTP server (that works with a simple configuration that serves web applications from a single virtual host), PostgreSQL and InfluxDB. Each service requires a unique user ID and group ID so that their privileges are separated.

To make these services more accessible/usable, we do not use a shared ports resource pool. Instead, each service type consumes port numbers from their own resource pools.

The following ID resources configuration can be used to provision the unique IDs to the services above:

rec {
  uids = {
    min = 2000;
    max = 3000;
  };

  gids = uids;

  httpPorts = {
    min = 8080;
    max = 8085;
  };

  postgresqlPorts = {
    min = 5432;
    max = 5532;
  };

  influxdbPorts = {
    min = 8086;
    max = 8096;
    step = 3;
  };
}

The above ID resources configuration defines a shared UIDs and GIDs resource pool, but separate ports resource pools for each service type. This has the following implications if we deploy multiple instances of each service type:

All Apache HTTP server instances get a TCP port assignment between 8080-8085.
All PostgreSQL server instances get a TCP port assignment between 5432-5532.
All InfluxDB server instances get a TCP port assignment between 8086-8096. Since an InfluxDB allocates two port numbers: one for the HTTP server and one for the RPC service (the latter's port number is the base port number + 2). We use a step count of 3 so that we can retain this convention for each InfluxDB instance.

Conclusion

In this blog post, I have described a new tool: dydisnix-id-assign that can be used to automatically assign unique numeric IDs to services in Disnix service models.

Moreover, I have described: nixproc-id-assign that acts a thin wrapper around this tool to automatically assign numeric IDs to services in the Nix process management framework's processes model.

This tool replaces the old dydisnix-port-assign tool in the Dynamic Disnix toolset (described in the blog post written five years ago) that is much more limited in its capabilities.

Availability

The dydisnix-id-assign tool is available in the current development version of Dynamic Disnix. The nixproc-id-assign is part of the current implementation of the Nix process management framework prototype.

Sander van der Burg's blog

Thursday, September 24, 2020

Assigning unique IDs to services in Disnix deployment models