Sander van der Burg's blog: On-demand service activation and self termination

I have written quite a few blog posts on service deployment with Disnix this year. The deployment mechanics that Disnix implements work quite well for my own purposes.

Unfortunately, having a relatively good deployment solution does not necessarily mean that a system functions well in a production environment -- there are also many other concerns that must be dealt with.

Another important concern of service-oriented systems is dealing with resource consumption, such as RAM, CPU and disk space. Obviously, services need them to accomplish something. However, since they are typically long running, they also consume resources even if they are not doing any work.

These problems could become quite severe if services have been poorly developed. For example, they may leak memory and never fully release the RAM they have allocated. As a result, an entire machine may eventually run out of memory. Moreover, "idle" services may degrade the performance of other services running on the same machine.

There are various ways to deal with resource problems:

The most obvious solution is buying bigger or additional hardware resources, but this typically increases the costs of maintaining a production environment. Moreover, it does not take the source of some of the problems away.
Another solution would be to fix and optimize problematic services, but this could be a time consuming and costly process, in particular when there is a high technical debt.
A third solution would be to support on-demand service activation and self termination -- a service gets activated the first time it is consulted and terminates itself after a period of idleness.

In this blog post, I will describe how to implement and deploy a system supporting the last solution.

To accomplish this goal, we need to modify the implementations of the services -- we must retrieve an incoming connection from the host system's service manager that activates a service when a client connects and self terminate when the moment is right.

Furthermore, we need to adapt a service's deployment procedure to use these facilities.

Retrieving a socket from the host system's service manager

In many conventional deployment scenarios, the services themselves are responsible for creating the sockets to which clients can connect. However, if we want to activate them on-demand this property conflicts -- the socket must already exist before the process runs, so that it can be started when a client connects.

We can use a service manager that supports socket activation to accomplish on-demand activation. There are various solutions supporting this property. The most prominently advertised solution is probably systemd, but there are other solutions that can do this as well, such as launchd, inetd, or xinetd, albeit the protocols that activated processes must implement differ.

In one of my toy example systems used for testing Disnix (the TCP proxy example) I used to do the following:

static int create_server_socket(int source_port)
{
    int sockfd, on = 1;
    struct sockaddr_in client_addr;
        
    /* Create socket */
    sockfd = socket(AF_INET, SOCK_STREAM, 0);
    if(sockfd < 0)
    {
        fprintf(stderr, "Error creating server socket!\n");
        return -1;
    }    

    /* Create address struct */
    memset(&client_addr, '\0', sizeof(client_addr));
    client_addr.sin_family = AF_INET;
    client_addr.sin_addr.s_addr = htonl(INADDR_ANY);
    client_addr.sin_port = htons(source_port);
        
    /* Set socket options to reuse the address */
    setsockopt(sockfd, SOL_SOCKET, SO_REUSEADDR, &on, 4);
      
    /* Bind the name (ip address) to the socket */
    if(bind(sockfd, (struct sockaddr *)&client_addr, sizeof(client_addr)) < 0)
        fprintf(stderr, "Error binding on port: %d, %s\n", source_port, strerror(errno));
        
    /* Listen for connections on the socket */
    if(listen(sockfd, 5) < 0)
        fprintf(stderr, "Error listening on port %d\n", source_port);

    /* Return the socket file descriptor */
    return sockfd;
}

The function listed above is responsible for creating a socket file descriptor, binding the socket to an IP address and TCP port, and listening for incoming connections.

To support on-demand activation, I need to modify this function to retrieve the server socket from the service manager. Systemd's socket activation protocol works by passing the socket as the third file descriptor to the process that it spawns. By adjusting the previously listed code into the following:

static int create_server_socket(int source_port)
{
    int sockfd, on = 1;

#ifdef SYSTEMD_SOCKET_ACTIVATION
    int n = sd_listen_fds(0);
    
    if(n > 1)
    {
        fprintf(stderr, "Too many file descriptors received!\n");
        return -1;
    }
    else if(n == 1)
        sockfd = SD_LISTEN_FDS_START + 0;
    else
    {
#endif
        struct sockaddr_in client_addr;
        
        /* Create socket */
        sockfd = socket(AF_INET, SOCK_STREAM, 0);
        if(sockfd < 0)
        {
            fprintf(stderr, "Error creating server socket!\n");
            return -1;
        }
        
        /* Create address struct */
        memset(&client_addr, '\0', sizeof(client_addr));
        client_addr.sin_family = AF_INET;
        client_addr.sin_addr.s_addr = htonl(INADDR_ANY);
        client_addr.sin_port = htons(source_port);
        
        /* Set socket options to reuse the address */
        setsockopt(sockfd, SOL_SOCKET, SO_REUSEADDR, &on, 4);
        
        /* Bind the name (ip address) to the socket */
        if(bind(sockfd, (struct sockaddr *)&client_addr, sizeof(client_addr)) < 0)
            fprintf(stderr, "Error binding on port: %d, %s\n", source_port, strerror(errno));
        
        /* Listen for connections on the socket */
        if(listen(sockfd, 5) < 0)
            fprintf(stderr, "Error listening on port %d\n", source_port);

#ifdef SYSTEMD_SOCKET_ACTIVATION
    }
#endif

    /* Return the socket file descriptor */
    return sockfd;
}

the server will use the socket that has been created by systemd (and passed as a third file descriptor). Moreover, if the server is started as a standalone process, it will revert to its old behaviour and allocates the server socket itself.

I have wrapped the systemd specific functionality inside a conditional preprocessor block so that it only gets included when I explicitly ask for it. The downside of supporting systemd's socket activation protocol is that we require some functionality that is exposed by a shared library that has been bundled with systemd. As systemd is Linux (and glibc) specific, it makes no sense to build a service with this functionality enabled on non-systemd based Linux distributions and non-Linux operating systems.

Besides conditionally including the code, I also made linking against the systemd library conditional in the Makefile:

CC = gcc

ifeq ($(SYSTEMD_SOCKET_ACTIVATION),1)
    EXTRA_BUILDFLAGS=-DSYSTEMD_SOCKET_ACTIVATION=1 $(shell pkg-config --cflags --libs libsystemd)
endif

all:
 $(CC) $(EXTRA_BUILDFLAGS) hello-world-server.c -o hello-world-server

...

so that the systemd-specific code block and library only get included if I run 'make' with socket activation explicitly enabled:

$ make SYSTEMD_SOCKET_ACTIVATION=1

Implementing self termination

As with on-demand activation, there is no way to do self termination generically and we must modify the service to support this property in some way.

In the TCP proxy example, I have implemented a simple approach using a counter (that is initially set to 0):

volatile unsigned int num_of_connections = 0;

For each client that connects to the server, we fork a child process that handles the connection. Each time we fork, I also raise the connection counter in the parent process:

while(TRUE)
{
    /* Create client socket if there is an incoming connection */
    if((client_sockfd = wait_for_connection(server_sockfd)) >= 0)
    {
        /* Fork a new process for each incoming client */
        pid_t pid = fork();
     
        if(pid == 0)
        {
            /* Handle the client's request and terminate
             * when it disconnects */
        }
        else if(pid == -1)
            fprintf(stderr, "Cannot fork connection handling process!\n");
#ifdef SELF_TERMINATION
        else
            num_of_connections++;
#endif
    }

    close(client_sockfd);
    client_sockfd = -1;
}

(As with socket activation, I have wrapped the termination functionality in a conditional preprocessor block -- it makes no sense to include this functionality into a service that cannot be activated on demand).

When a client disconnects, the process handling its connection terminates and sends a SIGCHLD signal to the parent. We can configure a signal handler for this type of signal as follows:

#ifdef SELF_TERMINATION
    signal(SIGCHLD, sigreap);
#endif

and use the corresponding signal handler function to decrease the counter and wait for the client process to terminate:

#ifdef SELF_TERMINATION

void sigreap(int sig)
{
    pid_t pid;
    int status;
    num_of_connections--;
    
    /* Event handler when a child terminates */
    signal(SIGCHLD, sigreap);
    
    /* Wait until all child processes terminate */
    while((pid = waitpid(-1, &status, WNOHANG)) > 0);

Finally, the server can terminate itself when the counter has reached 0 (which means that it is not handling any connections and the server has become idle):

    if(num_of_connections == 0)
        _exit(0);
}
#endif

Deploying services with on demand activation and self termination enabled

Besides implementing socket activation and self termination, we must also deploy the server with these features enabled. When using Disnix as a deployment system, we can write the following service expression to accomplish this:

{stdenv, pkgconfig, systemd}:
{port, enableSystemdSocketActivation ? false}:

let
  makeFlags = "PREFIX=$out port=${toString port}${stdenv.lib.optionalString enableSystemdSocketActivation " SYSTEMD_SOCKET_ACTIVATION=1"}";
in
stdenv.mkDerivation {
  name = "hello-world-server";
  src = ../../../services/hello-world-server;
  buildInputs = if enableSystemdSocketActivation then [ pkgconfig systemd ] else [];
  buildPhase = "make ${makeFlags}";
  installPhase = ''
    make ${makeFlags} install
    
    mkdir -p $out/etc
    cat > $out/etc/process_config <<EOF
    container_process=$out/bin/process
    EOF
    
    ${stdenv.lib.optionalString enableSystemdSocketActivation ''
      mkdir -p $out/etc
      cat > $out/etc/socket <<EOF
      [Unit]
      Description=Hello world server socket
      
      [Socket]
      ListenStream=${toString port}
      EOF
    ''}
  '';
}

In the expression shown above, we do the following:

We make the socket activation and self termination features configurable by exposing it as a function parameter (that defaults to false disabling it).
If the socket activation parameter has been enabled, we pass the SYSTEMD_SOCKET_ACTIVATION=1 flag to 'make' so that these facilities are enabled in the build system.
We must also provide two extra dependencies: pkgconfig and systemd to allow the program to find the required library functions to retrieve the socket from systemd.
We also compose a systemd socket unit file that configures systemd on the target system to allocate a server socket that activates the process when a client connects to it.

Modifying Dysnomia modules to support socket activation

As explained in an older blog post, Disnix consults a plugin system called Dysnomia that takes care of executing various kinds of deployment activities, such as activating and deactivating services. The reason that a plugin system is used, is because services can be any kind of deployment unit with no generic activation procedure.

For services of the 'process' and 'wrapper' type, Dysnomia integrates with the host system's service manager. To support systemd's socket activation feature, we must modify the corresponding Dysnomia modules to start the socket unit instead of the service unit on activation. For example:

$ systemctl start disnix-53bb1pl...-hello-world-server.socket

starts the socket unit, which in turn starts the service unit with the same name when a client connects to it.

To deactivate the service, we must first stop the socket unit and then the service unit:

$ systemctl stop disnix-53bb1pl...-hello-world-server.socket
$ systemctl stop disnix-53bb1pl...-hello-world-server.service

Discussion

In this blog post, I have described an on-demand service activation and self termination approach using systemd, Disnix, and a number of code modifications. Some benefits of this approach are that we can save system resources such as RAM and CPU, improve the performance of non-idle services running on a same machine, and reduce the impact of poorly implemented services that (for example) leak memory.

There are also some disadvantages. For example, connecting to an inactive service introduces latency, in particular when a service has a slow start up procedure making it less suitable for systems that must remain responsive.

Moreover, it does not cope with potential disk space issues -- a non-running service still consumes disk space for storing its package dependencies and persistent state, such as databases.

Finally, there are some practical notes on the solutions described in the blog post. The self termination procedure in the example program terminates the server immediately after it has discovered that there are no active connections. In practice, it may be better to implement a timeout to prevent unnecessary latencies.

Furthermore, I have only experimented with systemd's socket activation features. However, it is also possible to modify the Dysnomia modules to support different kinds of activation protocols, such as the ones provided by launchd, inetd or xinetd.

The TCP proxy example uses C as an implementation language, but systemd's socket activation protocol is not limited to C programs. For instance, an example program on GitHub demonstrates how a Python program running an embedded HTTP server can be activated with systemd's socket activation mechanism.

References

I have modified the development version of Dysnomia to support the socket activation feature of systemd. Moreover, I have extended the TCP proxy example package with a sub example that implements the on-demand activation and self termination approach described in this blog post.

Both packages can be obtained from my GitHub page.

Sander van der Burg's blog

Friday, December 4, 2015

On-demand service activation and self termination