Thursday, March 17, 2016

The NixOS project and deploying systems declaratively


Last weekend I was in Wrocław, Poland to attend wroc_love.rb, a conference tailored towards (but not restricted to) applications Ruby related. The reason for me to go to there is because I was invited to give a talk about NixOS.

As I have never visited neither Poland nor a Ruby-related conference before, I did not really know what to expect, but it turned out to be a nice experience. The city, venue and people were all quite interesting, and I liked it very much.

In my talk I basically had two objectives: providing a brief introduction to NixOS and diving into one of its underlying visions: declarative deployment. From my perspective, the former aspect is not particularly new as I have given talks about the NixOS project many times (for example, I also crafted three explanation recipes).

Something that I have not done before is diving into the latter aspect. In this blog post, I'd like to elaborate about it, discuss why it is appealing, and in what extent certain tools reach it.

On being declarative


I have used the word declarative in many of my articles. What is supposed to mean?

I have found a nice presentation online that elaborates on four kinds sentences in linguistics. One of the categories covered in the slides are declarative sentences that (according to the presentation) can be defined as:

A declarative sentence makes a statement. It is punctuated by a period.

As an example, the presentation shows:

The dog in the neighbor's yard is barking.

Another class of sentences that the presentation describes are imperative sentences which it defines as:

An imperative sentence is a command or polite request. It ends in a period or exclamation mark.

The following xkcd comic shows an example:


(Besides these two categories of sentences described earlier, the presentation also covers interrogative sentences and exclamatory sentences, but I won't go into detail on that).

On being declarative in programming


In linguistics, the distinction between declarative and imperative sentences is IMO mostly clear -- declarative sentences state facts and imperative sentences are commands or requests.

A similar distinction exists in programming as well. For example, on Wikipedia I found the following definition for declarative programming (the Wikipedia article cites the article: "Practical Advantages of Declarative Programming" written by J.W. Lloyd, which I unfortunately could not find anywhere online):

In computer science, declarative programming is a programming paradigm -- a style of building the structure and elements of computer programs -- that expresses the logic of a computation without describing its control flow.

Imperative programming is sometimes seen as the opposite of declarative programming, but not everybody agrees. I found an interesting discussion blog post written by William Cook that elaborates on their differences.

His understanding of the declarative and imperative definitions are:

Declarative: describing "what" is to be computed rather than "how" to compute the result/behavior

Imperative: a description of a computation that involves implicit effects, usually mutable state and input/output.

Moreover, he says the following:

I agree with those who say that "declarative" is a spectrum. For example, some people say that Haskell is a declarative language, but I my view Haskell programs are very much about *how* to compute a result.

I also agree with William Cook's opinion that declarative is a spectrum -- contrary to linguistics, it is hard to draw a hard line between what and how in programming. Some programming languages that are considered imperative, e.g. C, modify mutable state such as variables:

int a = 5;
a += 3;

But if we would modify the code to work without mutable state, it still remains more a "how" description than a "what" description IMO:

int sum(int a, int b)
{
    return a + b;
}

int result = sum(5, 3);

Two prominent languages that are more about what than how are HTML and CSS. Both technologies empower the web. For example, in HTML I can express the structure of a page:

<!DOCTYPE html>

<html>
    <head>
        <title>Test</title>
        <link rel="stylesheet" href="style.css" type="text/css">
    </head>
    <body>
        <div id="outer">
            <div id="inner">
                <p>HTML and CSS are declarative and so cool!</p>
            </div>
        </div>
    </body>
</html>

In the above code fragment, I define two nested divisions in which a paragraph of text is displayed.

In CSS. I can specify what the style is of these page elements:

#outer {
    margin-left: auto;
    margin-right: auto;
    width: 20%;
    border-style: solid;
}

#inner {
    width: 500px;
}

In the above example, we state that the outer div should be centered, have a width of 20% of the page, and a solid border should be drawn around it. The inner div has a width of 500 pixels.

This approach can be considered declarative, because you do not have to specify how to render the page and the style of the elements (e.g. the text, the border). Instead, this is what the browser's layout engine figures out. Besides being responsible for rendering, it has a number of additional benefits as well, such as:

  • Because it does not matter (much) how a page is rendered, we can fully utilize a system's resources (e.g. a GPU) to render a page in a faster and more fancy way, and optionally degrade a page's appearance if a system's resources are limited.
  • We can also interpret the page in many ways. For example, we can pass the text in paragraphs to a text to speech engine, for people that are visually impaired.

Despite listing some potential advantages, HTML and CSS are not perfect at all. If you would actually check how the example gets rendered in your browser, then you will observe one of CSS's many odd traits, but I am not going to reveal what it is. :-)

Moreover, despite being more declarative (than code written in an imperative programming language such as C) even HTML and CSS can sometimes be considered a "how" specification. For example, you may want to render a photo gallery on your web page. There is nothing in HTML and CSS that allows you to concisely express that. Instead, you need to decompose it into "lower level" page elements, such as paragraphs, hyperlinks, forms and images.

So IMO, being declarative depends on what your goal is -- in some contexts you can exactly express what you want, but in others you can only express things that are in service of something else.

On being declarative in deployment


In addition to development, you eventually have to deploy a system (typically to a production environment) to make it available to end users. To deploy a system you must carry out a number of activities, such as:

  • Building (if a compiled language is used, such as Java).
  • Packaging (e.g. into a JAR file).
  • Distributing (transferring artifacts to the production machines).
  • Activating (e.g. a Java web application in a Servlet container).
  • In case of an upgrade: deactivating obsolete components.

Deployment is often much more complicated than most people expect. Some things that make it complicated are:

  • Many kinds of steps need to be executed, in particular when the technology used is diverse. Without any automation, it becomes extra complicated and time consuming.
  • Deployment in production must be typically done on a large scale. In development, a web application/web service typically serves one user only (the developer), while in production it may need to serve thousands or millions of users. In order to serve many users, you need to manage a cluster of machines having complex constraints in terms of system resources and connectivity.
  • There are non-functional requirements that must be met. For example, while upgrading you want to minimize a system's downtime as much possible. You probably also want to roll back to a previous version if an upgrade went wrong. Accomplishing these properties is often much more complicated than expected (sometimes even impossible!).

As with linguistics and programming, I see a similar distinction in deployment as well -- carrying out the above listed activities are simply the means to accomplish deployment.

What I want (if I need to deploy) is that my system on my development machine becomes available in production, while meeting certain quality attributes of the system that is being deployed (e.g. it could serve thousands of users) and quality attributes of the deployment process itself (e.g. that I can easily roll back in case of an error).

Mainstream solutions: convergent deployment


There are a variety of configuration management tools claiming to support declarative deployment. The most well-known category of tools implement convergent deployment, such as: CFEngine, Puppet, Chef, Ansible.

For example, Chef is driven by declarative deployment specifications (implemented in a Ruby DSL) that may look as follows (I took this example from a Chef tutorial):

...

wordpress_latest = Chef::Config[:file_cache_path] + "/wordpress-latest.tar.gz"

remote_file wordpress_latest do
  source "http://wordpress.org/latest.tar.gz"
  mode "0644"
end

directory node["phpapp"]["path"] do
  owner "root"
  group "root"
  mode "0755"
  action :create
  recursive true
end

execute "untar-wordpress" do
  cwd node['phpapp']['path']
  command "tar --strip-components 1 -xzf " + wordpress_latest
  creates node['phpapp']['path'] + "/wp-settings.php"
end

The objective of the example shown above is deploying a Wordpress web application. What the specification defines is a tarball that must be fetched from the Wordpress web site, a directory that must be created in which a web application is hosted and a tarball that needs to be extracted into that directory.

The specification can be considered declarative, because you do not have to describe the exact steps that need to be executed. Instead, the specification captures the intended outcome of a set of changes and the deployment system converges to the outcome. For example, for the directory that needs to be created, it first checks if it already exists. If so, it will not be created again. It also checks whether it can be created, before attempting to do it.

Converging, instead of directly executing steps, provides additional safety mechanisms and makes deployment processes more efficient as duplicate work is avoided as much as possible.

There are also a number of drawbacks -- it is not guaranteed (in case of an upgrade) that the system can converge to a new set of outcomes. Moreover, while upgrading a system we may observe downtime (e.g. when a new version of the Wordpress is being unpacked). Also, doing a roll back to a previous configuration cannot be done instantly.

Finally, convergent deployment specifications do not guarantee reproducible deployment. For example, the above code does not capture the configuration process of a web server and a PHP extension module, which are required dependencies to run Wordpress. If we would apply the changes to a machine where these components are missing, the changes may still apply but yield a non working configuration.

The NixOS approach


NixOS also supports declarative deployment, but in a different way. The following code fragment is an example of a NixOS configuration:

{pkgs, ...}:

{
  boot.loader.grub.device = "/dev/sda";

  fileSystems = [ { mountPoint = "/"; device = "/dev/sda2"; } ];
  swapDevices = [ { device = "/dev/sda1"; } ];
  
  services = {
    openssh.enable = true;
    
    xserver = {
      enable = true;
      desktopManager.kde4.enable = true;
    };
  };
  
  environment.systemPackages = [ pkgs.mc pkgs.firefox ];
}

In a NixOS configuration you describe what components constitute a system, rather than the outcome of changes:

  • The GRUB bootloader should be installed on the MBR of partition: /dev/sda.
  • The /dev/sda2 partition should be mounted as a root partition, /dev/sda1 should be mounted as a swap partition.
  • We want Mozilla Firefox and Midnight Commander as end user packages.
  • We want to use the KDE 4.x desktop.
  • We want to run OpenSSH as a system service.

The entire machine configuration can be deployed by running single command-line instruction:

$ nixos-rebuild switch

NixOS executes all required deployment steps to deploy the machine configuration -- it downloads or builds all required packages from source code (including all its dependencies), it generates the required configuration files and finally (if all the previous steps have succeeded) it activates the new configuration including the new system services (and deactivating the system services that have become obsolete).

Besides executing the required deployment activities, NixOS has a number of important quality attributes as well:

  • Reliability. Nix (the underlying package manager) ensures that all dependencies are present. It stores new versions of packages next to old versions, without overwriting them. As a result, you can always switch back to older versions if needed.
  • Reproducibility. Undeclared dependencies do not influence builds -- if a build works on one machine, then it works on others as well.
  • Efficiency. Nix only deploys packages and configuration files that are needed.

NixOS is a Linux distribution, but the NixOS project provides other tools bringing the same (or similar) deployment properties to other areas. Nix works on package level (and works on other systems besides NixOS, such as conventional Linux distributions and Mac OS X), NixOps deploys networks of NixOS machines and Disnix deploys (micro)services in networks of machines.

The Nix way of deploying is typically my preferred approach, but these tools also have their limits -- to benefit from the quality properties they provide, everything must be deployed with Nix (and as a consequence: specified in Nix expressions). You cannot take an existing system (deployed by other means) first and change it later, something that you can actually do with convergent deployment tools, such as Chef.

Moreover, Nix (and its sub projects) only manage the static parts of a system such as packages and configuration files (which are made immutable by Nix by making them read-only), but not any state, such as databases.

For managing state, external solutions must be used. For example, I developed a tool called Dysnomia with similar semantics to Nix but it is not always good solution, especially for big chunks of state.

How declarative are these deployment solutions?


I have heard some people claiming that the convergent deployment models are not declarative at all, and the Nix deployment models are actually declarative because they do not specify imperative changes.

Again, I think it depends on how you look at it -- basically, the Nix tools solve problems in a technical domain from declarative specifications, e.g. Nix deploys packages, NixOS entire machine configurations, NixOps networks of machines etc., but typically you would do these kinds of things to accomplish something else, so in a sense you could still consider these approach a "how" rather than a "what".

I have also developed domain-specific deployment tools on top of the tools part of the Nix project allowing me to express concisely what I want in a specific domain:

WebDSL


WebDSL is a domain-specific language for developing web applications with a rich data model, supporting features such as domain modelling, user interfaces and access control. The WebDSL compiler produces Java web applications.

In order to deploy a WebDSL application in a production environment, all kinds of complicated tasks need to be carried out -- we must install a MySQL server, Apache Tomcat server, deploy the web application to the Tomcat server, tune specific settings, and install a reverse proxy that does caching etc.

You typically do not want to express such things in a deployment model. I have developed a tool called webdsldeploy allowing someone to only express the deployment properties that matter for WebDSL applications on a high level. Underneath, the tool consults NixOps (formerly known as Charon) to compose system configurations hosting the components required to run the WebDSL application.

Conference compass


Conference Compass sells services to conference organizers. The most visible part of their service are apps for conference attendees, providing features such as displaying a conference program, list of speakers and floor maps of the venue.

Each customer basically gets "their own app" -- an app for a specific customers has their preferred colors, artwork, content etc. We use a single code base to produce specialized apps.

To produce such specialized apps, we do not want to specify things such as how to build an app for Android through Nix, an app for iOS through Nix, and how to produce debug and release versions etc. These are basically just technical details.

Instead, we have developed our own custom tool that is driven by a specification that concisely expresses what customizations we want (e.g. artwork) and produces the artefacts we want accordingly.

We use a similar approach for our backends -- each app connects to its own dedicated backend allowing users to configure the content displayed in the app. The configurator can also be used to dynamically update the content that is displayed in the apps. For big customers, we offer an additional service in which we develop programs that automatically import data from their information systems.

For the deployment of these backend instances, we do not want to express things such as machines, database services, and the deployment of NPM and Python packages.

Instead, we use a domain-specific tool that is driven by a model that concisely expresses what configurators we want and which third party integrations they provide. The tool is responsible for instantiating virtual machines in the cloud and deploying the services to it.

Conclusion


In this blog post I have elaborated about being declarative in deployment and discussed in what extent certain tools reach it. As with declarative programming, being declarative in deployment is a spectrum.

References


Some aspects discussed in this blog post are covered in my PhD thesis:
  • I did a more elaborate comparison of infrastructure deployment solutions in Chapter 6. I also cover convergent deployment and used CFEngine as an example.
  • I have covered webdsldeploy in Chapter 11, including some background information about WebDSL and its deployment aspects.
  • The overall objective of my PhD thesis is constructing deployment tools for specific domains. Most of the chapters cover the ingredients to do so, but Chapter 3 explains a reference architecture for deployment tools, having similar (or comparable) properties to tools in the Nix project.

For convenience, I have also embedded the slides of my presentation into this web page: