Sunday, February 25, 2018

A more realistic public Disnix example

It has almost been ten years ago when I started developing Disnix -- February 2008 marked the start of my master's thesis internship at Philips Research that resulted in the first prototype version.

Originally, Disnix was specifically developed for one use case only -- a medical service-oriented system called the "Service Development Support System" (SDS2) that can be used for asset tracking and utilisation analysis for medical devices in a hospital environment. More information about this case study can be found in my master's thesis, some of my research papers and my PhD thesis (all of them can be found on my publications page).

Many developments have happened since the realization of the first prototype -- its feature set has been extended considerably, its architecture has been overhauled several times and the code has evolved significantly. Most notably, I have been maintaining a production system for over three years with it.

In all these years, there is always one recurring question that I regularly receive from various kinds of people:

Why should I use Disnix and why would it be useful?

The answer is that Disnix becomes useful when you have a system that can be decomposed into distributable services, such as web services, RESTful services, web applications or processes.

In addition to the fact that Disnix automates its deployment and offers a number of powerful quality properties (e.g. non-destructive upgrades for the static parts of a system), it also helps componentized systems in reaching their full potential -- for example, when services can be built, deployed, and managed individually you can scale a system up and down (e.g. by distributing services to dedicated machines or consolidating all services on a single machine) and you can anticipate more flexibly to events (e.g. by redeploying services when we encounter a crashing machine).

Although the answer may sound simple, service-oriented systems are complicated -- besides facing all kinds of deployment complexities, properly dividing a system into distributable components is also quite challenging. For all the systems I have seen in the last decade, the requirements and their modularization strategies were all quite different from each other. I have also seen a number of systems for which decomposing into services did not work and unnecessary complexities were introduced.

Moreover, it is hard to find representative public examples that people can use as a reference. I was fortunate that I had access to an industrial case study during my research. Nonetheless, I was suffering from many difficulties because of the lack of any meaningful public case studies. As a countermeasure, I developed a collection of example cases in addition to SDS2, but because of their over-simplicity, proving my point often remained hard.

Roughly half a year ago, I have released most parts of my ancient web framework that I used to actively develop before I started doing research in software deployment and I created a couple of example applications for it.


Although my web framework development predates my deployment research, I was already using it to implement information systems that followed some modularity principles that are beneficial when using Disnix as a deployment system.

Recently, I have extended my web framework's example applications repository (providing a homework assistant, CMS, photo gallery and literature survey assistant) to become another public Disnix example case following the same modularity principles I used for the information systems I used to implement at that time.

Creating a componentized web information system


As mentioned earlier in this blog post, I have already implemented a (fairly simple) componentized web information system before I started working on Disnix using my ancient custom made web framework. The "componentization process" (a term that I had neither learned about yet nor something I was consciously implementing at that time) was partially driven by evolution and partially by non-functional requirements.

Originally, the system started out as just one single web application for one specific purpose and consisted of only two components -- a MySQL database responsible for storing the data and web front-end implemented in PHP, which is quite a common separation pattern for PHP applications.

Later, I was asked to implement another PHP application with similar functionality. Initially, I wrote the application from scratch without any reuse in mind, but at some point I made two important decisions:

  • I decided to keep the databases of each applications separate as opposed to integrating all the tables into one single database. My main motivating factor was that I wanted to prevent another developer's wrong decisions from messing up the other application. Moreover, I realized that for the data that was specific to the application domain that other systems did not have to know about it.
  • In addition to domain specific data, I noticed that both databases also stored the same kind of data, namely: user accounts -- both systems had a user account system to allow users to change the data. This also did not motivate me to integrate both databases into one database. Instead, I created a separate user database and authentication system (as a library API) that was shared among both applications.

After completing the two web applications, I had to implement more functionality. I decided to keep all of these new features for these new problem domains in separate applications with separate databases. The only thing they had in common was a shared user authentication system.

At some point I ended up having many sub applications. As a result, I needed a portal application that redirected users to these sub applications. Essentially, what I implemented became a system of systems.

Deployment with Disnix


The "architectural decisions" that I described earlier resulted in a system composed of several kinds of components:

  • Domain-specific web applications exposing functionality that logically belongs together.
  • Domain-specific databases storing tables that are strongly correlated.
  • A shared user database.
  • A portal application redirecting users to the domain-specific web applications.

The above listed components can be distributed over multiple machines in a network, because they connect to each other through network links (e.g. connecting to a MySQL database can be done with a TCP connection and connecting to a domain specific web application can be done through HTTP). As a result, they can also be modeled as services that can be deployed with Disnix.

To replicate the same patterns for demo purposes, I integrated my framework's example applications into a similar system of sub systems. We can deploy the corresponding example system to one single target machine with Disnix, by running:

$ disnixos-env -s services.nix \
  -n network-single.nix \
  -d distribution-single.nix --use-nixops

The entire system gets deployed to a single machine because of the distribution model (distribution.nix) that maps all services to one target machine:

{infrastructure}:

{
  usersdb = [ infrastructure.test1 ];
  cmsdb = [ infrastructure.test1 ];
  cmsgallerydb = [ infrastructure.test1 ];
  homeworkdb = [ infrastructure.test1 ];
  literaturedb = [ infrastructure.test1 ];
  portaldb = [ infrastructure.test1 ];

  cms = [ infrastructure.test1 ];
  cmsgallery = [ infrastructure.test1 ];
  homework = [ infrastructure.test1 ];
  literature = [ infrastructure.test1 ];
  users = [ infrastructure.test1 ];
  portal = [ infrastructure.test1 ];
}

The resulting deployment architecture looks as follows:


The above visualization of the deployment architecture shows the following aspects:

  • The surrounding light grey colored box denotes a target machine. In this particular example, we only have one single target machine where services are deployed to.
  • The dark grey colored boxes correspond to container environments. For our example system, we have two of them: mysql-database corresponding to a MySQL DBMS server and apache-webapplication corresponding to an Apache HTTP server.
  • The ovals denote services corresponding to MySQL databases and web applications.
  • The arrows denote inter-dependency links that correspond to network connections. As explained in my previous blog post, solid arrows are dependencies with a strict ordering requirement while dashed arrows are dependencies without an ordering requirement.

Some people may argue that it is not really beneficial to deploy such a system with Disnix -- with NixOps you can define a machine configuration having a MySQL DBMS server and an Apache HTTP server with the corresponding databases and web application components. With Disnix, you must first ensure that the machines, the MySQL and Apache HTTP servers are configured by other means first (that could for example be done with NixOps), and then you have to deploy the system's components with Disnix.

In a single machine deployment scenario, it may indeed not be that beneficial. However, what you get in addition to automated deployment is also more flexibility. Since Disnix manages the services directly, as opposed to entire machine configurations as a whole, you can anticipate better in case of events by redeploying the system.

For example, when the amount of visitors keeps growing, you may run into the problem that a single server can no longer handle all the traffic. In such cases, you can easily add another machine to the network and adjust the distribution model to move (for example) the databases to another machine:

{infrastructure}:

{
  usersdb = [ infrastructure.test2 ];
  cmsdb = [ infrastructure.test2 ];
  cmsgallerydb = [ infrastructure.test2 ];
  homeworkdb = [ infrastructure.test2 ];
  literaturedb = [ infrastructure.test2 ];
  portaldb = [ infrastructure.test2 ];

  cms = [ infrastructure.test1 ];
  cmsgallery = [ infrastructure.test1 ];
  homework = [ infrastructure.test1 ];
  literature = [ infrastructure.test1 ];
  users = [ infrastructure.test1 ];
  portal = [ infrastructure.test1 ];
}

By redeploying the system, we can take advantage of the additional system resources that the new machine provides:

$ disnixos-env -s services.nix \
  -n network-separate.nix \
  -d distribution-separate.nix --use-nixops

resulting in the following deployment architecture:


Likewise, there are countless of other deployment strategies possible to meet all kinds of non-functional requirements. For example, we can also distribute bundles of domain specific application and database pairs over two machines:

$ disnixos-env -s services.nix \
  -n network-bundles.nix \
  -d distribution-bundles.nix --use-nixops

resulting in the following deployment architecture:


This approach is even more scalable than simply offloading the databases to another server.

In addition to scalability, there are countless of other reasons to pick a certain distribution strategy. You could also, for example, distribute redundant instances of databases and applications as a failover to improve availability or improve security by deploying the databases with privacy sensitive data to a machine with restrictive network access.

State management


When updating the deployment of systems with Disnix (such as moving a database from one machine to another), there may be a recurring limitation that you could run frequently into -- like Nix, Disnix only manages the static parts of the system, but not any state. This means that a service's deployment can be reproduced elsewhere, but data, such as the content of a database is not migrated.

For example, the sub system of example applications stores two kinds of data -- records in the MySQL database and files, such as images uploaded in the photo gallery or PDF files uploaded to the literature application. When moving these applications around the data is not migrated.

As a possible solution, Disnix also provides simple state management facilities. When enabled, Disnix will take snapshots of the databases and filesets on the source machines, transfers the snapshots to the target machines, and finally restores the snapshots when moving a service one machine to another in the distribution model.

State management can be enabled globally by passing the --deploy-state parameter to (disnix-env or annotating the services with deployState = true; in the services model):

$ disnixos-env -s services.nix \
  -n network-bundles.nix \
  -d distribution-bundles.nix --use-nixops --deploy-state

We can also directly use the state management system, e.g. for backup purposes. When running the following command:

$ disnix-snapshot

Disnix takes snapshots of all databases and web application state (e.g. the images in the photo gallery and uploaded PDF files) and transfers them to the coordinator machine. With the dysnomia-snapshots tool we can inspect the snapshot store:

$ dysnomia-snapshots --query-all
apache-webapplication/cms/1f9ed847885d2b3e3c67c51231122d958751eb5e2443c281e02e1d7108a505a3
apache-webapplication/cmsgallery/28d17a6941cb195a92e748aae737ccf524747477c6943436b734891d0f36fd53
apache-webapplication/literature/ed5ec4f8b9b4fcdb8b740ad1fa7ecb40b10dece03548f1d6e09a6a82c804131b
apache-webapplication/portal/5bbea499f8f8a4f708bb873ad683dbf088afa4c553f90ab287a9249a7ef02651
mysql-database/cmsdb/aa75992f780991c39a0969dcac5f69b04685c4fa764937476b816e938d6972ba
mysql-database/cmsgallerydb/31ebdaba658ca376123ff6a91a3e275731b383346a07840b1acaa1e44d921b65
mysql-database/homeworkdb/f0fda91545af0cb300afd84592d4914dcd48257053401e232438e34d83af828d
mysql-database/literaturedb/cb881c2200a5f1562f0b66f1394d0902bbb8e2361068fe096faac3bc31f76b5d
mysql-database/portaldb/5d8a5cb952f40ce76f93eb939d0b37eab33736d7b1e1426038322f8a572034ee
mysql-database/usersdb/64d11fc7f8969da5da318276a666f2e00e0a020ba619a1d82ed9b84a7f1c2ca6

and with some shell scripting, the actual contents of the snapshot store:

$ find $(dysnomia-snapshots --resolve $(dysnomia-snapshots --query-all)) -type f
/home/sander/state/snapshots/apache-webapplication/cms/1f9ed847885d2b3e3c67c51231122d958751eb5e2443c281e02e1d7108a505a3/state.tar.xz
/home/sander/state/snapshots/apache-webapplication/cmsgallery/28d17a6941cb195a92e748aae737ccf524747477c6943436b734891d0f36fd53/state.tar.xz
/home/sander/state/snapshots/apache-webapplication/literature/ed5ec4f8b9b4fcdb8b740ad1fa7ecb40b10dece03548f1d6e09a6a82c804131b/state.tar.xz
/home/sander/state/snapshots/apache-webapplication/portal/5bbea499f8f8a4f708bb873ad683dbf088afa4c553f90ab287a9249a7ef02651/state.tar.xz
/home/sander/state/snapshots/mysql-database/cmsdb/aa75992f780991c39a0969dcac5f69b04685c4fa764937476b816e938d6972ba/dump.sql.xz
/home/sander/state/snapshots/mysql-database/cmsgallerydb/31ebdaba658ca376123ff6a91a3e275731b383346a07840b1acaa1e44d921b65/dump.sql.xz
/home/sander/state/snapshots/mysql-database/homeworkdb/f0fda91545af0cb300afd84592d4914dcd48257053401e232438e34d83af828d/dump.sql.xz
/home/sander/state/snapshots/mysql-database/literaturedb/cb881c2200a5f1562f0b66f1394d0902bbb8e2361068fe096faac3bc31f76b5d/dump.sql.xz
/home/sander/state/snapshots/mysql-database/portaldb/5d8a5cb952f40ce76f93eb939d0b37eab33736d7b1e1426038322f8a572034ee/dump.sql.xz
/home/sander/state/snapshots/mysql-database/usersdb/64d11fc7f8969da5da318276a666f2e00e0a020ba619a1d82ed9b84a7f1c2ca6/dump.sql.xz

The above output shows that for each MySQL database, we store a compressed SQL dump of the database and for each stateful web application, a compressed tarball of state files.

Conclusion


In this blog post, I have described a more realistic public Disnix example that is inspired by my web framework developments a long time ago. Aside from automating a system's deployment, the purpose of this blog post is to describe how a system that can be decomposed into distributable services that can be deployed with Disnix. Implementing such a system is all but trivial and driven by various kinds of design decisions.

Availability


The example web application system can be obtained from my GitHub page. The Disnix deployment expressions can be found in the deployment/ sub folder.

In addition, I have created a Dysnomia module named: fileset that can capture the state files of web applications in a compressed tarball.

After the recent developments the Disnix toolset has reached a new stable point. As a result, I have decided to release Disnix 0.8. Consult the Disnix homepage for more information!

3 comments:

  1. For the state snapshots, do you use a filesystem like LVM?

    ReplyDelete
    Replies
    1. dysnomia takes the snapshots with an application-specific script.

      Delete
    2. Yes that's correct. Dysnomia implements a plugin system and plugin decides how a snapshot should be taken or restored.

      In this particular example, the databases are snapshotted by invoking the mysqldump command and the state of the web applications (e.g. the photo gallery) by invoking the tar command.

      I don't use any specialized filesystem. This has a number of performance implications, in particular for big databases, but for small datasets it works well enough.

      Delete