Sander van der Burg's blog: December 2020

Thursday, December 31, 2020

Annual blog reflection over 2020

In my previous blog post that I wrote yesterday, I celebrated my blog's 10th anniversary and did a reflection over the last decade. However, I did not elaborate much about 2020.

Because 2020 is a year for the history books, I have decided to also do an annual reflection over the last year (similar to my previous annual blog reflections).

A summary of blog posts written in 2020

Nearly all of the blog posts that I have written this year were in service of only two major goals: developing the Nix process management framework and implementing service container support in Disnix.

Both of them took a substantial amount of development effort. Much more than I initially anticipated.

Investigating process management

I already started working on this topic last year. In November 2019, I wrote a blog post about packaging sysvinit scripts and a Nix-based functional organization for configuring process instances, that could potentially also be applied to other process management solutions, such as systemd and supervisord.

After building my first version of a framework (which was already a substantial leap in reaching my full objective), I thought it would not take me that much time to get all details that I originally planned finished. It turns out that I heavily underestimated the complexity.

To test my framework, I needed a simple test program that could daemonize on its own, which was (and still is) a common practice for running services on Linux and many other UNIX-like operating systems.

I thought writing such a tool that daemonizes would be easy, but after some research, I discovered that it is actually quite complicated to do it properly. I wrote a blog post about my findings.

It took me roughly three months to finish the first implementation of the process manager-agnostic abstraction layer that makes it possible to write a high-level specification of a running process, that could universally target all kinds of process managers, such as sysvinit, systemd, supervisord and launchd.

After completing the abstraction layer, I also discovered that a sufficiently high-level deployment specification of running processes could also target other kinds deployment solutions.

I have developed two additional backends for the Nix process management framework: one that uses Docker and another using Disnix. Both solutions are technically not qualified as process managers, but they can still be used as such by only using a limited set of features of these tools.

To be able to develop the Docker backend, I needed to dive deep into the underlying concepts of Docker. I wrote a blog post about the relevant Docker deployment concepts, and also gave a presentation about it at Mendix.

While implementing more examples, I also realized that to more securely run long-running services, they typically need to run as unprivileged users. To get predictable results, these unprivileged users require stable user IDs and group IDs.

Several years ago, I have already worked on a port assigner tool that could already assign unique TCP/UDP port numbers to services, so that multiple instances can co-exist.

I have extended the port assigner tool to assign arbitrary numeric IDs to generically solve this problem. In turns out that implementing this tool was much more difficult than expected -- the Dynamic Disnix toolset was originally developed under very high time pressure and had a substantial amount of technical debt.

In order to implement the numeric ID assigner tool, I needed to revise the model parsing libraries, that broke the implementations of some of the deployment planner algorithms.

To fix them, I was forced to study how these algorithms worked again. I wrote a blog post about the graph-based deployment planning algorithms and a new implementation that should be better maintainable. Retrospectively, I wish I did my homework better at the time when I wrote the original implementation.

In September, I gave a talk about the Nix process management framework at NixCon 2020, that was held online.

I pretty much reached all my objectives that I initially set for the Nix process management framework, but there is still some leftover work to bring it at an acceptable usability level -- to be able to more easily add new backends (somebody gave me s6 as an option) the transformation layer needs to be standardized.

Moreover, I still need to develop a test strategy for services so that you can be (reasonably) sure that they work with a variety of process managers and under a variety of conditions (e.g. unprivileged user deployments).

Exposing services as containers in Disnix

Disnix is a Nix-based distributed service deployment tool. Services can basically be any kind of deployment unit whose life-cycle can be managed by a companion tool called Dysnomia.

There is one practical problem though, in order to deploy a service-oriented system with Disnix, it typically requires the presence of already deployed containers (not be confused with Linux containers), that are environments in which services are managed by another service.

Some examples of container providers and corresponding services are:

The MySQL DBMS (as a container) and multiple hosted MySQL databases (as services)
Apache Tomcat (as a container) and multiple hosted Java web applications (as services)
systemd (as a container) and multiple hosted systemd unit configuration files (as services)

Disnix deploys the services (as described above), but not the containers. These need to be deployed by other means first.

In the past, I have been working on solutions that manage the underlying infrastructure of services as well (I typically used to call this problem domain: infrastructure deployment). For example, NixOps can deploy a network of NixOS machines that also expose container services that can be used by Disnix. It is also possible to deploy the containers as services, in a separate deployment layer managed by Disnix.

When the Nix process management framework became more usable, I wanted to make the deployment of container providers also a more accessible feature. I heavily revised Disnix with a new feature that makes it possible to expose services as container providers, making it possible to deploy both the container services and application services from a single deployment model.

To make this feature work reliably, I was again forced to revise the model transformation pipeline. This time I concluded that the lack of references in the Nix expression language was an impediment.

Another nice feature by combining the Nix process management framework and Disnix is that you can more easily deploy a heterogeneous system locally.

I have released a new version of Disnix: version 0.10, that provides all these new features.

The Monitoring playground

Besides working on the two major topics shown above, the only other thing I did was a Mendix crafting project in which I developed a monitoring playground, allowing me to locally experiment with alerting scripts, including the visualization and testing.

Some thoughts

From a blogging perspective, I am happy what I have accomplished this year -- not only have I managed to reach my usual level of productivity again (last year was somewhat disappointing), I also managed to both develop a working Nix process management framework (making it possible to use all kinds of process managers), and use Disnix to deploy both container and application services. Both of these features are on my wish list for many years.

In the Nix community, having the ability to also use other process managers than systemd, is something we have been discussing already since late 2014.

However, there are also two major things that kept me mentally occupied in the last year.

Open source work

Many blog posts are about the open source work I do. Some of my open source work is done as part of my day time job as a software engineer -- sometimes we can write a new useful feature, make an extension to an existing project that may come in handy, or do something as part of an investigation/learning project.

However, the majority of my open source work is done in my spare time -- in many cases, my motivation is not as altruistic as people may think: typically I need something to solve my own problems or there is some technical concept that I would like to explore. However, I still do a substantial amount of work to help other people or for the "greater good".

Open source projects are typically quite satisfying the work on, but they also have negative aspects (typically the negative aspects are negligible in the early stages of a project). Sadly as projects get more popular and gain more exposure, the negativity attached to them also grows.

For example, although I got quite a few positive reactions on my work on the Nix process management framework (especially at NixCon 2020), I know that not everybody is or will be happy about it.

I have worked with people in the past, who consider this kind of work a complete waste of time -- in their opinion, we already have Kubernetes that has already solved all relevant service management problems (some people even think it is a solution to all problems).

I have to admit that, while Kubernetes can be used to solve similar kind of problems (and what is not supported as a first-class feature, can still be scripted in ad-hoc ways), there is still much to think about:

As explained in my blog post about Docker concepts, the Nix store typically supports much more efficient sharing of common dependencies between packages than layered Docker images, resulting in much lower disk space consumption and RAM consumption of processes that have common dependencies.
Docker containers support the deployment of so-called microservices because of a common denominator: processes. Almost all modern operating systems and programming-languages have a notion of processes.

As a consequence, lots of systems nowadays typically get constructed in such a way that they can be easily decomposed into processes (translating to container instances), imposing substantial overhead on each process instance (because these containers typically need to embed common sub services).

Services can also be more efficiently deployed (in terms of storage and RAM usage) as units by managed by a common runtime (e.g. multiple Java web applications managed by Apache Tomcat or multple PHP applications managed by the Apache HTTP server).

The latter form of reuse is now slowly disappearing, because it does not fit nicely in a container model. In Disnix, this form of reuse is a first-class concept.
Microservices managed by Docker (somewhat) support technology diversity, because of the fact that all major programming languages support the concept of processes.

However, one particular kind of technology that you cannot choose is the the operating system -- Docker/Kubernetes relies on non-standardized Linux-only concepts.

I have also been thinking about the option to pick your operating system as well: you need security, then pick: OpenBSD, you want performance, then pick: Linux etc. The Nix process management framework allows you to also target process managers on different operating systems than Linux, such as BSD rc scripts and Apple's launchd.

I personally believe that these goals are still important, and that keeps me motivated to work on it.

Furthermore, I also believe that it is important to have multiple implementations of tools that solve the same or similar kind of problems -- in the open source world, there are lot of "battles" between communities about which technology should be the standard for a certain problem.

My favourite example of such a battle is the system's process manager -- many Linux distributions nowadays have adopted systemd, but this is not without any controversy, such as in the Debian project.

It took them many years to come to the decision to adopt it, and still there are people who want to discuss "init system diversity". Likewise, there are people who find the systemd-adoption decision unacceptable, and have forked Debian into Devuan, providing a Debian-based distribution without systemd.

With the Nix process management framework the fact that systemd exists (and may not be everybody's first choice) is not a big deal -- you can actually switch to other solutions, if desired. A battle between service managers is not required. A sufficiently high-level specification of a well understood problem allows you to target multiple solutions.

Another problem I face is that these two projects are not the only projects I have been working on or maintain. There are many other projects I have been working on the past.

Sadly, I am also a very bad multitasker. If there are problems reported with my other projects, and the fix is straight forward, or there is a straight forward pull request, then it is typically no big deal to respond.

However, I also learned that some for some of the problems other people face, there is no quick fix. Sometimes I get pull requests that partially solves a problem, or in other cases: fix a specific problem, but breaks others features. These pull requests cannot always be directly accepted and also need a substantial amount of my time for reviewing.

For certain kinds of reported problems, I need to work on a fundamental revision that requires a substantial amount of development effort -- however, it is impossible to pick up such a task while working on another "major project".

Alternatively, I need to make the decision to abandon what I am currently working on and make the switch. However, this option also does not have my preference because I know it will significantly delay my original goal.

I have noticed that lots of people get dissatisfied and frustrated, including myself. Moreover, I also consider it a bad thing to feel pressure on the things I am working on in my spare time.

So what to do about it? Maybe I can write a separate blog post on this subject.

Anyway, I was not planning to abandon or stop anything. Eventually, I will pick up these other problems as well -- my strategy for now, is to do it when I am ready. People simply have to wait (so if you are reading this and waiting for something: yes, I will pick it up eventually, just be patient).

The COVID-19 crisis

The other subject that kept me (and pretty much everybody in the world) busy is the COVID-19 crisis.

I still remember the beginning of 2020 -- for me personally, it started out very well. I visited some friends that I have not seen in a long time, and then FOSDEM came, the Free and Open Source Developers European Meeting.

Already in January, I heard about this new virus that was rapidly spreading in the Wuhan region on the news. At that time, nobody in the Netherlands (or in Europe) was really worried yet. Even to questions, such as: "what will happen when it reaches Europe?", people typically responded with: "ah yes, well influenza has a impact on people too, it will not be worse!".

A few weeks later, it started to spread to countries close to Europe. The first problematic country I heard about was Iran, and a couple of weeks later it reached Italy. In Italy, it spread so rapidly that within only a few weeks, the intensive care capacity was completely drained, forcing medical personnel to make choices who could be helped and who could not.

By then, it sounded pretty serious to me. Furthermore, I was already quite sure that it was only a matter of time before it would reach the Netherlands. And indeed, at the end of February, the first COVID-19 case was reported. Apparently this person contracted the virus in Italy.

Then the spreading went quickly -- every day, more and more COVID-19 cases were reported and this amount grew exponentially. Similar to other countries, we also slowly ran into capacity problems in hospitals (materials, equipment, personnel, intensive care capacity etc.). In particular, the intensive care capacity reached at a very critical level. Fortunately, there were hospitals in Germany willing to help us out.

In March, a country-wide lockdown was announced -- basically all group activities were forbidden, schools and non-essential shops were closed, and everybody who is capable of working from should work from home. As a consequence, since March, I have been permanently working from home.

As with pretty much everybody in the world, COVID-19 has negative consequences for me as well. Fortunately, I have not much to complain about -- I did not get unemployed, I did not get sick, and also nobody in my direct neighbourhood ran into any serious problems.

The biggest implication of the COVID-19 pandemic for me is social contacts -- despite the lockdown I still regularly meet up with family and frequent acquaintances, but I have barely met any new people. For example, at Mendix, I typically came in contact with all kinds of people in the company, especially those that do not work in the same team.

Moreover, I also learned that quite a few of my contacts got isolated because of all group activities that were canceled -- for example I did not have any music rehearsals in a while, causing me not to see or speak to any of my friends there.

Same thing with conferences and meet ups -- because most of them were canceled or turned into online events, it is very difficult to have good interactions with new people.

I also did not do any traveling -- my summer holiday was basically a staycation. Fortunately, in the summer, we have managed to minimize the amount of infections, making it possible to open up public places. I visited some touristic places in the Netherlands, that are normally crowded by people from abroad. That by itself was quite interesting -- I normally tend to neglect national touristic sites.

Although the COVID-19 pandemic brought all kinds of negative things, there were also a couple of things that I consider a good thing:

At Mendix, we have an open office space that typically tends to be very crowded and noisy. It is not that I cannot work in such an environment, but I also realize that I do appreciate silence, especially for programming tasks that require concentration. At home, it is quiet, I have much fewer distractions and I also typically feel much less tired after a busy work day.
I also typically used to neglect home improvements a lot. The COVID-19 crisis helped me to finally prioritize some non-urgent home improvements tasks -- for example, on the attic, where my musical instruments are stored, I finally took the time to organize everything in such a way that I can rehearse conveniently.
Besides the fact that rehearsals and concerts were cancelled, I actually practiced a lot -- I even studied many advanced solo pieces that I have not looked at in years. Playing music became a standard activity between my programming tasks, to clear my mind. Normally, I would use this time to talk to people at the coffee machine in the office.
During busy times I also used to tend to neglect house keeping tasks a lot. I still remember (many years ago) when I just moved into my first house, doing the dishes was already a problem (I had no dish washer at that time). When working from home, it is not a problem to keep everything tidy.
It is also much easier to maintain healthy daily habits. In the first lockdown (that was in spring), cycling/walking/running was a daily routine that I could maintain with ease.

In the Netherlands, we have managed to overcome the first lockdown in just a matter of weeks by social distancing. Sadly, after the restrictions were relaxed we got sloppy and at the end of the summer the infection rate started to grow. We also ran into all kinds of problems to mitigate the infections -- limited test capacity, people who got tired of all the countermeasures not following the rules, illegal parties etc.

Since a couple of weeks we are in our second lockdown with a comparable level of strictness -- again, the schools and non-essentials shops are closed etc. The second lockdown feels a lot worse than the first -- now it is in the winter, people are no longer motivated (the amount of people that revolt in the Netherlands have grown substantially, including people spreading news that everything is a Hoax and/or a one big scheme organized by left-wing politicians) and it is already taking much longer than the first.

Fortunately, there is a tiny light at the end of the tunnel. In Europe, one vaccine (the Pfizer vaccine) has been approved and more submissions are pending (with good results). By Monday, the authorities will start to vaccinate people in the Netherlands.

If we can keep the infection rates and the mutations under control (such as the mutation that appeared in England) then we will eventually build up the required group immunity to finally get the situation under control (this probably is going to take many more months, but at least it is a start).

Conclusion

This elaborate reflection blog post (that is considerably longer than all my previous yearly reflections combined) reflects over 2020 that is probably a year that will no go unnoticed in the history books.

I hope everybody remains in good health and stays motivated to do what it is needed to get the virus under control.

Moreover, when the crisis is over, I also hope we can retain the positive things learned in this crisis, such as making it more a habit to allow people to work (at least partially) from home. The open-source complaint in this blog post is just a minor inconvenience compared to the COVID-19 crisis and the impact that it has on many people in the world.

The final thing I would like to say is:

HAPPY NEW YEAR!!!!!!!!!!!!

Wednesday, December 30, 2020

Blog reflection over the last decade

Today it is exactly ten years ago that I started this blog. As with previous years, I will do a reflection, but this time it will be over the last decade.

What was holding me back

The idea to have my own blog was already there for a long time. I always thought it was an interesting medium. For example, I considered it a good instrument to express my thoughts on the technical work I do, and in particular, I liked having the ability to get feedback.

The main reason why it still took me so long to start was because I never considered it "the right time". For example, I was already close to starting a blog 15 years ago (while I was still early in my studies) when web development was still one of my main technical interests, but still refrained from doing so.

At that time I did some interesting "discoveries" and I had some random ideas I could elaborate about, but these ideas never materialized enough so that I could write a story about it.

Moreover, I also did not feel comfortable enough yet to express myself, because I did not have much writing experience in English. Retrospectively, I learned that there is a never a right time for having a blog, I should just start.

Entering the research domain

A couple of years later, while I was working on my master's thesis, I made the decision to go for a PhD degree, because I was genuinely interested in the research domain of my master's thesis: software deployment, mostly because of my prior experience in industry and building Linux distributions from scratch.

Even before starting my PhD, I already knew that writing is an important component in research -- as a researcher, you have to regularly report about your work by means of research papers that typically need to be anonymously peer reviewed.

In most scientific disciplines, academic papers are published in journals. In the computer science domain, it is more common to publish papers in conference proceedings.

Only a certain percentage of paper submissions that are considered good quality (as judged by the peer reviews) are accepted for publication. Rejection of a paper typically requires you make revisions and submitting that paper to a different conference.

For top general conferences in the software engineering domain the acceptance rate is typically lower than 20% (this ratio used be even lower, close to 15%).

In my PhD, I had a very quick publication start -- in the first month, a paper about atomic upgrades for distributed systems was accepted that covered an important topic of my master's thesis.

Roughly half a year later, me and my co-workers published a research paper about the objectives of the Pull Deployment of Services (PDS) research project (in which my research was of the sub topics) funded by NWO/Jacquard.

Although I had a very good start, I slowly started to learn (the hard way) that you cannot simply publish research papers about all the work you do -- as a matter of fact, it only represents a modest sub set of your daily work.

To write a good research paper, it takes quite a bit of time and effort to decide about the topic (including the paper's title) and to get all the details right. I had all kinds of interesting ideas but many of these ideas were not considered novel -- they were interesting engineering efforts but they did not add interesting new (significant) scientific knowledge.

Moreover, in a research paper, you also need to put your contribution in context (e.g. explain/show how it compares to similar work and how it expands existing knowledge), and provide validation (this can be a proof, but in most cases you evaluate in what degree your contribution meets its claims, for example, by providing empirical data).

After instant acceptance of the first two papers, things did not work out that smoothly anymore. I had several paper rejections in a row -- one paper was badly rejected because I did not put it into the right context (for example, I ignored some important related work) and I did not make my contribution very clear (I basically left it open to the interpretation of the reader, which is a bad thing).

Fortunately, I learned a lot from this rejection. The reviewers even suggested me an alternative conference where I could submit my revised paper to. After addressing the reviewers' criticisms, the paper got accepted.

Another paper was rejected twice in a row for IMO very weak reasons. Most notably, it turned out that many reviewers believed that the subject was not really software engineering related (which is strange, because software deployment is explicitly listed as one of the subjects in the conference's call for papers).

When I explained this peculiarity to Eelco Visser (one of my supervisors and co-promotor), he suggested that I should have more frequent interaction with the scientific community and write about the subject on my blog. Software deployment is generally a neglected subject in the software engineering research community.

Eventually, we have managed to publish the problematic papers (one is about Disnix, the tool implementation of my research) and the other about the testing aspect of the previously rejected paper.

After that problematic period, I have managed to publish two more papers that got instantly accepted bringing me to all kinds of interesting conferences.

The decision to start my blog

Although having a 3 paper acceptance streak and traveling to the conferences to present them felt nice for a while, I still was not too happy.

In late 2010, one day before new years eve (I typically reflect over things in the past year at new year's eve) I realized that research papers alone is just a very narrow representation of the work that I do as a researcher (although the amount of papers and their impact are typically used as the only metric to judge the performance of a researcher).

In addition to getting research papers accepted and doing the required writing, there is much more that the work of an academic researcher (and in particular the software engineering domain) is about:

Research in software engineering is about constructing tools. For example, the paper: Research Paradigms in Computer Science' by Peter Wegner from Brown University says:

Research in engineering is directed towards the efficient accomplishment of specific tasks and towards the development of tools that will enable classes of tasks to be accomplished more efficiently.

In addition to the problems that tools try solve or optimize, the construction of these tools is typically also very challenging, similar to conventional software development projects.

Although the construction aspects of tools may not always be novel and too detailed for a research paper (that typically has a page limit), it is definitely useful to work towards a good and stable design and implementation. Writing about these aspects can be very useful for yourself, your colleagues and peers in the field.

Moreover, having a tool that is usable and works also mattered to me and to the people in my research group. For example, my deployment research was built on top of the Nix package manager, that in addition to research, was also used to solve our internal deployment problems.
I did not completely start all the development work of my tooling from scratch -- I was building my deployment tooling on top of the Nix package manager that was both a research project, and an open source project (more accurately called a community project) with a small group of external contributors.

(As a sidenote: the Nix package manager was started by Eelco Dolstra who was a Postdoc in the same research project and one my university supervisors).

I considered my blog a good instrument to communicate with the Nix community about ideas and implementation aspects.
Research is also about having frequent interaction with your peers that work for different universities, companies and/or research institutes.

A research paper is useful to get feedback, but at the same time, it is also quite an inaccessible medium -- people can obtain a copy from publishers (typically behind a paywall) or from your personal homepage and communicate by e-mail, but the barrier is typically high.
I was also frequently in touch with software engineering practitioners, such as former study friends, open source communities and people from our research project's industry partner: Philips Healthcare.

I regularly received all kinds of interesting questions related to the practical aspects of my work. For example, how to apply our research tools to industry problems or how our research tools compare to conventional tools.

Not all of these questions can be transformed into research papers, but were definitely useful to investigate and write about.
Being in academia is more than just working on publications. You also travel to conferences, get involved in all kinds of different (and sometimes related) research subjects of your colleagues and peers and you may also help in teaching. These subjects are also worth writing about.

Because of the above reasons, I was finally convinced that the time was right to start my blog.

The beginning: catching up with my research papers

Since I was already working on my PhD research for more than 2 years, there was still a lot of catching up I had to do. It did not make sense to just randomly start writing about something technical or research related. Basically, I wanted all information on my blog "to fit together".

For the first half year, my blog was basically about writing things down I had already done and published about.

After my blog announcement, I started explaining what the Pull Deployment of Services research project is about, then explaining the Nix package manager that serves as the fundamental basis of all the deployment tooling that I was developing, followed by NixOS, a Linux distribution that is entirely managed by the Nix package manager that can be deployed from a single declarative specification.

The next blog post was about declarative deployment and testing with NixOS. It was used as an ingredient for a research paper that already got published, and a talk with the same title for FOSDEM: the free and open source's European meeting in Brussels. Writing about the subject on my blog was a useful preparation session for my talk.

After giving my talk at FOSDEM, there was more catching up work to do. After explaining the basic Nix concepts, I could finally elaborate about Disnix, the tool I have been developing as part of my research that uses Nix to extend deployment to the domain of service-oriented systems.

After writing about the self-adaptive deployment framework built on top of Disnix (I have submitted my paper at the beginning that year, and it got accepted shortly before writing the corresponding blog post), I was basically up-to-date with all research aspects.

Using my blog for research

After my catch up phase was completed, I could finally start writing about things that were not directly related to any research papers already written in the past.

One of the things I have been struggling with for a while was making our tools work with .NET technology. The Nix package manager (and sister projects, such as Disnix) were primarily developed for UNIX-like operating systems (most notably Linux) and technologies that run on these operating systems.

Our industry partner: Philips Healthcare, mostly uses Microsoft technologies in their development stack ranging from .NET as a runtime, C# for coding, SQL server for storage, and IIS as web server.

At that time, .NET was heavily tied to the Windows eco-system (Mono already existed that provided a somewhat compatible runtime for other operating systems than Windows, but it did not provide compatible implementations of all libraries to work with the Philips platform).

With some small modifications, I could use Nix on Cygwin to build .NET projects. However, running .NET applications that rely on shared libraries (called library assemblies in .NET terminology) was still a challenge. I could only provide a number of very sub optimal solutions, of which none was ideal.

I wrote about it on my blog, and during my trip to ICSE 2011 in Hawaii I learned from a discussion with a co-attendee that you could also use an event listener that triggers when a library assembly is missing. The reflection API can be used in this event handler to load these missing assemblies, making it possible to efficiently solve my dependency problem making it possible to use both Nix and Disnix to deploy .NET services on Windows without any serious obstacles.

I have also managed to discuss one of my biggest frustrations in the research community: the fact that software deployment is a neglected subject. Thanks to spreading the blog post on Twitter (that in turn got retweeted by all kinds of people in the research community) it attracted quite a few visitors and a large number helpful comments. I even got in touch with a company that develops a software deployment automation solution as their main product.

Another investigation that I did as part of my blog (without publishing in mind) was addressing a common criticism from various communities, such as the Debian community, that Nix would not qualify itself as a viable package management solution because it does not comply to the Filesystem Hierarchy Standard (FHS).

I also did a comparison with the deployment properties of GoboLinux, another Linux distribution that deliberately deviates from the FHS to show that a different filesystem organisation has clear benefits for making deployments more reliable and reproducible. The GoboLinux blog post appeared on Reddit (both the NixOS and Linux channels) and attracted quite a few visitors.

From these practical investigations I wrote a blog post that draws some general conclusions.

Reaching the end of my PhD research

After an interesting year, both from a research and blogging perspective, I was reaching the final year of my PhD research (in the Netherlands, a contract of a PhD student is typically only valid for 4 years).

I had already slowly started with writing my PhD thesis, but there was still some unfinished business. There were four (!!!) more research ideas that I wanted to publish about (which was retrospectively looking, a very overambitious goal).

One of these papers was a collaboration project in which we combined our knowledge about software deployment and construction with license compliance engineering to determine which source files are actually used in a binary so that we could detect whether it meets the terms and conditions of free and open-source licenses.

Although our contribution looked great and we were able to detect a compliance issue in FFmpeg, a widely used open source project, the paper was rejected twice in a row. The second time the reviews were really vague and not helpful at all. One of my co-authors called the reviewers extremely incompetent.

After the second rejection, I was (sort of) done with it and extremely disappointed. I did not even want to revise it and submit it anywhere else. Nonetheless, I have published the paper as a technical report, reported about it on my blog, and added it as a chapter to my PhD thesis.

(As a sidenote: more than 2 years later, we did another attempt to resurrect the paper. The revisions were quite a bit of work, but the third version finally got accepted at ASE 2014: one of the top general conferences in the software engineering domain.

This was a happy moment for me -- I was so disappointed about the process, and I was happy to see that there were people who could motivate and convince me that we should not give up).

Another research idea was formalizing infrastructure deployment. Sadly, the idea was not really considered novel -- it was mostly just an incremental improvement over our earlier work. As a result, I got two paper rejections in a row. After the second rejection, I have abolished the idea to publish about it, but I still wrote a chapter about it in my PhD thesis.

All the above rejections (and the corresponding reviews) really started to negatively impact my motivation. I wrote two blog posts about my observations: one blog post was about a common reason for rejecting a paper: the complaint that a contribution is engineering, but not science (which is quite weird for research in software engineering). Another blog post was about the difficulties in connecting academic research with software engineering practice. From my experiences thus far, I concluded that there is a huge gap between the two.

Fortunately, I still managed to gather enough energy to finish my third idea. I already had a proof-of-concept implementation for managing state of services deployed by Disnix for a while. By pulling out a few all nighters, I managed to write a research paper (all by myself) and submitted it to HotSWUp 2012. That paper got instantly accepted, which was a good boost for my motivation.

In the last few months, the only thing I could basically do is finishing up my PhD thesis. To still keep my blog somewhat active, I have written a number of posts about my conference experiences.

Although I already had a very early proof-of-concept implementation, I never managed to finish my fourth research paper idea. This was not a problem for finishing my PhD thesis as I already had enough material to complete it, but still I consider it one the more interesting research ideas that I never got to finish. As of today, I still have not finished or published about it (neither on my blog or in a research paper).

Leaving academia, working for industry

A couple of weeks before my contract with the university was about to expire, I finished the first draft of my PhD thesis and submitted it to the reading committee for review.

Although the idea of having an academic research career crossed my mind several times, I ultimately decided that this was not something I wanted to pursue, for a variety of reasons. Most notably, the discrepancy between topics suitable for publishing and things that could be applied in practice was one of the major reasons.

All that was left was looking for a new job. After an interesting job searching month I joined Conference Compass, a startup company that consisted of fewer than 10 people when I joined.

One of the interesting technical challenges they were facing was setting up a product-line for their mobile conference apps. My past experience with deployment technologies turned out to come in quite handy.

The Nix project did not disappear after all involved people in the PDS project left the university (besides me, Eelco Dolstra (the author of the Nix package manager) and Rob Vermaas also joined an industrial company) -- the project moved to GitHub, increasing its popularity and the number of contributors.

The fact that the Nix project continued and that blogging had so many advantages for me personally, I decided to resume my blog. The only thing that changed is that my blog was no longer in service of a research project, but just a personal means to dive into technical subjects.

Reintroducing Nix to different audiences

Almost at the same time that the Nix project moved to GitHub, the GNU Guix project was announced: GNU Guix is a package manager with similar objectives to the Nix package manager, but with some notable differences too: instead of the Nix expression language, it uses Scheme as a configuration language.

Moreover, the corresponding software packages distribution: GuixSD, exclusively provides free software.

GNU Guix reuses the Nix daemon, and related components such as the Nix store from the Nix package manager to organize and isolate software packages.

I wrote a comparison blog post, that was posted on Reddit and Hackernews attracting a huge number of visitors. The amount of visitors was several orders of magnitude higher than all the blog posts I have written before that. As of today, this blog post is still in my overall top 10.

One of the things I did in the first month at Conference Compass is explaining the Nix package manager to my colleagues who did not have much system administration experience or knowledge about package managers.

I have decided to use a programming language-centered Nix explanation recipe, as opposed to a system administration-centered explanation. In many ways, I consider this explanation recipe the better of the three that I wrote.

This blog post also got posted on Reddit and Hackernews attracting a huge number of visitors. In only one month, with two blog posts, I attracted more visitors to my blog than all my previous blog posts combined.

Developing an app building infrastructure

As explained earlier, Conference Compass was looking into developing a product-line for mobile conference apps.

I did some of the work in the open, by using a variety of tools from the Nix project and making contributions to the Nix project.

I have packaged many components of the Android SDK and developed a function abstraction that automatically builds Android APKs. Similarly, I also built a function for iOS apps (that works both with the simulator and real devices), and for Appcelerator Titanium: a JavaScript-based cross platform framework allowing you target a variety of mobile platforms including Android and iOS.

In addition to the Nix-based app building infrastructure, I have also described how you can set up Hydra: a Nix-based continuous integration service to automatically build mobile apps and other software projects.

It turns out that in addition to ordinary software projects, Hydra also works well for distributing bleeding edge builds of mobile apps -- for example, you can use your phone or tablet's web browser to automatically download and install any bleeding edge build that you want.

The only thing that was a bit of a challenge was distributing apps to iOS devices with Hydra, but with some workarounds that was also possible.

I have also developed a Node.js package to conveniently integrate custom application with Hydra.

Finishing up my PhD and defending my thesis

Although I left academia, the transition to industry was actually very gradual -- as explained earlier, while being employed at Conference Compass, I still had to finish and defend my PhD thesis.

Several weeks before my planned defence date, I received feedback from my reading committee about my draft that I finished in my last month at the university. This was a very stressful period -- in addition to making revisions to my PhD thesis, I also had to arrange the printing and the logistics of the ceremony.

I also wrote three more blog posts about my thesis and the defence process: I provided a summary of my PhD thesis as a blog post, I wrote about the defence ceremony, and about my PhD thesis propositions.

Writing thesis propositions is also a tradition in the Netherlands. Earlier that year, my former colleague Felienne Hermans decided to blog and tweet about her PhD thesis propositions, and I did the same thing.

PhD thesis propositions are typically not supposed to have a direct relationship to your PhD thesis, but they should be defendable. In addition to your thesis, the committee members are also allowed to ask you questions about your propositions.

The blog post about my PhD thesis propositions (as of today) still regularly attracts visitors. The amount of visitors of this blog post heavily outnumbers the summary blog post about my PhD thesis.

In addition to my PhD thesis, there were more interesting post-academia research events: a journal paper submission finally got officially published (4 years after submitting the first draft!) and we have managed to get our paper about discovering license compliance inconsistencies accepted at ASE 2014, that was previously rejected twice.

Learning Node.js and more about JavaScript

In addition to the app building infrastructure at Conference Compass, I have also spend considerable amounts of time learning things about Node.js and its underlying concepts: the asynchronous event loop. Although I already had some JavaScript programming experience, all my knowledge thus far was limited to the web browser.

I learned about all kinds of new concepts, such as callbacks (and function-level scoping), promises, asynchronous programming (in general) and mixing callbacks with promises. Moreover, I also learned that (despite my earlier experiences in the concepts of programming languages course) working with prototypes in JavaScript was more difficult than expected. I have decided to address my earlier shortcomings in my teachings with a blog post.

With Titanium (the cross-platform mobile app development framework that uses JavaScript as an implementation language), beyond regular development work, I investigated how we can port a Node.js-based XMPP library to Titanium and how we can separate concerns well enough to make a simple, reliable chat application.

Building a service management platform and implementing major Disnix improvements

At Conference Compass, somewhere in the middle of 2013, we decided to shift away from a single monolithic backend application for all our apps, to a more modular approach in which each app has their own backend and their own storage.

After a couple of brief experiments with Heroku, we shifted to a Nix-based approach in mid 2014. NixOps was used to automatically deploy virtual machines in the cloud (using Amazon's EC2 service), and Disnix became responsible for deploying all services to these virtual machines.

In the Nix community, there was quite a bit of confusion about these two tools, because both use the Nix package manager and are designed for distributed deployment. I wrote a blog post to explain in what ways they are similar and different.

Over the course of 2015, most of my company work was concentrated on the service management platform. In addition to automating the deployment of all machines and services, I also implemented the following functionality:

Backup support (using the experimental state management facilities of Dysnomia)
Monitoring support with Datadog
A general configuration management framework to organize and document all relevant configuration items
Various optimizations: target-specific services that do not require unnecessary reconfigurations and on-demand activation and self-termination of services, to save RAM.

In late 2015, the first NixCon conference was organized, in which I gave a presentation about Disnix and explained how it can be used for the deployment of microservices. I received all kinds of useful feedback that I implemented in the first half of 2016:

Most notably, I changed the internal model of Disnix to also work with the notion of containers (environments that manage services), a feature that Dysnomia already supported, but could not be directly controlled in Disnix.
You can also manage multiple instances of container services on a single machine.

Over time, I did many more interesting Disnix developments:

I made modifications to use it as a remote package deployer
I worked on an abstraction layer to more easily deal with concurrency
I built a tool that can reconstruct the Disnix deployment models from a network of already deployed services.
I built a tool that helps diagnosing problems
I created more public examples (based on Chord and my own web framework).

Furthermore, the Dynamic Disnix framework (an extension toolset that I developed for a research paper many years ago), also got all kinds of updates. For example, it was extended to automatically assign TCP/UDP port numbers and to work with state migrations.

While working on the service management platform, five new Disnix versions were released (the first was 0.3, the last 0.8). I wrote a blog post for the 0.5 release that explains all previously released versions, including the first two prototype iterations.

Brief return to web technology

As explained in the introduction, I already had the idea to start my blog while I was still actively doing web development.

At some point I needed to make some updates to web applications that I had developed for my voluntary work that still use pieces of my old custom web framework.

I already release some pieces (most notably the layout manager) of it on my GitHub page as a side project, but at some point I have also decided to release the remainder of the components.

I also wrote a blog post about my struggles composing a decent layout and some pointers on "rational" layout decisions.

Working on Nix generators

In addition to JavaScript development at Conference Compass, I was also using Nix-related tools for automating deployments of Node.js projects.

Eventually, I created node2nix to make deployments with the NPM package manager possible in Nix expressions (at the time this was already possible with npm2nix, but node2nix was developed to address important shortcomings of npm2nix, such as circular dependencies).

Over time, I faced many more challenges that were node2nix/NPM related:

To make NPM more compatible on Windows, the NPM authors introduced a "flattening strategy" that required a substantial rewrite of node2nix.
Simulating global NPM package installations in Nix expressions.
Substituting impure version specifiers that may trigger accidental remote network requests.
Bypassing NPM's content-addressable cache for local installations (that is fundamentally incompatible with how NPM installations are done in Nix expressions).

When I released my custom web framework I also did the same for PHP. I have created composer2nix to allow PHP composer projects to be deployed with the Nix package manager.

In addition to building these generators, I also heavily invested in working towards identifying common concepts and implementation decisions for both node2nix and composer2nix.

Both tools use an internal DSL to generate Nix expressions (NiJS for JavaScript, and PNDP for PHP) as opposed to using strings.

Both tools implement a domain model (that is close to NPM and composer concepts) that get translated to an object structure in the Nix expression language with a generic translation strategy.

Joining Mendix, working on Nix concepts and improvements

Slightly over 2 years ago I joined Mendix, a company that develops a low-code application development platform and related services.

While I was learning about Mendix, I wrote a blog post that explains its basic concepts.

In addition, as a crafting project, I also automated the deployment of Mendix applications with Nix technologies (and even wrote about it on the public Mendix blog).

While learning about the Mendix cloud deployment platform, I also got heavily involved in documenting its architecture. I wrote a blog post about my practices (the notation that I used was inspired by the diagrams that I generate with Disnix). I even implemented some of these ideas in the Dynamic Disnix toolset.

When I just joined Mendix, I was mostly learning about the company and their development stack. In my spare time, I made quite a few random Nix contributions:

As a personal learning exercise and attempt to make the stdenv.mkDerivation function abstraction in Nix more understandable, I wrote layered build function abstractions.
I also extended the lessons for building these abstractions to automate the deployment of SDKs with Nix, most notably the Android SDK.
I also worked on automating the process of patching prebuilt ELF binaries so that these programs can be conveniently deployed by Nix. Most notably, it came in handy for the Android SDK.

Furthermore, I made some structural Disnix improvements as well:

I wrote a data exchange library to more reliably consume Disnix deployment models (that are generated by Nix expressions), that also provides much better error reporting.
I revised the Disnix model transformation pipeline to improve maintainability. In addition, it also provides more configuration properties and a new model: the packages model.

Side projects

In addition to all the major themes above, there are also many in between projects and blog posts about all kinds of random subjects.

For example, one of my long-running side projects is the IFF file format experiments project (a container file format commonly used on the Commodore Amiga) that I already started in the middle of my PhD.

In addition to the viewer, I also developed a hacky Nix function to build software projects on AmigaOS, explained how to emulate the Amiga graphics modes, ported the project to SDL 2.0, and to Visual Studio so that it could run on Windows.

I also wrote many general Nix-related blog posts between major projects, such as:

And covered developer's cultural aspects, such as my experiences with Agile software development and Scrum, and developer motivation.

Some thoughts

In this blog post, I have explained my motivation for starting my blog 10 years ago, and covered all the major projects I have been working including most of the blog posts that I have written.

If you are a PhD student or a more seasoned researcher, then I would definitely encourage you to start a blog -- it gave me the following benefits:

It makes your job much more interesting. All aspects of your research and teaching get attention, not just the papers, that typically only reflect over a modest sub set of your work.
It is a good and more accessible means to get frequent interaction with peers, practitioners, and outsiders who might have an interest in your work.
It improves your writing skills, which is also useful for writing papers.
It helps me to structure my work, by working on focused goals one at the time. You can use some of these pieces as ingredient for a research paper and/or your PhD thesis.
It may attract more visitors than research papers.

About the last benefit: in academia, there all kinds of metrics to measure the impact of a researcher, such as the G-index, and H-index. These metrics are sometimes taken very seriously, for example, by organizations that decide whether you can get a research grant or not.

To give you a comparison: my most "popular" research paper titled: "Software deployment in a dynamic cloud: From device to service orientation in a hospital environment" was only downloaded (at the time of writing this blog post) 625 times from the ACM digital library and 240 times from IEEE Xplore. According to Google Scholar, it got 28 citations.

My most popular blog post (that I wrote as an ingredient for my PhD research) is: On Nix, NixOS and the Filesystem Hierarchy Standard (FHS) that attracted 5654 views, which is several orders of magnitude higher than my most popular research paper. In addition, I wrote several more research-related blog posts that got a comparable number of views, such as the blog post about my PhD thesis propositions.

After completing my PhD research, I wrote blog posts that attracted even several orders of magnitude more visitors than the two blog posts mentioned above.

(As a sidenote: I personally am not a big believer in the relevance of these numbers. What matters to me is the quality of my work, not quantity).

Regularly writing for yourself as part of your job is not an observation that is unique to me. For example, the famous computer scientist Edsger Dijkstra, wrote more than 1300 manuscripts (called EWDs) about topics that he considered important, without publishing in mind.

In EWD 1000, he says:

If there is one "scientific" discovery I am proud of, it is the discovery of the habit of writing without publication in mind. I experience it as a liberating habit: without it, doing the work becomes one thing and writing it down becomes another one, which is often viewed as an unpleasant burden. When working and writing have merged, that burden has been taken away.

If you feel hesitant to start your blog, he says the following about a writer's block:

I only freed myself from that writer's block by the conscious decision not to write for a target audience but to write primarily to please myself.

For software engineering practitioners (which I effectively became after leaving academia) a blog has benefits too:

I consider good writing skills important for practitioners as well, for example to write specifications, API documentation, other technical documentation and end-user documentation. A blog helps you developing them.
Structuring your thoughts and work is also useful for software development projects, in particular free and open source projects.
It is also a good instrument to get in touch with development and open source communities. In addition to the Nix community, I also got a bit of attention in the Titanium community (with my XMPP library porting project), the JavaScript community (for example, with my blog post about prototypes) and more recently: the InfluxData community (with my monitoring playground project).

Concluding remarks

In this blog post, I covered most of my blog post written in the last decade, but I did not elaborate much about 2020. Since 2020 is a year that will definitely not go unnoticed in the history books, I will write (as an exception) an annual reflection over 2020 tomorrow.

Moreover, after browsing over all my blog posts since the beginning of my blog, I also realized that it is a bit hard to find relevant old information.

To alleviate that problem, I have reorganized/standardized all my labels so that you can more easily search on subjects. On my homepage, I have added an overview of all labels that I am currently using.