Sander van der Burg's blog

15th annual blog reflection

2025-12-30T23:24:00.004+01:00

Today it's my blog's 15th anniversary. As usual, this is a nice opportunity to reflect over last year's writings.

Some reflection

In 2024, it was very quiet on my blog. I have only written one blog post (about configuring my recently acquired Amiga 4000 machine) and I was not happy about my progress. This year I have managed to improve my cadence by breaking up some of my projects into smaller chunks and trying not to multi-task too much.

Amiga development

In 2025, I resumed my Amiga-related fun projects. After configuring my second hand Amiga 4000 in such a way that I find it acceptable, I have decided to install and run Linux on it -- running Linux on an Amiga has always been a fascinating use case to me. Already in 2001, I gathered quite a bit of knowledge about Linux, its internals and its portability aspects.

Since 2001 I have been curious to see how it would work on an Amiga machine, rather than a standard x86-based PC. Unfortunately, until 2022 I only had access to an Amiga 500 that is incapable of running Linux -- an Amiga 500 contains the first generation Motorola 68000 processor lacking a memory management unit (MMU) which is a hard requirement for running Linux. After obtaining my Amiga 4000 that has a 68040 CPU (with an MMU) I could finally see how it would work in practice.

It turns out that running Linux on an Amiga is quite a challenge. There is some information available on the Internet and a Linux distribution that (somewhat) supports it: Debian. Unfortunately, much of the relevant information that I need is scattered, sometimes outdated, and not always well-written. As a result, I have decided to write a blog post about my experiences, so that all information can be obtained from a single location.

I was only expecting a handful of people to find such a blog post interesting. At first, it did not attract that many visitors. A couple of months later, it appeared on various news web sites, such as Amiga News, Hackernews, Reddit, and OS news. As a result, it not only reached the Amiga community, but also a broader development community.

Thanks to this wide exposure, my Linux/Amiga related-blog post is now one of my most frequently read blog posts. I am quite happy to see such a broad exposure -- although the Amiga is nowadays mostly an obsolete platform, the Linux kernel still supports it. I hope my blog post makes Linux on Amiga more useful to people who want to explore or improve it.

In 2022, I learned that in addition to Linux, NetBSD (another UNIX-like operating system) also supports the Amiga and substantially improved its Amiga support. After experimenting with Linux on my Amiga 4000, I have also decided to give NetBSD a try and report about my experiences. This blog post was also covered on the same news sites and also attracted quite a few visitors.

Another interesting Amiga project that I worked on was the ability to mount my KCS PowerPC board-emulated PC hard-drive in AmigaOS and Linux so that I can conveniently exchange files (to clarify: my Amiga 500 contains an extension card making it possible to emulate a PC and run PC software).

Previously, I had to rely on floppy disks or a null modem cable to exchange files with my KCS PowerPC board-emulated PC installation, which is slow and inconvenient.

In 2025, I got quite frustrated by this limitation. Contrary to my emulated PC instance, it is possible to easily exchange data with my Linux and NetBSD installations from AmigaOS and Linux -- there are ext2 and Berkeley Fast File system modules for both operating systems.

Already in the 90s I was convinced that these problems could be solved because AmigaOS is flexible enough to support many kinds of file systems through external AmigaDOS drivers. There are also several kinds of FAT file system DOS drivers for AmigaOS, but none of them work with my emulated PC drive. Unfortunately, in the 90s I had neither the knowledge, nor the resources to fully solve that problem.

In 2025, I revisited the problem and wrote two blog posts covering my solutions: an AmigaOS Exec driver and a network block device (NBD) driver for UNIX-like systems, such as Linux.

Building a retro PC

After completing my Amiga projects, I became motivated to build a retro PC with late 90s hardware, supplemented with a couple of modern peripherals (such as a CF2IDE and a GoTek floppy emulator device) to make data exchange more convenient.

The motivating reason to build this retro PC is because I learned that the ability to run old games on modern computers is not as good as I thought it would be. Furthermore, I believe it is good to preserve some significant historical computer technology.

Conclusion

I am happy to have seen some improvements and that some of my writings had some impact.

I have plenty of ideas for 2026, so stay tuned.

The last thing I would like to say is:

HAPPY NEW YEAR!!!!

Building a late 90s retro PC

2025-10-21T01:39:00.006+02:00

Last year I went to the cinemas and enjoyed watching Dune: Part Two, a movie adaptation of the first Dune novel written by Frank Herbert. I still have fond memories reading the novel for my English literature classes at secondary school -- the book is very detailed and it took me quite a bit of effort to read it. Fortunately, I already knew many parts of the storyline thanks to a computer game: Dune 2000, that I enjoyed very much playing in the late 90s.

After seeing the movie I wanted to play Dune 2000 again. The game is quite old -- it was released in 1998 by Westwood Studios. Originally, I ran the game on my parents' PC, a Pentium 166 with 32 MiB RAM running the first edition of Windows 98.

Running the game on a modern PC turns out to be quite challenging. Although some hardware in PCs should be fairly backwards compatible with even the earliest PC models (for example, modern CPUs can still run real mode instructions that were commonly used in the MS-DOS era in the 80s and early 90s), compatibility on a modern machine with an old PC software product is all but guaranteed.

On Windows 11, Dune 2000 will not work out of the box. Windows 11 requires a 64-bit CPU and runs in 64-mode. 64-bit versions of Windows have dropped compatibility for 16-bit Windows and MS-DOS applications. Although the Dune 2000 game is a 32-bit executable, the installer turns out to be 16-bit.

Fortunately, I found an alternative installer on the PC Gaming Wiki developed by the retro computing community allowing me to install and play Dune 2000 on Windows 11. Although this installer made it possible to play Dune 2000 for a while, I have noticed that Windows 11 updates can also break the playability of the game again.

I also tried a few other games from the late 90s and for many of them I learned that it is quite hard to make them run properly on Windows 11. As a result, I was not happy with the current state of affairs when it comes to backwards compatibility.

Some time later, because of the lack of good application compatibility in Windows 11, I got motivated to install Windows 98 Second Edition (SE) in PCem, a PC emulator allowing me to set up an emulated machine that was comparable to the first PC that I bought myself in the late 90s. A bare bones Windows 98 installation is somewhat impractical to use so I also started to collect useful drivers and utilities that I used at that time.

After my experiences setting up an old PC software configuration (Windows 98 SE) and working with retro computers (various Commodore models), I became motivated to build a retro PC that is comparable to the first PC I first bought from my own money as a teenager in 1999.

Before 1999 I only owned obsolete computers, such as the Commodore 64 and Amiga 500, which explains why I have quite a bit of experience with them. I had to share my access to a modern PC with other family members. Saving the money to buy a modern computer of my own took me quite a bit of time and effort.

In this blog post, I will report about my experiences building my retro PC and describe my experiences using it.

Finding a good base machine

All of my desktop PCs that I used to own were custom built. Building your own PC from parts is an interesting process -- I always used to enjoy looking at the specifications of various hardware components and see how they can be combined to assemble a reasonably priced machine. In the Netherlands, there are quite a few shops that you can buy computer parts from.

Contrary to modern PCs, finding the right parts for old machines is extra challenging, because most of them cannot be found in conventional shops. Fortunately, there are web sites such as eBay and Marktplaats (a Dutch e-commerce web site for trading second hand goods) that offer many kinds of used goods, including computer parts and peripherals.

In 1999 I used to have an Abit BE6 motherboard. Back then I was in doubt between this motherboard and an ASUS P2B. I have picked the Abit BE6 motherboard, because it had an Ultra ATA 66 controller that is supposed to be faster than a conventional IDE controller.

After doing a search on Marktplaats, I found a second hand custom built PC containing an ASUS P2B-F motherboard and a number of parts that are close to my desired system configuration:

Motherboard: ASUS P2B-F. This is a motherboard model I specifically looked for.
CPU: Intel Pentium III 450 MHz. This CPU is slightly slower than I used to have. My first PC had a 500 MHz model.
192 MiB RAM. My first PC used to have 128 MiB of RAM. It is fine to have a little bit more.
Graphics card: ATI Rage 128. This card was comparable in features and performance to a NVIDIA RIVA TNT card using true-color 3D graphics.
Sound card: SoundBlaster PCI 128. A basic sound card with decent DOS compatibility.
Network card: RealTek 8139
Floppy drive: A 1.44 MB 3.5-inch
CD-ROM drive: A Lite-On (52x speed)
Hard drive: A 8 GiB Maxtor N256 IDE

Finding additional parts

I made the following adjustments to the machine's composition so it that becomes closer to the machine I used to own in the late 90s:

I replaced the ATI Rage 128 video card with a Diamond Viper V770 (containing a NVIDIA RIVA TNT2 chipset) that I found on eBay. This is the same kind of video card I used to own in the late 90s and is more powerful than the ATI card.
My 90s machine used to contain a more fancy sound card: a Sound Blaster Live! I have replaced the Sound Blaster PCI 128 with a Sound Blaster Audigy 2 card that I have removed from one of my previous computers. The Audigy 2 is an improved version of the Sound Blaster Live!
My 90s machine was also upgraded with a DVD-ROM player. As a result, I was able to watch DVD movies on it. On Marktplaats I found a very cheap LG DVD-ROM player that is also capable of burning writable CDs.
I used to have a Microsoft Sidewinder USB joypad to play games with. I found the exact same model on Marktplaats.

In addition to replacing traditional parts, I have also decided to buy some additional parts to make retro-computing more convenient:

A GoTek floppy emulator device. Similar to my old 8-bit Commodore and Amiga computers I am running into the issue that floppy drives have become an inconvenient medium. Modern PCs no longer have it included by default. Although I still have an external USB floppy drive that I can use, it still remains inconvenient, because floppy disks are slow and have limited storage capability.

Similar to my old Commodore computers, it is also possible to use a GoTek floppy emulator device as a substitute for a physical floppy drive in a PC. Rather than floppy disks, a GoTek allows me to use a USB memory stick with disk images. I can use the rotator to select the disk image I want to use.

My motherboard only has one floppy drive socket, but it is possible to connect both the physical floppy drive and the GoTek floppy emulator at the same time with a single cable with two connectors.

I have made the GoTek floppy emulator the primary disk drive, but I can also still use the original floppy drive as a secondary disk drive if I want to. The fact that the traditional floppy drive can still be used is convenient for backing up content from old physical floppy disks.

I have followed the configuration instructions on this documentation page to configure my GoTek -- I have created a directory named: FF on my USB stick and added a configuration file named FF.CFG with the following settings:
```
interface = ibmpc
display-type = oled-128x64-rotate
```
The above settings specify that I have an IBM type disk drive and I have rotated the display text so that it no longer appears upside down.
A CF2IDE device. Hard drives will not last forever. Similar to my Amiga 500's hard drive, the Maxtor hard drive in my retro PC is showing age related problems. For example, it makes a weird continuous buzzing sound.

I have decided to replace the hard drive with an CF2IDE device, similar to my Amiga 4000. This device allows me to use Compact Flash cards as a replacement for physical hard drives.

Another advantage (similar to my Amiga's SD2SCSI and CF2IDE devices) is that I can conveniently switch memory cards at the back of the machine allowing me to easily work with many kinds of software installations.

Software configurations

As already explained, a CF2IDE device allows me conveniently switch memory cards at the back of the machine. As a result, I have been able to produce a number of interesting software configurations that I can easily experiment with.

Windows 98 SE

The first configuration I produced is a working Windows 98 Second Edition (SE) installation. At the time I bought my own PC in 1999, this was the mainstream operating system commonly used on consumer PCs. I still had a copy lying around.

Back in 1999 there were two concurrent Windows product lines: the most commonly used versions were DOS/Windows hybrids starting with Windows 95, followed by 98, 98 second edition, and Millennium Edition (ME). I will call this product-line the Windows 9x product-line.

Although these Windows versions appear to be "modern" graphical operating systems, they still carry much of Microsoft's MS-DOS legacy -- you could still clearly observe that they consist of a DOS and Windows part: first, the system boots into the text-based MS-DOS mode, optionally loading DOS device drivers and DOS programs, such as TSRs, and then the Windows desktop environment is started. By using the boot menu (that can be reached by pressing F8 on startup), it is also possible to boot into MS-DOS mode, if desired.

There was also the NT-product line starting with Windows NT 3.1, that from a visual perspective looked very similar to Windows 3.1, but was developed from the ground up to be more powerful, portable and robust.

In mid 1999, Windows NT 4.0 was the latest version in the NT-product line having a graphical shell that looked similar to Windows 95.

Although the Windows NT product-line was more promising and gained some adoption for business use, it was not yet used on a wide scale by consumers because it requires much more system resources and its backwards compatibility with MS-DOS applications and games was not as good as the DOS/Windows hybrids. Moreover, Windows NT lacked popular multimedia features such as hardware accelerated Direct3D graphics.

At the end of 1999, Windows 2000 was released as the successor to Windows NT 4.0, improving various features including multimedia support.

I consider Windows 98 SE to be the best option in the Windows 9x product-line for playing games and running commonly used Windows software from the late 90s. 98 SE has more features than previous versions, but not the mistakes of its successor: Windows Millennium Edition.

Windows Millennium Edition has added a few nice features, such as support for more modern hardware, but it removed the option to boot into MS-DOS mode, preventing me from playing a number of interesting MS-DOS games. Moreover, Windows Millennium Edition also had quite a few stability problems.

Installing Windows 98 SE and most of the software packages was generally straight forward. For example, I can install Windows 98 SE relatively straight forward by booting from a Windows 98 boot disk and running SETUP.EXE from the CD-ROM.

Configuration challenges

There are a number of configuration challenges that I ran into while setting up a Windows 98 SE installation:

Intel chipset driver

By default, the Intel 440BX chipset is not detected by Windows 98 SE. I had to install a driver for it. I installed this version that I found one online.

Realtek network card driver

My network card was also not recognized -- I installed this driver that I found online.

Video card driver

Windows 98 SE does not automatically detect my graphics card. To use my card, I need to install an external driver provided by NVIDIA. There are a variety of driver versions available for my chipset (RIVA TNT2).

The last driver version that still supports Windows 98 is version: 71.84, but I learned that it has all kinds of issues. For example, running the 3DMark 2000 demo does not seem to work at all.

After some searching I learned that it is better to use an older driver version for older NVIDIA cards. After experimenting with a variety of versions, I had pretty good results with version 28.32 -- all the 3D applications that I want to run on my computer seem to work with it.

Old MS-DOS tools

One of the objectives of my Windows 98 SE configuration is to run MS-DOS applications and games. As explained earlier, the Windows 9x product-line is basically a DOS/Windows hybrid. As a result, it has pretty good compatibility with many MS-DOS applications.

Windows 95, 98 and 98 SE include a number of MS-DOS utilities in the default installation, but compared to MS-DOS 6.22 (the last independent MS-DOS release) its features have been trimmed down considerably.

Fortunately, some of the missing MS-DOS tools can be found on the Windows installation disc: the D:\TOOLS\OLDMSDOS. For example, when I was younger I typically wanted to use MS-DOS QBASIC, which was not included in the default installation of Windows 95 and 98. I could still get access to it by copying it from the old MS-DOS tools directory to C:\DOS and adding this directory to the PATH environment variable in AUTOEXEC.BAT:.

PATH C:\WINDOWS;C:\WINDOWS\COMMAND;C:\DOS

I have also learned that in each new Windows version in 9x product-line the amount of old MS-DOS facilities shrinks somewhat. I also still have a copy of Windows 95 OSR1 lying around that contains more classic MS-DOS utilities than the Windows 98 SE disc, including MEMMAKER which turns out to be quite handy.

Instead of using the MS-DOS supplement utilities for Windows 98 SE, I prefer to use the utilities from the Windows 95 OSR1 CD-ROM.

Using USB memory sticks

Earlier in this blog post I have explained that floppy disks are quite an inconvenient medium to use nowadays. A modern means to exchange data physically is to use USB memory sticks.

My retro PC has two USB ports available, so I also wanted to use USB memory sticks for data exchange. Unfortunately, back in 1999 USB storage devices were not yet standardized -- each vendor provided their own driver. For modern USB storage devices, Windows 98 drivers are no longer provided. As a result, Windows 98 SE fails to detect USB memory sticks.

Fortunately, I learned that somebody has created a general USB storage driver package for Windows 98 that should work with any USB storage device. I used version 3.6 and it is straight forward to install.

Although the driver works great, I learned that it was created by integrating a number of components from Windows Millenium Edition, the successor of Windows 98 SE.

The negative side effect of installing the driver is that some elements of my Windows installation have lost their locality settings -- I am using a Dutch version and some Windows Explorer elements have changed to English.

After some searching, I learned that I can extract the original versions of the affected files from the Windows 98 SE CD-ROM by running the following commands in an MS-DOS prompt (D: corresponds to the CD-ROM device):

C:
MD \TEMP
CD \TEMP

EXTRACT D:\WIN98\WIN98_25.CAB SYSDM.CPL
EXTRACT D:\WIN98\WIN98_41.CAB USER32.DLL
EXTRACT D:\WIN98\WIN98_43.CAB EXPLORER.EXE
EXTRACT D:\WIN98\WIN98_44.CAB SYSTRAY.EXE
EXTRACT D:\WIN98\WIN98_45.CAB USER.EXE

I can restore the files by booting the machine into MS-DOS mode (e.g. by pressing F8 on bootup or using the shutdown function) and running the following commands:

C:
CD \TEMP

MOVE SYSDM.CPL C:\WINDOWS\SYSTEM
MOVE USER32.DLL C:\WINDOWS\SYSTEM
MOVE EXPLORER.EXE C:\WINDOWS
MOVE SYSTRAY.EXE C:\WINDOWS\SYSTEM
MOVE USER.EXE C:\WINDOWS\SYSTEM

The only disadvantage of restoring the original files is that the systray icon (that can be used to inspect the status of the USB storage device) can no longer be used. However, I can still reliably unmount a USB drive by opening "My Computer", right clicking on the device and selecting the: "Eject" function.

Mouse driver for MS-DOS

In Windows mode, my PS/2 mouse works out of the box, because it includes a driver. Unfortunately, in MS-DOS mode I need to obtain a driver myself.

To get the mouse working I have downloaded cutemouse -- it is originally developed for FreeDOS, but it also works decently in Windows 98's DOS mode. Its only disadvantage is that requires a bit of conventional or upper memory, so I only load it when I need it.

CD-ROM driver for MS-DOS

Similar to my mouse, in Windows mode my DVD-ROM works out of the box, because Windows has a driver for it. In MS-DOS mode, I also need to install a driver myself.

I found this OAK CDROM driver on VOGONS that seems to be compatible.

By adding the following line to CONFIG.SYS I can load the driver:

DEVICE=C:\DRIVERS\VIDE-CDD.SYS /D:MSCD001

By adding the following line to the AUTOEXEC.BAT file I can mount the CD-ROM drive on startup:

MSCDEX /D:MSCD001

Sound card driver

Another challenge was to make the sound working properly. Although I still have the original driver CD-ROM that includes a convenient installer, I ran into some challenges.

At first, I just used the CD-ROM to install the driver by following the recommended steps. I did not install Creative MediaSource, because I do not need it -- it is an application to turn your Windows installation into an entertainment system. I prefer to use my own selected applications to accomplish the same goals.

There are two driver variants on the CD-ROM, a Windows Driver Model (WDM) and a VxD variant. By default, the Audigy 2 installation disc installs the WDM driver on Windows 98. WDM is a new driver model introduced with Windows 98, but it was not yet successfully used until Windows 2000.

I have noticed two major limitations of using the WDM driver in my Windows 98 SE setup:

When running some games, things get silent after playing for a while.
When running MS-DOS games in an MS-DOS box in Windows, I only have SoundBlaster Pro compatibility. Moreover, I do not have any sound when I try to run MS-DOS applications in MS-DOS mode.

After some searching, I learned that the VxD driver is more robust on Windows 98 SE (and other Windows versions in the Windows 9x product-line).

I can switch to the VxD driver by starting the following program in the start menu: Programs -> Creative -> Utility -> Driver Utility Program. In the utility program, I can give the instruction to install the VxD driver.

Then you must reboot your computer. At the next boot up, Windows 98 SE sees some new hardware for which no driver can be found. This message should be ignored. Once the system has booted, the VxD drivers will be installed. After another reboot, the sound card is detected and can be used again.

I also knew that the first generation card of this product-line: the SoundBlaster Live! had SoundBlaster 16 compatibility. I learned that it is also possible to have MS-DOS SoundBlaster 16 compatibility with my Audigy 2 card, but it is not supported by official means -- there is an unofficial DOS driver pack that can be obtained from the VOGONS forums.

Enabling Sound Blaster 16 compatibility is quite tricky. First, I must make sure that there is a free IRQ channel. By default, IRQ 7 is taken by the LPT1: port. Disabling it in the BIOS frees it up so that the sound card uses IRQ 7.

We can use IRQ 5 for Sound Blaster 16 emulation. I learned the hard way that it is best to change the BIOS setting before the installation of the sound driver -- doing so afterwards bricked my Windows installation. Windows was refusing to boot due to an IRQ conflict.

Then, I must open the registry editor (regedit.exe) and change the following setting:

[HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\Creative Tech\Emu10kx\Emulation]
"EnableSB16Emulation"=dword:00000001

After changing the setting, I must reboot the machine. At first startup, Windows should detect a new device that is recognized as a "Creative SB16 Emulation" device. Although Windows manages to install a driver, it shows that there is a problem when I open the device manager. I should ignore this error for now.

Then I must install an unofficial Audigy 2 DOS pack from VOGONS. This package includes a number of files from the Sound Blaster Live! installation disc to enable Sound Blaster 16 emulation.

I must perform the following steps:

Unpack dospack RAR file
Run the installer: AUDIGY DOS DRIVER\Setup.exe
Do NOT reboot
Copy AUDIGY12.EXE from the AUDIGY12 PATCH/ directory to C:\Program Files\Creative\DOSDrv
Edit AUTOEXEC.BAT, add following line right after SBEINIT.COM:
```
    C:\PROGRA~1\CREATIVE\DOSDRV\AUDIGY12.EXE
    
```
Reboot the machine

Finally, after rebooting there is still the broken "Creative SB16 Emulation" device. We must manually reinstall it to get a working device, by executing the following steps:

Control Panel -> Add New Hardware
Select: "No, the device isn't in the list"
Then select: "No, I want to select the hardware from a list"
Type of hardware: "Sound, video and game controllers"
Manufacturer: Creative Technology, Ltd.
Model: Creative SB16 Emulation

After installing the device by following the above procedure, there is a second "Creative SB16 emulation" device that is not reported as broken.

As a sidenote: the previously installed device that is also called: "Creative SB16 Emulation" still remains visible in the device manager and remains reported as broken -- ignore it. It is a weird situation, but the outcome is that we have working SB16 emulation.

Mounting CD-ROM images

In addition to floppy disks, my retro PC can also work with another kind of physical media: CD-ROM and DVD-ROM discs. For games, for which I have original copies, this is a fine medium.

However, I consider working with writable CD-ROM / DVD-ROM discs a bit inconvenient -- for example, I have downloaded a DirectX update ISO CD-ROM image that I need to access somehow. Writable CD-ROM and DVD-ROMs are a bit impractical, because it takes time to produce them and they are not very reliable in the long run -- I had quite a few writable discs in the past that became inaccessible after a few years.

Fortunately, it is also possible to put CD-ROM and DVD-ROM images on a USB memory stick and get access to their content by using a Virtual CD-ROM / DVD-ROM drive.

I have installed Daemon Tools version 3.47 to make this possible -- it is not the latest version for Windows 98 SE, but this version is still small / light-weight and works decently.

Optimizing memory for MS-DOS applications

Another challenge of running MS-DOS applications (that has not changed much since the MS-DOS days) is dealing with memory -- some MS-DOS games require a substantial amount of free conventional memory to run.

After installing all required MS-DOS drivers, e.g. the CD-ROM driver and Soundcard, I can optimize the amount of free conventional memory by running MEMMAKER. I need to reboot the machine into MS-DOS mode (by pressing F8 on startup) and run the following command on the command-line prompt:

MEMMAKER

I have used the following settings:

Use Express or Custom Setup? Express Setup
Do you use any programs that need expanded memory (EMS)? No
Specify which drivers and TSRs to include in optimization? No
Scan the upper memory area aggressively? Yes
Optimize upper memory for use with Windows? No
Use monochrome region (B000-B7FFF) for running programs? No
Keep current EMM386 memory exclusions and inclusions? No
Move Extended BIOS Data Area from conventional to upper memory? Yes

Finally, after MEMMAKER has made its changes, I typically check the CONFIG.SYS file to see if DOS is loaded as follows:

DOS=HIGH,UMB

By using these settings I have managed to increase the amount of free conventional memory from 530K to 616K in my default boot configuration.

Although using MEMMAKER increases the amount of free conventional memory, this configuration is not perfect. I have picked this configuration because it works with most of my applications.

Unfortunately, not all my MS-DOS applications will work with this setup -- I have disabled expanded memory (EMS) emulation, but enabled access to the Upper Memory Block (UMB) by loading the EMM386 with the NOEMS parameter. A few of my MS-DOS applications require EMS memory, such as a DOS game called: Zorro. On the other hand, I also have some MS-DOS applications that cannot work at all with an EMM driver, such as the PC port of Turrican 2.

For the applications that will not work with my main setup, I can configure the PIF files in Windows in such a way that the Windows boots into a different MS-DOS configuration, with or without an EMM driver enabled.

The only unfortunate side effect of booting into a configuration without an EMM driver is that I have no sound -- the SoundBlaster 16 emulation DOS TSR requires it.

Experience

I am happy with my Windows 98 SE setup. I can play many of the classic Windows games that I liked in the late 90s, such as Dune 2000, Quake 2, Half-Life and Unreal:

I can also use my Sidewinder joypad to play games, such as Mega Man X3:

Moreover, having decent MS-DOS application compatibility also makes it possible to play games such as Duke Nukem 3D and Jazz Jackrabbit:

To play Jazz Jackrabbit, I have followed this suggestion on the Jazz 2 online forums to get it fixed -- my retro PC is running too fast causing a division by zero error (Runtime error 200 at 0009:37F2). This error happens frequently with running applications written in Turbo Pascal on fast computers. Jazz Jackrabbit was also written in Turbo Pascal. By downloading the TPPATCH program and patching the FILE0001.EXE executable the problem was solved.

Furthermore, I got many interesting applications working, such as WinAmp, Encarta 2005, Encore (that I used to typeset sheet music) and the classic Visual Basic 6.0 (that I used to program frequently with):

Slackware 8.0

The second configuration I have produced is Linux based: a Slackware 8.0 installation

Learning about Linux

I have an interesting history with Linux. From computer magazines and the documentation of DOS UAE (a port of the UAE Amiga emulator to MS-DOS), I already knew about Linux's existence and how to use some of the frequently command-line utilities early 1997.

At first I did not see it as something serious/valuable -- in 1997 I was not convinced that a student from Helsinki University could compete with Microsoft Windows and develop an operating system that could do certain things better.

Late 1999, my cousin demonstrated how Linux worked and gave me some background information. As a result, I got some hands on experience with Linux. I tried various distributions, such as SuSE, RedHat and Corel Linux.

I also learned much more about Linux, free software, and how it is developed -- it was developed with the purpose of providing a free (as in freedom) version of UNIX, which was already a well established concept.

Linux is only an operating system kernel -- to produce a usable system, it is typically combined with packages from other free and open-source software projects (most notably the GNU project, but also many others, such as KDE, GNOME and the X Window System) to become a functional UNIX-like system. All these projects are developed in a community-driven fashion by all kinds people all over the world.

There are many kinds of projects that produce usable Linux-based systems by combining various kinds of software packages. These systems are called Linux distributions.

From this background information and my hands on experience, I recognized its potential.

My early installation experiences

Somewhere in 2000, I tried installing a Linux distribution on my own machine. I believe the first distribution that I tried to install is SuSE 6.2. The installation was a failure -- the installer failed to detect my hard-drive because it was attached to the Ultra ATA 66 controller, which was not recognized.

I left my failed Linux installation experience alone for a while and some time later I gave RedHat Linux 7 a try -- initially, I ran into the same problem as SuSE 6.2 (not recognizing my Ultra ATA 66 controller). Eventually, I detached the hard drive from the Ultra ATA 66 controller and attached it to the original Ultra ATA 33 controller, making it for me possible to get a working Linux experience on my own machine.

After a search on the Internet, I discovered that a driver for my Ultra ATA 66 controller was developed and released as a separate kernel patch. I had to download the Linux kernel source code, apply the patch to the source code tree and compile the kernel from source code.

I followed the Linux kernel documentation closely, but no matter what I tried, I ran into compilation errors. At some point, I learned that my version of RedHat Linux was using a weird version of GCC (version 2.96 to be precise) that was never officially released by the GNU project. Back then, the recommended compiler to use for the Linux kernel was GCC 2.95.2. Apparently, this unofficial 2.96 compiler had issues the compiling the Linux kernel.

Then some more time passed and somebody suggested Slackware to me. This was the first Linux distribution I was happy with -- not because the distribution is perfect, but because it was not as heavily customized as the major distributions (preventing me from running into in weird, poorly documented problems) and easy to make modifications to.

Its user experience was not better than SuSE or Red Hat. For example, Red Hat already had a graphical installer, and Slackware was text based. Moreover, the KDE desktop experience in Slackware was a bit unpolished -- some desktop applications were not visible in the program launcher menu. I had to manually add them or start them from the command-line.

I was happy to see that compiling things from source code works as predicted. On Slackware 7.1, I have managed to successfully compile a modified kernel for my Ultra ATA 66 controller allowing me to use the full potential of the hardware in my computer.

Some time later in 2001, I installed the successor version: Slackware 8.0, which included KDE 2.1 and a driver for my Ultra ATA 66 controller. In RedHat Linux 7 I found the GNOME desktop appearance somewhat more appealing than KDE 1.1, but KDE 2.0 was a huge leap forward for me. It gave me a comparable desktop experience to Windows 98/2000. I have been using KDE as my primary Linux desktop environment ever since.

Interesting applications

The late 90s and early 00s were also interesting to me from an application perspective for Linux -- at the time Windows 98 was launched, Microsoft was in the news quite a lot because of Antitrust issues. For example, Microsoft had some huge competitive advantages by bundling Internet Explorer with Windows to compete with Netscape.

Some software vendors, most notably competitors of Microsoft, were actively looking at Linux as an alternative. As a consequence, a number of interesting commercial applications became available for Linux, such as Netscape, Real Player, and Adobe Acrobat reader. As a "serious rival" to Microsoft Office, there was StarOffice (its code base was eventually open sourced. As of today it is still actively developed as LibreOffice).

Moreover, in the gaming area many interesting things happened. Already in the mid 90s, some prestigious commercial games became available for Linux, most notably games from Id software.

There was a company called Loki Games that developed Linux versions of many popular Windows games, such as Quake 3 arena, Unreal Tournament, Sim City 3000, and Soldier of Fortune. From a commercial perspective Loki games did not do well and went out of business in 2002.

Although its commercial business was a failure, its legacy lives on: the company has demonstrated that Linux is a viable platform for games. Moreover, it developed free and open-source technology to make developing games on Linux easier, most notably SDL and OpenAL. As of today, these libraries are still frequently used by many games and multimedia applications.

Configuration challenges

I have decided produce an old Slackware 8.0 configuration, because this is the latest Slackware version I used on my first computer. It provides a decent KDE 2.1 desktop experience -- in my opinion this version of KDE provides a comparable desktop experience to Windows 98 and 2000.

Another objective is to run interesting applications and games from the late 90s, early 2000s. Most notably running some games from Loki games was high on my list of objectives.

I ran into a number of configuration challenges that I will explain in the next sections.

Upgrading the Linux kernel

Slackware 8.0 uses relatively old versions of the Linux kernel even for late 2001, early 2002 standards. By default, it recommends version 2.2.19, because it is considered more stable. Version 2.4.5 is provided as an alternative.

Linux 2.4.5 is missing quite a bit of functionality to optimally use my hardware. For example, it uses an old version the Open Sound System (OSS) that does not include a driver for my Audigy 2 card. Moreover, USB support is also limited -- I cannot, for example, use my Microsoft Sidewinder USB gamepad.

To improve the situation, I have upgraded the Linux kernel to version 2.4.23. I can still download the source tarballs of old 2.4 versions from kernel.org.

After downloading the tarball, I can unpack it as follows:

cd /usr/src
bzip2 -dc linux-2.4.23.tar.bz2 | tar xfv -

After unpacking the source tarball, it is recommended to clean the source code tree first:

cd linux-2.4.23
make mrproper

With the following command-line instruction, I can copy of Slackware's Linux 2.4.5 kernel configuration into the kernel source code tree and configure it in such a way that it works with version 2.4.23:

cp /boot/config .config
yes "" | make oldconfig

Then I need to make some adjustments to the imported kernel configuration. One way to do this is by using the menu-based configuration tool, by running:

make menuconfig

One of the adjustments that I need to make is to enable USB Human Interface Device support so that I can use my joypad:

USB support ->
<M> USB Human Interface Device (full HID) support
[*]   HID input layer support
[*]   /dev/hiddev raw HID device support

After enabling the extra kernel settings, I can run the following instructions to build the kernel image and corresponding kernel modules:

make dep
make bzImage
make modules

I can install the kernel modules as follows:

make modules_install

We must also make some modifications so that we can boot our new kernel. I can copy the kernel image and related artifacts (the config and system map) to the boot directory as follows:

cp -v arch/i386/boot/bzImage /boot/vmlinuz-2.4.23
cp -v System.map /boot/System.map-2.4.23
cp -v .config /boot/config-2.4.23

To make it possible to boot into our new kernel configuration, I must add the following entry to LILO: the bootloader's configuration (/etc/lilo.conf):

image = /boot/vmlinuz-2.4.23
  root = /dev/hda6
  label = Linux-2.4.23
  read-only

And instruct LILO to update the MBR, by running:

lilo

After rebooting the machine, I can boot my new kernel by selecting the entry: "Linux-2.4.23" in the LILO boot menu.

Sound support: installing ALSA

As explained earlier, one of the drivers that the Linux kernel is lacking is a sound card driver for my Audigy 2. It turns out that even the Open Sound System (OSS) in kernel version 2.4.23 does not include one.

In 2002 a new Linux sound sub system was still in heavy development: the Advanced Linux Sound Architecture (ALSA). In Linux 2.6, ALSA replaced the Open Sound System (OSS). At first, it was released as a separate package. ALSA seems to have a driver for my sound card.

I can complement my 2.4.23 kernel with ALSA and the corresponding driver for my card, by downloading the ALSA driver 0.9.8 tarball and running the following commands to build and install it:

./configure \
  --with-moddir=/lib/modules/2.4.23/kernel/drivers/sound \
  --with-kernel=/lib/modules/2.4.23/build \
  --with-sequencer=yes \
  --with-oss=yes \
  --with-isapnp=no \
  --with-cards=dummy,emu10k1
make
make install
./snddevices

In addition to the ALSA driver package, I also had to compile and install the ALSA library, ALSA OSS compatibility and ALSA Utilities packages. Installing them was straight forward by running the standard GNU Autotools build procedure: ./configure; make; make install.

Finally, after installing the packages, I need to do some configuration steps so that the kernel modules are loaded on startup and that the sound settings are restored. I can do that by adding the following lines to /etc/rc.d/rc.local:

# Load kernel sound modules
modprobe snd-emu10k1
modprobe snd-mixer-oss
modprobe snd-seq-oss
modprobe snd-pcm-oss
modprobe snd-seq-midi

# Restore volume settings
alsactl restore

By default, the sound is muted. We can adjust the volume by using the alsamixer:

$ alsamixer

I typically set the volume levels of the master and PCM channels to 74.

I can save my mixer settings by running:

$ alsactl store

Installing the NVIDIA Linux driver for my graphics card

2D graphics work decently out of the box in Slackware 8.0. XFree86, the X Window System distribution commonly used on Linux systems at that time, includes a driver for NVIDIA cards named: nv.

If you want hardware accelerated 3D graphics, you need to install an external driver package from NVIDIA. I have downloaded the 53.28 version of the NVIDIA Linux driver that seems to work decently.

Installing it is straight forward. First, you need to give the installer file executable permissions:

chmod 755 NVIDIA-Linux-x86-1.0-5328-pkg1.run

then you can run the installer as follows:

./NVIDIA-Linux-x86-1.0-5328-pkg1.run

When the installer asks to download a precompiled kernel module, simply deny and let the installer compile it.

After the installation is complete, we must make sure that the kernel module loads on startup, by adding the following line to /etc/rc.d/rc.local:

# Load video driver module
modprobe nvidia

We must also update the configuration XFree86 to use the new NVIDIA driver. We should open its configuration: /etc/X11/XF86Config in a text editor and change the following line:

Driver    "nv"

into:

Driver    "nvidia"

and enable OpenGL integration by uncommenting the following line:

# Load "glx"

USB peripheral support

In order to use USB peripherals, we must load a number of kernel modules at startup. I have added a number additional lines to /etc/rc.d/rc.local to do this.

Loading the following module enables support for my UHCI-based USB host controller:

modprobe usb-uhci

The following module enables USB storage support so that I can my USB memory sticks:

modprobe usb-storage

Adding the following lines allow me to use my Sidewinder USB joypad:

modprobe hid
modprobe joydev

To allow SDL applications to use my joypad I need to configure the SDL_JOYSTICK_DEVICE environment variable to refer to the joypad's device file. I have created the following file (/etc/profile.d/joystick.sh) to make sure that it happens at boot time:

SDL_JOYSTICK_DEVICE=/dev/input/js0
export SDL_JOYSTICK_DEVICE

APM support

Another subtle annoyance is that my system does not power itself off when I give the shutdown instruction. This problem can be fixed by loading the APM module at startup by adding the following line to /etc/rc.d/rc.local:

# Load APM module
modprobe apm

Configuring storage devices

My retro PC has support for quite a few storage devices: it has a GoTek floppy emulator, a traditional floppy drive, a DVD-ROM and two USB connectors that can work with all kinds of USB storage devices.

To use them on Slackware, I need to update my /etc/fstab so that I can easily mount them:

/dev/fd0     /mnt/floppy0    auto      rw,user,noauto
/dev/fd1     /mnt/floppy1    auto      rw,user,noauto
/dev/cdrom   /cdrom          iso9660   ro,user,noauto
/dev/sda     /mnt/usb        auto      rw,user,noauto

I must also create the missing mount point directories:

mkdir -p /mnt/floppy0 /mnt/floppy1 /mnt/usb

Experience

My Slackware installation works as expected. I can conveniently run the KDE 2.1 desktop and many applications, such as classic versions of the GIMP, Mozilla Suite, and Star Office:

I can also play a number of interesting free/open-source games, such as Tuxracer and Supertux 0.1.3:

What was challenging is that I had to compile all these games, including many of their dependencies (e.g. libxml2, SDL, smpeg, SDL_mixer, SDL_image etc.), from source code.

In the late 90s, early 00s, many software projects did not provide prebuilt packages for Linux distributions. Especially if you were using a non-mainstream Linux distribution, such as Slackware, it was a common habit that you had to build things yourself.

And of course, I was able to run a number of commercial games, such as those released by Loki Games: Unreal Tournament and Quake 3:

Windows XP

The third software configuration I looked into was Windows XP.

Windows XP was an interesting milestone for Microsoft. It was the first Windows version in the Windows NT-product line that was considered suitable for consumer use in addition to business use. With the release of Windows XP, the Windows 9x product-line came to an end. It became available on the market late 2001.

A couple of months earlier, I started my computer science studies and I had quite a few classmates that were enthusiastic about it.

By then I was already using Linux as my primary operating system on my home computer for a while. I could still dual boot Windows Millennium Edition -- I made the unfortunate decision to upgrade my 98 SE installation, which I regretted.

At first, I was not very motivated to try Windows XP. When Windows XP was released, I already knew that it had some huge advantages over the Windows versions in the Windows 9x product-line, thanks to its NT-heritage:

It is much more robust. For example, a misbehaving program in Windows 9x could easily render your entire system unstable even after terminating it. With Windows XP, the impact of misbehaving programs is more limited.
Other operating system components are more efficient and reliable. In Windows 98 SE, I frequently used a program called MemTurbo to optimize memory usage. In Windows 98 SE, I often lose quite a bit of allocatable RAM due to memory fragmentation. In Windows XP, memory management is much more efficient.
Its MS-DOS compatibility has improved over its predecessors in the Windows NT-product line. The Windows NT product-line also offers some degree of MS-DOS compatibility, but it is not as good as its Windows 9x counter parts. For example, in Windows 2000 and older, it was not possible to have any sound in MS-DOS applications. In Windows XP, Sound Blaster 2.0 emulation was added.

Although Windows XP had a number of improvements over the products in the Windows 9x product-line and previous editions of Windows in the NT-product line, I also noticed that it had a number of drawbacks for me:

It requires much more system resources. Windows 98 SE roughly requires 300 MiB of free disk space, but the requirements of Windows XP are much more substantial: you need at least 1.5 GiB of free disk space (even more if you want to upgrade to Service Pack 1 and 3).
Windows XP also requires much more RAM. Windows 98 SE already works with 16 MiB of RAM. In 2002, I read that the minimum requirement for Windows XP is 128 MiB, which was the total amount of RAM in my machine early 2002 (by 1999 standards this was considered to be a substantial amount of RAM). Later I learned that the minimum amount was downgraded to 64 MiB, but I believe you need to tune down certain visual effects to have a usable system.
Although MS-DOS compatibility is better that its predecessors in the Windows NT product-line, it is considerably worse than the Windows 9x product-line counterparts.

For example, some DOS games I used to play run considerably slower, such as DOOM. Moreover, in some MS-DOS applications the sound is not optimal (or non-working) and some applications will not work at all (for example, those that will not work with an EMM driver).
Windows XP requires product activation. In order to use it, you had to activate your installation over the Internet or make a phone call to Microsoft to request a new activation key by providing your serial number and hardware key.

Although I was not too happy with the drawbacks of Windows XP, I eventually decided to install it anyway (as a dual boot option next to my Linux installation), because I considered it to be a huge improvement over Windows Millennium Edition.

Because Windows XP is an important development milestone and it allowed me to use some newer applications that are not supported on Windows 98 SE, I have also decided to create a Windows XP configuration for my retro PC.

Configuration challenges

Compared to my Windows 98 SE installation, in my Windows XP installation much more of my hardware was supported out of the box, such as my network card and USB storage support. Nonetheless, I ran into a number of configuration challenges.

Virtual memory / swap file issues

After successfully installing Windows XP, the first issue I ran into is that I was frequently seeing an error reporting that there is "not enough virtual memory". This error has quite an impact on the stability of the system and applications. Eventually, after some searching I discovered that no swapfile.sys file was created on my C: drive, explaining the lack of virtual memory.

After doing a search on the Internet, I discovered the root cause -- Windows XP considers my CF2IDE device a removable disk drive. Although Windows XP allows itself to be installed on a removable device, it does not allow such devices to be used to store swap files on.

I found a driver package, named: diskmod to work around this problem. I can install this driver by right clicking on the diskmod.inf file and selecting the option: Install. Then I can run the UFDasHDD.bat script to change all removable disk devices into hard drives.

After rebooting the system, my problem was solved -- my CF2IDE device is treated as an ordinary hard disk device and the missing swapfile is created.

Graphics card driver

Windows XP includes a driver for my Diamond Viper V770 card (containing a RIVA TNT2 chipset) supporting 2D graphics out of the box.

To use 3D graphics, I need to install an external driver package from NVIDIA. As with the previous operating systems, I did not use the latest version -- installing version 67.66 gives me no problems running the 3D applications and games that I want.

Sound card driver

For my sound card, I followed almost the same procedure as in Windows 98 SE. The only difference is that in Windows XP it is recommended to use the WDM driver.

Furthermore, because Windows XP is based on Windows NT (that was developed from scratch, rather than using Microsoft's MS-DOS legacy), it makes no sense to install the DOS drivers.

Using CD-ROM and DVD-ROM images

Similar to my Windows 98 SE installation, I can use Daemon Tools to mount CD-ROM and DVD-ROM images. I am using the exact same version (3.47) because it is still small / light weight.

Tools to fix DOS compatibility issues

Earlier, I explained that compatibility with MS-DOS is not optimal. There are some tools that can be used to fix certain kinds of compatibility problems.

To get sound support in some applications (e.g. Prince of Persia) or better sound card support (e.g. Sound Blaster 16), I have installed VDMSound. You can run the following TSR to get improved sound support:

DOSDRV

Although VDMSound may improve the sound experience for some DOS applications, I have also learned that its emulation performance is worse than the Sound Blaster 2.0 emulation that Windows XP provides.

Some games that use VESA graphics modes (most notably BUILD-engine games, such as Duke3D and Shadow Warrior) may refuse to start. By running NOLFB in advance, these games may work again.

Experience

Windows XP does not offer much advantages over Windows 98 SE for playing games -- most of my Windows games still work, but there are no games that perform significantly better. Furthermore, there are some DOS games that will no longer work.

I consider the biggest advantage of using Windows XP (beyond better stability) that I can run newer applications. Some of my prominent examples are Paint.NET version 2.72 (which I find interesting: it was a .NET application released as free and open-source software at a time that Microsoft was still very much opposed to the whole idea) and Sibelius 4.0:

MS-DOS 6.22 with Windows 3.1

Because using Compact Flash cards is so convenient to try out many kinds of configurations, I have also decided to create an MS-DOS 6.22 installation.

Although I have used MS-DOS 6.22 (and a number of older versions) on various kinds of machines including other people's PC as well as the emulated PC environment on my Amiga 500, I have never installed it on my first PC because there was no reason to.

However, since Compact Flash cards are cheap and I can conveniently switch them, I have also decided to set up an MS-DOS configuration.

DOS configuration challenges

Doing a clean MS-DOS installation is straight forward by following the steps in the installer. In addition, I had to perform the same kinds of DOS-related configuration steps as in my Windows 98 SE installation:

Installing the cutemouse driver
Loading the CD-ROM driver and mounting the CD-ROM drive.
Optimizing free conventional memory with MEMMAKER.

There were a couple of additional challenges:

Creating partitions

MS-DOS 6.22 does not support FAT32 filesystem partitions. The best it can do is FAT16, which has a maximum storage limit of 2 GiB. The smallest Compact Flash card I could buy is 4 GiB. Fortunately, it is possible to create two partitions of 2 GiB each, still allowing me to use all of the card's storage capacity.

Sound card support

My Sound Blaster Audigy 2 CD-ROM does not include an installer for MS-DOS. However, I discovered that somebody in the retro computing community has developed an DOS specific configuration pack (by extracting the DOS portions from the Windows 9x installer) and released it on VOGONS.

Installing this DOS pack is easy -- I simply need to unpack it into the root directory of my C: partition.

I had to make a number of subtle changes. First, I need to edit the batch script that initializes my soundcard (C:\AUDIGY2\LIVEINIT.BAT) and make the following modifications:

Change the paths from: c:\sblive to C:\AUDIGY2.
Uncomment the line that loads audigy12.exe executable
The SET BLASTER instruction must match my settings. For me it is: SET BLASTER=A220 I5 D1 H5 P330 T6.

I also had to update the configuration file (C:\AUDIGY2\CTSYN.INI) with the correct IRQ:

SBIRQ=5

As explained earlier in this blog post, the Sound Blaster 16 emulation TSR requires an EMM driver to get access to the Upper Memory Block (UMB). If you have not yet configured it, you can add the following line to C:\CONFIG.SYS:

DEVICE=C:\DOS\EMM386.EXE NOEMS

To allow the sound card to be initialized on startup, I have added to following line to the early stages of AUTOEXEC.BAT:

CALL C:\AUDIGY2\LIVEINIT

If your AUTOEXEC.BAT loads TSRs, such as SMARTDRV.EXE or MSCDEX.EXE, then move them so that they are invoked after the initialization of the sound card. According to the documentation, certain processes may conflict with the soundcard initialization.

Windows 3.1 configuration challenges

In addition to MS-DOS 6.22, I have installed Windows 3.1, which was still a separate product from MS-DOS back in the early 90s. Moreover, Windows 3.1 was not an operating system, but a graphical shell running on top of MS-DOS.

Installing it was straight forward, but there were a number of things I could do to improve the configuration.

Installing my graphics card driver

By default, Windows 3.1 uses a 640x480 resolution with 16 colors for displaying graphics. If you want to use better screen modes (such as higher resolutions and more colors), it is recommended to install a driver for your graphics card.

Fortunately, there is a Windows 3.1x driver for NVIDIA RIVA TNT2 video cards.

Installing it was straight forward by unzipping the driver package into a temp directory (e.g. C:\TEMP\NV) and running the following steps:

Main -> Windows Setup
Options -> Change System Settings...
Display: Other display (Requires disk from OEM)...
Pick: NVidia TNT (640x480x256, small font)

When it asks for the driver disk location, provide the location of the temp directory (C:\TEMP\NV). It may also ask you to insert a disk named: ".". In this case answering: C:\WINDOWS\SYSTEM did the job for me.

Sound support

Dealing with sound support in Windows 3.1 was the most tricky configuration aspect. Similar to MS-DOS, there is no Windows 3.1 driver for my sound card. I thought this would not be a big issue, because I have DOS TSR that emulates a Sound Blaster 16.

Unfortunately, installing the Windows 3.1 Sound Blaster 16 driver did not work. I also tried the Sound Blaster 1.5 or 2.0 drivers, but none of them worked. As far as I can see, none of these drivers are compatible with my emulated SB16 device.

Moreover, I also believe it is difficult to support integration with the emulated DOS driver because Windows 3.1 runs in 386 enhanced mode by default, in which virtual memory is used. It may not be able to communicate with the emulate DOS device at all.

I gave up searching for a solution and decided to install the PC speaker driver.

I can install it by executing the following steps:

Main -> Control Panel -> Drivers
Click on: Add
Select: Unlisted or Updated driver
Insert PC speaker driver disk
Provide location: A:\
Select driver: Sound Driver for PC-speaker
Enable: Enable interrupts during playback

The PC speaker driver at least gives me some sound, but I cannot, for example, play any MIDI files.

Experience

I can run many DOS applications and games, such as QuickBASIC, Lotus 1-2-3, Historic and Duke Nukem 2:

Although many applications work, my sound experience is not optimal -- some applications crash if I enable sound effects, such as DOOM (Sound Blaster or General MIDI music works fine though). If you want to have an optimal pure MS-DOS experience, I recommend using an ISA soundcard.

Windows 3.1 works -- I can run all kinds of applications and games, such as Minesweeper and Visual Basic 3.0:

Due to not having a proper sound card driver, I cannot use most multimedia applications, such as Windows Media Player to play MIDI files.

Conclusion

In this blog post, I have described how I have built a retro PC allowing me to run the same kinds of applications and games that I frequently used in the late 90s, early 00s.

I have produced four kinds of software configurations (running four different kinds of operating systems) to demonstrate how the applications that I find interesting can be used. In addition, it is a nice machine to run some future projects on.

Some readers that know me well would probably ask me why I have only produced one Linux configuration (Slackware 8.0) and why I have not tried any of the other Linux distributions that I have mentioned, such as SuSE and Red Hat.

Although I appreciated Linux and Slackware was initially working out for me, I was never fully satisfied with any Linux distribution. Moreover, I also did a substantial amount of customization work on all kinds of Linux systems. To make that kind of work doable, I also worked on a custom developed automated solution. This is an interesting story for a future blog post.

Acknowledgments

I am very thankful to the efforts of all kinds of people in retro computing community. Without their efforts, it would be very difficult to do all this stuff.

Mounting a KCS PowerPC board emulated PC hard-drive partition in Linux (or: an exercise in writing a simple network block device server)

2025-05-06T22:52:00.005+02:00

In my previous blog post, I have explained that AmigaOS is a flexible operating system supporting many kinds of filesystems through custom DOS drivers. For example, in the 90s, I frequently used the CrossDOS driver to get access to PC floppy disks.

In addition, my Amiga 500 was extended with a KCS PowerPC board making it possible to emulate an XT PC and run PC applications. This PC emulator integrates with all kinds of Amiga peripherals, such as the keyboard, mouse, floppy drives, and hard drives.

Although I could easily read PC floppies from AmigaOS, it was not possible for me to mount partitions from my KCS PowerPC board emulated PC hard drive. The reason is that the hard drive data is modified for efficiency -- each pair of bytes is reversed for more efficient processing on the Motorola 68000 CPU, which is a big endian CPU. For example, the typical signature of a Master Boot Record (MBR) is stored in reversed order: rather than 0x55AA, it is stored as 0xAA55.

In my previous blog post, I have shown a custom developed AmigaOS device driver that unreverses bytes and translates offsets in such a way that the addressing of the PC hard drive starts at offset 0. By using this device driver, Amiga filesystem DOS drivers can work with a virtual PC hard-drive.

Although it is convenient to be able to manage the files of my PC partition from AmigaOS, not all my inconveniences have been solved. With my SCSI2SD replacement for a physical hard drive, I can also conveniently transfer data to my Amiga hard drive from my Linux PC using my PC's card reader.

Sometimes I also want to transfer files from my Linux PC to my emulated PC partition, such as custom developed software. Unfortunately, on Linux I ran into the same problem as AmigaOS -- Linux has a FAT filesystem driver, but it cannot be used because the partition's raw data is not in a format that the filesystem module understands.

In addition to AmigaOS, I have also worked on a solution that makes it possible to cope with this problem on Linux. In this blog post, I will explain the solution.

Looking for a solution

Linux is a UNIX-like operating system. One of the concepts that UNIX is built on is that everything is a file. In addition to regular files, access to many kinds of storage media are also provided through files, namely: block device files.

Mounting a file representing a storage medium (this can be a device file, but also a regular file containing a dump of a medium, such as a floppy or CD-ROM) to a directory, makes it possible to browse the contents of a storage medium from the file system.

Linux and other kinds of UNIX-like operating systems (such as FreeBSD, NetBSD and OpenBSD) support many kinds of file systems, including the FAT filesystem that is commonly used on MS-DOS and many versions of the Windows operating system.

Because for efficiency reasons pairs of bytes have been reversed on my PC emulated Amiga partition, the raw data coming from the storage medium is not in a format that the FAT filesystem module understands.

The question that I came up with that is: how to cope with that? How can we provide a file that actually does provide the raw data in the right format, in which the bytes have been unreversed?

I have been thinking about various kinds of solution directions. Some impractical directions I have been thinking about were the following:

Working with hard-drive dumps. A simple but crude way to get access to the data is to make a dump (e.g with the dd command) of the emulated PC hard-drive partition and to run a program that unreverses the pair of bytes. I can mount the modified dump as a loopback device and then read the data.

The downside of this approach is that it requires me to store a copy of the entire drive (although 100 MiB for nowadays standards' is not much) and that modifying data requires me write back the entire dump back to the SD card, which is costly and inconvenient.
Using a named pipe (also known as a FIFO). I could write a program that unreverses the bytes, using the SD card's device file as an input and a named pipe (FIFO) as an output. A named pipe can be accessed as a file from the filesystem.

Unfortunately, random access is not possible with a named pipe -- it is only possible to traverse a collection data from the beginning to the end. As a result, I cannot mount a FIFO to a directory, because filesystem modules require random access to data.
Developing a custom kernel module. To be able to mount a device file file that provides the raw data in the right format I could develop a kernel module. Device files are a means to communicate with a kernel module.

Although I have some basic experience developing Linux kernel modules, I consider this approach to be quite impractical for my use case:
- Linux does not have a stable kernel API, so custom module development requires maintenance each time a new Linux version is released.
- Deploying a Linux kernel module is not very convenient -- you need special user privileges to do that.
- Code running in the kernel is privileged -- incorrectly implemented code could crash the entire system.
- It is very unlikely that a kernel module for such a niche use case gets accepted into the mainline kernel.

After some more thinking and talking to people, I learned that I am not the only one struggling with raw hard-drive data that is not in the right format. For example, QEMU is a modern machine emulator that also emulates hard-drives. Hard drive data is typically stored in a space efficient way -- the qcow format, in which data is only allocated when it is actually needed.

There are also facilities to get access to these virtual QEMU hard drives from the host system or a remote system in a network. QEMU provides a utility named: qemu-nbd to make this possible. With this tool, it is possible, for example, to mount the contents of a qcow2 hard-drive image on the host system.

After learning about this tool, I also learned more about NBD. NBD is an abbreviation for Network Block Device and offers the following features:

It makes block driver access possible over the network. NBD is realized by three components: a server, a client and the network between them. A server exports data (in blocks of a fixed size) and is typically implemented as a userspace process rather than functionality that runs in the kernel.
A client runs on a client machine and connects to an NBD server over a network using the NBD protocol. It makes a network block device accessible through an ndb device file, which is also a block device file.
Networking also uses file concepts on UNIX-like operating systems -- as a result, I can also efficiently connect to a local NBD server (without using a network interface) by using a Unix domain socket.

Because NBD makes it possible to use a userspace process to export data and this data can be exported at runtime, it looks like the ideal ingredient in solving my problem.

As a result, I have decided to look into developing a custom network block device server for my use case.

Developing a nbdkit plugin

How hard is to implement your own NDB server? After some searching, I discovered that developing your own NDB server is actually pretty easy: nbdkit can be used for this.

ndbkit manages many low-level aspects for you. As a developer, you need to construct an instance of struct ndbkit_plugin that defines an NDB configuration and register it as a plugin.

Most of the members of this struct are callback functions that get invoked during the life-cycle of the server:

static struct nbdkit_plugin plugin = {
    .name = "kcshdproxy-plugin",
    .version = VERSION,
    .longname = "KCS HD Proxy plugin",
    .description = "NBD plugin for accessing KCS PowerPC board emulated hard drives",
    .config = kcshdproxy_config,
    .config_complete = kcshdproxy_config_complete,
    .open = kcshdproxy_open,
    .get_size = kcshdproxy_get_size,
    .pread = kcshdproxy_pread,
    .pwrite = kcshdproxy_pwrite,
    .close = kcshdproxy_close
};

NBDKIT_REGISTER_PLUGIN(plugin);

The above plugin configuration defines some metadata (e.g. name, version, long name and description) and a collection of callbacks that have the following purpose:

The kcshdproxy_config callback is invoked for each command-line configuration parameter that is provided. This callback function is used to accept the targetFile parameter that specifies the device file representing my hard-drive and the offset defining the offset of the Amiga partition that stores the emulated PC hard drive data.
The kcshdproxy_config_complete callback is invoked when all configuration parameters have been processed. This function checks whether the configuration properties are valid.
The kcshdproxy_open function opens a file handler to the target device file
The kcshdproxy_get_size function returns the size of the file representing the hard-drive. This function was the most complicated aspect to implement and requires a bit of explanation.
The kcshdproxy_pread callback fast forwards to a specified offset, then reads a block of data and finally unreverses each pair of bytes. Implementing this function is straight forward.
The kcshdproxy_pwrite callback fast forwards to a specified offset, reverses the input block data, writes it, and finally unreverses it again. As with the AmigaOS device driver, unreversing is required because the driver does not re-read a previously written block.
The kcshdproxy_close function closes the file handler for the target device.

The most tricky part was implementing the kcshdproxy_get_size function that determines the size of the block device. At first, I thought using the stat() function call would suffice -- but it does not. For hard-drive dumps, it (sort of) works because I use the offsets provided by GNU Parted to create the dump file -- in this usage scenario, the file size exactly matches the Amiga partition size.

It is also possible to use the NDB server with a full dump of the entire hard-drive. Unfortunately, if I do this then the file size no longer matches the partition size, but the size of the entire hard drive.

For device files it is not possible to use stat() -- it will always return 0. The reason why this happens is that size information comes from the inode data structure of a file system. For device files, sizes are not maintained.

Linux has a non-standardized function to determine the size of a block device, but using this function has the same limitation as ordinary files -- this value may not always reflect the partition size, but it could also be something else, such as the size of the entire hard drive.

I ended up using the partition entry information from the Master Boot Record (MBR) to determine the size of the virtual PC hard-drive. Each partition entry contains various kinds of attributes including the offset of the first absolute sector, and the number of sectors that the partition consists of.

I can determine the size of the virtual PC hard-drive by computing the provisional sizes for each partition (a classic MBR allows up to four partitions to be defined) with the following formula:

Provisional size = (Offset of first absolute sector + Number of sectors)
    * Sector size

As explained in my previous blog post, the sector size is always 512.

By computing the provisional sizes for each partition and taking the maximum value of these provisional sizes, I have a pretty reliable indication of the true hard drive size.

Usage

Using the NDB server with a local connection is straight forward.

First, I must determine the offset of the Amiga partition that stores the emulated PC hard-drive data (/dev/sdd is a block device file referring to my card reader). I can use GNU Parted to retrieve this information by changing the offset to bytes and printing the partition table:

$ parted /dev/sdd
unit B
print
Model: Generic- USB3.0 CRW-SD (scsi)
Disk /dev/sdd: 15931539456B
Sector size (logical/physical): 512B/512B
Partition Table: amiga
Disk Flags: 

Number  Start       End          Size        File system  Name  Flags
 1      557056B     104726527B   104169472B  affs1        DH0   boot
 2      104726528B  209174527B   104448000B               KCS
 3      209174528B  628637695B   419463168B  affs1        DH1
 4      628637696B  1073725439B  445087744B  affs1        DH2

In the above output, the second partition (with name: KCS) represents my emulated PC hard-drive. Its offset is: 104726528.

As explained in my previous blog post, in most of the cases, the master boot record (MBR) can be found at the beginning of the emulated PC partition, but sometimes there are exceptions. For example, on my Kickstart 1.3 drive, it is moved somewhat.

I have also created a Linux-equivalent of the searchmbr tool determine the exact offset of the MBR. Running the following command scans for the presence of the MBR by using the partition's offset as a start offset:

$ searchmbr /dev/sdd 104726528
MBR found at offset: 104726528

As can be seen in the output, the MBR's offset is identical to the Amiga partition offset confirming that the MBR is at the beginning of the partition.

I can start an NDB server using a UNIX domain socket: $HOME/kcshdproxy.socket for local connectivity, as follows:

$ nbdkit --unix $HOME/kcshdproxy.socket \
  --foreground \
  --verbose /usr/lib/kcshdproxy-plugin.so \
  offset=104726528 targetFile=/dev/sdd

In the above command-line instruction, I have provided two configuration settings as parameters:

offset specifies the offset of the emulated PC hard drive. In the example, the value corresponds to our previously discovered partition offset. If no offset is given, then it defaults to 0.
targetFile specifies the file that refers to the hard disk image. In the example: /dev/sdd refers to my card reader.

To connect to the server and configure a network block device, I can run the following command:

$ nbd-client -unix $HOME/kcshdproxy.socket -block-size 512

After running the above command: /dev/nbd0 refers to the block device file giving me access to the block data provided by my NDB server. /dev/ndb0p1 refers to the first partition.

I can mount the first partition with the following command:

$ mount /dev/nbd0p1 /mnt/kcs-pc-partition

And inspect its contents as follows:

$ cd /mnt/kcs-pc-partition
$ ls
 COMMAND.COM    DRVSPACE.BIN  GAMES
 AUTOEXEC.BAT   CONFIG.SYS    IO.SYS
 AUTOEXEC.OLD   DOS           MSDOS.SYS

As can be seen in the output above, the data is accessible from my Linux system. Isn't it awesome? :-)

Although it is not possible to run the KCS PowerPC board emulator program in any of the well known Amiga emulators, such as FS-UAE or WinUAE (they lack the hardware emulation to make this possible), I can use DOSBox (or improved versions of it, such as DOSBox-X) to run programs from my emulated PC partition, by starting DOSBox as follows:

$ dosbox -noautoexec -c 'mount C /mnt/kcs-pc-partition' -c 'C:'

The provided command parameters (-c parameters) automatically mount my KCS PC partition as the C: drive:

Conclusion

In this blog post I have shown a custom developed NDB server that makes it possible to mount a KCS PowerPC board emulated PC hard drive partition in Linux. By having access to this emulated partition from Linux, it becomes possible to conveniently exchange files with my Linux PC and manage my emulated PC configuration from DOSBox.

Availability

My ndbkit plugin can be obtained from my GitHub page and used under the terms and conditions of the MIT license.

Mounting a KCS PowerPC board emulated PC hard-drive partition in AmigaOS (or: an exercise in writing a simple AmigaOS device driver)

2025-04-29T21:59:00.004+02:00

In my previous two blog posts, I have shown that my Amiga 4000 has the ability to run multiple operating systems -- in addition to AmigaOS, it can also run Linux and NetBSD, albeit with some challenges.

Something that these three operating systems have in common is that their file system capabilities are very powerful -- they are flexible enough to support many kinds of file systems through custom modules.

For example, both Linux and NetBSD have an Amiga Fast Filesystem module making it possible for me to conveniently exchange data with my Amiga partition. The opposite scenario is also possible -- there are custom file system handlers available for AmigaOS allowing me to exchange data with my Linux Ext2 and NetBSD Berkey Fast Filesystem partitions.

The ability to exchange files with different operating systems is not a new experience to me. When my Amiga 500 was still mainstream, I was already accustomed to working with PC floppy disks. Amiga Workbench 2.0 bundles a file system driver: CrossDOS that makes it possible to exchange data with PC floppy disks.

Although exchanging data with PC floppy disks is possible, there was something that I have been puzzled by for a long time. My Amiga 500 contains a KCS PowerPC board making it possible to emulate an XT-based PC. The user experience is shown in the picture on the top right. The KCS PowerPC board integrates with all kinds of Amiga peripherals, such as its keyboard, mouse, floppy drive, and hard drive (attached to an expansion board).

Unfortunately, I have never been able to exchange files with my emulated PC hard-drive from AmigaOS. Already in the mid 90s, I knew that this should be possible somehow. I have dedicated a great amount of effort in configuring a mount entry for my hard drive, but despite all my efforts I could not make it work. As a result, I had to rely on floppy disks or a null modem cable to exchange data, making data exchange very slow and inconvenient.

Recently, I have decided to revisit this very old problem. It turns out that the problem is simple, but the solution is a bit more complicated than I thought.

In this blog post, I will explain the problem, describe my solution and demonstrate how it works.

The KCS PowerPC board

As I have already explained in a previous blog post, the KCS PowerPC board is an extension card for the Amiga. In an Amiga 500, it can be installed as a trapdoor expansion.

It is not a full emulator, but it offers a number of additional hardware features making it possible to run PC software:

A 10 MHz NEC V30 CPU that is pin and instruction-compatible with an Intel 8086/8088 CPU. Moreover, it implements some 80186 instructions, some of its own instructions, and is between 10-30% faster.
1 MiB of RAM that can be used by the NEC V30 CPU for conventional and upper memory. In addition, the board's memory can also be used by the Amiga as additional chip RAM, fast RAM and as a RAM disk.
A clock (powered by a battery) so that you do not have reconfigure the date and time on startup. This PC clock can also be used in Amiga mode.

The KCS PowerPC board integrates with all kinds of Amiga peripherals, such as the keyboard, mouse, RAM, joysticks, floppy drives and hard drives (provided by an expansion board).

It can also emulate various kinds of displays controllers, such as CGA, EGA and VGA, various kinds of graphics modes, and Soundblaster and Adlib audio.

Video and audio is completely emulated by using Amiga's own chips and software. Because of the substantial differences between PC graphics controllers and the Amiga chips, graphics emulation is often quite slow.

Although graphics are quite costly to emulate, text mode applications generally work quite well and sometimes even better than a real XT PC. Fortunately, many PC applications in the 80s and early 90s were text based, so this was, aside for games, not a big issue.

Configuring DOS drivers in AmigaOS

AmigaOS has the ability to configure custom DOS drivers. Already in the 90s, I discovered how this process works.

In Amiga Workbench 2.0 or newer, DOS driver configurations can be found in the DEVS:DOSDrivers folder. For example, if I open this folder in the Workbench then I can see configurations for a variety of devices:

(As a sidenote: in Amiga Workbench 1.3 and older, we can configure DOS drivers in a centralized configuration file: DEVS:MountList).

In the above screenshot, the PC0 and PC1 configurations refer to devices that can be used to exchange data with PC floppy disks. When a PC floppy disk is inserted into the primary or secondary disk drive, the PC0: or PC1: devices provide access to its contents (rather than the standard DF0: or DF1:).

I eventually figured out that these DOS driver entries are textual configuration files -- I could, for example, use the PC0: configuration as a template for creating a configuration to mount my hard drive partition (PCDH0:). If I open this configuration file in a text editor, then I see the following settings:

FileSystem     = L:CrossDOSFileSystem
Device         = mfm.device
Unit           = 0
Flags          = 1
Surfaces       = 2
BlocksPerTrack = 9
Reserved       = 1
Interleave     = 0
LowCyl         = 0
HighCyl        = 79
Buffers        = 5
BufMemType     = 0
StackSize      = 600
Priority       = 5
GlobVec        = -1
DosType        = 0x4D534400

With my very limited knowledge in the 90s, I was already able to determine that if I want to open my hard drive partition, some settings need to be changed:

CrossDOSFileSystem is a filesystem driver that understands MS-DOS FAT file systems and makes them accessible through AmigaDOS function calls. This setting should remain unchanged.
The device driver: mfm.device is used to carry out I/O operations on a floppy drive. Many years later, I learned that MFM means Modified frequency modulation, a common method for encoding data on floppy disks.

To be able to access my SCSI hard drive, I need to replace this device driver with a device driver can carry out I/O operations for my hard drive: evolution.scsi. This device file is available from the ROM thanks to the autoconfig facility of my expansion board: the MacroSystem Evolution.
The geometry of the hard drive is different than a floppy drive -- for example, it offers much more storage and the PC partition is typically somewhere in the middle of the hard drive, not at the beginning. Aside from the fact that I knew that the LowCyl and HighCyl settings can be used to specify the boundaries of a partition, I knew that I had to change other configuration properties as well. In the 90s, I had no idea what most of them were supposed to mean, but I was able to determine them by using a utility program: SysInfo.

Fast forwarding many years later, I learned that it is a common habit to measure offsets and sizes of classic storage media in cylinders, heads and sectors. The following picture (taken from Wikipedia) provides a good overview of the concepts:

By requesting the properties of my KCS: partition (the Amiga partition storing the data of my emulated PC drive) in SysInfo, I am able to derive most of the properties that I need to adjust my configuration:

Some of the above properties can be translated to DOS driver configuration properties as follows:

Unit number translates to: Unit
Device name translates to: Device
Surfaces translates to: Surfaces
Sectors per side translates to: BlocksPerTrack. I must admit that I find the term: "sectors per side" a bit confusing so I am not quite sure how it relates to concepts in the picture shown earlier, Anyway, I discovered by experimentation that this metric is equal to blocks per track setting.
Reserved blocks translates to Reserved.
Lowest cylinder translates to LowCyl.
Highest cylinder translates to HighCyl.
Number of buffers translates to Buffers.

Some DOS driver configurations also need to know the Mask and MaxTransfer properties of a hard-drive partition. I can discover these settings by using HDToolBox and selecting the following option: Advanced options -> Change File System for Partition:

Finally, to be able to know how to address the offsets in sizes in bytes, I need to know the block size. On the Amiga, 512 is a common value.

If I take the discovered properties into account and make the required changes, I will get the following DOS driver configuration for my hard drive partition (PCDH0:):

FileSystem     = L:CrossDOSFileSystem
Device         = evolution.device
Unit           = 0
Flags          = 1
Surfaces       = 1
BlocksPerTrack = 544
Reserved       = 2
Interleave     = 0
LowCyl         = 376
HighCyl        = 750
Buffers        = 130
BufMemType     = 0
StackSize      = 600
Priority       = 5
GlobVec        = -1
DosType        = 0x4D534400
Mask           = 0xffffff
MaxTransfer    = 0xfffffe
BlockSize      = 512

Although the configuration looks complete, it does not work -- if I try to mount the PCDH0: device and list its contents, then I will see the following error message:

This is as far as I could get in the 1990s. Eventually, I gave up trying because I had no additional resources to investigate this problem.

A deeper dive into the issue

Fast forwarding many years later, there is much I have learned.

Foremost, I learned a couple of things about PC hard drives -- I now know that the first block of every PC hard-drive (consisting of 512 bytes) contains a Master Boot Record (MBR).

A master boot record contains various kinds of data, such as bootstrap code (a prerequisite to boot an operating system from a hard drive partition), a partition table and a fixed signature (the last two bytes) that are supposed to always contain the values: 0x55, 0xAA.

With my SCSI2SD replacement for my physical hard drive, I can easily check the raw contents of the partitions by putting the SD card in the card reader of my Linux PC and inspecting it with a hex editor.

In addition, I found a nice hex editor for the Amiga: AZap, that can also be used to inspect the raw data stored on an AmigaDOS drive. For example, if I open a CLI and run the following command-line instruction:

AZap KCS:

Then I can inspect the raw data of my emulated PC drive (the KCS: partition):

The above screenshot shows the first block (of 512 bytes) of the KCS partition. This block corresponds to the hard drive's MBR.

If I look at its contents, I have noticed something funny. For example, the last two bytes (the typical signature) seems to be reversed -- instead of 55AA, it is AA55.

Furthermore, if I look at the bootstrap code, I see a number of somewhat human-readable strings. As an example, take the string:

"b seutirgnsssyetme"

It should probably not take long to observe what is going on -- the order of each pair of bytes is reversed. If I unreverse them, we will see the following string:

" besturingssysteem"

the word in the middle: besturingssysteem translates to: operating system (I am using a Dutch version of MS-DOS).

So why is the order of each pair of bytes reversed? I suspect that this has something to do with performance -- as I have already explained, the KCS PowerPC board integrates with a variety of Amiga peripherals, including hard-drives. To use Amiga hard drives the emulation software uses AmigaOS device drivers.

Amiga software uses the Motorola 68000 CPU (or newer variants) for executing instructions. This CPU is a big-endian CPU, while the NEC V30 CPU (pin-compatible with an Intel 8086/8088) is a little endian CPU. The order of bytes is reversed in little endian numbers. Because PCs use little-endian CPUs, data formats used on the PC, such as the MBR and the FAT file system typically store numbers in little endian format.

To let run software properly on the Motorola 68000 CPU, it needs to reverse the bytes that numbers consists of. Most likely, the emulation performs better if the byte order does not have to be reversed at runtime.

After this observation I realized that a big ingredient in solving my puzzle is to make sure that each pair of bytes gets unreversed at runtime so that a file system handler knows how to interpret the hard-drive's raw data.

Idea: developing a proxy device driver

As I have already explained, a file system handler typically uses a device driver to perform low-level I/O operations. To cope with the byte reversing problem, I came with the idea to develop a proxy driver providing the following features:

It relays I/O requests from the proxy driver to the actual hard-disk driver. For requests that work with hard disk data, the driver intercepts the call and reverses the bytes so that the client gets it in a format that it understands.
Some filesystem handlers, such as fat95, know how to auto detect partitions by inspecting a hard-drive's MBR. Unfortunately, auto detection only works if the MBR is on the first block. An emulated PC hard drive is typically never the first partition on an Amiga hard drive. The proxy driver can translate offsets in such a way that the caller can refer to the MBR at offset 0.

In a nutshell, the proxy device driver offers us a virtual PC hard drive that is accessible from AmigaOS.

A high level overview of some of the AmigaOS components

Implementing an Amiga device driver for the first time requires some learning about the concepts of the operating system. As a result, it is good to have a basic understanding of some of the components of which the operating system consists:

The kernel is named: Exec. It is (somewhat) a microkernel that is responsible for task management, memory allocation, interrupt handling, shared library and device management, and inter-process communication through message passing.

Compared to more modern/common microkernel operating systems, it does message passing through pointers rather than serializing data. As a result, it is very fast and efficient, but not as secure as true microkernels.
One of the responsibilities of the kernel is device management -- this is done through device drivers that can be loaded from disk when they are needed, or by device drivers that are already in ROM or RAM. Device drivers use .device as a filename extension.

From the file system, they are typically loaded from the DEVS: assignment. By default, this assignment is an alias for the Devs/ directory residing on the system partition.

A driver is a module typically representing a shared resource, such as a physical device (e.g. a floppy drive). The functionality of a device driver can be used by client applications by sending I/O request objects using the kernel's message passing system.

I/O requests are basically function calls that specify a command number and some data payload. A device driver may also modify the data payload to send an answer, such a block of data that was read.
AmigaDOS is a sub system providing file system abstractions, a process management infrastructure and a command-line interface. Due to time pressure, it was not developed from scratch but taken from another operating system called: TRIPOS under a license from MetaComCo. AmigaDOS was written in BCPL and interacting with it from the C programming language was not always very efficient. In AmigaOS 2.0, the AmigaDOS sub system was rewritten in C.
AmigaDOS also has its own drivers (they are often called DOS drivers rather than Exec drivers). DOS drivers can integrate with the file system operations that AmigaDOS provides making it possible work with many kinds of file systems.

Similar to UNIX-like systems (e.g. Linux), also non-storage devices can be accessed through a sub set of file system operations. DOS drivers providing functionality for non-storage devices are typically called handlers. For example, it is possible to redirect output to the serial port (e.g. using SER: as a file output path) or opening a console window (by using CON:) through custom handlers.

DOS drivers are typically loaded from the L: assignment, that by default, refers to the L/ folder on the system partition.
AmigaOS consists of more parts, such as Intuition: a windowing system and the Workbench: a graphical desktop environment. For this blog post, their details are irrelevant.

Developing a custom driver

I have plenty of C development experience, but that experience came many years after abandoning the Amiga as a mainstream platform. I gained most of my C development experience while using Linux. Back in the 90s, I had neither a C compiler at my disposal nor any C development resources.

In 2012 I learned that a distribution of the GNU toolchain and many common Linux tools exists for AmigaOS: Geek Gadgets. I have used Geek Gadgets to port the tools of my IFF file format experiments to AmigaOS. I have also used it to build a native viewer application for AmigaOS.

After some searching, I discovered a simple AmigaOS driver development example using GCC. Unfortunately, I learned that compiling this example driver with the GCC version included with the Geek Gadgets distribution does not work -- these tutorials rely on compiler extensions (for example, to implement the right calling convention and ordering of functions) that only seem to be supported by much newer versions of GCC. These tutorials recommend people to use amiga-gcc -- a cross compiling toolchain that needs to run on modern computers.

Although I can set up such a cross compilation toolchain, it is way too much complexity for my use case -- the main reason why I previously used GCC (in Geek Gadgets) is to port some of my Linux applications to AmigaOS.

Then I ended up searching for conventional resources that were commonly used when the Amiga was still a mainstream platform. The Amiga Development Reference explains quite well how to use device drivers from client applications.

Unfortunately, it does not provide much information that explains how to develop a device driver yourself -- the only information that it includes is an example driver implemented in assembly. I am not scared of a technical challenge, but for a simple driver that just needs to relay I/O requests and reverse data, assembly is overkill IMO. C is a more useful language to do that IMO.

I ended up using the SAS C Compiler (originally it was called the Lattice C compiler) version 6.58. I have managed to obtain a copy several years ago. The compiler also includes a very simple device driver example implemented in C (it resides in the example/example_device sub folder).

A device driver has a very simple structure:

int  __saveds __asm __UserDevInit(register __d0 long unit,
                                  register __a0 struct IORequest *ior,
                                  register __a6 struct MyLibrary *libbase)
{
    /* .... */
}

void __saveds __asm __UserDevCleanup(register __a0 struct IORequest *ior,
                                     register __a6 struct MyLibrary *libbase)
{
    /* .... */
}

void __saveds __asm DevBeginIO(register __a1 struct IORequest *ior)
{
    /* .... */
}

void __saveds __asm DevAbortIO(register __a1 struct IORequest *ior)
{
    /* .... */
}

Only four functions need to be implemented:

The UserDevInit function is called by the kernel when the device driver is loaded. In my proxy device driver, it opens a connection the actual hard disk device driver. This function returns 0 if the initialization succeeds and 1 if it fails.
The UserDevCleanup function is called by the kernel when the device driver is unloaded. It closes the connection to the actual device driver and cleans up all related resources.
The DevBeginIO function is invoked when a device driver receives an I/O request from a client application. This function checks the command type and reverses the data payload, if needed.
The DevAbortIO function is invoked when a device driver receives an abort request from a client application. This function does nothing in my proxy driver.

In addition to implementing these functions, these functions also require the compiler to follow a certain calling convention. Normally, a compiler has some freedom to choose how function parameters are propagated, but for an Exec device driver, CPU registers need to be used in a very specific way.

The implementation process

As I have explained earlier, device drivers accept I/O requests. I/O requests are function calls consisting of a command number and some data payload. Device driver commands are somewhat standardized -- there are commands that all devices accept, but also device-specific commands. The Amiga development reference provides an overview of device drivers that are bundled with AmigaOS and their commands.

Custom drivers typically follow the same kinds of conventions. For example, the driver for my SCSI controller that comes with my expansion board (evolution.device) accepts the same commands that Commodore's SCSI driver does (scsi.device). This driver is bundled with Kickstart 2.0 and newer.

To implement my proxy driver, I have implemented all the SCSI driver commands described in the manual. For the CMD_READ, CMD_WRITE and TD_FORMAT commands I am using a routine that reverses the order of the bytes in the data field.

Although the implementation process looked simple, I ran into a number of issues. In my first attempt to read my PC partition, the system crashed. To diagnose the problem, I have augmented the driver code with some conditional debugging print statements, such as:

#ifdef DEBUG
    KPrintF("Target device: %s opened successfully at unit: %ld\n", config.device, unit);
#endif

The KPrintF function call sends its output over a serial port. In FS-UAE, I can redirect the output that is sent over a serial port to a TCP socket, by adding the following setting to the emulator's configuration:

serial_port = tcp://127.0.0.1:5000/wait

Then by using a telnet to connect to the TCP socket (listening on port 5000) I can inspect the output of the driver:

$ telnet localhost 5000

I can also connect my null modem cable to my real Amiga 500 and inspect the output in minicom. I need to use a baud rate setting of 9600 bits per second.

With my debugging information, I discovered a number of problems that I had to correct:

It seems that the CrossDOS and fat95 DOS drivers do not only send SCSI driver commands. They also seem to be sending a number of trackdisk-specific device commands. I had to relay these commands through my proxy driver as well.
The fat95 driver also works with New Style Devices (NSD). An important feature addition of NSD devices is that they are not restricted by the 32-bit integer limit. As a result, they can handle devices providing more than 4 GiB of storage capacity. They seem to use their own commands -- one of them checks whether a device is a new style device. I was not properly relaying a "command not implemented" error for this case, causing a crash.
There is a trackdisk command: TD_ADDCHANGEINT to configure a software interrupt that activates on a disk change. It seems that this command never sends an answer message. My driver ran into an infinite loop because it was still expecting an answer. I also learned that in UAE, the uaehf.device driver seems to work with this command, but my real hard drive's SCSI driver (evolution.device) seems to crash when this command is called. I ended up ignoring the command -- it seems to do no harm.
For write operations, I need to reverse bytes before writing a block, but I also need to unreverse them again after the write operation has completed. It turns out that file system drivers do not re-read previously written blocks.

Although I prefer to avoid debugging strategies, I also know that it is unavoidable for this particular development scenario -- specifications are never complete and third-party drivers may not strictly follow them. You really need to figure out how all of it works in practice.

I am very fortunate to have an emulator at my disposal -- it makes it very easy to have a safe test setup. I also do not have to be afraid to mess up my PC hard-drive -- I can simply make a dump of the entire contents of my hard drive and restore it once my system has been messed up.

In the 90s, I did not have such facilities. Driver development would probably have been much harder and more time consuming.

Usage

Earlier in this blog post, I have shown a non-working mount configuration for my PC hard-drive partition (PCDH0:). With my proxy driver and some additional settings, I can produce a working configuration.

First, I must configure the KCS HD proxy device in such a way that it becomes a virtual PC hard drive device for my emulated PC drive.

I can determine the offset (in bytes) of my KCS partition, by using the following formula:

Offset = Surfaces * Sectors per side * Lowest cylinder * Block size

Applying the formula with the properties discovered in SysInfo results in:

Offset = 1 * 544 * 376 * 512 = 104726528

After determining the partition offset, we must determine the exact offset of the MBR. Typically, it is at the beginning of the Amiga partition storing the PC hard drive data but there may be situations (e.g. when using Kickstart 1.3) that the MBR is moved somewhat.

I have developed a utility called: searchmbr can help you to determine the exact offset (what is does is searching for a 512 bytes block that ends with the MBR signature: 0x55AA):

> searchmbr evolution.device 0 104726528
Checking block at offset: 104726528
MBR found at offset: 104726528

In the output above, we see that the MBR is at the beginning of the Amiga partition.

With this information, I can create a configuration file: S:KCDHDProxy-Config for the proxy driver that has the following contents:

104726528
evolution.device

In the above configuration, the first line refers to the offset of the PC partition (in bytes) and the second line to the device driver of the hard drive.

It turns out that CrossDOS does not know how to interpret an MBR -- it needs to know the exact position of the first PC partition.

(As a sidenote: the fat95 DOS driver does have a feature to autodetect the position of a partition. If you set the LowCyl value to 0, then the fat95 driver will attempt to auto-detect a partition).

After configuring the proxy driver, the querypcparts tool can be used to retrieve the offset and size of the partitions on the PC hard drive:

$ querypcparts 0
Partition: 1, first sector offset: 34, number of sectors: 202878

In my example case, there is only one MS-DOS partition -- it resides at offset 34. Its size is 202878. Both of these properties are measured in sectors.

With the above information, I can make adjustments to our previous mount configuration (of the PCDH0: device). Rather than using the real SCSI driver for our hard drive, we use the KCS HD proxy driver (that reverses the bytes and translates the offsets):

Device = kcshdproxy.device

As explained earlier, partition offsets and sizes are typically specified in cylinders, heads and sectors. A PC hard drive typically does not use the same value for the amount of blocks per track that an Amiga partition does -- as a result, the boundaries of a PC partition are typically not aligned to cylinders.

To cope with this, we can change the following properties to 1, so that we can specify a PC partition's offsets in sectors:

BlocksPerTrack = 1
Surfaces       = 1

Because the driver translates offsets in such a way that the beginning of the virtual PC hard drive starts at 0 and the offsets are measured in cylinders, the LowCyl property should correspond to the offset of the first sector reported by the querypcparts tool:

LowCyl = 34

We can compute the HighCyl boundary with the following formula:

HighCyl = First sector offset + Number of sectors - 1

Applying the formula with the values from the querymbr tool results in the following value:

HighCyl = 202911

After making the above modifications, I can successfully mount the PCDH0: device in AmigaOS and browse it contents in the Amiga Workbench:

Isn't it awesome? :-)

The driver also works in Kickstart 1.3, giving me the classic Amiga 500 experience:

Availability

The KCS HD proxy driver can be obtained from my GitHub page and Aminet. Currently, it has been tested with the CrossDOS and fat95 filesystem DOS drivers.

The KCS PowerPC board supports many kinds of SCSI controllers -- unfortunately, beyond uaehf.device and my expansion board's SCSI driver: evolution.device, I have not been able to test any other configurations. I believe it should work with any driver that is scsi.device compatible, but I do not have the means to test this. Use this driver at your own risk!

Future work

Although it is nice to have a solution to easily exchange files with my PC hard drive partition from AmigaOS, not all my problems have been solved -- I also sometimes download stuff from the Internet to my Linux PC that I may want to transfer to my emulated PC partition. With the SCSI2SD device as a replacement hard drive, transferring data to my Amiga "hard drive" is very easy -- I can insert the SD card into the card reader of my PC and get access to its data.

Unfortunately, on Linux, I have the same problem -- Linux has a FAT filesystem driver, but it does not understand the raw data because each pair of bytes is reversed. In the future, I may also have to develop a tool that implements the same kind of solution on Linux.

Running NetBSD on my Amiga 4000

2025-02-18T22:08:00.001+01:00

In my last blog post, I have shown how I have been using Linux on my Amiga 4000. Running Linux on an Amiga has always been a fascinating use case to me -- I have developed strong interests in both areas from a very young age.

Shortly after buying my second-hand Amiga 4000 in 2022, I discovered an interesting Reddit article stating that NetBSD (another UNIX-like operating system) supports the Amiga. The title of this article was IMO somewhat confusing, because it suggests that support for the Amiga was recently added, but that statement is false -- NetBSD already supports the Amiga since its 1.0 release (done in 1994). Amiga support was heavily improved in 2022.

NetBSD is an interesting operating system -- it is the first BSD derivative that was forked from 386BSD by a group of developers in 1992 that were dissatisfied with its development process. FreeBSD is a fork started by another group of developers in the same year. OpenBSD was forked from NetBSD in 1995 and DragonFly BSD from FreeBSD in 2003 due to disagreements.

Each of these BSD derivatives have different goals and standards. NetBSD's focus is portability and simplicity. Its slogan is "of course it runs NetBSD" and facilitates a design and infrastructure that makes porting to new platforms relatively easy. As a result, it has been ported to many kinds of CPUs and hardware architectures, including the PC and the Amiga.

The other BSD derivatives have different goals -- OpenBSD is focused on security and correctness and FreeBSD's purpose is to be a more balanced/general purpose BSD derivative. DragonFly BSD's focus is threading and symmetric multiprocessing.

Another interesting story about NetBSD and the Amiga is its porting process. In order to port NetBSD to the Amiga, the build tool chain had to be ported to AmigaOS first. The result of this porting process is a library project called: ixemul.library providing a UNIX system call interface on top of AmigaOS.

The ixemul.library is the main ingredient of the Geek Gadgets distribution that offers a large collection of UNIX utilities that can be used on top of AmigaOS. Some time ago, I have used the Nix package manager to automate package builds with these tools to port my IFF file formats project from Linux to AmigaOS and creating a viewer program for AmigaOS.

Although NetBSD is an interesting operating system, I have never used any of the BSD derivatives as much as I have used Linux. For example, I have never used any of them as a desktop operating system.

I have the most experience with FreeBSD -- I have used it for web servers, as a porting target for the Nix package manager so that we could add another build target to our university's build farm (that uses Hydra: the Nix-based continuous integration system), and as a porting target for my my experimental Nix process management framework.

My experience with OpenBSD and NetBSD is more limited -- the only notable use case I have with them is to use them as porting targets for the Nix package manager, similar to FreeBSD. Apart from a few simple experiments, I never did anything meaningful with DragonFly BSD.

Because of the impact of the article and my past experiences with NetBSD, trying it out on my Amiga 4000 was high on my TODO list. It was a nice experience -- beyond installing NetBSD it was also the first time for me to use it as an operating system to "do work with", rather than a development target.

In this blog post, I will report about my experiences.

Getting started with NetBSD for the Amiga

In my previous blog post (covering Linux on Amiga), I explained that it was a bit of a challenge to start, because much of the information that I need is outdated and scattered.

With NetBSD, the starting process was much easier for me -- NetBSD for the Amiga has IMO a very well written web page from which I can easily obtain the installation CD-ROM ISO file and Amiga-specific installation instructions.

Moreover, contrary to Debian, I can use the latest version of NetBSD (version 10.1 at the time writing this blog post) on my Amiga 4000. With Debian, I was not able to use any version newer than 3.1.

(To be precise: the m68k port was discontinued after Debian 3.1, but re-introduced in Debian 9.0. Unfortunately, 9.0 and newer versions did not work for me. The latest version of Debian is 12.9 at the time writing this blog post).

Installing NetBSD on the Amiga 4000

The installation procedure does not appear to be very user friendly, but is IMO not too difficult if you already have some experience with AmigaOS and a UNIX-like operating system.

To produce a working NetBSD installation on an Amiga I followed their documentation. From a high-level point of view it consists of the following steps:

Downloading an ISO file and burning it to a writable CD-ROM.
Booting up my Amiga 4000 and creating a partition layout suitable for using NetBSD. Similar to Linux, NetBSD can be installed next to an existing AmigaOS installation on a separate partition. There are a couple things that need to be kept in mind:
- Partitioning must be done by using native AmigaOS tools, such as HDToolBox. NetBSD does not provide its own partitioning tool.
- We must use HDToolBox to manually specify the correct partition identifiers for the NetBSD swap and root partition. Manual specification is a bit cumbersome, but necessary -- HDToolBox does not recognize these partition types by itself
- We must make the swap and root partitions bootable. Furthermore, we need to allow custom boot blocks for these partitions. More information about this later.
- The swap and root partitions must be under the 4 GiB limit. This limit turns out to be a hard limit for my machine as well -- even though my AmigaOS 3.9 homebrew kickstart ROM has a SCSI driver that is not restricted by the 4 GiB limitation, the boot menu does not seem to recognize bootable partitions above this limit.
  
  To address the 4 GiB requirement, I ended up creating a 1 GiB AmigaOS partition containing the Amiga Workbench, followed by a 128 MiB NetBSD swap partition, followed by a 2 GiB NetBSD root partition. The remaining disk space is used for partitions used by AmigaOS.
Transferring the miniroot filesystem image from the CD-ROM to the swap partition. The miniroot filesystem contains a minimal NetBSD installation capable of performing the actual installation of the NetBSD distribution.
Booting from the swap partition to set the NetBSD installation procedure in motion. The installer is text-based and asks a number of questions. The package installation process takes several hours on my Amiga to complete.
Rebooting the machine from the root partition and completing the NetBSD installation.

The NetBSD boot process

Similar to Linux, NetBSD has an interesting boot process -- using these operating systems require the presence of a running kernel. A kernel is not magically there, but needs to be loaded from an external location first. A boot loader is a program that is responsible for facilitating this process.

In my previous blog post, I have explained that are variety of ways to do boot loading. Most importantly, it is possible to load a kernel as quickly as possible after powerup using as few facilities from the ROM or the native operating system. This is a process that I will call cold booting.

In addition, it is possible to first boot into an existing operating system and load a kernel image while the existing operating system is running. Once the kernel is loaded it can take control of the system. This is a process that I called warm booting.

On the Amiga, it is a common practice to warm boot Linux (contrary to Linux on PCs, where cold booting is the normal practice). With NetBSD on Amiga it is actually the opposite -- cold booting is the preferred practice.

As I have already explained in the previous section, we must make the swap and root partitions bootable in HDToolBox. Furthermore, we must allow these partitions to use custom bootblocks.

When powering up my Amiga, I can hold both mouse buttons to enter the boot menu. In the boot menu, I can select the partition that I want to boot from (by default, it traverses the list of bootable drives/partitions top down and takes the first workable option):

In the above screenshot, our NetBSD partitions are also visible as possible boot partitions.

When I select: netbsd-root I can boot from my NetBSD root partition that contains a custom boot block providing a boot loader. The bootloader can be used to load a NetBSD kernel with my desired command-line parameters:

Although cold booting is the preferred practice according to the documentation, it is also possible to warm boot a NetBSD kernel. The loadbsd command can be used for this -- it is an Amiga tool that can load a kernel image from an Amiga partition with a number of kernel parameters. Once the kernel has been loaded, it can take control of the system.

There is yet another way to start NetBSD -- there is also a command-line tool called runbootblock. This tool can be used to automatically execute the code stored in a bootblock of a partition from a running AmigaOS session.

I guess this approach falls in between warm and cold booting -- it is warm booting a kernel from a running operating system using the facilities of a cold booting process. I guess this method can be called lukewarm booting.

Although I am fine with using the Amiga Kickstart's boot menu, I also find it convenient to boot into NetBSD when I am already in the Amiga Workbench. To do that, I have created a desktop icon called StartNetBSD to lukewarm boot into NetBSD:

The icon in the bottom right corner above executes the following script:

runbootblock -dscsi.device -u0 -pnetbsd-root

The above command specifies that we want to run the bootblock of the partition netbsd-root that resides on the first drive unit. We use the scsi.device driver to get access to the partition.

Lukewarm booting seems to work fine if I use the Amiga's chipset for the framebuffer. Unfortunately, when using my Cybervision 64/3D RTG card, the screen looks messed up. I discovered that AmigaOS' initialization of the VGA display is the culprit -- as a result, if I want to use NetBSD in combination with my RTG card, I always need to do a full cold boot.

Post installation steps

After completing the NetBSD installation, there are number of additional configuration steps that I had to perform to get all of my use cases supported. I took heavy inspiration from this NetBSD 9.2 Amiga post installation guide.

Switching off unneeded services

The first thing I noticed after first startup is that my NetBSD installation boots quite slowly. After logging in, it still behaves slowly.

By running ps I figured out that there is a background process: makemandb that consumes quite a bit of my CPU's time. Moreover, I noticed that postfix takes quite a bit of time to start on boot up.

Because I do not need these services, I have disabled them at startup by adding the following lines to /etc/rc.conf:

makemandb=NO
postfix=NO

Configuring user accounts

By default, there is only one user account -- the root user, that has full privileges. Moreover, the root user has no password set.

We can configure the root password by running the following command:

$ passwd

We can configure an unprivileged user account and set a password as follows:

$ useradd -m -G wheel sander
$ passwd sander

Enabling wscons

In the default installation, the command-line experience is somewhat primitive -- I only have a gray background with black colored text and a single terminal window.

To enhance the NetBSD command-line experience, I can enable wscons: NetBSD's platform-independent workstation console driver -- it handles complete abstraction of keyboards and mice and it can multiplex terminals.

To enable wscons, we first need to deploy a kernel that supports it -- the default kernel does not have that functionality built in.

A wscons-enabled kernel can be found on the NetBSD CD-ROM. I can replace the default kernel by logging into my NetBSD installation and running the following commands:

$ mount /dev/cd0a /cdrom
$ cd /
$ cp /cdrom/amiga/binary/kernel/netbsd-WSCONS.gz /
$ gunzip netbsd-WSCONS.gz
$ mv netbsd netbsd-ORIG
$ mv netbsd-WSCONS netbsd

In the above code fragment, I copy the wscons-enabled kernel from the NetBSD CD-ROM to the root directory, uncompress it, make a backup of the old kernel and finally replace the default kernel with the new kernel.

We must also enable the wscons service on startup. This can be done by adding the following line to /etc/rc.conf:

wscons=YES

To use multiple terminals, we must open /etc/ttys in a text editor and enable the ttyE1, ttyE2 and ttyE3 consoles by switching their 'off' flags to 'on':

Finally, we must reboot the system so that we can use the wscons-enabled kernel.

Configuring frame buffer video modes

Another thing I was curious about is how I can configure the output device and resolutions for the framebuffer. For example, when I boot up the kernel, NetBSD automatically uses my Cybervision 64/3D RTG card's display using a 640x480 resolution in 8-bit color mode.

Although I am happy to use my RTG card, I also want to have the ability to use Amiga's AGA chipset for graphics. Moreover, I want to have ability to switch to different graphics modes.

On Linux this can be done by passing the appropriate video parameter to the kernel. On NetBSD, this turns out to be a very tricky configuration aspect.

According to the boot manual page (that describes the NetBSD kernel's boot parameters), there is only one display-related kernel parameter: -A. The documentation says that this parameter enables AGA display mode, but it actually does two things -- it indeed enables AGA mode so that the display uses 256 colors, but it also configures the display to use a "double NTSC" screen mode (640x400, non-interlaced).

Without the -A parameter, the NetBSD kernel is instructed to use 8 colors and an NTSC high resolution interlaced screen mode (640x400). Although my Amiga has an AGA chipset, my LED TV does not seem to accept double NTSC or double PAL screen modes.

In addition to very limited native Amiga chipset configuration abilities, I did not see any kernel parameters that allow me to select my preferred framebuffer device. After some searching, I learned that it is not possible to provide more screen mode settings.

While studying the NetBSD post installation notes GitHub gist, I learned that I am not the only person running into problems with the display. In the gist, the author describes how he adjusted the display settings for the AGA display mode to match the settings of a VGA output by copying the settings from the Linux amifb driver and cross compiling the NetBSD kernel for the Amiga.

I ended up cross compiling a kernel myself as well by following the same instructions. To do the cross compilation, I have downloaded a NetBSD 10.1 distribution ISO file for x64 machines and installed it in a VirtualBox virtual machine on my PC.

In this VirtualBox virtual machine, I have downloaded the NetBSD 10.1 source code by running:

$ cvs -d anoncvs@anoncvs.NetBSD.org:/cvsroot checkout -P -r netbsd-10-1-RELEASE src

I can set up the cross compilation toolchain by running:

$ cd src
$ ./build.sh -m amiga tools

The above command automatically downloads the required dependencies (e.g. the source tarballs for the cross compiler, cross linker etc.) and compiles them.

My first goal was to see if I can make the required adjustments to force the kernel to use the Amiga chips to display the framebuffer. I ended up editing two files and disabling a number of options.

The first file that I have changed is the following: sys/arch/amiga/conf/GENERIC. I have commented out the following properties:

#options 	RETINACONSOLE	# enable code to allow retina to be console
#options 	CV64CONSOLE	# CyberVision console
#options 	TSENGCONSOLE	# Tseng console
#options 	CV3DCONSOLE	# CyberVision 64/3D console

#options 	GRF_AGA_VGA	# AGA VGAONLY timing
#options 	GRF_SUPER72	# AGA Super-72

#grfrt0		at zbus0		# retina II
#grfrh0		at zbus0		# retina III

#grfcv0		at zbus0		# CyberVision 64
#grfet*		at zbus0		# Tseng (oMniBus, Domino, Merlin)
#grfcv3d0	at zbus0		# CyberVision 64/3D

#grf1		at grfrt0
#grf2		at grfrh0

#grf5		at grfcv0
#grf6		at grfet?
#grf7		at grfcv3d0

#ite1		at grf1			# terminal emulators for grfs
#ite2		at grf2			# terminal emulators for grfs

#ite5		at grf5			# terminal emulators for grfs
#ite6		at grf6			# terminal emulators for grfs
#ite7		at grf7			# terminal emulators for grfs

The most important property to comment out is the CV3DCONSOLE. Disabling the Cybervision 64/3D console ensures that my Cybervision 64/3D card never gets detected. As a result, NetBSD is forced to use the native Amiga chipset.

As a side effect of disabling the Cybervision 64/3D console option, I must also disable the corresponding color graphics driver (grfcv3d0), framebuffer device (grf7) and terminal emulator device (ite7).

In addition to Cybergraphics 64/3D, I have been disabling a number of additional features that I do not need. For example, I do not need any support for unorthodox resolutions (VGA, Super72). I can also disable support for additional RTG cards because I do not have them.

As a side effect of disabling their consoles, I must also disable their corresponding color graphics drivers, framebuffer devices and terminal emulation devices.

Another file that I need to modify is the wscons configuration (sys/arch/amiga/conf/WSCONS):

#no grfrt0	at zbus0
#no grfrh0	at zbus0

#no grf1		at grfrt0
#no grf2		at grfrh0

#no ite1		at grf1
#no ite2		at grf2

#no ite5		at grf5
#no ite6		at grf6
#no ite7		at grf7

In the above file, I also need to disable the color graphics, framebuffer and terminal emulation devices that I have commented out in the previous configuration file.

After configuring the kernel, I can compile it with the following command-line instruction:

$ ./build.sh -m amiga kernel=WSCONS

The resulting kernel image can be found in: src/sys/arch/amiga/compile/obj/WSCONS/netbsd. I have copied the resulting kernel image to my NetBSD root partition and named it: netbsd-NORTG.

If I want to use a NetBSD session using my Amiga's chips (with a 8-color interlaced high resolution screen mode), then I can provide the following command to the bootloader:

netbsd-NORTG -Sn2

Resulting in the following NetBSD session:

As can be seen in the picture above, I have successfully instructed the bootloader to load my custom NORTG NetBSD kernel forcing the operating system to use the Amiga display driver.

After compiling my custom NORTG kernel, I have also been experimenting with the kernel source code to see if I can somehow avoid using the incompatible "double NTSC" screen mode. Unfortunately, I discovered that screen modes are hardcoded in the kernel (to be precise, I found them in this file: src/sys/arch/amiga/dev/grfabs_cc.c).

After trying out a few display settings (including the VGA monitor settings described in the GitHub gist) I discovered that if I cut the amount of scan lines in half for the AGA screen mode, I can get an acceptable display in AGA mode (at least in the console).

Unfortunately, despite this achievement I realized that my customized AGA mode is useless -- if I want to use the X Window System, the display server does not seem to know how to use my hacked screen mode. As a result, it fails to start.

I eventually gave up investigating this problem and decided to simply use the non-AGA graphics mode.

Using the X Window System

A nice property of the X Window System on NetBSD/Amiga is that it integrates with wscons. As a result, it has a very minimal configuration process.

Something that appears to be missing is the presence of a pseudo terminal device file. Without it, xterm refuses to start. I can create this missing device file by running the following commands:

$ cd /dev
$ ./MAKEDEV pty0

Another nice property is that, in contrast to the X Window System for Debian Linux/Amiga, the NetBSD version also supports the Amiga chipset for displaying graphics.

When using a NORTG kernel console session, I can simply run: startx and after a while, it shows me this:

The picture above shows a CTWM session with some running applications. The only two downsides of using the X Window System on the Amiga is that it takes quite a bit of time to boot up and I can only use monochrome graphics.

To have a color display, I need to enable AGA mode. As already explained, I cannot use the AGA screen mode because my display cannot handle double PAL or double NTSC screen modes.

Adjusting screen modes

As far as I could see, it is not possible to change screen modes at runtime while using the Amiga chipset.

With my Cybervision 64/3D RTG card it is actually possible to change screen modes at runtime. To do that, I need to load a screen mode definition file. I made an addition to the /etc/rc.local script to load such a definition file at boot time:

$ grfconfig /dev/grf7 /etc/gfxmodes

In the above command-line instruction, the /dev/grf7 parameter corresponds to the Cybervision 64/3D framebuffer device (this device file can be determined by looking at the output of the dmesg command) and /etc/gfxmodes to a screen mode definition file.

Writing a screen mode definition file is a bit cumbersome. Fortunately, from my previous experiments with the Picasso96Mode Amiga preferences tool, I discovered that this tool also has a feature to generate NetBSD 1.2 and 1.3 compatible mode definitions. I learned that NetBSD 10.1 still uses the same format as NetBSD 1.3

To automatically save these mode settings to a file, I need to open a CLI in my Amiga Workbench and run the following command:

SYS:Prefs/Picasso96Mode >T:gfxmodes

The above command opens the Picasso96Mode preferences program and redirects the standard output to T:gfxmodes.

In the Picasso96 GUI, I must select all relevant screen modes and select the menu option: Mode -> Print Mode to export the screen mode (as shown in the picture above).

To get a reasonable coverage, I start with the lowest resolution and color mode (8-bit). Then I move to higher color modes, then higher resolutions etc. until all the screen modes that I need are covered.

After exporting all relevant screen modes, I need to open the T:gfxmodes file in a text editor and remove all comments and NetBSD 1.2 entries. What remains is a file that has the following structure:

x 26249996 640 480 8 640 688 768 792 479 489 492 518 default
x 43977267 800 600 8 800 832 876 976 599 610 620 638 +hsync +vsync
x 67685941 1024 768 8 1024 1072 1184 1296 767 772 777 798 default

In the above file, each line represents a screen mode. The first defines a 640x480 resolution screen, the second an 800x600 resolution screen and the third a 1024x768 resolution screen. All three screen modes use an 8-bit color palette.

We need to slightly adjust this file to allow it to work with NetBSD -- the first column (that contains an 'x' character) represents a mode number. We need to give it a unique numeric value or a 'c' to define the screen mode of the console.

If I want my console to use a 640x480 screen mode, and keep the remaining screen modes as an option, I can change the above mode definition file into:

c 26249996 640 480 8 640 688 768 792 479 489 492 518 default
1 26249996 640 480 8 640 688 768 792 479 489 492 518 default
2 43977267 800 600 8 800 832 876 976 599 610 620 638 +hsync +vsync
3 67685941 1024 768 8 1024 1072 1184 1296 767 772 777 798 default

In the above file, I have replaced the 'x' characters by unique numeric values and I have duplicated the 640x480 mode to be the console screen mode.

By copying the above file to /etc/gfxlogin on my NetBSD root partition, I can switch between screen modes if desired.

I learned that to use the X Window System in combination with my Cybervision 64/3D card, I also require a modes defintion file. By default, the X Window System uses the first screen mode (screen mode: 1). When I start the X server, I eventually get to see the following display:

Installing custom packages

Contrary to Linux, which is a kernel that needs to be combined with other software packages (for example, from the GNU project) to become a functional distribution, NetBSD is a complete system. However, I do need some extra software to make my life more convenient, such as the Midnight Commander.

NetBSD includes a package manager: pkgsrc and a package repository with a variety of software packages. Automatically downloading and installing prebuilt binaries and their dependencies can be conveniently done with a front-end for pkgsrc: pkgin.

Unfortunately, as I have already explained in my previous blog post, my Amiga 4000 machine does not have a network card. The only means I have to link it up to the Internet is a null-modem cable which is too slow for downloading packages.

Fortunately, the web-based interface of pkgsrc is easy enough to manually download the packages I want and their dependencies. For example, to obtain Midnight Commander, I can open the following page: https://ftp.netbsd.org/pub/pkgsrc/current/pkgsrc/sysutils/mc46/index.html, download the m68k tarball, follow the runtime dependencies, download their tarballs, and then the transitive dependencies etc.

To install the downloaded packages on NetBSD, I just copy them to a directory on my NetBSD drive. Then I can install them with the pkg_add command:

$ pkg_add /root/packages/mc-4.6.1nb28.tgz

The result is that I can use Midnight Commander on NetBSD on my Amiga:

Setting up a terminal connection between the Amiga and PC with a null-modem cable

In my previous blog post about running Linux on my Amiga, I have been using a null-modem cable to link the Amiga to my PC. With NetBSD, I can do the same thing. For example, on my Linux PC I can start a terminal session over the serial port:

$ agetty --flow-control ttyUSB0 19200

and use Minicom on my NetBSD/Amiga installation to remotely connect to it.

I can also do the opposite -- I can enable a serial console on NetBSD by editing /etc/ttys and enabling a terminal for the /dev/tty00 device:

tty00    "/usr/libexec/getty std.9600"    unknown    on    secure

Then I can use minicom on my Linux PC to remotely connect to my Amiga:

$ minicom -b 9600 -D /dev/ttyUSB0

The result can be seen in the following picture:

Connecting to the Internet

Similar to my Linux setup, I can also connect my NetBSD/Amiga installation to the Internet. The recipe is exactly the same. I took inspiration from the Linux PPP HOWTO for this.

First, I need to set up a link end point on my Amiga 4000, with the following command:

$ pppd -detach crtscts lock noauth defaultroute 192.168.1.2:192.168.1.1 /dev/tty00 19200

Then I can configure my desktop PC (which has a connection to the Internet by using an ethernet card):

$ pppd -detach crtscts lock noauth proxyarp 192.168.1.1:192.168.1.2 /dev/ttyUSB0 19200

Then I should be able to ping my desktop PC from the Amiga by running:

$ ping 192.168.1.1

With some additional steps, I can connect my Amiga 4000 to the Internet by using my desktop PC as a gateway.

First, I need to enable IP forwarding on my desktop PC:

echo 1 > /proc/sys/net/ipv4/ip_forward

Then, on my desktop PC, I can enable network address translation (NAT) as follows:

INTERNAL_INTERFACE_ID="ppp0"
EXTERNAL_INTERFACE_ID="enp6s0"

iptables -t nat -A POSTROUTING -o $EXTERNAL_INTERFACE_ID -j MASQUERADE
iptables -A FORWARD -i $EXTERNAL_INTERFACE_ID -o $INTERNAL_INTERFACE_ID -m state --state RELATED,ESTABLISHED -j ACCEPT
iptables -A FORWARD -i $INTERNAL_INTERFACE_ID -o $EXTERNAL_INTERFACE_ID -j ACCEPT

In the above example: ppp0 refers to the PPP link interface and enp6s0 to the ethernet interface that is connected to the Internet.

In order to resolve domain names on the Amiga 4000, I need to copy the nameserver settings from the /etc/resolv.conf file of the desktop PC to the Amiga 4000.

After doing these configuration steps I can, for example, use w3m to visit my homepage:

Exchanging files

Similar to my Amiga/Linux set up, I also want to the option to exchange files to and from my NetBSD/Amiga setup, for example, to try out stuff that I have downloaded from the Internet. For my NetBSD setup I can use the same two kinds of approaches.

Exchanging files with the memory card on my Linux PC

In my previous blog post I have shown that it is possible to take the Compact Flash card from my Amiga's CF2IDE device and put it in the card reader of my PC. Linux on my PC does not recognize an Amiga RDB partition table, but I can use GNU Parted to determine the partitions' offsets and use a loopback device to mount it:

$ parted /dev/sdb
GNU Parted 3.6
Using /dev/sdb
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) unit B                                                           
(parted) print                                                            
Model: Generic- USB3.0 CRW-CF/MD (scsi)
Disk /dev/sdb: 32019111936B
Sector size (logical/physical): 512B/512B
Partition Table: amiga
Disk Flags: 

Number  Start         End           Size          File system  Name         Flags
 1      56899584B     1130766335B   1073866752B   asfs         DH0          boot
 2      1130766336B   1265338367B   134572032B                 netbsd-swap  boot
 3      1265338368B   3953166335B   2687827968B   sun-ufs      netbsd-root  boot
 4      3953166336B   6100899839B   2147733504B   asfs         DH1
 5      6100899840B   8248633343B   2147733504B   asfs         DH2
 6      8248633344B   16838664191B  8590030848B   asfs         DH3
 7      16838664192B  32019111935B  15180447744B  asfs         DH4

(parted) quit

In the above code fragment, I have invoked GNU parted to read the partition table of my Amiga drive (/dev/sdb), switched the units to bytes, and printed the partition table. The third partition entry (Number: 3) refers to the NetBSD root partition on my Amiga 4000.

Linux is capable of mounting UFS partitions. I can use the start offset and size fields in the above table to configure a loopback device (/dev/loop0) referring to my Amiga drive's NetBSD partition:

$ losetup --offset 1265338368 --sizelimit 2687827968 /dev/loop0 /dev/sdb

Then I can mount the partition (through the loopback device file) to a directory:

$ mount /dev/loop0 /mnt/amiga-netbsd-partition

And then I can read files from the Amiga's NetBSD partition by visiting /mnt/amiga-netbsd-partition. Unfortunately, my Linux kernel's UFS module does not seem to have write support enabled -- it appears that UFS write functionality is experimental and disabled by default. As a consequence, I can only read files from the partition.

When the work is done, I must unmount the partition and detach the loopback device:

$ umount /mnt/amiga-netbsd-partition
$ losetup -d /dev/loop0

I have also been investigating how I could exchange files with my PC if NetBSD were my desktop operating system. I learned that there is the vndconfig command to configure a pseudo disk device (which is mostly feature comparable to a Linux loopback device), but I could not find any facilities to configure the partition offets.

Exchanging files between the AmigaOS and NetBSD operating systems

Similar to my Amiga Workbench 3.1/Linux hard drive configuration, I also want the ability to exchange files between my AmigaOS and NetBSD installation on the same system.

NetBSD on Amiga natively supports mounting Amiga Fast Filesystem partitions. For example, if I have such a partition I can automatically mount it on startup by adding the following line to /etc/fstab:

/dev/wd0d /mnt/harddisk ados rw 0 0

As with my other Compact Flash card, I have picked Smart Filesystem (SFS) over Amiga's Fast File system for the same reasons -- I need a file system that is more suitable for large hard drives. Unfortunately, it seems that there are no facilities for NetBSD to read from or write to Smart Filesystem partitions.

However, I can do the opposite -- a Berkeley Fast File System handler for AmigaOS seems to exist that offers read and write support.

Installing the package is similar to the Ext2 handler described in the previous blog post:

Unpacking the LhA file into a temp directory
Copying the l/BFFSFileSystem file to L:
Creating a DOSDriver mount entry to get the partition mounted.

Creating the DOSDriver mount entry was a bit tricky. I did the following steps:

I used the doc/MountList.BFFS file from the LhA archive as a template by copying it to DEVS:DOSDrivers/BH0. In the destination file, I need to adjust a number of properties that we must look up first.
I can use HDToolBox to determine the start and end cylinders of the NetBSD root partition.
I can use SysInfo to determine the amount of surfaces and sectors per side, by opening the "DRIVES" function and selecting a random hard drive partition:
Then I need to adjust the partition's DOSDriver file: DEVS:DOSDrivers/BH0:
- I must remove all examples, except the BH0: example
- The LowCyl and HighCyl need to match the properties that we have discovered with HDToolBox.
- The Surfaces property must match the value from SysInfo
- The BlocksPerTrack property must match the Sectors per side property in SysInfo.
Finally I can make an icon for the BH0 DOS Driver, by copying the PC1.info file to BH0.info and removing the UNIT=1 tooltip (by using the icon information function).

The result is that I can access my NetBSD root partition from AmigaOS:

The screenshot above shows an Amiga Workbench 3.9 session in which I have opened the NetBSD root partition.

Running NetBSD/Amiga in FS-UAE

In my previous blog post, I have shown that it is also possible to install and run Linux in FS-UAE, an Amiga emulator. Using an emulator is convenient for experimentation and running software at greater speeds than the real machine.

To run a different operating system, we must edit the FS-UAE configuration in a text editor and change the configuration to use a hard file. Moreover, we need to enable IDE emulation for the hard-drive and CD-ROM devices:

hard_drive_0 = mydrive.hdf
hard_drive_0_controller = ide
hard_drive_0_type = rdb
cdrom_drive_0_controller = ide1

The above configuration properties specify the following:

We want to use a hard file: mydrive.hdf
The controller property specifies that we need IDE controller emulation -- this property is required because NetBSD does not work with Amiga drivers. Instead, it wants to control hardware itself.
The type property specifies that the hard drive image contains an RDB partition table. This property is only required when the hard file is empty. Normally, the type of a hard file is auto detected.

The result is that I can run NetBSD conveniently in FS-UAE:

As with my Linux installation, I can also take the Compact Flash card in my Amiga's CF2IDE drive (containing the NetBSD installation), put it in the card reader of my PC and use it in FS-UAE.

I need to change the hard drive setting to use the device file that corresponds to my card reader:

hard_drive_0 = /dev/sdb

In the above code fragment: /dev/sdb corresponds to the device file representing the card reader. I also need to make sure that the device file is accessible by an unprivileged user by giving it public permissions, by running the following command (as a root user):

$ chmod 666 /dev/sdb

Conclusion

In this blog post I have shown how I have been using NetBSD on my Amiga 4000. I hope this information can be of use to anyone planning to run NetBSD on their Amigas.

Compared to running Linux on Amiga, there are pros and cons. I definitely consider the homepage and documentation of NetBSD's Amiga port to be of great quality -- all required artifacts can be easily obtained from a single location and the installation procedure is well documented.

Moreover, NetBSD 10.1 (released in December 2024) is considerably newer than Debian 3.1r8 (released in 2008) and provides more modern software packages. Furthermore, the X Window System in NetBSD supports the Amiga chipset for displaying graphics -- the Debian Linux version does not.

NetBSD also has a number of disadvantages. Foremost, I consider its biggest drawback its speed -- it is considerably slower than my Debian Linux installation. This is probably due to the fact that modern software is slower than older software, not because Linux is superior to NetBSD when it comes to speed. Most likely, if I can manage to make a modern Linux distribution work on my Amiga, then it would probably also be much slower than Debian 3.1r8.

Another drawback is that I have much fewer options to control the framebuffer in NetBSD. With Linux I can easily select the video output device, resolutions and color modes at boot time and run time. In NetBSD the options are much more limited.

Beyond these observable Amiga differences between NetBSD and Linux, there are many more differences between NetBSD and Debian Linux, but I will not cover them in this blog post -- this blog post is not a NetBSD vs Linux debate.

Finally, I want to say that I highly appreciate the efforts of the NetBSD developers to have improved their Amiga port. NetBSD is definitely usable and useful for the Amiga. It would be nice if some areas could be improved even further.

Running Linux on my Amiga 4000

2025-01-26T23:18:00.013+01:00

In a recent blog post, I have described how I have been using my Amiga 4000 in 2024, including how to install the 3.1 and 3.9 AmigaOS versions from scratch and showing a few interesting use cases.

After completing this project, my Amiga 4000 experiments have continued. Recently, I have managed to run Linux on my Amiga 4000.

It turns out that making Linux work on such an old vintage machine (and somewhat uncommon hardware architecture) is challenging. There is quite a bit of information available on the Internet explaining how to do this, but not all of it is of good quality, some of it is heavily outdated and useful information is scattered.

As a result, I have decided to write a blog post about my collected knowledge and experiences.

My motivation

Some readers may probably wonder why I want to run Linux on such an old vintage machine? Aside for nostalgic reasons and fun-oriented programming projects, I have an interesting connection with both Amiga machines and Linux. Actually, my Amiga experiences eventually brought me to Linux.

IMO this is an interesting story, but feel free to skip to the next section. :-)

Commodore's demise

After Commodore's demise in 1994, the Amiga remained my main working machine for several years, but I could strongly feel the decline in the years after 1994.

Foremost, no newer Amiga models appeared. In 1995, a major Dutch Amiga magazine (Amiga X) was canceled. I was no longer receiving any software updates and I was still mostly using the same applications and games.

When the Internet and the World Wide Web became a mainstream service (in mid-1996) I realized that the PC had surpassed the Amiga in all thinkable areas. It was clear to me that the Amiga has become an obsolete platform and it was time to move on.

My transition to the PC was not abrupt. In the years before Commodore's demise I already slowly grew accustomed to using PCs -- I already knew how to use MS-DOS, because I had access to friends' PCs and PCs on my elementary school. Moreover, I also had plenty of hands on experience thanks to the KCS PowerPC board in my Amiga 500, capable of emulating an XT-based PC.

Transition to the PC as a main machine

Although I never hated MS-DOS, I always considered it a primitive working environment compared to AmigaOS. It was mainly text-oriented: a black screen with gray text. It only had a simple filesystem capable of using file names that consist of a maximum of eight characters (optionally with a three character extension). There was no multi-tasking support.

(As a sidenote: If you want a graphical desktop environment, then you can start a graphical shell, such as Windows 3.1. Windows 3.1 was a DOS application, not a full operating system. Although it was nice to use it, for playing games you generally did not need it.)

In late 1996, my parents bought a brand new PC with Windows 95 -- the successor to Windows 3.1. I was happy and a bit disappointed at the same time.

Foremost, I was very happy with all new abilities that this new PC provided me: better graphics, better sound, blockbuster games that were not available for the Amiga (most notably Wolfenstein 3D and Doom), newer productivity applications that offered more features than comparable Amiga applications and access to the Internet and the World Wide Web.

Moreover, Windows 95 overcame a number of limitations that MS-DOS (and Windows 3.1 as a shell) used to have, such as pre-emptive multi-tasking, long filenames and better graphics. From a visual perspective, a Windows 95 working environment did not feel primitive compared to AmigaOS and a hybrid MS-DOS / Windows 3.1 environment.

I was not very fond of Windows 95's stability. It used to crash quite frequently with a blue-screen of death.

Also, I considered Windows 95 a big and bloated operating system. AmigaOS only used a couple of megabytes on a hard drive installation, but for Windows 95 I needed several hundred of megabytes. From a disk space perspective, Windows 95 was a huge sacrifice. Back then our hard drive was only 2 GiB. The Windows 95 boot process was also considerably longer than the Amiga Workbench.

AmigaOS is more modular -- it can easily be stripped down in such a way that you can boot it from a single double density floppy disk. Although it is possible to create a bootable/customizable MS-DOS floppy disk, it is not possible to boot into a customized, stripped down version of Windows by conventional means.

Windows 95 was still a hybrid MS-DOS / Windows based operating system, even though it did a number of things to conceal its MS-DOS heritage. For example, it did not boot into a DOS prompt, but at startup you could still see that it was loading DOS drivers. For conventional Windows productivity application usage, you typically did not need to know about Windows 95's DOS heritage.

However, most of the games I used to play in late 1996/early 1997 were still MS-DOS based. It was quite a challenge to make some these MS-DOS games work thanks of the legacy of the segmented memory model of the Intel CPU.

For example, some games still rely on enough available conventional memory. In the default Windows 95 installation, the amount of free conventional memory was sometimes not enough. I have played quite a lot with MEMMAKER to optimize my system to have more conventional memory available. Using MEMMAKER sometimes bricked the boot process of Windows 95.

There were also quite a few annoying bugs that could brick my system. I still remember that a Windows-bundled application called Active Movie once reported that it lost its file associations. When it asked me to restore them it connected all file types to Active Movie including .EXE files rendering my system impossible to use.

In several occasions, I also learned in a painful way that certain application installers could brick the Windows registry rendering my system unusable.

Because of these issues I ended up reinstalling Windows 95 so many times that I can still remember its serial key.

First exposure to Linux: reading a magazine

At some point, my dad subscribed himself to a PC magazine: c't magazine. Because Windows 95 had an interesting stability reputation, magazines sometimes reported about alternatives. Thanks to these magazines, I learned about Linux the first time in early 1997.

When I first read about it, I could not believe that a single student named, named Linus Torvalds (at that time he was still a student at Helsinki university), could create an operating system that could measure up with Windows (developed by a multi-billion dollar company) and do certain things better.

I was not convinced that Linux was a realistic alternative to Windows at that time. Moreover, there were also many things that I did not understood about Linux. More about this later :)

Second exposure: DOS UAE

Still in early 1997, I discovered the Ultimate/Unusable/UNIX Amiga Emulator (UAE) that makes it possible to emulate an Amiga on different kinds of computer systems.

UAE is portable emulator -- it came in several variants for various operating systems. The first UAE variant that I used was DOS UAE. Because I was still fond of the Amiga, I decided to give it a try.

Getting DOS UAE to run was not an easy job. I needed to study the manual and I found out that the DOS UAE distribution also included the documentation of the core UAE distribution.

By reading the documentation of the core UAE distribution, I learned a couple of interesting things. For example, what I found interesting is that it was possible to get access to the UAE source code. The core UAE distribution was primarily developed for UNIX-like systems (at that time UNIX was a completely new word to me that I did not understand), most notably: Linux.

For non-UNIX like operating systems, UAE was forked into a number of variants, such as: DOS UAE for MS-DOS, WinUAE for Windows and MacUAE for the classic Apple Macintosh. Changes of these forks were regularly merged back into the core UAE distribution.

The core UAE manual also provides configuration instructions for all the other operating systems that it supports. Studying the UAE manual gave me my second exposure to Linux. From the documentation, I learned that there is a command-line interface, and quite a few command-line utilities, such as: cat, ls, cp, gzip and bzip2.

The manual also regularly mixed the words "Linux" and "UNIX" which gave me quite a bit of confusion -- are they the same, or different? What is their relation?

Third exposure: real hands on experience

After managing to make DOS UAE run, it took a while before I got exposed to Linux again.

My third exposure to Linux was late 1999. A cousin started to study Computer Science and he was to first to show me what working with Linux was actually like in practice. I also got the opportunity to try it out several times myself. I still remember using old versions of Red Hat Linux and Corel Linux.

Using Linux for the first time was not a scary experience to me at all, including the command-line -- I was already familiar with the AmigaOS and MS-DOS command-line interfaces, and I already knew several Linux commands from studying the DOS UAE manual.

I was quite impressed by a number of things:

It was quite powerful in multi-tasking -- I could easily run many programs at the same time.
There was a command-line interface that felt decent and was more powerful (e.g. TAB-completion) compared to the MS-DOS command-line interface in Windows 95 that was somewhat treated as a second class citizen.
Native Linux filesystems (e.g. Ext2) have more abilities than the FAT32 file system in Windows 95: native long filenames and a notion of ownership and permissions.
As a result of a more powerful filesystem and its notion of ownership, everything was more secure. In particular, the fact that the root file system is owned by the root user and an unprivileged user only has write access to files in its own home directory was a huge improvement over Windows 95, 98, Me in which the user has full access to all files, including the operating system parts.
Many Linux distributions are quite modular. They provide package managers allowing you to easily install, upgrade and remove additional packages. You can also relatively easily deploy a stripped down system if you want. Windows 95 also had the option to enable/disable certain pieces, but even in its most basic form it was still bloated to me.
Graphical desktop environments looked decent. I tried both KDE and GNOME, and although they were not perfect, their look and feel was OK to me.
I could easily set up a server and provide services such as email, a web server and FTP.
Some well known commercial applications and games became available for Linux, such as Netscape, Real Player, Word Perfect, Quake 3 arena and Unreal Tournament. I remember that Quake 3 and Unreal Tournament ran pretty well in combination with my NVIDIA Riva TNT2 card.

I also gradually learned more about how Linux was developed -- Linus Torvalds started the Linux project while he was still a student at Helsinki university, but the software packages of a Linux distribution are developed by a larger movement:

The component that bears the name Linux is an operating system kernel -- not a full operating system. Typically most Linux systems are augmented with packages from the GNU project, such as GNU Bash, GNU Coreutils, GNU Texinfo etc. For this reason, the GNU project's opinion is that GNU/Linux is a more accurate name for Linux distributions.
Linus Torvalds leads the Linux project, but he is not doing all the development by himself -- in fact, there are many developers participating in the development of the kernel and other projects.
The goal of Linux and the GNU projects is to provide a free version of UNIX. The ideas that these packages implement have already been implemented in non-free/commercial UNIX operating systems.
Other pieces of a Linux distribution are developed by other communities. For example, both the KDE and GNOME desktops are developed by their own communities.
Most of the software packages that Linux distributions provide are free and open source software. The source code of these packages are available, you are free to study them, modify them and distribute your own (modified) copies.

Back in the 90s and early 00s this was considered to be quite unconventional -- most well known software packages were proprietary software products for which the source code was kept private.

Fourth exposure: installing Linux myself

Somewhere in 2000 I took the decision to install Linux on my home computer, because I was convinced that it was relevant to regularly use. It was a challenge to get started -- I tried two Linux distributions: SuSE and Red Hat. With both distributions, I ran into a driver problem -- the Ultra ATA/66 controller to which my hard drive was attached was not supported yet.

After some time, I gave up and eventually attached my hard drive to the Ultra ATA/33 controller. Later on, I managed to find a kernel patch that adds a Ultra ATA/66 driver. I had to compile a patched kernel myself from source code.

While visiting kernel.org to obtain the Linux kernel source code, I discovered another interesting property of Linux: Linux is a portable operating system that runs on many hardware architectures in addition to the PC (with an Intel x86-compatible CPU).

I even noticed that it is possible to run Linux on an Amiga if you have model with a Motorola 68000-based CPU variant with an MMU: the 68020, 68030 and 68040 (non EC-variants). These CPUs were only included with the high-end Amiga models: the Amiga 2500, 3000 or 4000.

Back in the 90s, I only had an Amiga 500 that with the first generation 68000 CPU (lacking an MMU and FPU), so I knew it would not be possible to run Linux on it.

How to get started with Linux on Amiga?

Now that I finally have a Linux-capable Amiga 4000 in my possession (with a 68040 CPU with an MMU), I have decided to revisit that old idea to run Linux on the Amiga.

Getting started is a challenge. For PCs, there are plenty of Linux distributions available with plenty of advertising, but for the m68k there barely is.

I still remember from visiting old versions of the kernel homepage that there is a Linux/m68k homepage. It turns out that this website has not been updated in a very long time. Some links are broken and some functionality no longer works properly -- for example, I tried to register myself as a m68k Linux user but the registration form does not seem to work anymore.

The distributions page explains that there have been only three Linux distributions: an old unofficial port of Red Hat, Whiteline Linux/68k: an old Linux distribution for Atari STs in German and Debian. It turns out that Debian is the only actively developed distribution that still maintains an m68k port.

I know that the current versions of the Linux kernel still support m68k and that there are still people maintaining it. I have decided to do a search on the Internet to discover more about using modern versions of Linux on the Amiga.

The most recent discussion I could find was in this forum post on the English Amiga Board. Apparently, the same questions that I had were raised. The forum post links to an old post from 2013 that contains basic installation instructions from a Debian developer explaining how to install a test version of Debian (pre-Wheezy) on an Amiga hard drive (e.g. compact flash card) from a PC-based Linux host.

I tried the installation instructions, and they worked. However, I also found out that this test build of Debian did not work well on my Amiga 4000. Apparently, the instructions state that 64 MiB of fast RAM is the minimum (32 MiB may work, but it was not tested).

It turns out that the instructions were right -- my Amiga 4000 only has 48 MiB fast RAM. As a consequence, the Linux kernel boots and the initialization process starts, but once udev is called (a tool that loads device drivers), the kernel gives me all kinds of memory related errors. Half way, the boot process got completely stuck.

Selecting an appropriate Debian version

After running into the kernel out of memory issues, I realized that I had to make a choice -- either obtain all the required hardware upgrades or find an older Debian version that still works within the limits of the machine that I currently have.

I have been looking at various kinds of Debian versions. What I learned by diving into the history of Debian is that an m68k port was introduced with version 2.0. It remained one of the Debian ports until version 3.1 -- its successor: Debian 4.0 dropped m68k support (or at least: I could not find any disk images in the archive). The test version in the forum post was a pre-Wheezy (7.0) release. In Debian 9.x and later versions, the m68k port returned.

First, I tried the newest Debian port that I could find (12.0). The installation disc seems to provide an installer for the Amiga, but once the boot loader has loaded the kernel, my system freezes -- I got stuck with a gray colored screen. The same thing happened to me with versions 11.x, 10.x and 9.x.

I am not sure what the reason behind this stuck gray colored screen is. Maybe the Amiga version is broken and the other m68k platforms (e.g. Atari ST, classic Apple Macintosh) still work. It could also have something to do with increasing system resource demands, but I have no idea.

Then I tried older Debian distributions. The Debian 3.0 kernel seems to have problems detecting my hard drive -- it appears to have trouble with the fact that it is bigger than 4 GiB.

Debian 3.1 actually seems to boot up fine and it did not run out of memory. As a result, I have decided to pick that version.

Creating Debian 3.1 Sarge m68k ISO images

My Amiga 4000 does not have a network card. By default, Debian prefers to download packages from the Debian package repository over the Internet.

The only means I have to connect my Amiga to another machine is a null-modem cable, which is too slow to transfer large quantities of data, such as the packages for an entire Linux installation.

As result, I need to use old fashioned CD-ROM/DVD-ROMs. I learned that Debian 3.0 is the last version that offers downloadable ISO files for the entire distribution.

Although its successor: Debian 3.1 does not offer downloadable distribution ISOs (it does offer a NETBOOT image with a minimal system), you can still generate distribution ISOs with a tool called: Jigdo. I followed the instructions described in the Debian Jigdo HOWTO by downloading the required Jigdo files and running the following commands:

$ jigdo-lite debian-31r8-m68k-binary-1.jigdo
$ jigdo-lite debian-31r8-m68k-binary-2.jigdo
$ jigdo-lite debian-31r8-i386-binary-1.jigdo
$ jigdo-lite debian-31r8-i386-binary-2.jigdo

The above commands automatically download all required packages from the appropriate Debian mirrors and generate self-contained ISO images from them.

When jigdo asks me for a Debian mirror I provided:

http://archive.kernel.org/debian-archive/debian/

For legacy Debian distributions, it turns out that the packages are no longer available anymore from their original locations.

For DVD 2, there is one package that has a different checksum: zope-quotafolder_0.1.1-1_all.deb. I ignored the error -- it is a package I do not care about and I simply removed the .tmp suffix from the generated ISO file. The resulting ISO seems to work just fine.

After generating the two DVD ISO images, I have burned them on writable DVD-ROMs. Fortunately, I still have a blu-ray drive in my desktop PC that is capable of writing DVD discs.

Installing Debian 3.1 on the Amiga

I can install Debian on my Amiga by inserting the first DVD-ROM into the DVD-ROM drive of my Amiga and booting up the installer.

By opening the following directory in the Amiga Workbench: CD0:install/tools/amiga, we can find multiple executables:

Only two of the above executables are useful to me. The StartInstall icon boots up a Linux kernel using the framebuffer driver for the Amiga's OCS/ECS/AGA video chipset. The drawback of this installer is that driver's default setting is to use a "double NTSC" mode display (this mode is similar to ordinary NTSC, but it runs at twice the refresh rate). Because I have a PAL display that does not like double NTSC mode, the display looks messed up.

I can correct that problem by copying the StartInstall script and adding the video=amifb:pal kernel parameter to instruct the kernel to use an ordinary PAL mode at boot time.

There is another installer icon named: StartInstall_CV3D. This script instructs the kernel to use the Cybervision 64/3D frame buffer driver. I have decided to pick this installer for my installation, since I have that RTG card installed in my machine.

Installing Debian is relatively straight forward. I left some free space (+/- 5 GiB of the 32 GiB) at the end of my Amiga hard drive that already contains a Workbench 3.1 installation. I can use that free space for my Linux installation and store it next to my AmigaOS installation.

When the installer reaches the partitioning step, I have picked the "manual partitioning" option and added the following two partitions:

A swap partition of 128 MiB
The root file system partition (/) using the remaining available disk space. The root partition uses the Ext2 file system.

Installing packages takes a quite an amount of time on my Amiga -- uncompressing and installing packages is nowhere as fast as a modern machine. I did not exactly measure the full installation time, but it took several hours before the installation of the base system was completed.

Moreover, the installer works but it is quite slow and unresponsive. It seems that it is written in Perl and interpreting Perl scripts on my Amiga is significantly slower than running compiled executables that have been implemented in C. Sometimes it feels that the process is completely stuck, but by running ps in a second terminal, I discovered that the machine is actually doing a lot of hard work.

Configuring the boot process

The installer does not take care of configuring the Linux boot process on the Amiga. You need to set that up yourself.

Booting a Linux kernel is an interesting process. Although the kernel is the operating system component that interacts with the hardware and controls the system, the Linux kernel is not magically there in memory yet when you power up a computer -- it first needs to be loaded from external media into memory and then it needs to be started so that it can take control of the machine. A boot loader is an application that is typically responsible for this process.

There are various ways to load and start the Linux kernel. As a result, there are also various flavours of boot loaders.

(As a sidenote: I know that older versions of the Linux kernel (2.6.x and lower) on the x86 architecture can boot without a boot loader, as described in the Bootdisk HOWTO. Instead of using a boot loader, you can also write a kernel image to the Master Boot Record (MBR) of a disk drive.

The kernel image contains some bootstrap code to put the CPU in protected mode and starting the kernel. This feature is not used very frequently because it makes updating the kernel hard.)

Nowadays on PCs, it is common to load the Linux kernel as quickly as possible after booting up the machine (a process which I will call cold booting). LILO used to be amongst the first boot loaders to accommodate that -- it could load a selection of kernels from the hard drive with pre-configured parameters.

LILO was eventually succeeded by GRUB (one of its improvements is that it does not require an update when a new kernel image becomes available) and systemd-boot (meant for UEFI-based systems).

In the MS-DOS era, it was also possible to start a Linux kernel after having already booted into MS-DOS. Since MS-DOS is a relatively primitive operating system with no memory protection, this was easily possible -- LOADLIN is a DOS program that can load a kernel from an MS-DOS partition into memory and then start the kernel so that it can take over the system. Booting a kernel from a running existing operating system is a process that I will call warm booting.

On the Amiga, it turns out that warm booting Linux from AmigaOS is the preferred practice -- amiboot is an Amiga program that can load a Linux kernel into memory from AmigaOS and then start the Linux kernel so that it can take control of the system.

(As a sidenote: I also found traces of a package named: amiga-lilo. I guess the purpose of this package is to facilitate cold booting on the Amiga.

In theory, it should also be possible to mostly bypass the Amiga operating system by creating a custom boot block on a partition. This custom boot block starts a boot loader program. I guess the cold boot approach on the Amiga was never popular enough).

To set up the boot process for the Debian installation on my Amiga, I executed the following steps:

I have created a directory on my first AmigaOS partition: DH0:boot-debian and added a Workbench icon for the directory.
I have copied the amiboot program and supporting documentation files from the following folder of the first Debian installation DVD: CD0:install/tools/amiga to DH0:boot-debian.
I have copied the Linux kernel for the Amiga from the following location: CD0:install/kernels/vmlinuz-2.4.27-amiga to DH0:boot-debian
I have renamed the StartInstall_#? files to Start_#? because these scripts are no longer about starting the installer, but starting my Linux system.
I have modified the amiboot instruction of the Start script with the following parameters:
```
amiboot-5.6 -d -k vmlinuz-2.4.27-amiga root=/dev/hda7 video=amifb:pal video=virge:off
    
```
The above command invokes amiboot to load the Linux kernel from an image in the same directory with the following parameters:
- The root partition points to the seventh partition of the hard-drive: /dev/hda7.
- I have configured the Amiga framebuffer to start in PAL mode, rather than "double NTSC" which is the default. Double NTSC mode messes up my display.
- I have disabled the S3 Virge driver (the video chipset on my Cybergraphics 64/3D card), because it gives a division by zero exception in the kernel when I boot it.
I have modified the amiboot instruction of the Start_CV3D script with mostly the same adjustments:
```
amiboot-5.6 -d -k vmlinuz-2.4.27-amiga root=/dev/hda7 video=virge:640x480-8
  
```
The only difference is that the above command instructs the kernel to use my Cybervision 64/3D card using the 640x480 resolution and an 8-bit color depth.
I did a backdrop of the Start and Start_CV3D icons (by selecting the icons, and then picking 'Icon -> Leave out' in the menu) so that I can boot up Linux conveniently from the desktop:

Clicking on the Start icon gives me the following boot up screen:

Completing the Debian installation

After booting up my Linux installation, the Debian installer resumes the installation process. It asks me to configure my timezone, hostname, the package manager (by scanning the two installation DVDs), an unprivileged user account and so on. This is generally a straight forward, but slow process (due to the fact that the installer is a Perl script).

Some use cases

After completing the installation process described earlier, my Amiga-based Linux system is ready to be used. I tried a number of use cases.

Controlling the frame buffer

Previously, I have shown that it is possible to change the settings of the framebuffer at boot time with the video= kernel parameter (such as PAL/NTSC mode, color depth and resolutions). It is also possible to switch these settings at run-time by using the fbset utility.

fbset can be installed as follows:

$ apt-get install fbset

To be able to properly use fbset you need video configurations in the /etc/fb.modes configuration file. It turns out that the default installation does not provide the settings that you need to work with the Amiga's OCS/ECS/AGA video chipsets.

Fortunately, by looking up the corresponding source code, I could find a modes file that provides the settings that I need: fb.modes.PAL. I have replaced the default fb.modes file with this file.

By installing this configuration file, I can switch to interlaced mode by running:

$ fbset pal-lace

As shown in the picture above, running the above command doubles the amount of scanlines on my screen (at the expense of a flickering screen).

I can switch back to normal PAL mode by running:

$ fbset pal

When using my Cybervision 64/3D card, I also want to have the ability to switch resolutions and color modes at runtime. I could not find a preconfigured fb.modes file for my RTG card to do that.

To obtain the required framebuffer configuration settings, I followed a silly procedure. I ended up booting the kernel in a specific video mode, and then saving the framebuffer settings by running:

$ fbset >> /etc/fb.modes

and then rebooting and switching screen modes until all available modes have been covered. I repeated the procedure for the following kernel video parameters:

virge:640x480-8
virge:800x600-8
virge:1024x768-8
virge:1280x1024-8
virge:1600x1200-8
virge:640x480-16
virge:800x600-16
virge:1024x768-16

When capturing the configuration, the resulting configuration name does not take the color depth into account. For example, when capturing the 640x480-8 configuration, fbset generates a mode that is named 640x480-50 in which the -50 component refers to the refresh rate.

I ended up changing the configuration names to use a naming convention that consists of <width>x<height>x<colorDepth> which is similar to the kernel video parameters shown earlier.

After capturing all screen modes, I can switch between them at runtime. For example, the following command switches to a 1024x768 using an 8-bit color depth:

$ fbset 1024x768-8

As can be seen in the picture above, the resolution has changed from 640x480 to 1024x768.

The X Window System

After playing around with the framebuffer. I have also decided to give the X Window System a try.

Since X11 applications are quite resource demanding, I have decided to do a very minimal X Window System installation. Based on instructions from the following page (they were outdated, so I looked for the equivalent set of packages in Debian 3.1), I installed the following set of packages to produce a very small X11 installation:

$ apt-get install xfonts-base xfree86-common xlibs xserver-common \
    xserver-xfree86 xbase-clients

While installing the above packages, the xserver-xfree86 maintainer script asks me to provide a number of configuration settings. I used the following values:

Select desired X server driver: fbdev
Use kernel framebuffer device interface? yes
XKB rule to use: xfree86
Select keyboard model: amiga
Keyboard layout: us
Mouse device: /dev/amigamouse
Is your monitor an LCD device? No
Method for selecting your monitor characteristics? simple
Monitor size: 17 inch
Video mode to use: 640x480
Default color depth in bits: 8

After trying to start the X Window System, I have noticed that it crashes on startup while loading certain modules. I fixed the startup procedure by opening the /etc/X11/XF86Config-4 file in a text editor and disabling the problematic modules.

OpenGL is not required, and probably will not work on the Amiga, because of the lack of system resources. As a result, I can disable the following modules by commenting them out:

# Load "GLcore"
# Load "dri"
# Load "glx"

The following module can also be disabled. It is not needed because it is used to make a BIOS interrupt call on 80x86 systems:

# Load "int10"

I also figured out that the keyboard does not work properly with XKB and needs to be disabled. I can disable XKB altogether by commenting out the following options:

# Option "XkbRules" ...
# Option "XkbModel" ...
# Option "XkbLayout" ...

And adding the following property:

Option "XkbDisable"

I have also installed TWM as window manager and xterm as a terminal application:

$ apt-get install twm xterm

I can start the X Window System from the terminal by running:

$ startx

Resulting in the following graphical environment:

The X Window System works fine in combination with my Cybervision 64/3D card. The only thing that remains a mystery to me is whether it is possible to use the X Window System in combination with the AGA chipset -- it always results in a segmentation fault in the startup procedure.

Using the GoTek floppy emulator

Similar to Linux on PCs, Linux on the Amiga can also work with floppy disks and a variety of file systems. For example, it knows how to read and write to floppy disks using Amiga's Fast File System and MS-DOS' FAT file system.

As I already explained in previous blog posts, I have a GoTek floppy emulator as an additional floppy drive. The GoTek floppy emulator uses disk images from a USB memory stick rather than real floppy disks. The GoTek drive can be attached to the external floppy port and is mapped to the DF2: drive.

To use the DF2: drive in Linux, I need to create additional device files (/dev/fd2 etc.). This can be done by running the following commands:

$ cd /dev
$ ./MAKEDEV fd2

I am using the following entries in the /etc/fstab file to mount floppy drives:

/dev/fd0    /media/floppy0  auto    rw,user,noauto    0 0
/dev/fd1    /media/floppy1  auto    rw,user,noauto    0 0
/dev/fd2    /media/floppy2  auto    rw,user,noauto    0 0

Of course, I must also make sure that the mount point directories exist. I can create them by running:

$ mkdir /media/floppy0
$ mkdir /media/floppy1
$ mkdir /media/floppy2

For example, I can mount a virtual Amiga floppy "inserted" into the GoTek drive, such as the Workbench disk, by running the following command:

$ mount /media/flopppy2

Then I can get access to the floppy disk's contents by opening the /media/floppy2 directory.

Setting up a terminal connection between the Amiga and PC with a null-modem cable

Similar to connecting my Amiga 4000 (running AmigaOS) to my Linux PC, I can also set up a terminal connection between my Linux PC and Amiga Linux installation with a null-modem cable.

For example, I can turn my desktop PC (running NixOS) into a terminal server by running the following command as root user:

$ agetty --flow-control ttyUSB0 19200

The above command exposes a terminal session over the USB to RS232 port, using hardware (RTS/CTS) flow control and a baud rate of 19200 bits per second.

I can connect with minicom from my Amiga 4000 to remotely administer my desktop PC and exchange files using the ZMODEM protocol.

I can also do the opposite. With the following command (as root user) on my Amiga 4000 Linux installation, I can turn it into a terminal server:

$ getty -h 19200 /dev/ttyS0

Then I can use minicom from my desktop PC to connect to the Amiga:

$ minicom -b 19200 -D /dev/ttyUSB0

The result of this linking up process can be seen in this picture:

As may be observed, my laptop on the right uses minicom to open up a shell session on my Amiga 4000 machine.

Connecting to the Internet

Another nice property of Linux is that has builtin support for the Internet protocol stack. (As a sidenote: Internet connectivity is also possible on AmigaOS, but you need to install separate software for that).

In addition to a terminal connection, I can also set up a PPP link between my Amiga and desktop PC over the null-modem cable so that I can use the Internet protocols to connect between them. My desktop PC is supposed to have 192.168.1.1 as an IP address and my Amiga 4000 192.168.1.2. I took inspiration from the PPP HOWTO, which is a bit outdated.

First, I need to set up a link end point on my Amiga 4000, with the following command:

$ pppd -detach crtscts lock noauth defaultroute 192.168.1.2:192.168.1.1 /dev/ttyS0 19200

Then I can configure my desktop PC (which has a connection to the Internet by using an ethernet card):

$ pppd -detach crtscts lock noauth proxyarp 192.168.1.1:192.168.1.2 /dev/ttyUSB0 19200

Then I should be able to ping my desktop PC from the Amiga by running:

$ ping 192.168.1.1

With some additional steps, I can also connect my Amiga 4000 to the Internet by using my desktop PC as a gateway.

First, I need to enable IP forwarding on my desktop PC:

echo 1 > /proc/sys/net/ipv4/ip_forward

Then, on my desktop PC, I can enable network address translation (NAT) as follows:

INTERNAL_INTERFACE_ID="ppp0"
EXTERNAL_INTERFACE_ID="enp6s0"

iptables -t nat -A POSTROUTING -o $EXTERNAL_INTERFACE_ID -j MASQUERADE
iptables -A FORWARD -i $EXTERNAL_INTERFACE_ID -o $INTERNAL_INTERFACE_ID -m state --state RELATED,ESTABLISHED -j ACCEPT
iptables -A FORWARD -i $INTERNAL_INTERFACE_ID -o $EXTERNAL_INTERFACE_ID -j ACCEPT

In the above example: ppp0 refers to the PPP link interface and enp6s0 to the ethernet interface that is connected to the Internet.

In order to resolve domain names on the Amiga 4000, I need to copy the nameserver settings from the /etc/resolv.conf file of the desktop PC to the Amiga 4000.

Then I can, for example, use Lynx to visit my homepage:

Visiting Internet sites is not particularly fast over a null-modem cable (with a baud rate of 19200 bits per second) but it is possible. :)

Exchanging files

Sometimes it is also required to exchange files from and to my Linux installation on the Amiga 4000, such as software packages from the Internet. There are various ways to accomplish that.

Exchanging files with the memory card on my Linux PC

In addition to using a null-modem cable, which is slow because it has a baud rate of 19200 bits per second, I can also simply just take the Compact Flash card from the CF2IDE device and put it in the card reader of my PC.

An Ext2 partition on the Amiga is no different than an Ext2 partition on the PC. The only challenge is that we need to address it somehow -- Linux on my PC is not able to understand an RDB (Rigid Disk Block) partition table (the partitioning format that the Amiga uses) -- PCs use a different partitioning format.

There is a way to cope with this. If in a file (e.g. a hard disk image or device file referring to a hard drive) we know the offset and length of a partition, then we can use a loopback device to mount it.

GNU Parted is capable of reading an RDB partition table:

parted /dev/sdb
unit B
print
Model: Generic- USB3.0 CRW-CF/MD (scsi)
Disk /dev/sdb: 32019111936B
Sector size (logical/physical): 512B/512B
Partition Table: amiga
Disk Flags: 

Number  Start         End           Size          File system     Name  Flags
 1      1806336B      2149539839B   2147733504B   asfs            DH0   boot
 2      2149539840B   4297273343B   2147733504B   asfs            DH1
 3      4297273344B   6445006847B   2147733504B   asfs            DH2
 4      6445006848B   15090130943B  8645124096B   asfs            DH3
 5      15090130944B  25891117055B  10800986112B  asfs            DH4
 6      25891117056B  26019366911B  128249856B    linux-swap(v1)  DH5
 7      26019366912B  32019111935B  5999745024B   ext2            DH6
quit

In the above code fragment, I started GNU parted to read the partition table of my Amiga drive (/dev/sdb), switched the units to bytes, and then I printed the partition table. The last partition entry (Number: 7) refers to the Linux Ext2 partition on my Amiga 4000.

I can use the start offset and size fields in the above table to configure a loopback device (/dev/loop0) referring to my Amiga drive's Linux partition from my Amiga drive's compact flash card:

$ losetup --offset 26019366912 --sizelimit 5999745024 /dev/loop0 /dev/sdb

Then I can mount the partition (through the loopback device file) to a directory:

$ mount /dev/loop0 /mnt/amiga-linux-partition

And then I can exchange files to the Amiga's Linux partition by visiting /mnt/amiga-linux-partition.

When the work is done, I must unmount the partition and detach the loopback device:

$ umount /mnt/amiga-linux-partition
$ losetup -d /dev/loop0

Exchanging files between the AmigaOS and Linux operating systems

I have two operating systems running on my Amiga 4000. Another relevant use case is the ability to exchange files between the AmigaOS and Linux operating systems on the Amiga 4000.

As I already explained, the Linux kernel supports reading and writing Amiga Fast File system partitions. For example, if one of my hard-drive partitions would have been formatted using Amiga's Fast File system, then I can simply mount it at boot time by adding the following entry to /etc/fstab:

/dev/hda1    /mnt/harddisk    affs    defaults    0 0

Unfortunately, because of my hard-drive size (which is much bigger than 4 GiB) I have picked Smart Filesystem (SFS) over Amiga's Fast Filesystem for my Amiga partitions. It seems that a Linux kernel module for SFS has been developed, but it was never accepted into the upstream Linux kernel and its development seems to have stopped a long time ago. As a result, there is no means for me to read my AmigaOS partitions from Linux.

Fortunately, I can do the opposite -- it is possible to read and write to a Linux Ext2 partition from AmigaOS: there is an ext2fs filesystem handler for AmigaOS.

Installing it consists of three steps:

We must extract the contents of the lha file into a temp directory
Copy the AmigaOS/Ext2FileSystem handler to L:
Create a DEVS:DOSDrivers entry specifying mount settings of our Linux Ext2 partition.

Writing the DOSDriver file for the Ext2 partition is tricky -- the ext2fs package suggests to use a tool: GiggleDisk to automatically generate a mount list. This tool can be used to mostly derive the configuration properties from the RDB partition table.

I ran into an annoying problem with GiggleDisk -- when I run it, it stops with an "Invalid RDB header" error. Some experimentation revealed that it is the Ext2 and swap partitions that it does not like. If I remove the Linux partitions, then GiggleDisk is able to generate a mount list for all the Amiga partitions.

As a consequence, my recommendation is to run GiggleDisk before you generate the Linux partitions and keep the corresponding mount list stored for future use. Before installing Linux you need to install GiggleDisk and run the following command:

GiggleDisk DEVICE scsi.device UNIT 0 TO DH0:T

The above command examines the first drive on the SCSI bus and writes the mount list entries to following directory: DH0:T.

For example, for my first hard drive partition, GiggleDisk generated the following DOSDriver file (DH0:T/DH0):

/*
** DosDriver automatically created by GiggleDisk
**
** GiggleDisk (c)2005 Guido Mersmann
**
*/

FileSystem       = df2:l/SmartFilesystem
Device           = scsi.device
Unit             = 0
BlockSize        = 512
Surfaces         = 7
SectorsPerBlock  = 2
BlocksPerTrack   = 252
Reserved         = 2
PreAlloc         = 0
Interleave       = 0
MaxTransfer      = 0x0001FE00
Mask             = 0x7FFFFFFE
LowCyl           = 2
HighCyl          = 2412
Buffers          = 80
BufMemType       = 0
StackSize        = 16384
Priority         = 0
GlobVec          = -1
DosType          = 0x43465300
Activate         = 1
Mount            = 1
/* Bootable      = TRUE */

Fortunately, most of the properties of the mount list entries are the same for all partitions. The only differences we have between the partitions are:

The FileSystem and DosType fields.
The location of the partition on the hard drive, defined by the LowCyl and HighCyl properties.

By studying the documentation of the ext2fs package, using the information of GNU Parted (shown in the previous section) and some math, I can derive the configuration settings for the Ext2 partition from the existing partitions' configurations.

First, I need to make a copy of the DH0 file, naming it EXT6 -- EXT6: means the Ext2 filesystem device equivalent for the partition DH6:. Then I change the following two fields so that the operating system knows that it needs to be treated as an Ext2 file system partition:

FileSystem = DH0:L/Ext2FileSystem
DosType = 0x45585432

Then we must compute the LowCyl and HighCyl values. In the previous section, we have already requested the offsets (in bytes) of all partitions by running GNU Parted. We can use these offsets to derive the LowCyl and HighCyl properties by doing some math.

First, I must compute the amount of blocks per cylinder. This value can be computed with the following formula:

blocksPerCylinder = heads * sectors

We can map the variables in the above formula to properties shown in the previous mount entry:

heads corresponds to Surfaces
sectors corresponds to BlocksPerTrack

In our situation this results in:

blocksPerCylinder = 7 * 252 = 1764

From the number of blocks per cylinder I can derive the amount of bytes per cylinder with the following formula:

bytesPerCylinder = blocksPerCylinder * BlockSize

By filling in the BlockSize value in the config file this results in:

bytesPerCylinder = 1764 * 512 = 903168

With the above value I can derive the low cylinder with the following formula:

LowCyl = Start / bytesPerCylinder

By filling in the start offset (in bytes) of the Linux partition from the GNU Parted output, this results in:

LowCyl = 26019366912 / 903168 = 28809

I can apply the same value to also compute the size in cylinders:

sizeInCylinders = Size / bytesPerCylinder 
  = 5999745024 / 903168
  = 6643

The high cylinder value can be computed as follows:

HighCyl = LowCyl + sizeInCylinders - 1
  = 28809 + 6643 - 1
  = 35451

(As a sidenote: we need to subtract 1 because the start cylinder is taken into account with the size)

By updating the mount file with the above LowCyl and HighCyl values, I have filled in all required configuration properties.

Unfortunately, I learned that providing the correct settings is not enough. There is another small problem -- it turns out that the Ext2 file system driver is also constrained by the SCSI driver's 4 GiB limitation. I learned from the ext2fs README that it is possible to escape that limit by installing a program called: td64patch. td64patch is a program that needs to run in the background.

Although this program does the job, I do not like to start it as part of the Startup-Sequence because I do not permanently need it. As an alternative solution, I created a mount script (MountEXT6):

Run C:td64patch DEVICE scsi.device
C:Mount DH0:Storage/DOSDrivers/EXT6

The above script does two things:

It starts the td64patch program in the background for bypassing the limits of AmigaOS' scsi.device driver
Then it mounts the EXT6: drive using the mount configuration that we have created earlier.

I also created an icon for the above script, using C:IconX as a default tool and did a backdrop so that I can conveniently start it from the desktop when I need it:

The screenshot above shows the result of clicking on the MountEXT6 icon. The EXT6: drive gets mounted and I can explore its contents in the Amiga Workbench.

Running Linux on Amiga in FS-UAE

Another thing I was curious about is whether I could run Linux for Amiga in FS-UAE, an Amiga emulator.

Having the ability to use an emulator is useful for experimentation. Moreover, UAE does not use cycle exact emulation for 68020 processors and later models. Instead, it emulates as fast as possible. As a result, on my modern desktop PC an emulated Amiga 4000 runs much faster than my real Amiga 4000. For experimentation with Linux, e.g. building packages from source code, this greater speed is quite convenient.

It turns out that running Linux in FS-UAE is actually possible, but we need to enable certain settings for that.

The biggest impediment to run Linux in a UAE is how UAE normally "emulates" hard-drives and CD-ROM drives. In fact, FS-UAE does not really emulate them, but it provides its own AmigaOS replacement drivers: uaehf.device and uaescsi.device that interface with the host filesystem/device drivers. Linux does not work with AmigaOS device drivers -- once the kernel is started, it wants to take control of the hardware itself.

As an alternative, it is also possible to emulate IDE controllers. When IDE emulation is enabled, AmigaOS uses its own scsi.device driver to interact with the storage devices. Linux can control the IDE controllers by itself.

For a hard drive, we must create a hard file (mounting a directory from host file system relies on using uaehf.device) and specify an IDE controller. We can do that by adding the following properties to an FS-UAE configuration file:

hard_drive_0 = mydrive.hdf
hard_drive_0_controller = ide
hard_drive_0_type = rdb

In the above code fragment, hard drive 0 uses mydrive.hdf as a hardfile, use an IDE controller named: ide and the type property states that the hard file contains an RDB partition table, not a single partition.

The last property is only needed when the hard file is still empty. After the partitioning is done, FS-UAE is able to detect the hard file type automatically.

For the CD-ROM drive, we also need to enable an IDE controller:

cdrom_drive_0_controller = ide1

After configuring the IDE controllers, we should be able to install Debian in FS-UAE on the hard-drive image.

I have noticed that emulating an IDE controller in combination with an emulated CD-ROM is not perfect. Sometimes I/O access becomes very slow in AmigaOS.

While installing Debian from the DVD-ROMs, I have noticed that during the package installation process I sometimes run into read errors.

Fortunately, I can cope with that by executing a shell session from the Debian installer and remounting the CD-ROM drive:

umount /cdrom
# Eject the CD-ROM and reinsert it again with FS-UAE (F12)
mount /dev/cdroms/cdrom0 /cdrom
exit

And then resuming the installation where I left off. After a couple of tries, the installation should succeed.

The result is that I can run Linux conveniently in FS-UAE:

The above screenshot shows my NixOS Linux system (running on an x64 CPU) running FS-UAE that runs Debian Linux for m68k. Isn't it cool?

Similar to the memory card of my Amiga 500's SCSI2SD device, I can also put the Compact Flash card of my Amiga 4000 in the card reader of my PC and then run its Linux installation in FS-UAE.

To do that I need to change the hard drive setting to use the device file that corresponds to my card reader:

hard_drive_0 = /dev/sdb

In the above code fragment: /dev/sdb corresponds to the device file representing the card reader. I also need to make sure that the device file is accessible by an unprivileged user by giving it public permissions. I need to run the following command as a root user to do that:

$ chmod 666 /dev/sdb

Conclusion

In this blog post, I have described how I have been using Linux on my Amiga 4000. I hope this information can be of use to anyone who is planning to run Linux on their Amigas. Furthermore, I hope that the existing documentation on the Linux/m68k homepage gets updated at some point. May be the information in this blog post can help with that.

Debian 3.1 works decently for me, but everything is much slower compared to a PC. This is not really surprising if you run an operating system (last updated in 2008) on a CPU from the early 90s that still runs at a 25 MHz clock speed :).

Most command-line utilities (that have been written in C) work at an acceptable speed, but applications such as Perl scripts or the X Window System are generally quite slow.

I have also noticed that the amount of system resources that the Linux kernel needs grows with each newer version. Even for the Amiga, an obsolete hardware platform, it is possible to address these higher system demands by, for example, buying an accelerator card with a substantially faster CPU (e.g. a 68060 @ 100 MHz) and substantially more RAM (e.g. 512 MiB of fast RAM).

Although an accelerator card may give my Amiga a huge performance boost, I do not want to implement such big upgrades -- the reason why I like to tinker with my Amiga is because it is a limited machine. I use it for nostalgic and creative reasons, such as developing software that fit within the boundaries of the machine.

I did buy a modest upgrade to my Amiga 4000 machine though -- my Fastlane Z3 expansion card still has room for an additional 32 MiB of fast RAM. I have decided to buy an extra 16 MiB increasing the amount of my Amiga's fast RAM from 48 MiB to 64 MiB.

I highly appreciate Debian's efforts to maintain uncommon ports of Linux and the GNU system. From a commercial point of view, I believe there is little motivation to keep the m68k port alive, but it is still good to see that a community-driven project, such as Debian, still does the effort.

I also got motivated to see if I can run some of my own software under Linux on the Amiga 4000. An obvious candidate would be the packages from my IFF file format experiments project. I need to find more spare time to see if I can actually do it. :)

Another thing that I expect people (who know me a bit better from my Linux background) to ask me is when I will deliver a port of the Nix package manager and the Linux distribution that is built around it: NixOS to m68k-linux. I have not made any plans yet, but you will never know -- these tools are instrumental in my personal software construction process and I have already done an interesting Amiga crossover with these tools in the past. :)

Blog reflection over 2024

2024-12-30T20:37:00.002+01:00

Not much to report this year. I have only written one article: Using a Commodore Amiga 4000 in 2024.

Next year, I hope to find back my usual cadence. I am not out of ideas and there is definitely going to be more to report about.

The last thing I would like to say is:

HAPPY NEW YEAR!!!!

Using a Commodore Amiga 4000 in 2024

2024-12-29T00:36:00.003+01:00

Some time ago, I have written a couple of blog posts about using my old Commodore computers that I grew up with: the Commodore 64, 128, and Amiga 500 in modern times.

At the time these machines were bought by my parents (in the late 80s and early 90s), Commodore still used to be a respected and mainstream computer brand. Times were quite exciting back then -- I had many relatives that used to own a Commodore computer. We also used to regularly visit local computer clubs in which we used to exchange ideas and software.

An uncle of mine was always very progressive with new computer models -- he was always the first in the family to make the switch to a new computer model. For example, when we all still used Commodore 64 and 128s, he was the first to make the switch to the Amiga 500 when it came out. He was also the only relative to buy an Amiga 4000 -- the most powerful Amiga model. After Commodore's demise, he was the first to switch to the PC.

I still remember that I was always excited to visit him in the Amiga days. The Amiga 4000 was much more advanced than the Amiga 500 (that we used to own at that time) and I was very impressed by its improved graphics and speed capabilities.

Contrary to the 64, 128 and Amiga 500, I have never worked with the Amiga 4000 much, aside from a couple of visits to my uncle and the Home Computer museum in Helmond.

Almost three years I ago, I ran into the opportunity of buying a second hand Amiga 4000.

Although I am quite happy to have a good working Amiga 4000 in my possession, there were a couple of subtle annoyances:

The machine has quite a few hardware and software modifications. As a result, when I boot up the machine, it loads a heavily modified AmigaOS 3.9 Workbench with many additions and customizations, such as a high resolution screen with large high color icons.

Although the installation works fine and provides useful features, this Workbench installation barely resembles the desktop environment I was used to in the 90s.
I want control over the machine's hard-drive's software configuration -- I want to exactly know how to reproduce a machine's configuration from scratch, in case all data gets lost.

As a consequence, I must exactly know which applications and dependencies I need to install, where these software packages came from, and what kind of configuration changes I want to make. I have such procedures documented for myself.

Doing this for my Amiga 4000, turned out to be somewhat a challenge -- some floppy disks, such as driver disks, and instruction manuals were lost by the previous owner.
I only want to install the software packages that I know. As a consequence, I want to get rid of the software that I do not use or care about.

In the last few weeks, I have managed to perform clean AmigaOS 3.1 and AmigaOS 3.9 installations on my Amiga 4000 -- I have identified all my required software packages, found the ones that were missing, and documented the installation procedures.

In this blog post, I will explain the properties of my Amiga 4000 and report about my findings.

Some interesting properties of the Amiga 4000

As I have already explained, the Amiga 4000 was the most advanced Amiga model produced by Commodore. It was released in mid-1992, but it took a little while longer before they became available on the market in my home country, the Netherlands.

Similar to the earlier Amiga models, such as the Amiga 500, it contains custom chips that give the Amiga all kinds of nice multimedia capabilities, such as a good working sound chip (Paula), a video beam-aligned co-processor: the copper, capable of generating all kinds of interesting visual effects, and the blitter, a co-processor capable of quickly copying data chunks (such as graphics surfaces).

The Amiga 4000 has the following improvements over its predecessor models:

A more advanced and much faster CPU: a 68040 clocked at 25 MHz with an FPU (floating point unit) and a MMU (memory management unit). In contrast, the Amiga 500 has a 68000 CPU clocked at 7 MHz, without an FPU and MMU.

(As a sidenote: Some time later, Commodore also introduced a lower priced Amiga 4000 model containing the 68EC030 CPU. Its capabilities are similar to the 68030 CPU that can also be found in the Amiga 3000, but the EC variant is cheaper, because it removes the MMU and FPU.)
More RAM. By default, the Amiga 4000 came with 2 MiB chip RAM and 4 MiB fast RAM. The amount of fast RAM can be upgraded to 16 MiB. With a Zorro III expansion card, fast RAM can be extended even further to, for example, 64, 128, 256 MiB and even larger quantities.
AGA graphics chipset. The AGA chipset is the successor to the previous Amigas' OCS and ECS graphics chipsets -- it expands the amount of color registers from 32 to 256, and increases the size of the red, green and blue color components from 4 bits (providing a set of 4096 possible colors) to 8 bits (providing a set of approximately 16.8 million possible colors).

The OCS and ECS graphics chipsets also offer HAM-6 screen modes, that make it possible to use view photo realistic images using all 4096 colors by using only 6 bits per pixel (albeit with some compression limitations). The AGA chipset offers an additional 8-bits HAM mode capable of displaying 262144 colors on one screen, with some compression limitations.
A high-density floppy drive. High density floppies have twice the amount of storage capacity of double density disks (1.72 MiB rather than 880 KiB).
A new operating system Kickstart and Workbench (version 3.0), with all kinds of new features. Its successor version (version 3.1) also became available on the earlier Amiga models as an upgrade, such as the Amiga 500.

Contrary to the Amiga 500, but similar to the Amiga 2000 and 3000, the Amiga 4000 is a high-end computer model, not a budget model -- the computer is in a separate case (not integrated into the keyboard), and has much more extension capabilities. The Amiga 4000 has four Zorro III expansion slots, rather than one. You can use these slots to extend the machine with multiple hardware peripherals at the same time, such as external graphics, sound and network cards.

The Amiga 1200 and Amiga CD32 are the Amiga 4000's budget counterparts. The Amiga 1200 also contains the AGA chipset, but only has one expansion slot, less memory and a cheaper and less powerful CPU (the 68EC020). The CD32 is basically an Amiga 1200 that lacks a keyboard, but includes a CD-ROM drive.

Although the Amiga 4000 has quite a few nice improvements over its predecessor models, the competitive situation of Commodore was quite bad in 1992. Contrary to the introduction of the Amiga 1000 and 500 (in 1985-1987) Commodore no longer had the competitive edge in 1992.

In the late 80s, due to mismanagement and cuts in research budgets, the Amiga failed to innovate. With the introduction of the Amiga 4000 and 1200, Commodore was basically just "catching up". The competition, such as the IBM PC and compatibles, were ahead in many areas:

VGA graphics (introduced in 1987) have become more common in PCs in the 90s. VGA provides a 320x200 screen mode with 256 colors (albeit out of a palette of 262144 possible colors) and has an easier method to address pixels -- chunky graphics, that address every pixel as a byte, rather than 8 separate bitplanes.

Although for 2D games, the Amiga's AGA chipset did fine in comparison to VGA, it had no "good answer" to 3D games, such as Wolfenstein 3D and Doom, because it lacks the chunky graphics modes that these games heavily rely on.

(As a sidenote: many years later, after Id software released the source code of Wolfenstein 3D and Doom, these games were ported to the classic Amiga models, but that is a different story).
After VGA, even better graphics standards were developed for the PC in the late 80s and early 90s, such as Super VGA and XGA, that were much more powerful than Amiga's AGA chipset. These graphics standards support even higher resolutions and more colors (16-bit high color, and 24/32 bits true color screen modes).

Aside from using an external RTG card, the Amiga did not provide any built-in solution to match these graphical features.
From an audio perspective, the Amiga was also no longer leading. Although Paula was still a fine sound chip in 1992, cheaper and more powerful sound cards were developed for the PC that can provide an even better experience.

The only area that I believe the Amiga still had a competitive edge in 1992 was on operating system level. In 1992, Microsoft had no consumer operating system yet that was capable of doing pre-emptive multi-tasking. Windows NT 3.1 (release mid 1993) was the first version of Windows supporting pre-emptive multi-tasking, but it was an expensive operating system meant for business use. Windows 95 brought pre-emptive multi-tasking to consumers, but it became available in the middle of 1995. It took Apple even longer -- they brought pre-emptive multi-tasking to consumers with Mac OS X, introduced in 2001.

The Amiga 4000 also has a number of drawbacks over its predecessor models -- it used cheaper hardware components of lesser quality, such as IDE rather than SCSI for hard-drives (SCSI at that time provided better performance) and low quality capacitors and batteries, that will typically start to leak after years of inactivity, damaging the motherboard. Fortunately, my Amiga 4000 was revised in time to not suffer from any damage.

My Amiga 4000 specs

My Amiga 4000 has the following specs:

It is the Amiga 4000/040 model, which means that it has a 68040 CPU (with an MMU and FPU), rather than the cheaper, slower and less powerful 68EC030 CPU (that lacks an MMU and FPU).
The amount of fast RAM on the board was upgraded to 16 MiB.
It also contains an expansion card: a Phase 5 Fastlane Z3 providing a SCSI controller (no devices are connected to it) and another 32 MiB of fast RAM, providing me a total of 48 MiB of fast RAM.
Similar to my Amiga 500, the hard-drive was broken due to old age. It was replaced by a CF2IDE device, making it possible to use a Compact Flash card as a replacement hard drive. Compared to old hard drives, a Compact Flash card provides much more storage capability and higher I/O speeds.
A DVD-ROM player was added
An external RTG graphics card was added: the Cybervision 64/3D. This card provides chunky graphics at much higher resolutions and higher color modes than the AGA chipset (in addition to 8-bit colors, it provides 16-bit high color and 24/32-bit true color).
The original 3.1 kickstart ROM was replaced by an AmigaOS 3.9 kickstart ROM. This requires some explanation.

I have also been using some of the replacement peripherals of my Amiga 500 in combination with the Amiga 4000:

I can attach my GoTek floppy emulator to the external floppy disk connector and use it as the DF2: drive. Although the internal disk drive still works fine, using disk images from a USB memory stick is much more convenient than using real floppies.
I can use the null modem cable to hook up the Amiga 4000 to my PC and conveniently exchange files.
I can use the RGB2SCART cable to connect the RGB output to my LED TV, because I no longer have a working Amiga monitor.

AmigaOS 3.9

As I have already mentioned earlier, the original Kickstart 3.1 ROM chip in my Amiga was replaced by an "AmigaOS 3.9 Kickstart ROM". This kickstart ROM was a big mystery to me when I first saw it.

According to Wikipedia, AmigaOS 3.5 and 3.9 were developed by Haage & Partner after Commodore's demise. They were released in 1999 and 2002 as software-only updates to Kickstart 3.1 systems, that ran at least on a 68(EC)020 processor.

AmigaOS 3.9 has a variety of changes and additions over AmigaOS 3.1, such as:

An updated SCSI driver that is not restricted to a maximum of 4 GiB of storage
A TCP/IP stack and web browser
An MP3 player
Various look and feel updates
NewIcons, providing 256 color, high resolution icons, rather than 4 color icons in the classic Workbench
AmiDock (a menu application that can be used to conveniently launch applications)

After learning more about AmigaOS 3.5 and AmigaOS 3.9, I still did not understand where this ROM chip came from or why it was needed, because the official AmigaOS 3.9 documentation does not mention the kickstart part at all.

After some additional searching, I learned that this 3.9 ROM chip is a scene/homebrew ROM. Basically its value is that it provides a faster and more convenient AmigaOS 3.9 bootup.

Normally, AmigaOS 3.9 patches various aspects of the Kickstart 3.1 ROM on first startup with various kinds of modifications, such as a better SCSI driver and Workbench modifications. After the patching is complete, the machine reboots and then the system boots into Workbench 3.9, making the first startup process slow. With the AmigaOS 3.9 ROM chip these modifications are already present on powerup, making the boot process faster.

Moreover, because the Kickstart 3.1 ROM's internal SCSI driver is not able to escape the 4 GiB limit, the first partition of a hard drive needs to be smaller than 4 GiB. When using the AmigaOS 3.9 ROM, there is no such limitation because the patched SCSI driver is already present in memory.

Installing an AmigaOS 3.9 system from scratch

Another thing that I learned is that Amiga Workbench 3.1 does not boot on Kickstart 3.9. After some deeper inspection, I discovered that the LoadWB command in the Startup-Sequence fails with an "invalid resident library" error.

My guess is that AmigaOS 3.9's modified workbench.library is not compatible with the LoadWB command from Amiga Workbench 3.1. This incompatibility is a bit disappointing, because previous kickstart releases (provided by Commodore) were always backwards compatible with older Workbench releases. For example, you can still boot Workbench 1.3 on a system with a Kickstart 3.1 ROM.

Because of this incompatibility and because I learned the value of using AmigaOS 3.9, I have decided to first do a clean AmigaOS 3.9 installation on a new Compact Flash card.

Creating an AmigaOS 3.9 boot disk

Contrary to the Amiga Workbench versions provided by Commodore, AmigaOS 3.9 is distributed on a CD-ROM, rather than a set of floppy disks.

On my Amiga 4000, it is not possible to boot from a CD-ROM. To do a fresh AmigaOS 3.9 installation, I must first create an emergency boot disk that has access to the CD-ROM drive.

Creating an emergency disk is straight forward. I must first boot from an existing Workbench installation, insert the AmigaOS 3.9 CD-ROM and start the installer: CD0:OS-Version3.9/OS3.9-Installation. In the installer I need to pick the option: "Create emergency disk". The rest of the steps are self explanatory:

The size of my Compact Flash card is 32 GiB. As I already explained, Commodore never took drives of such large sizes into account because they simply did not exist yet in 1992. As a result, the SCSI driver and Amiga Fast File System were never designed to (efficiently) deal with them.

The previous owner decided to use Smart Filesystem (SFS) as the default filesystem for the harddisk partitions rather than Amiga's Fast File System, because it is much better suited for large partitions. I have decided to do the same.

To be able to create SFS partitions and format them, I have downloaded and unpacked the SFS distribution from Aminet and copied the following files to the emergency boot disk:

The file system handler from: AmigaOS3.x/L/SmartFilesystem to DF2:L/.
The SFSformat command from the base directory to DF2:C/.

Partitioning the hard drive

After finishing the emergency boot disk, I can switch off the machine, insert my blank Compact Flash card, and boot from the emergency disk to partition the hard drive.

As a sidenote: similar to my Amiga 500, I recommend to clear a memory card first before partitioning it. Traces of previous partitions may confuse tools, such as HDToolBox and DiskSalv. On Linux, I can clear a memory card by running the following command:

$ dd if=/dev/zero of=/dev/sdb bs=1M

I have used the HDToolBox tool included with the AmigaOS 3.9 installation CD (CD0:Emergency-Boot/Tools/HDToolBox) to do the partitioning:

First, I need to select the drive that I want to partition. I have picked: SDCFXS-032G and clicked on the button: 'Install drive'
Then I need to configure HDToolBox to recognize Smart Filesystem partitions. I can configure HDToolBox to use the SmartFileSystem handler, by selecting the first partition, and then clicking on: 'Add/Update...' button below the 'Filesystem / International (FFS)' text box.
Then I must click on the button: 'Add New Filesystem...' and open DF2:L/SmartFileSystem handler
Finally, when creating a partition, I must change the partition's file system to: CFS\00

Although with the custom kickstart 3.9 ROM it is possible to escape a partition's 4 GiB limit, I have still decided to keep the first two partitions below 2 GiB -- the operating system and Workbench may be able to work fine with larger partitions, but certain applications may still have problems. For example, AMOS BASIC internally uses signed 32-bit integers to indicate the amount of free disk space. As a consequence, it will overflow and report weird values if a partition is bigger than 2 GiB.

Formatting the hard drive partitions

After creating the partitions, I have to reboot the system once more and boot from the emergency disk so that these new partitions are recognized. Then I can format the partitions by running the following kinds of commands on the command-line interface (CLI):

DF2:C/SFSformat DEVICE DH0: NAME HARDDISK0 SHOWRECYCLED

The above command formats the device DH0:, gives it the name: HARDDISK0 and adds a recycle bin to it.

Installing Workbench 3.9

Finally, I can install AmigaOS 3.9 on the hard drive from the CD-ROM, by opening the installer: CD0:OS-Version3.9/OS3.9-Installation and selecting the option: "OS3.9 full installation over OS3.0 or empty HD". The procedure is straight forward and self explanatory.

After completing the installation, I can reboot the machine and boot into AmigaOS 3.9 installed on the hard disk. Then I can install the applications that I want.

Making the RTG graphics card work

Installing the applications that I want is generally a straight forward process. The only challenge was making the Cybervision 64/3D graphics work.

As I already explained in the beginning of this blog post, the driver disks and the manual were lost. Fortunately, this Amiga hardware database page can still offer me the missing copies.

In my search process, I also learned a thing or two about retargetable graphics (RTG) on the Amiga. It seems that Commodore never developed a standardized API for AmigaOS to facilitate retargetable graphics. It was on the planning, but they went out of business before it got realized.

Other parties eventually stepped up to fill the gap. The two most popular RTG APIs are CyberGraphX (launched in 1995) and Picasso96 (launched in 1996). Initially, CyberGraphX was the most popular, but the slowly Picasso96 overtook its popularity, because it provides much better integration with the Amiga Workbench. The two APIs have a high-degree of compatibility with each other.

First, I tried to install CyberGraphX that came with the missing driver disks. Although I was able to see the 'CyberGraphX' logo on my VGA monitor, RTG graphics were not usable. The system basically just hangs. I guess this problem may be caused by an incompatibility with AmigaOS 3.9 or the CyberGraphX libraries that were already stored in the Kickstart 3.9 ROM.

Then I tried Picasso96. A copy was included on my AmigaOS 3.9 CD-ROM. It includes a driver for my Cybergraphics 64/3D card.

Unfortunately, with Picasso96 I also ran into problems. Using the test feature in the Picasso96Mode Prefs program always showed me a black screen, no matter how much I change the display parameters.

After an extensive search on the web and comparing the content of my AmigaOS 3.9 installation with the installation of the previous owner I eventually discovered the root cause -- there was no 68040.libary installed on my system. I fixed the issue by copying the missing library from the directory: CD0:OS-Version3.9/Extras/Libs on the AmigaOS 3.9 CD-ROM to DH0:Libs.

After fixing the library issue, Picasso96 works fine. I can, for example, display the Workbench (and many Intuition/Workbench-based applications) at a much higher resolution on my VGA screen:

The above pictures show the effect of enabling RTG graphics. On the left, the Workbench screen is rendered by the AGA chipset (using a 640x256 resolution). The RGB output is attached with an RGB2SCART cable to the right screen. On the right, we have enabled the RTG card using a much higher resolution: 800x600. The RTG card is connected with a VGA cable to the left screen.

Installing an AmigaOS 3.1 system from scratch

After producing a working AmigaOS 3.9 configuration from scratch, I also wanted to have a working AmigaOS 3.1 installation so that I can enjoy the original user experience that I was used to in the 90s.

Because Compact Flash cards are cheap, and we can conveniently switch them at the back of the machine, we can easily make a second hard drive image with an AmigaOS 3.1 installation.

Getting a working AmigaOS 3.1 installation turned out to be quite an interesting puzzle :). As I have already explained, Workbench 3.1 is not compatible with Kickstart 3.9. Contrary to my Amiga 500, it seems that it is not possible to have multiple kickstart ROM chips in my Amiga 4000 -- my Amiga 500 has two kickstart ROM chips (1.3 and 2.0) and a kickstart switcher, but as far as I know, such a device does not exist for the Amiga 4000.

Moreover, I am hesitant to replace the Kickstart 3.9 ROM chip by the original Kickstart 3.1 ROM chip. Aside from the fact that I find replacing chips scary, it also feels as a step backwards to deliberately degrade a working system's functionality -- Kickstart 3.9 has useful features, such as an updated SCSI driver.

Soft kicking Kickstart 3.1

After recollecting my memories of earlier Amiga models, I realized that there is a different way to tackle my problem. The first Amiga model: the Amiga 1000, did not have a kickstart ROM at all. Instead, on initial power up, it showed a white screen requesting the user to load the kickstart from a floppy disk into a restricted area of the RAM. After the kickstart is loaded, it shows the well-known floppy disk splash screen and permanently stays in RAM, even after reboots.

After some searching, I learned that loading a Kickstart into RAM is possible on all Amiga models, by using a program called a "soft kicker". There are a variety of such programs available.

For example, SKick is a soft kicker that works well on my Amiga 500 -- it can patch a variety of Amiga 500 compatible kickstart ROM images, load them into a different area in memory, namely: RAM, and then reboot the machine to use that Kickstart version. The Kickstart stays in memory, even after a reboot.

The only price you need to pay is 256 KiB (for Kickstarts 1.x) or 512 KiB (for Kickstarts 2.x or newer) of RAM. If you have a memory expansion, such as an additional 8 MiB of fast RAM, this is typically not a big issue.

SKick does not seem to work well on my Amiga 4000. From what I have read, the problem is that the 68040 CPU has an MMU and puts things in memory differently than SKick is used to.

After some more searching on Aminet, I found BlizKick. This soft kicker is originally designed for phase5/DCE turbo boards, but it also supports the MAPROM feature of the Commodore A3640 CPU card that contains the 68040 CPU.

I discovered that the MAPROM feature was disabled on my CPU card. I had to open the Amiga 4000 case and change the jumper configuration to enable it:

After enabling the MAPROM feature, I can easily soft kick my machine from my existing AmigaOS 3.9 installation to load the original Kickstart 3.1, by running the following command-line instruction:

BlizKick CPUCARD KICKFILE DEVS:Kickstarts/kick40068.A4000

The above command-line instruction states that we need to load the 3.1 Kickstart ROM for the Amiga 4000 file into RAM: DEVS:Kickstarts/kick40068.A4000 and soft kick it by using the MAPROM feature of my A3640 CPU card.

"Bootstrapping" AmigaOS 3.1

The next step turned out to be a challenging one. I need to somehow partition the hard drive, format the partitions and install Workbench 3.1. My idea was to create a bootable floppy disk to soft kick my Amiga 4000 into Kickstart 3.1, and then boot from the Workbench 3.1 install disk.

Unfortunately, it turns out that calling BlizKick from a bootable floppy disk's Startup-Sequence, does not work. For reasons unknown to me, the Kickstart always seems to enter a crash loop after the reboot.

Eventually, I gave up the boot floppy approach and took a very unorthodox route. First, I used AmigaOS 3.9 to partition the hard-drive and format the SFS partitions, the same way described in the previous section.

I did not install AmigaOS 3.9 on the hard drive. Directly after formatting the partitions, I copied BlizKick and the 3.1 Kickstart ROM to the first hard-drive partition. I created a very basic Startup-Sequence that only does the following:

BlizKick CPUCARD KICKFILE DEVS:Kickstarts/kick40068.A4000

After rebooting the machine from the hard-drive I noticed that it had successfully soft kicked into Kickstart 3.1.

Then I booted from the Workbench 3.1 install disk to install the Workbench on the hard drive. The Workbench installation procedure is straight forward and self explanatory.

After completing the Workbench installation, the old Startup-Sequence gets overridden. To allow the machine to soft kick into Kickstart 3.1 on the next reboots/power ups, I have added the following code fragment to the beginning of the hard drive's Startup-Sequence:

C:Version >NIL:

If NOT $Kickstart EQ "40.68"
    SYS:Programs/BlizKick/BlizKick CPUCARD KICKFILE DEVS:Kickstarts/kick40068.A4000 QUIET
EndIf

The above code fragment extends the previous example with a Kickstart version check: it checks whether the current kickstart version is 40.68 (the internal version number for the 3.1 kickstart release that we want to use) and only invokes BlizKick when this is not the case.

On a real Amiga 4000, this check is redundant -- BlizKick also checks whether the desired Kickstart version was soft kicked already. However, I also want to have the ability to put the Compact Flash card in my PC's card reader so that I can use it with FS-UAE. In FS-UAE, BlizKick is unable to determine that it had soft kicked already, causing the emulated machine to run in an infinite loop without this check.

Patching the Kickstart to use an updated SCSI driver

As I have already explained, one of the benefits of the 3.9 Kickstart ROM is that it includes an updated SCSI driver that is not limited by a maximum hard drive size of 4 GiB.

When booting my Workbench 3.1 configuration (that also soft kicks Kickstart 3.1), I have noticed that I was constrained by the 4 GiB limit again -- aside from the first partition, the remainder of the partitions were no longer visible.

To escape this 4 GiB limitation, I have installed the Install-SCSI package from Aminet. This package extracts the SCSI driver from the Kickstart, patches it with new features (such as the ability to use partitions larger than 4 GiB), and modifies the Startup-Sequence to patch the Kickstart to use the updated driver on startup. As long as the boot partition is below the 4 GiB limit, this approach works.

Result

Although I had to take a very unorthodox route, I am happy about the result. The following pictures demonstrate my Workbench 3.1 (on Kickstart 3.1) experience:

Similar to my AmigaOS 3.9, I can use both the AGA chipset (shown on the left) and the RTG card (shown on the right) to render the Workbench screen.

Using the Amiga 4000

After producing clean AmigaOS 3.1 and 3.9 configurations, I have been playing around with quite a few applications and games.

For example, when using Brilliance, a painting program, I can instantly see that AGA graphics are an improvement over ECS/OCS graphics:

The above picture shows an 256 color example picture that is included with Brilliance. Apart from using 256 colors, it can also use a high-resolution, interlaced screen mode (providing a resolution of 640x512). In comparison, an Amiga 500 can only use 32 colors (or the extra-half brite and HAM modes) in low resolution modes. In high resolution mode, it can only display 16 colors.

AGA games are also typically much more colorful, such as Pinball Fantasies (AGA version), Soccer Kid (AGA version) and Slamtilt:

Earlier in this blog post, I have mentioned that when Wolfenstein 3D and Doom were launched for the IBM PC and compatibles, the Commodore Amiga had no good answer. Not being able to catch up partially contributed to its demise.

After the source code of these games were released, these games were eventually ported to the Amiga (AmiWolf was released in 2013 and ADoom in 1997):

On my Amiga 4000, these games do not render extremely fast, but they are playable. I believe that the Doom gaming experience on early 80386 PCs was not particularly fast either.

I also tried a demo (Closer by CNCD ) that was designed specifically for the Amiga 4000 showing me 3D animations:

The Cybervision 64/3D RTG card opens up even more possibilities. A higher resolution screen makes it possible for me to more conveniently use productivity software, such as FinalWriter and Deluxe Music:

Painting can also be done in high resolution, true-color modes with TVPaint:

One particular program I am still quite impressed with is LightWave. Although LightWave even works on an Amiga 500 (albeit with many limitations), having a much faster machine with an FPU at my disposal makes it much more usable:

In combination with my RTG card I can also render a scene at a much higher resolution, with more colors:

I must admit that the above rendered picture still looks nice today.

How does the Amiga 4000 experience compare to the Amiga 500?

As I have explained earlier, I have intensively used my Amiga 500 for years, but I have not used an Amiga 4000 much, so it is still (sort of) a new experience to me.

I have to say that the experience in the last couple of weeks was nice -- it was nice to explore the enhanced capabilities of the Amiga 4000 and the applications that use it:

A faster CPU makes many applications much more responsive, which makes you as a user more productive. For example, FinalWriter and DeluxeMusic also work fine on my Amiga 500, but it sometimes takes a couple of seconds to render a page of a document or sheet music. On an Amiga 4000 these pages render almost instantly.
Some applications are much more usable, such as LightWave. More abilities open up because of the machine's speed and the presence of an FPU. Although the non-floating point aspects of LightWave even work on an Amiga 500, it is too slow and limited to do anything useful.
There are AGA games that look nice.
There are some really nice demos using the enhanced capabilities of the Amiga 4000.
With an RTG card you can do advanced, high resolution, true-color graphics. I am particularly impressed by the images that I can produce with LightWave in combination with a RTG card. With an RTG card, I have the feeling that the Amiga 4000 is still on par with PCs in the early 90s.

Although it is nice to have more power and new abilities, I do not want to claim that my Amiga 4000 experience is significantly better than my Amiga 500 experience.

For quite a few of my use cases an Amiga 4000 is "overkill" -- most of the software that I am interested in were developed for the Amiga 500, because that was the most popular Amiga model, despite being a lesser capable machine.

On my Amiga 4000, I still mostly use the same applications that also work fine on an Amiga 500, such as DirectoryOpus, Cygnus Editor, AMOS Professional, and ProTracker. Moreover, most of my favourite games were OCS/ECS-compatible games.

Furthermore, there are some aspects of my Amiga 4000 that I consider worse compared to my Amiga 500:

I do not have the ability to (conveniently) switch the primary floppy disk drive (the DF0: drive) with an external drive. On my Amiga 500, there is a DF0 selector switch that I can use to make the secondary disk drive the primary disk drive and vice-versa. This feature makes it possible to conveniently boot disk images from my GoTek floppy drive.

As far as I know, it is not possible to have such a facility on an Amiga 4000. Some applications and games can boot from any disk drive (if you select the boot device from the Kickstart by pressing both mouse buttons on startup), but many of them will only work if you boot from DF0:.

I do not want to replace the internal floppy drive from my Amiga 4000 with a GoTek. Fortunately an alternative solution exists -- I can also find WHDLoad replacements of these games so that they can be played from the hard drive.
To my knowledge, there is also no method to use multiple ROM chips. Although soft kicking is also possible, it is still a bit disappointing to require such a complex solution.
The Amiga 4000 is not perfectly backwards compatible with software designed for older Amiga models. There are quite a few games that crash because of an incompatibility, such as one of my favourite Amiga 500 games: Turrican.

There are a variety of reasons that older applications may not work: the AGA chipset, the CPU, and the Kickstart/Workbench version are components that could introduce incompatibilities.

Fortunately, the WHDLoad project has addressed many compatibility issues. WHDLoad is somewhat comparable to Virtual Machines on modern computers -- aside from the fact that it loads disk images from an hard disk and prevents the user from switching disks, it can also soft-kick more compatible kickstart versions, disable CPU caches and perform other kinds of tweaks to make a game more compatible on a certain system.

For example, the WHDLoad-version of Turrican is compatible with my Amiga 4000, because it provides all kinds of facilities/fixes to deal with incompatibilities.
The Cybervision 64/3D is a nice graphics card for mid-90s standards, but not all software can effectively use it.

Conclusion

In this blog post, I have described the properties of my Amiga 4000, how I have installed the operating systems from scratch and my experiences using the machine.

13th annual blog reflection

2023-12-30T18:22:00.000+01:00

Today, it is the 13th anniversary of my blog. As usual, this is a nice opportunity to reflect over last year's writings.

Similar to 2022, 2023 was not a very productive year compared to the years before -- I am still a bit in a state of recovery, because of the pressure that I was exposed to two years ago. Another reason is that a substantial amount of my spare time is spend on voluntary work for the musical society that I am a member of.

Web framework

I have been maintaining the website for the musical society for quite a few years and it is one of last applications that still use my custom web framework. Thanks to my voluntary work, I made a new feature addition to the layout framework: generating dynamic menus by using an HTML representation of a site map as a basis, such as a mobile navigation/hamburger menus and dropdown menus.

I never liked these kinds of menus very much, but they are quite commonly used. In particular, on mobile devices, a web application feels weird if it does not provide a mobile navigation menu.

The nice thing about using a site map as a basis is that web pages are still able to degrade gracefully -- when using a text-oriented browser or when JavaScript is disabled (JavaScript is mandatory to create an optimal mobile navigation menu), a web site remains usable.

Nix development

Last year, I explained that I had put my Nix development work on hold. This year, I still did not write any Nix-related blog posts, but I have picked up Nix development work again and I am much more active in the community.

I have visited NixCon 2023 and I had a great time. While I was at NixCon, I have decided to pick up my work for the experimental process management framework from where I left it behind -- I started writing RFC 163 that explains its features so that they can be integrated into the main Nixpkgs/NixOS distribution.

Writing an RFC was already on my TODO list for two years, and I always had the intention integrate the good ideas on this framework into Nixpkgs/NixOS so that the community can benefit from it.

The RFC is still being discussed and we are investigating some of the raised questions and concerns.

Research papers

I have also been sorting files on my hard drive, something that I commonly do at the end of the year. The interesting thing is that I also ran into research papers that I collected in the last sixteen years.

Since reading papers and maintaining your knowledge is quite important for researchers and not something that is easy to do, I wrote a blog post about my experiences.

Retro computing

Another area that I worked on is retro computing. I finally found the time to get all my old 8-bit Commodore machines (a Commodore 64 and 128) working in the way they should. The necessary repairs were made and I have ordered new and replacement peripherals. I wrote a blog post that shows how I have been using these 8-bit Commodore machines.

Conclusion

Next year, I intend to focus myself more on Nix development. I already have enough ideas the I am working on, so stay tuned!

The last thing I would like to say is:

HAPPY NEW YEAR!!!

Using my Commodore 64 and 128 in 2023

2023-12-28T23:43:00.006+01:00

Two years ago, I wrote a blog post about using my Commodore Amiga 500 in 2021 after not having touched it in ten years. Although the computer was still mostly functional, some peripherals were broken.

To fix my problems, I brought it to the Home Computer Museum in Helmond for repairs.

Furthermore, I have ordered replacement peripherals so that the machine can be more conveniently used, such as a GoTek floppy emulator. The GoTek floppy emulator makes it possible to conveniently use disk images stored on an USB memory stick as a replacement for physical floppy disks.

I also briefly mentioned that I have been using my first computer: a Commodore 128 for a while. Moreover, I also have a functional Commodore 64, that used to be my third computer.

Although I have already been these 8-bit machines on a more regular basis since 2022, I was not satisfied enough yet to write about them, because there were still some open issues, such as a broken joystick cable and the unavailability of the 1541 Ultimate II cartridge. The delivery took a while because it had to be redesigned/reproduced due to chip shortages.

A couple of weeks ago, the cartridge was finally delivered. In my Christmas holiday, I finally found the time to do some more experiments and write about these old 8-bit Commodore machines.

My personal history

The Commodore 128, that is still in my possession, originally belonged to my parents and was the first computer I was exposed to. Already as a six year old, I knew the essential BASIC commands to control it, such as requesting a disk's contents (e.g. LOAD"$",8: LIST), loading programs from tape and disk (e.g. LOAD, LOAD"*",8,1) and running programs (RUN).

One of my favorite games was the Commodore 64 version of Teenage Mutant Ninja Turtles developed by Ultra Software, as can be seen in the above screenshot.

I liked the game very much because I was a fan of the TV show, but it was also quite buggy and notoriously difficult. Some parts of the game were as good as impossible to finish. As a result, I was never able to complete the game, despite having played it for many hours.

Many relatives of mine used to have an 8-bit Commodore machine. A cousin and uncle used to own a Commodore 64C, and another uncle owned a Commodore 128. We used to exchange ideas and software quite a lot.

At first, I did not know that a Commodore 128 was a more capable machine than an ordinary Commodore 64. My parents used to call it a Commodore 64, and for quite some time I did not know any better.

The main reason behind the confusion is that a Commodore 128 is nearly 100% backwards compatible with a Commodore 64 -- it contains the same kinds of chips and it offers a so-called Commodore 64 mode.

You can switch to Commodore 64 mode by holding the Commodore logo key on bootup or by typing: GO64 on the command prompt. When a utility cartridge is inserted, the machine always boots in Commodore 64 mode. The picture above shows my Commodore 128 running in Commodore 64 mode.

At the time, we had a utility cartridge inserted into the cartridge socket that offered fast loading, preventing us from seeing the Commodore 128 mode altogether. Moreover, with the exception of the standard software that was bundled with the machine, we only had Commodore 64 software at our disposal.

In 1992, I wrote my first BASIC program. The program was very simple -- it changes the colors of the text, screen and screen border, asks somebody to provide his name and then outputs a greeting.

At some point, by accident, the utility cartridge was removed and I discovered the Commodore 128 mode, as can be seen in the picture above. I learned that the Commodore 128 ROM had a more advanced BASIC version that, for example, also allows you to play music with the PLAY command.

I also discovered the CP/M disk that was included with the machine and tried it a few times. It looked interesting (as can be seen in the picture above) but I had no applications for it, so I had no idea what to do with it. :)

I liked the Commodore 128 features very much, but not long after my discovery, my parents bought a Commodore Amiga 500 and gave the Commodore 128 to another uncle. All my relatives that used to have an 8-bit Commodore machine already made the switch to the Amiga, and we were the last to make the transition.

Although switching to a next generation machine may sound exciting, I felt disappointed. In the last year that the Commodore 128 was still our main machine, I learned so much, and I did not like it that I was no longer be able to use the machine and learn more about my discoveries. Fortunately, I could still play with the old Commodore 128 once in a while when we visited that uncle that we gave the machine to.

Some time later, in late 1993, my parents gave me a Commodore 64 (the old fashioned breadbin model) that they found at a garage sale (as shown in the picture above). This was the third computer model that I was exposed to and the first computer that was truly mine, because I did not have to share it with my parents and brother. This machine gave me my second 8-bit Commodore experience, and I have been using this old machine for quite some time, until mid-1997.

Originally, the Commodore 64 did not come with any additional peripherals. It was just the computer with a cassette drive and no utility cartridges for fast loading. I had a cassette with some games and a fast loading program that was the first program on the tape. Nothing more. :)

I was given a few books and I picked up Commodore 64 programming again. In the following years, I learned much more about programming and the capabilities of the Commodore 64, such as for-loops, how to do I/O, how to render sprites and re-program characters.

I have also been playing around with audio, but my sound and music skills were far too limited to do anything that makes sense. Moreover, I did quite a few interesting attempts to create games, but nothing truly usable came out of it. :)

In 1994, I bought a 1541 disk drive and several utility cartridges at a garage sale, such as the Final Cartridge (shown above). The Final Cartridge provides all kinds of useful features, such as fast loading and the ability to interrupt the machine and inspect its memory with a monitor.

Owning a disk drive also allowed me to make copies of the games that I used to play on my parents' Commodore 128.

Eventually, in 1998 I switched to the Commodore Amiga 500 as my personal machine, but I kept my Commodore 64. In 1998, Commodore was already out of business for four years and completely lost its relevance. My parents bought a PC in 1996. After using the Amiga for a while on the attic, the Amiga's display broke rendering it unusable. In 1998, I discovered how to attach the Amiga to a TV.

In late 1999, I was finally able to buy my own PC. I kept the Amiga 500, because I still considered it a cool machine.

Several years later, the Commodore 128 was returned to me. My uncle no longer had any use for it and was considering to throw it away. Because I still remembered its unique features (compared to a regular Commodore 64), I have decided to take it back.

Some facts

Why is the Commodore 64 such an interesting machine? Besides the fact that it was the first machine that I truly owned, it has also been listed in the Guinness World Records as the highest-selling single computer model of all time.

Moreover, it also has interesting capabilities, such as:

64 KiB of RAM. This may not sound very impressive for nowadays' standards (for example, my current desktop PC has 64 GiB of RAM, a million times as much :) ), but in 1982 this used to be a huge amount.
A 6510 CPU, which is a modified 6502 CPU that has an 8-bit I/O port added. On machines with a PAL display, it runs at a clock speed slightly under 1 MHz.
Compared to modern CPUs, this may not sound impressive (a single core of my current CPU runs at 3.7 GHz :) ), but in the 80s the CPU was quite good -- it was cheap and very efficient.

Despite the fact that there were competing CPUs at the time that ran at higher clock speeds, most 6502 instructions only take a few cycles and fetch the next instruction from memory while the previous instruction is still in execution. As a result, it was still a contender to most of its competitors that ran at higher clock speeds.
A nice video chip: the VIC chip. It supports 16 preconfigured colors and various screen modes, including high resolution screen modes that can be used for addressing pixels. It also supports eight hardware managed sprites -- movable objects directly controlled by the video chip.
A nice sound chip: the SID chip. It offers three audio channels, four kinds of waveforms (triangle, sawtooth, squarewave and white noise) and analog mixing. This may not sound impressive, but at the time, the fact that the three audio channels can be used to mix waveforms arbitrarily was a very powerful capability.
An operating system using a BASIC programming language interpreter (Commodore BASIC v2) as a shell. In the 70s and 80s, BASIC was a very popular programming language due to its simplicity.

Other interesting capabilities of the Commodore 64 were:

The RAM is shared between the CPU and other chips, such as the VIC and SID. As a result, the CPU is offloaded to do calculation work only.
The CPU's clock speed is aligned with the video beam. The screen updates 50 times per second and the screen is rendered from top to bottom, left to right. Each screen block takes two cpu cycles to render.

These properties make it possible to change the screen while it is rendered (a technique called racing the beam). For example, while the screen is drawn, it is possible to adjust the colors in color RAM, multiplexing sprites (by default you can only configure eight sprites), changing screen modes (e.g. from text to high res etc).

For example, the following screenshot of a computer game called: Mayhem in Monsterland demonstrates what is possible by "racing the beam". In the intro screen (that uses multi-color bitmap mode), we can clearly see more colors per scanline than three unique colors and a background color per 8x8 block, that is normally only possible in this screen mode:
And of course, the Commodore 64 has a huge collection games and applications.

The Commodore 128 has similar kinds of chips (as a result, it is nearly 100% compatible with the Commodore 64).

It has the following changes and additions:

Double the amount of RAM: 128 KiB
A second video chip: the VDC chip, that can render 80-column text and higher resolution graphics, but no sprites. To render 80-column output, you need to connect an RGBI cable to a monitor that is capable of displaying 80-column graphics. The VDC chip does not work with a TV screen.
A CPU that is twice as fast: the 8502, but entirely backwards compatible with the 6510. However, if you use the VIC chip for rendering graphics (40 column mode) the chip still runs at half its speed, which is the same as the ordinary 6510. In 80-column mode or when the screen is disabled, it runs at twice the speed of a 6510.
A second CPU: the Zilog Z80. This CPU is used after booting the machine in CP/M mode from the CP/M boot disk.
An improved BASIC interpreter that supports many more features: Commodore BASIC v7.0

Using the Commodore machines

To conveniently use the Commodore machines in 2023, I have made a couple of repairs and I have ordered new peripherals.

Power supplies

I bought new power supplies. I learned that it is not safe to use the original Commodore 64 power supply as it gets older -- it may damage your motherboard and chips:

In a nutshell, the voltage regulator on the 5 volt DC output tends to fail in such a way that it lets a voltage spike go straight to your C64 motherboard, frying the precious chips.

Fortunately, modern replacement power supplies exist. I bought one from: c64lover.com that seems to work just fine.

I also bought a replacement power supply for the Commodore 128 from Keelog. The original Commodore 128 power supply is more robust than the Commodore 64 power supply, but I still want some extra safety.

Cassette drive

As I have already explained, my Commodore 64 breadbin model only included a cassette drive. I still have that cassette drive (as shown in the picture above), but after obtaining a 1541 disk drive, I never used it again.

Two years ago, I ordered an SD2IEC that included a bonus cassette with a game: Hi-Score. I wanted to try the game, but it turned out there was a problem with the cassette drive -- it seems to spin irregularly.

After a brief investigation I learned that the drive belt was in a bad condition. I have ordered a replacement belt from Ebay. Installing it was easy, and the game works great:

Disk drives

I have two disk drives. As I have already explained, I have a 1541 drive that I bought from a garage sale for my Commodore 64. The pictures above show the exterior and interior of the disk drive.

The disk drive still works, but I had a few subtle problems with running modern demos that concurrently load data while playing the demo. Such demos would sometimes fail (crash or the sound starts to run out of sync with the rest of the demo), because of timing problems.

I have cleaned the disk drive head with some alcohol and that seemed to improve the situation.

I also have a 1571 disk drive that came with the Commodore 128. The 1571 disk drive is a more advanced disk drive, that is largely backwards compatible with the 1541. The pictures above show the exterior and interior of the drive.

In addition to ordinary Commodore 64 disks, it can also read both sides of a floppy disk at the same time and use disks that are formatted to be as such. It can also run in MPM mode to read CP/M floppy disks.

My 1571 disk drive still seems to work fine. The main reason why I did not have to clean it is because I have not used it as much as the 1541 disk drive.

The 1541 and 1571 disk drives are interesting devices. They are computer systems themselves -- each of them contains two embedded machines each having their own 6502 CPUs. One sub system is responsible for managing filesystem operations and the communication with the main computer, while the other sub system is used for controlling the drive.

The 1541 disk drive contains 2 KiB of RAM and runs its own software from a ROM chip that provides the Commodore Disk Operating System.

Technically, using a disk drive on an 8-bit Commodore machine is the same as two computer systems communicating with each other over Commodore's proprietary serial interface (the IEC).

Monitor

I also have a monitor for the Commodore 128 machine: the Commodore 1901 that is capable of displaying graphics in 40 and 80 column modes. It has an RGBI socket for 80-column graphics and RCA sockets for 40-column graphics. I need to use a switch on the front panel to switch between the 40 and 80 column graphics modes. In the picture shown above, I have switched the monitor to 80-column mode.

The monitor still works fine, but in 2019 a capacitor burned out, causing smoke to come out of the monitor, which was a scary experience.

Fortunately, the monitor was not irreparably damaged and the home computer museum managed to replace the broken capacitors. After it was returned to me, it seems to work just fine again.

In 1992, besides "The Very First": a programming tutorial and CP/M, I did not have any software for the Commodore 128. In the recent years I have downloaded a few interesting applications for the Commodore 128, such as a Tetris game that works in 80-column mode:

Joysticks

As already shown in earlier pictures, I am using The Arcade joysticks produced by Suzo International. I have four of them.

Unfortunately, three of them did not work properly anymore because their cables were damaged. I have managed to have them repaired by using this cable from the Amiga shop as a replacement.

Mouse

The Commodore 128 also came with a 1351 mouse (a.k.a. tank mouse), but it was lost. I never used a mouse much, except for GEOS: a graphical operating system.

To get that GEOS experience back, I first bought an adapter device that allows you to use a PS/2 mouse as a 1351 mouse. Later, I found the same 1351 mouse model on Ebay:

SD2IEC

I have also been looking into more convenient ways to use software that I have downloaded from the Internet. Transferring downloaded disk images from and to physical floppy disks is possible, but quite inconvenient.

The SD2IEC is a nice modern replacement for a 1541 disk drive -- it is cheap, it can be attached to the IEC port and all high-level disk commands seem to be compatible. Physically, it also looks quite nice:

As can be seen it the picture above, it looks very similar to an ordinary 1541 disk drive.

Unfortunately, low-level disk-drive operations are not supported -- some applications need to carry out low-level operations for fast loading, such as demos.

Nonetheless, the SD2IEC is still a great solution, because there are plenty of applications and games that do not require any fast loading capabilities.

1541 Ultimate II cartridge

After happily using the SD2IEC for a while, I wanted better compatibility with the 1541 disk drive. For example, many modern demos are not compatible, and I do not have enough physical floppy disks to store these demos on.

As I have already explained, this year, I have ordered the 1541 Ultimate II cartridge, but it took a while before it could be delivered.

The 1541 Ultimate II cartridge is an impressive device -- it can be attached to the IEC port and the tape port (by using a tape adapter) and provides many interesting features, such as:

It can cycle-exact emulate two 1541 disk drives
It offers two emulated SID chips
It can load cartridge images
It simulates disk drive sounds
You can attach up to two USB storage devices. You can load disk images and tape images, but you can also address files from the file system directly.
It has an ethernet adapter.

The above two pictures demonstrate how the cartridge works and how it is attached to the Commodore 64.

I am very happy that I was able to run many modern demos developed by the Demoscene community, such as: Comaland by Oxyron and Next level by Performers etc. on a real Commodore 64 without using physical floppy disks:

UPDATE June 2024: I learned that it is also possible to use the 1541 Ultimate II cartridge in CP/M mode on the Commodore 128. Previously, I have noticed that it is possible to boot a CP/M disk image on a Commodore 128. However, in CP/M mode, the computer freezes when you press the menu button on the cartridge making it impossible for me to switch disk images.

To cope with this problem, it is also seems to be possible to control the cartridge with a special piece of software called: CPMUTools. By enabling the command-interface and the RAM extension with 16 MiB of RAM, it is possible to mount floppy disks by using a CP/M program called UMOUNT and integrate with the operating system configuration (such as the clock) with a tool called UCONFIG.

To conveniently use CP/M with the 1541 Ultimate cartridge, I enable a secondary simulated 1581 disk drive (B drive) that I use to mount the CPMUTools disk to. In CP/M, I can start the disk utility as follows:

B:
UMOUNT

showing me the following program allowing me to pick disk images:

By using CPMUTools, it has become possible for me to try out CP/M software on a real Commodore 128, such as Turbo Pascal, something that I have not been able to do in the last thirty years:

ZoomFloppy

Sometimes I also want to transfer data from my PC to physical floppy disks and vice versa.

For example, a couple of years ago, I wanted to make a backup of the programs that I wrote when I was young.

In 2014, I have ordered a ZoomFloppy device to make this possible.

ZoomFloppy is a device that offers an IEC socket to which a Commodore disk drive can be connected. As I have already explained, Commodore disk drives are self-contained computers. As a result, linking to the disk drive directly suffices.

The ZoomFloppy device can be connected to a PC through the USB port:

The above picture shows how my 1541 disk drived is connected to my ThinkPad laptop by using the ZoomFloppy device. If you carefully look at the screen, I have requested an overview of the content of the disk that is currently in the drive.

I use OpenCBM on my Linux machine to carry out disk operations. Although graphical shells exist (for example, the OpenCBM project provides a graphical shell for Windows called: cbm4wingui), I have been using command-line instructions. They may look scary, but I learned them quite quickly.

Here are some command-line instructions that I use frequently:

Request the status of the disk drive:

$ cbmctrl status 8
73,speeddos 2.7 1541,00,00

Format a floppy disk (with label: mydisk and id: aa):

$ cbmformat 8 "mydisk,aa"

Transfer a disk image's contents (mydisk.d64) to the floppy disk in the drive:

$ d64copy mydisk.d64 8

Make a D64 disk image (mydisk.d64) from the disk in the drive:

$ d64copy 8 mydisk.d64

Request and display the directory contents of a floppy:

$ cbmctrl dir 8

Transfer a file (myfile) from floppy disk to PC:

$ cbmcopy --read 8 myfile

Transfer a file (myfile) from PC to floppy disk:

$ cbmcopy --write 8 myfile

Conclusion

In this blog post, I have explained how I have been using my old 8-bit Commodore 64 and 128 machines in 2023. I made some repairs and I have ordered some replacement peripherals. With these new peripherals I can conveniently run software that I have downloaded from the Internet.

On reading research papers and maintaining knowledge

2023-12-21T22:22:00.007+01:00

Ten years ago I have obtained my PhD degree and made my (somewhat gradual) transition from academia to industry. Despite the fact that I made this transition a long time ago, I still often get questions from people who are considering doing a PhD.

Most of the discussions that I typically have with such people are about writing -- I have already explained plenty about writing in the past, including a recommendation to start a blog so that writing becomes a habit. Having a blog allows you to break up your work into manageable pieces and build up an audience for your work.

Recently, I have been elaborately reorganizing files on my hard-drive, a tedious task that I often do at the end of the year. This year, I have also been restructuring my private collection of research papers.

Reading research papers became a habit while working on my master's thesis and doing my PhD. Although I have left academia a long time ago, I have retained the habit, although the amount of papers and articles that I read today is much lower than my PhD days. I no longer need to study research works much, but I have retained the habit to absorb existing knowledge and put things into context whenever I intend to do something new, for example for my blog posts or software projects.

In 2020, during the first year of the COVID pandemic, I have increased my interest in research papers somewhat, because I had to revise some of the implementations of algorithms in the Dynamic Disnix framework that were based on work done by other researchers. Fortunately, the ACM temporarily opened their entire digital library to the public for free so that I could get access to quite an amount of interesting papers, without requiring to pay.

In addition to writing, reading in academic research is also very important, for the following reasons:

To expand your knowledge about your research domain.
To put your own work into context. If you want to publish a paper about your work, having a cool idea is not enough -- you have to explain what your research contributes: what the innovation is. As a result, you need to relate to earlier work and (optionally) to studies that motivate the relevance of your work. Furthermore, you can also not just take credit for work that has already been done by others. As a consequence, you need to very carefully investigate what is out there.
You may have to peer review papers for acceptance in conference proceedings and journals.

Reading papers is not an easy job -- it often takes me quite a bit of time and dedication to fully grasp a paper.

Moreover, when studying a research paper, you may also have to dive into related work (by examining a paper's references, and the references of these) to fully get an understanding. You may have to dive several levels deep to gain enough understanding, which is not a straightforward job.

In this blog post, I want to share my personal experiences with reading papers and maintaining knowledge.

My personal history with reading

I have an interesting relationship with reading. Already at young age, I used to study programming books to expand my programming knowledge.

With only limited education and knowledge, I was very practically minded -- I would relentlessly study books and magazines to figure out how to get things done, but as soon as I figured out enough to get something done I stopped reading, something that I consider a huge drawback of the younger version of me.

For example, I still vividly remember how I used to program 2D side scroller games for the Commodore Amiga 500 using the AMOS BASIC programming language. I figured out many basic concepts by reading books, magazines and the help pages, such as how to load IFF/ILBM pictures as backgrounds, load ProTracker modules as background music, using blitter objects for moving actors, using a double buffer to smoothly draw graphics, side scrolling, responding to joystick input etc.

Although I have managed to make somewhat playable games by figuring out these concepts, the games were often plagued by bugs and very slow performance. A big reason that contributed to these failures is because I stopped reading after mastering the basics.

For example, to improve performance, I should have disabled the autoback feature that automatically swaps the physical and logical screens on every drawing command and do the screen swap manually after all drawing instructions were completed. I knew that using a double screen buffer would take graphics glitches away, but I never bothered to study the concepts behind it.

As I grew older and entered middle school, I became more critical of myself. For example, I learned that it is essential to properly cite where you get your knowledge from rather than "pretending" that you are pulling something out of your own hat. :)

Fast forwarding to my studies at the university: reading papers from the academic literature became something that I had to commonly do. For example, I still remember the real-time systems and software architecture courses.

The end goal of the former course was to write your own research paper about a subject in the real-time systems domain. In this course, I learned, in addition to real-time system concepts, how academic research works: writing a research paper is not just about writing down a cool idea (with references that you used as an inspiration), but you also need to put your work into context -- a research paper is typically based on work already done by others, and your paper typically serves as an ingredient that can be picked up by other researchers.

In the latter course, I had to read quite a few papers in the software architecture domain, write summaries and discuss my findings with other students. Here, I learned that reading papers is all but a trivial job:

Papers are often densely written. As a result, I get overwhelmed with information and it requires quite a bit of energy from my side to consume all of it.
The formatting of many papers is not always helpful. Papers are typically written for print, not for reading from a screen. Also, the formatting of papers are not always good for displaying code fragments or diagrams.
There is often quite a bit of unexplained jargon in a paper. To get a better understanding you need to dive deeper into the literature, such as also studying the references of the papers or books that are related to the subject.
Sometimes authors frequently use multi-syllable words.
It is also not uncommon for authors to use logic and formulas to formalize concepts and mathematically prove their contributions. Although formalization helps to do this, reading formulas is often a tough job for me -- there is typically a huge load of information and Greek symbols. These symbols IMO are not always very helpful to relate to what concepts they represent.
Authors often tend to elaborately stress out the caveats of their contributions, making things hard to read.

Despite having read many papers in the last 16 years and I got better at it, reading still remains a tough job because of the above reasons.

In the final year of my master's, I had to do a literature survey before starting the work on my master's thesis. The first time I heard about this, I felt scared, because of my past experiences with reading papers.

Fortunately, my former supervisor: Eelco Visser, was very practically minded about the process -- he wanted us to first work on practical aspects of their research projects, such as WebDSL: a domain-specific language for developing web applications with a rich data model and related tools, such as Stratego/XT and the Nix package manager.

After mastering the practical concepts of these projects, doing a literature survey felt much easier -- instinctively, while using these tools in practice, I became more interested in learning about the concepts behind them. Many of their underlying concepts were described in research papers published my my colleagues in the same research department. While studying these papers, I also got more motivated/interested into diving deeper in the academic literature by studying the papers' references and searching for related subjects in the digital libraries of the ACM, IEEE, USENIX, Springer, Elsevier etc.

During my PhD reading research papers became even more important. In the first six months of my PhD, I had a very good start. I published a paper about an important aspect of my master's thesis: atomic upgrading of the static parts of a distributed system, and a paper about the overall objective of the research project that I was in. I have to admit that, despite having these papers instantly accepted, I still had the wrong mindset -- I was basically just "selling my cool ideas" and finding support in the academic literature, rather than critically studying what is out there.

For my third paper, that covers a new implementation of Disnix (the third major revision to be precise), I learned an important/hard lesson. The first version of the paper got badly rejected by the program committee, because of my "advertising cool ideas mindset" that I always used to have -- I failed to study the academic literature well enough to explain what the innovation of my paper is in comparison to other deployment solutions. As a consequence, I got some very hard criticisms from the reviewers.

Fortunately, they gave me good feedback. For example, I had to study papers from the Working Conference on Component Deployment. I have addressed their criticisms and the revised paper got accepted. I learned what I had to do in the future -- it is a requirement to also study the academic literature well enough to explain what your contribution is and demonstrate its relevance.

This rejection also changed my attitude how I deal with research papers. Previously, after my work for a paper was done, I would typically discard the artifacts that I no longer needed, including the papers that I used as a reference. After this rejection, I learned that I need to build my own personal knowledge base so that for future work, I could always relate to the things that I have read previously.

Reading research papers

I have already explained that for various reasons, reading research papers is all but an easy job. For some papers, in particular the ones in my former research domain: software deployment, I got better as I grew more familiar to the research domain.

Nonetheless, I still sometimes find reading papers challenging. For example, studying algorithmic papers is extremely hard IMO. In 2021, I had to revise my implementations of approximation solutions for the multi-way cut, and graph coloring problems in the Dynamic Disnix framework. I had to re-read the corresponding papers again. Because they were so hard to grasp, I wrote a blog post that explains how I practically applied them.

To fully grasp a paper, reading it a single time is often not enough. In particular the algorithmic papers that I mentioned earlier, I had to read them many times.

Interestingly enough, I learned that reading papers is also a subject of study. A couple of years ago I discovered a paper titled: "How to Read a Paper" that explains a strategy for reading research papers using a three-pass approach:

First pass: bird's eye view. Study the title, abstract, introduction, headings, conclusions. A single pass is often already enough to decide whether a paper is relevant to read or not.
Second pass: study in greater detail, but ignore the big details, such as mathematical proofs.
Third pass: read everything in detail by attempting to virtually re-implement the paper.

After discovering this paper, I have also been using the three pass approach. I have studied most of my papers in my collection in two passes, and some of them in detail in three passes.

Another thing that I discovered by accident is that to extensively study literature, a continuous approach works better for me (e.g. reserving certain timeslots in a week) than just reserving longer periods of time that consist of only reading papers.

Also, regularly discussing papers with your colleagues helps. During my PhD days, I did not do it that often (we had no formal "process" for it) but there were several good sessions, such as a program committee simulation organized by Arie van Deursen, head of our research group.

In this simulation, we organized a program committee meeting of the ICSE conference in which the members of the department represented program committee members. We have discussed submitted papers and voted for acceptance or rejection. Moreover, we also had to leave the room if there was a conflict of interest.

I also learned that Edsger Dijkstra, a famous Dutch computer scientist, organized the ETAC (Eindhoven Tuesday Afternoon Club) and ATAC (Austin Tuesday Afternoon Club) in which amongst other activities, reading and discussing research papers was a recurring activity.

Building up your personal knowledge base

As I have explained earlier, I used to throw away my downloaded papers when the work for a paper was done, but I changed that habit after that hard paper rejection.

There are many good reasons to keep and organize the papers that you have read, even if they do not seem to be directly relevant to your work:

As I have already explained, in addition to reading a single paper and writing your own research papers, you need to maintain your knowledge base so that you can put them into context.
It is not always easy to obtain papers. Many of them are behind a paywall. Without a subscription you cannot access them, so once you have obtained them it is better to think twice before you discard them. Fortunately, open access becomes more common but it still remains a challenge. Arie van Deursen has written a variety of blog posts about open access.
Although many papers are challenging to read, I also started to appreciate certain research papers.

My own personal paper collection has evolved in an interesting way. In the beginning, I just used to put any paper that I have obtained into a single folder called: papers until it grew large enough that I had to start classifying them.

Initially, there was a one-level folder structure, consisting of categories such as: deployment, operating systems, programming languages, DSL engineering etc. At some point, the content of some of these folders grew large enough and I introduced a second level directory structure.

For example, the sub folder for my former research domain: software deployment (the process that consists of all activities to make a software system available for use) contains the largest amount of papers. Currently, I have collected 168 deployment papers that I have divided over the following sub categories:

Deployment models. Papers whose main contribution is a means to model various deployment aspects of a system, such as the structure of a system and deployment activities.
Deployment planning. Papers whose main contribution are algorithms that decide a suitable/optimal deployment architecture based on functional and non-functional requirements of a system.
Empirical studies. Papers containing empirical studies about deployment in practice.
Execution. Papers in which the main contribution is executing deployment activities. I have also sub categorized this folder into technology-specific solutions (e.g. a solution is specific to a programming language, such as Java or component technology, such as CORBA) and generic solutions.
Practice reports. Papers that report on the use of deployment technologies in practice.
Surveys. Papers that analyse the literature and draw conclusions from them.

A hierarchical directory structure is not perfect for organizing papers -- for many papers there is an overlap between multiple sub domains in the software engineering domain. For example, deployment may also be related to a certain component technology, in service of optimizing the architecture of a system, related to other configuration management activities (versioning, status accounting, monitoring etc.) or an ingredient in integration testing. If there is an overlap, I typically look at the strongest kind of contribution that the paper makes.

For example, in the deployment domain, Eelco Dolstra wrote a paper about maximal laziness, an important implementation aspect of the Nix expression language. The Nix package manager is a deployment solution, but the contribution of the paper is not deployment, but making the implementation of a purely functional DSL efficient. As a result, I have categorized the paper under DSL engineering rather than deployment.

The organization of my paper collection is always in motion. Sometimes I gain new insights causing me to adjust the classifications, or when a collection of papers for a sub domain grows, I may introduce a second-level classification.

Some practical tips to get familiar with a certain research subject

So what is my recommended way to get familiar with a certain research subject in the software engineering domain?

I would start by doing something practical first. In software engineering research domain, often the goal is to develop or examine tools. Start by using these tools first and see if you can contribute to them from a practical point of view -- for example, by improving features, fixing bugs etc.

As soon as I have mastered the practical aspects, I may typically already get motivated to dive into their underlying concepts by studying the papers that cover them. Then I will apply the three pass reading strategy and eventually study the references of the papers to get a better understanding.

After my orientation phase has finished, the next thing I would typically look at is the conferences/venues that are directly related to the subject. For software deployment, for example, there used to be only one subject-related conference: the Working Conference On Component Deployment (that unfortunately was no longer organized after 2005). It is typically a good thing to have examined all the papers of the related conferences/venues, by at least using a first-pass approach.

Then a potential next step is to search for "early defining papers" in that research area. In my experience, many research papers are improving on concepts pioneered by these papers, so it is IMO a good thing to know where it all started.

For example, in the software deployment domain the paper: "A Characterization Framework for Software Deployment Technologies" is such an early defining paper, covering a deployment solution called "The Software Dock". The paper comes with a definition for the term: "software deployment" that is considered the canonical definition in academic research.

Alternatively, the paper: "Software Deployment, Past, Present and Future" is a more recent yet defining paper covering newer deployment technologies and also offers its own definition of the term software deployment.

For unknown reasons, I always seem to like early defining papers in various software engineering domains. These are some of my recommendations of early defining papers in other software engineering domains:

General purpose programming languages: The IBM 701 Speedcoding System, The FORTRAN Automatic Coding System, Recursive Functions of Symbolic Expressions and Their Computation by Machine, Part I, Modified Report on the Algorithmic Language ALGOL 60.
Domain-specific languages: Programming Pearls: Little Languages, Domain-Specific Languages: An Annotated Bibliography
Programming: A case against the GO TO statement, Notes on Structured Programming
Operating systems: The Structure of the "THE"-multiprogramming System, The Multics Input/Output System, A General-Purpose File System for Secondary Storage, The UNIX Time-Sharing System
Software engineering in general: Research paradigms in computer science

After studying all these kinds of papers, your knowledge level should already be decent enough to find your way to study the remaining papers that are out there.

Literature surveys

In addition to research papers that need to put themselves into context, extensive literature surveys can also be quite valuable to the research community. During my PhD, I learned that it is also possible to publish a paper about a literature survey.

For example, some of my former colleagues did an extensive and systematic literature survey in the dynamic analysis domain. In addition to the results, the authors also explain their methodology, that consists of searching on keywords, looking for appropriate conferences and journals and following the papers' references. From these results they have derived an attribute framework and classified all the papers into this attribute framework.

I have kept the paper as a reference for myself, because I like the methodology. I am not so interested in dynamic analysis or program comprehension from a research perspective.

Literature surveys also exist in my former research domain, such as a survey of deployment solutions for distributed systems.

Conclusions

In this blog post, I have shared my experiences with reading papers and maintaining knowledge. In research, it is quite important and you need to take it seriously.

Fortunately, during my PhD I have learned a lot. In summary, my recommendations are:

Archive your papers and build up a personal knowledge base
Start with something practical
Follow paper references
Study early defining papers
Find people to discuss with
Study continuously in small steps

Although I never did an extensive literature survey in the software deployment domain (it is not needed for submitting papers that contribute new techniques) I can probably even write a paper about software deployment literature myself. The only problem is that I am not quite up to date with work that has been published in the last few years, because I no longer have access to these digital libraries.

Moreover, I also need to find the time and energy to do it, if I really want to :)

Using a site map for generating dynamic menus in web applications

2023-07-04T22:09:00.001+02:00

In the last few weeks, I have been playing around with a couple of old and obsolete web applications that I have developed in the past with my own web framework. Much of the functionality that these custom web applications offer are facilitated by my framework, but sometimes these web applications also contain significant chunks of custom code.

One of the more interesting features provided by custom code is folding menus (also known as dropdown and dropright menus etc.), that provide a similar experience to the Windows start menu. My guess is that because the start menu experience is so familiar to many users, it remains a frequently used feature by many web applications as of today.

When I still used to actively develop my web framework and many custom web applications (over ten years ago), implementing such a feature heavily relied on JavaScript code. For example, I used the onmouseover attribute on a hyperlink to invoke a JavaScript function that unfolds a panel and the onmouseout attribute to fold a panel again. The onmouseover event handler injects a menu section into the DOM using CSS absolute positioning to put it in the right position on the screen.

I could not use the standard menu rendering functionality of my layout framework, because it deliberately does not rely on the usage of JavaScript. As a consequence, I had to write a custom menu renderer for a web application that requires dynamic menu functionality.

Despite the fact that folding menus are popular and I have implemented them as custom code, I never made it a feature of my layout framework for the following two reasons:

I want web applications built around my framework to be as declarative as possible -- this means that I want to concisely express as much as possible what I want to render (a paragraph, an image, a button etc. -- this is something HTML mostly does), rather than specifying in detail how to do it (in JavaScript code). As a result, the usage of JavaScript code in my framework is minimized and non-essential.

All functionality of the web applications that I developed with my framework must be accessible without JavaScript as much possible.
Another property that I appreciate of web technology is the ability to degrade gracefully: the most basic and primary purpose of web applications is to provide information as text.

Because this property is so important, many non-textual elements, such as an image (img element), provide fallbacks (such as an alt attribute) that simply renders alternative text when graphics capabilities are absent. As a result, it is possible to use more primitive browsers (such as text-oriented browsers) or alternative applications to consume information, such as a text-to-speech system.

When essential functionality is only exposed as JavaScript code (which more primitive browsers cannot interpret), this property is lost.

Recently, I have discovered that there is a way to implement folding menus that does not rely on the usage of JavaScript.

Moreover, there is also another kind of dynamic menu that has become universally accepted -- the mobile navigation menu (or hamburger menu) making navigation convenient on smaller screens, such as mobile devices.

Because these two types of dynamic menus have become so common, I want to facilitate the implementation of such dynamic menus in my layout framework.

I have found an interesting way to make such use cases possible while retaining the ability to render text and degrade gracefully -- we can use an HTML representation of a site map consisting of a root hyperlink and a nested unordered list as a basis ingredient.

In this blog post, I will explain how implementing these use cases are possible.

The site map feature

As already explained, the basis for implementing these dynamic menus is a textual representation of a site map. Generating site maps is a feature that is already supported by the layout framework:

The above screenshot shows an example page that renders a site map of the entire example web application. In HTML, the site map portion has the following structure:

<a href="/examples/simple/index.php">Home</a>

<ul>
    <li>
        <a href="/examples/simple/index.php/home">Home</a>
    </li>
    <li>
        <a href="/examples/simple/index.php/page1">Page 1</a>
        <ul>
            <li>
                <a href="/examples/simple/index.php/page1/page11">Subpage 1.1</a>
            </li>
            <li>
                <a href="/examples/simple/index.php/page1/page12">Subpage 1.2</a>
            </li>
        </ul>
    </li>
    <li>
        <a href="/examples/simple/index.php/page2">Page 2</a>
        ...
    </li>
    ...
</ul>

The site map, shown in the screenshot and code fragment above, consists of three kinds of links:

On top, the root link is displayed that brings the user to the entry page of the web application.
The unordered list displays links to all the pages visible in the main menu section that are reachable from the entry page.
The nested unordered list displays links to all the pages visible in the sub menu section that are reachable from the selected sub page in the main menu.

With a few simple modifications to my layout framework, I can use a site map as an alternative type of menu section:

I have extended the site map generator with the ability to mark selected sub pages and as active, similar to links in menu sections. By adding the active CSS class as an attribute to a hyperlink, a link gets marked as active.
I have introduced a SiteMapSection to the layout framework that can be used as a replacement for a MenuSection. A MenuSection displays reachable pages as hyperlinks from a selected page on one level in the page hierarchy, whereas a SiteMap section renders the selected page as a root link and all its visible sub pages and transitive sub pages.

With the following model of a layout:

$application = new Application(
    /* Title */
    "Site map menu website",

    /* CSS stylesheets */
    array("default.css"),

    /* Sections */
    array(
        "header" => new StaticSection("header.php"),
        "menu" => new SiteMapSection(0),
        "contents" => new ContentsSection(true)
    ),

    ...
);

We may render an application with pages that have the following look:

As can be seen in the above screenshot and code fragment, the application layout defines three kinds of sections: a header (a static section displaying a logo), a menu (displaying links to sub pages) and a contents section that displays the content based on the sub page that was selected by the user (in the menu or by opening a URL).

The menu section is displayed as a site map. This site map will be used as the basis for the implementation of the dynamic menus that I have described earlier in this blog post.

Implementing a folding menu

Turning a site map into a folding menu, by using only HTML and CSS, is a relatively straight forward process. To explain the concepts, I can use the following trivial HTML page as a template:

The above page only contains a root link and nested unordered list representing a site map.

In CSS, we can hide the root link and the nested unordered lists by default with the following rules:

/* This rule hides the root link */
body > a
{
    display: none;
}

/* This rule hides nested unordered lists */
ul li ul
{
    display: none;
}

resulting in the following page:

With the following rule, we can make a nested unordered list visible when a user hovers over the surrounding list item:

ul li:hover ul
{
    display: block;
}

Resulting in a web page that behaves as follows:

As can be seen, the unordered list that is placed under the Page 2 link became visible because the user hovers over the surrounding list item.

I can make the menu a bit more fancy if I want to. For example, I can remove the bullet points with the following CSS rule:

ul
{
    list-style-type: none;
    margin: 0;
    padding: 0;
}

I can add borders around the list items to make them appear as buttons:

ul li
{
    border-style: solid;
    border-width: 1px;
    padding: 0.5em;
}

I can horizontally align the buttons by adopting a flexbox layout using the row direction property:

ul
{
    display: flex;
    flex-direction: row;
}

I can position the sub menus right under the buttons of the main menu by using a combination of relative and absolute positioning:

ul li
{
    position: relative;
}

ul li ul
{
    position: absolute;
    top: 2.5em;
    left: 0;
}

Resulting in a menu with the following behaviour:

As can be seen, the trivial example application provides a usable folding menu thanks to the CSS rules that I have described.

In my example application bundled with the layout framework, I have applied all the rules shown above and combined them with the already existing CSS rules, resulting in a web application that behaves as follows:

Displaying a mobile navigation menu

As explained in the introduction, another type of dynamic menu that has been universally accepted is the mobile navigation menu (also known as a hamburger menu). Implementing such a menu, despite its popularity, is challenging IMHO.

Although there seem to be ways to implement such a menu without JavaScript (such as this example using a checkbox) the only proper way to do it IMO is still to use JavaScript. Some browsers have trouble accepting such HTML+CSS-only implementations and it requires the use of an HTML element (an input element) that is not designed for that purpose.

In my example web application, I have implemented a custom JavaScript module, that dynamically transforms a site map (that may have already been displayed as a folding menu) into a mobile navigation menu by performing the following steps:

We query the root link of the site map and transform it into a mobile navigation menu button by replacing the text of the root link by an icon image. Clicking on the menu button makes the navigation menu visible or invisible.
The first level sub menu becomes visible by adding the CSS class: navmenu_active to the unordered list.
The menu button becomes active by adding the CSS class: navmenu_icon_active to the image of the root link.
Nested menus can be unfolded or folded. The JavaScript code adds fold icons to each list item of the unordered lists that embed a nested unordered list.
Clicking on the fold icon makes the nested unordered list visible or invisible.
A nested unordered list becomes visible by adding the CSS class: navsubmenu_active to the unordered list
A fold button becomes active by adding the CSS class: navmenu_unfold_active to the fold icon image

It was quite a challenge to implement this JavaScript module, but it does the trick. Moreover, the basis remains a simple HTML-rendered site map that can still be used in text-oriented browsers.

The result of using this JavaScript module is the following navigation menu that has unfoldable sub menus:

Concluding remarks

In this blog post, I have explained a new feature addition to my layout framework: the SiteMapSection that can be used to render menu sections as site maps. Site maps can be used as a basis to implement dynamic menus, such as folding menus and mobile navigation menus.

The benefit of using a site map as a basis ingredient is that a web page still remains useful in its most primitive form: text. As a result, I retain two important requirements of my web framework: declarativity (because a nested unordered list describes concisely what I want) and the ability to degrade gracefully (because it stays useful when it is rendered as text).

Developing folding/navigation menus in the way I described is not something new. There are plenty of examples on the web that show how such features can be developed, such as these W3Schools dropdown menu and mobile navigation menu examples.

Compared to many existing solutions, my approach is somewhat puristic -- I do not abuse HTML elements (such as a check box), I do not rely on using helper elements (such as divs and spans) or helper CSS classes/ids. The only exception is to support dynamic features that are not part of HTML, such as "active links" and the folding/unfolding buttons of the mobile navigation menu.

Although it has become possible to use my framework to implement mobile navigation menus, I still find it sad that I have to rely on JavaScript code to do it properly.

Folding menus, despite their popularity, are nice but the basic one-level menus (that only display a collection of links/buttons of sub pages) are in my opinion fine too and much simpler -- the same implementation is usable on desktops, mobile devices and text-oriented browsers.

With folding menus, I have to test multiple resolutions and devices to check whether they provide the right user experience. Folding menus are useless on mobile devices --- you cannot separately trigger a hover event without generating a click event, making it impossible to unfold a sub menu and peek what is inside.

When it is also desired to provide an optimal mobile device experience, you also need to implement an alternative menu. This requirement makes the implementation of a web application significantly more complex.

Availability

The SiteMapSection has become a new feature of the Java, PHP and JavaScript implementations of my layout framework and can be obtained from my GitHub page.

In addition, I have added a sitemapmenu example web application that displays a site map section in multiple ways:

In text mode, it is just displayed as a (textual) site map
In graphics mode, when the screen width is 1024 pixels or greater, it displays a horizontal folding menu.
In graphics mode, when the screen width is smaller than 1024 pixels and JavaScript is disabled, it displays a vertical folding menu.
In graphics mode, when the screen width is smaller than 1024 pixels and JavaScript is enabled, it displays a mobile navigation menu.

Blog reflection over 2022

2022-12-30T20:37:00.000+01:00

Today, it is my blog's anniversary. As usual, this is a nice opportunity to reflect over the last year.

Eelco Visser

The most shocking event of this year is the unfortunate passing of my former PhD supervisor: Eelco Visser. I still find it hard to believe that he is gone.

Although I left the university for quite some time now, the things I learned while I was employed at the university (such as having all these nice technical discussions with him) still have a profound impact on me today. Moreover, without his suggestion this blog would probably not exist.

Because the original purpose of my blog was to augment my research with extra details and practical information, I wrote a blog post with some personal anecdotes about him.

COVID-19 pandemic

In my previous blog reflection, I have explained that we were in the third-wave of the COVID pandemic caused by the even more contagious Omicron variant of the COVID-19 virus. Fortunately, it turned out that, despite being more contagious, this variant is less hostile than the previous Delta variant.

Several weeks later, the situation got under control and things were opened up again. The situation remained pretty stable afterwards. This year, it was possible for me to travel again and to go to physical concerts, which feels a bit weird after staying home for two whole years.

The COVID-19 virus is not gone, but the situation is under control in Western Europe and the United States. There have not been any lockdowns or serious capacity problems in the hospitals.

When the COVID-19 pandemic started, my employer: Mendix adopted a work-from-home-first culture. By default, people work from home and if they need to go to the office (for example, to collaborate) they need to make a desk reservation.

As of today, I am still working from home most of my time. I typically visit the office only once a week, and I use that time to collaborate with people. In the remaining days, I focus myself on development work as much as possible.

I have to admit that I like the quietness at home -- not everything can be done at home, but for programming tasks I need to think, and for thinking I need silence. Before the COVID-19 pandemic started, the office was typically very noisy making it sometimes difficult for me to focus.

Learning modern JavaScript features

I used to intensively work with JavaScript at my previous employer: Conference Compass, but since I joined Mendix I am mostly using different kinds of technologies. During my CC days, I was still mostly writing old fashioned (ES5) JavaScript code, and I still wanted to familiarise myself with modern ES6 features.

One of the challenging aspects of using JavaScript is asynchronous programming -- making sure that the main thread of your JavaScript application never blocks too long (so that it can handle multiple connections or input events) and keeping your code structured.

With old fashioned ES5 JavaScript code, I had to rely on software abstractions to keep my code structured, but with the addition of Promises/A+ and the async/await concepts to the core of the JavaScript language, this can be done in a much cleaner way without using any custom software abstractions.

In 2014, I wrote a blog post about the problematic synchronous programming concepts in JavaScript and their equivalent asynchronous function abstractions. This year, I wrote a follow-up blog post about the ES6 concepts that I should use (rather than software abstractions).

To motivate myself learning about ES6 concepts, I needed a practical use case -- I have ported the layout component of my web framework (for which a Java and PHP version already exist) to JavaScript using modern ES6 features, such as async/await, classes and modules.

An interesting property of the JavaScript version is that it can be used both on the server-side (as a Node.js application) and client-side (directly in the browser by dynamically updating the DOM). The Java and PHP versions only work server-side.

Fun projects

In earlier blog reflections I have also decided to spend more time on useless fun projects.

In the summer of 2021, when I decided not to do any traveling, I had lots of time left to tinker with all kinds of weird things. One of my hobby projects was to play around with my custom maps for Duke3D and Shadow Warrior that I created while I was still a teenager.

While playing with these maps, I noticed a number of interesting commonalities and differences between Duke3D and Shadow Warrior.

Although both games use the same game engine: the BUILD-engine, their game mechanics are completely different. As an exercise, I have ported one of my Duke3D maps to Shadow Warrior and wrote a blog post about the process, including a description of some of their different game mechanics.

Although I did the majority of the work already back in 2021, I have found some remaining free time in 2022 to finally finish the project.

Web framework improvements

This year, I have also intensively worked on improving several aspects of my own web framework. My custom web framework is an old project that I started in 2004 and many parts of it have been rewritten several times.

I am not actively working on it anymore, but once in a while I still do some development work, because it is still in use by a couple of web sites, including the web site of my musical society.

One of my goals is to improve the user experience of the musical society web site on mobile devices, such as phones and tablets. This particular area was already problematic for years. Despite making the promise to all kinds of people to fix this, it took me several years to actually take that step. :-).

To improve the user experience for mobile devices, I wanted to convert the layout to a flexbox layout, for which I needed to extend my layout framework because it does not generate nested divs.

I have managed to improve my layout framework to support flexbox layouts. In addition, I have also made many additional improvements. I wrote a blog post with a summary of all my feature changes.

Nix-related work

In 2022, I also did Nix-related work, but I have not written any Nix-related blog posts this year. Moreover, 2022 is also the first time since the end of the pandemic that a physical NixCon was held -- unfortunately, I have decided not to attend it.

The fact that I did not write any Nix-related blog posts is quite exceptional. Since 2010, the majority of my blog posts are Nix-related and about software deployment challenges in general. So far, it has never happened that there has been an entire year without any Nix-related blog posts. I think I need to explain a thing or two about what has happened.

This year, it was very difficult for me to find the energy to undertake any major Nix developments. Several things have contributed to that, but the biggest take-away is that I have to find the right balance.

The reason why I got so extremely out of balance is that I do most of my Nix-related work in my spare time. Moreover, my primary motivation to do Nix-related work is because of idealistic reasons -- I still genuinely believe that we can automate the deployment of complex systems in a much better way than the conventional tools that people currently use.

Some of the work for Nix and NixOS is relatively straight forward -- sometimes, we need to package new software, sometimes a package or NixOS service needs to be updated, or sometimes broken features need to be fixed or improved. This process is often challenging, but still relatively straight forward.

There are also quite a few major challenges in the Nix project, for which no trivial solutions exist. These are problem areas that cannot be solved with quick fixes and require fundamental redesigns. Solving these fundamental problems is quite challenging and typically require me to dedicate a significant amount of my free time.

Unfortunately, due to the fact that most of my work is done in my spare time, and I cannot multi-task, I can only work on one major problem area at the time.

For example, I am quite happy with my last major development project: the Nix process management framework. It has all features implemented that I want/need to consistently eat my own dogfood. It is IMHO a pretty decent solution for use cases where most conventional developers would normally use Docker/docker-compose for.

Unfortunately, to reach all my objectives I had to pay a huge price -- I have published the first implementation of the process management framework already in 2019, and all my major objectives were reached in the middle of 2021. As a consequence, I have spend nearly two years of my spare time only working on the implementation of this framework, without having the option to switch to something else. For the first six months, I remained motivated, but slowly I ran into motivational problems.

In this two-year time period, there were lots of problems appearing in other projects I used to be involved in. I could not get these projects fixed, because these projects also ran into fundamental problems requiring major redesigns/revisions. This resulted in a number of problems with members in the Nix community.

As a result, I got the feeling the I lost control. Moreover, doing anything Nix-related work also gave (and in some extent still gives) me a lot of negative energy.

Next year, I intend to return and I will look into addressing my issues. I am thinking about the following steps:

Leaving the solution of some major problem areas to others. One of such areas is NPM package deployments with Nix. node2nix's was probably a great tool in combination with older versions of NPM, but its design has reached the boundaries of what is possible already years ago.

As a result, node2nix does not support the new features of NPM and does not solve the package scalability issues in Nixpkgs. It is also not possible to properly support these use cases by implementing "quick fixes". To cope with these major challenges and keep the solution maintainable, a new design is needed.

I have already explained my ideas on the Discourse mailing list and outlined what such a new design could look like. Fortunately, there are already some good initiatives started to address these challenges.
Building prototypes and integrate the ideas into Nixpkgs rather than starting an independent project/tool that attracts a sub community.

I have implemented the Nix process management framework as a prototype with the idea to show how certain concepts work, rather than advertising the project as a new solution.

My goal is to write an RFC to make sure that these ideas get integrated into the upstream Nixpkgs, so that it can be maintained by the community and everybody can benefit from it.

The only thing I still need to do is write that RFC. This should probably be one of my top priorities next year.
Move certain things out of Nixpkgs. The Nixpkgs project is a huge project with several thousands of packages and services, making it quite a challenge to maintain and implement fundamental changes.

One of the side effects of its scale is that the Nixpkgs issue tracker is a good as useless. There are thousands of open issues and it is impossible to properly track the status of individual aspects in the Nixpkgs repository.

Thanks to Nix flakes, which unfortunately is still an experimental feature, we should be able to move certain non-essential things out of Nixpkgs and conveniently deploy them from external repositories. I have some things that I could move out of the Nixpkgs repository when flakes have become a mainstream feature.
Better communication about the context in which something is developed. When I was younger, I always used to advertise a new project as the next great thing that everybody should use -- these days, I am more conservative about the state of my projects and I typically try to warn people upfront that something is just a prototype and not yet ready for production use.

Blog posts

In my previous reflection blog posts, I always used to reflect over my overall top 10 of most popular blog posts. There are no serious changes compared to last year, so I will not elaborate about them. The fact that I have not been so active on my blog this year has probably contributed that.

Concluding remarks

Next year I will look into addressing my issues with Nix development. I hope to return to my software deployment/Nix-related work next year!

The final thing I would like to say is:

HAPPY NEW YEAR!!!

A summary of my layout framework improvements

2022-12-30T20:17:00.003+01:00

It has been quiet for a while on my blog. In the last couple of months, I have been improving my personal web application framework, after several years of inactivity.

The reason why I became motivated to work on it again, is because I wanted to improve the website of the musical society that I am a member of. This website is still one of the few consumers of my personal web framework.

One of the areas for improvement is the user experience on mobile devices, such as phones and tablets.

To make these improvements possible, I wanted to get rid of complex legacy functionality, such as the "One True Layout" method, that heavily relies on all kinds of interesting hacks that are no longer required in modern browsers. Instead, I wanted to use a flexbox layout that is much more suitable for implementing the layout aspects that I need.

As I have already explained in previous blog posts, my web application framework is not monolithic -- it consists of multiple components each addressing a specific concern. These components can be used and deployed independently.

The most well-explored component is the layout framework that addresses the layout concern. It generates pages from a high-level application model that defines common layout aspects of an application and the pages of which an application consists including their unique content parts.

I have created multiple implementations of this framework in three different programming languages: Java, PHP, and JavaScript.

In this blog post, I will give a summary of all the recent improvements that I made to the layout framework.

Background

As I have already explained in previous blog posts, the layout framework is very straight forward to use. As a developer, you need to specify a high-level application model and invoke a view function to render a sub page belonging to the application. The layout framework uses the path components in a URL to determine which sub page has been selected.

The following code fragment shows an application model for a trivial test web application:

use SBLayout\Model\Application;
use SBLayout\Model\Page\StaticContentPage;
use SBLayout\Model\Page\Content\Contents;
use SBLayout\Model\Section\ContentsSection;
use SBLayout\Model\Section\MenuSection;
use SBLayout\Model\Section\StaticSection;

$application = new Application(
    /* Title */
    "Simple test website",

    /* CSS stylesheets */
    array("default.css"),

    /* Sections */
    array(
        "header" => new StaticSection("header.php"),
        "menu" => new MenuSection(0),
        "contents" => new ContentsSection(true),
    ),

    /* Pages */
    new StaticContentPage("Home", new Contents("home.php"), array(
        "page1" => new StaticContentPage("Page 1", new Contents("page1.php")),
        "page2" => new StaticContentPage("Page 2", new Contents("page2.php")),
        "page3" => new StaticContentPage("Page 3", new Contents("page3.php"))
    ))
);

The above application model captures the following application layout properties:

The title of the web application is: "Simple test website" and displayed as part of the title of any sub page.
Every page references the same external CSS stylesheet file: default.css that is responsible for styling all pages.
Every page in the web application consists of the same kinds of sections:
- The header element refers to a static header section whose purpose is to display a logo. This section is the same for every sub page.
- The menu element refers to a MenuSection whose purpose is to display menu links to sub pages that can be reached from the entry page.
- The contents element refers to a ContentsSection whose purpose is to display contents (text, images, tables, itemized lists etc.). The content is different for each selected page.
The application consists of a number of pages:
- The entry page is a page called: 'Home' and can be reached by opening the root URL of the web application: http://localhost
- The entry page refers to three sub pages: page1, page2 and page3 that can be reached from the entry page.
  
  The array keys refer to the path component in the URL that can be used as a selector to open the sub page. For example, http://localhost/page1 will open the page1 sub page and http://localhost/page2 will open the page2 sub page.

The currently selected page can be rendered with the following function invocation:

\SBLayout\View\HTML\displayRequestedPage($application);

By default, the above function generates a simple HTML page in which each section gets translated to an HTML div element:

The above screenshot shows what a page in the application could look like. The grey panel on top is the header that displays the logo, the blue bar is menu section (that displays links to sub pages that are reachable from the entry page), and the black area is the content section that displays the selected content.

One link in the menu section is marked as active to show the user which page in the page hierarchy (page1) has been selected.

Compound sections

Although the framework's functionality works quite well for most of my old use cases, I learned that in order to support flexbox layouts, I need to nest divs, which is something the default HTML code generator: displayRequestedPage() cannot do (as a sidenote: it is possible to create nestings by developing a custom generator).

For example, I may want to introduce another level of pages and add a submenu section to the layout, that is displayed on the left side of the screen.

To make it possible to position the menu bar on the left, I need to horizontally position the submenu and contents sections, while the remaining sections: header and menu must be vertically positioned. To make this possible with flexbox layouts, I need to nest the submenu and contents in a container div.

Since flexbox layouts have become so common nowadays, I have introduced a CompoundSection object, that acts as a generic container element.

With a CompoundSection, I can nest divs:

/* Sections */
array(
    "header" => new StaticSection("header.php"),
    "menu" => new MenuSection(0),
    "container" => new CompoundSection(array(
        "submenu" => new MenuSection(1),
        "contents" => new ContentsSection(true)
    ))
),

In the above code fragment, the container section will be rendered as a container div element containing two sub div elements: submenu and contents. I can use the nested divs structure to vertically and horizontally position the sections in the way that I described earlier.

The above screenshot shows the result of introducing a secondary page hierarchy and a submenu section (that has a red background).

By introducing a container element (through a CompoundSection) it has become possible to horizontally position the submenu next to the contents section.

Easier error handling

Another recurring issue is that most of my applications have to validate user input. When user input is incorrect, a page needs to be shown that displays an error message.

Previously, error handling and error page redirection was entirely the responsibility of the programmer -- it had to be implemented in every controller, which is quite a repetitive process.

In one of my test applications of the layout framework, I have created a page with a form that asks for the user's first and last name:

I wanted to change the example application to return an error message when any of these mandatory attributes were not provided.

To ease that burden, I have made framework's error handling mechanism more generic. Previously, the layout manager only took care of two kinds of errors: when an invalid sub page is requested, a PageNotFoundException is thrown redirecting the user to the 404 error page. When the accessibility criteria have not been met (e.g. a user is not authenticated) a PageForbiddenException is thrown directing the user to the 403 error page.

In the revised version of the layout framework, the PageNotFoundException and PageForbiddenException classes have become sub classes of the generic PageException class. This generic error class makes it possible for the error handler to redirect users to error pages for any HTTP status code.

Error pages should be added as sub pages to the entry page. The numeric keys should match the corresponding HTTP status codes:

/* Pages */
new StaticContentPage("Home", new Contents("home.php"), array(
    "400" => new HiddenStaticContentPage("Bad request", new Contents("error/400.php")),
    "403" => new HiddenStaticContentPage("Forbidden", new Contents("error/403.php")),
    "404" => new HiddenStaticContentPage("Page not found", new Contents("error/404.php"))
    ...
))

I have also introduced a BadRequestException class (that is also a sub class of PageException) that can be used for handling input validation errors.

PageExceptions can be thrown from controllers with a custom error message as a parameter. I can use the following controller implementation to check whether the first and last names were provided:

use SBLayout\Model\BadRequestException;

if($_SERVER["REQUEST_METHOD"] == "POST") // This is a POST request
{
    if(array_key_exists("firstname", $_POST) && $_POST["firstname"] != ""
        && array_key_exists("lastname", $_POST) && $_POST["lastname"] != "")
        $GLOBALS["fullname"] = $_POST["firstname"]." ".$_POST["lastname"];
    else
        throw new BadRequestException("This page requires a firstname and lastname parameter!");
}

The side effect is that if the user forgets to specify any of these mandatory attributes, he gets automatically redirected to the bad request error page:

This improved error handling mechanism significantly reduces the amount of boilerplate code that I need to write in applications that use my layout framework.

Using the iterator protocol for sub pages

As can be seen in the application model examples, some pages in the example applications have sub pages, such as the entry page.

In the layout framework, there are three kinds of pages that may provide sub pages:

A StaticContentPage object is a page that may refer to a fixed/static number of sub pages (as an array object).
A PageAlias object, that redirects the user to another sub page in the application, also offers the ability to refer users to a fixed/static number of sub pages (as an array object).
There is also a DynamicContentPage object in which a sub page can interpret the path component as a dynamic value. That dynamic value can, for example, be used as a parameter for a query that retrieves a record from a database.

In the old implementation of my framework, the code that renders the menu sections always has to treat these objects in a special way to render links to their available sub pages. As a result, I had to use the instanceof operator a lot, which is in a bad code smell.

I have changed the framework to use a different mechanism for stepping over sub pages: iterators or iterables (depending on the implementation language).

The generic Page class (that is the parent class of all page objects) provides a method called: subPageIterator() that returns an iterator/iterable that yields no elements. The StaticContentPage and PageAlias classes override this method to return an interator/iterable that steps over the elements in the array of sub pages.

Using iterators/iterables has a number of nice consequences -- I have eliminated two special cases and a bad code smell (the intensive use of instanceof), significantly improving the quality and readability of my code.

Another nice property is that it is also possible to override this method with a custom iterator, that for example, fetches sub page configurations from a database.

The pagemanager framework (another component in my web framework) offers a content management system giving end-users the ability to change the page structure and page contents. The configuration of the pages is stored in a database.

Although the pagemanager framework uses the layout framework for the construction of pages, it used to rely on custom code to render the menu sections.

By using the iterator protocol, it has become possible to re-use the menu section functionality from the layout framework eliminating the need for custom code. Moreover, it has also become much easier to integrate the pagemanager framework into an application because no additional configuration work is required.

I have also created a gallery application that makes it possible to expose the albums as items in the menu sections. Rendering the menu sections also used to rely on custom code, but thanks to using the iterator protocol that custom code was completely eliminated.

Flexible presentation of menu items

As I have already explained, an application layout can be divided into three kinds of sections. A StaticSection remains the same for any requested sub page, and a ContentSection is filled with content that is unique for the selected page.

In most of my use-cases, it is only required to have a single dynamic content section.

However, the framework is flexible enough to support multiple content sections as well. For example, the following screenshot shows the advanced example application (included with the web framework) in which both the header and the content sections change for each sub page:

The presentation of the third kind of section: MenuSection still used to remain pretty static -- they are rendered as div elements containing hyperlinks. The page that is currently selected is marked as active by using the active class property.

For most of my use-cases, just rendering hyperlinks suffices -- with CSS you can still present them in all kinds of interesting ways, e.g. by changing their colors, adding borders, and changing some its aspects when the user hovers with the mouse cursor over it.

In some rare cases, it may also be desired to present links to sub pages in a completely different way. For example, you may want to display an icon or add extra styling properties to an individual button.

To allow custom presentations of hyperlinks, I have added a new parameter: menuItem to the constructors of page objects. The menuItem parameter refers to a code snippet that decides how to render the link in a menu section:

new StaticContentPage("Icon", new Contents("icon.php"), "icon.php")

In the above example, the last parameter to the constructor, refers to an external file: menuitem/icon.php:

<span>
	<?php
	if($active)
	{
		?>
		<a class="active" href="<?= $url ?>">
			<img src="<?= $GLOBALS["baseURL"] ?>/image/menu/go-home.png" alt="Home icon">
			<strong><?= $subPage->title ?></strong>
		</a>
		<?php
	}
	else
	{
		?>
		<a href="<?= $url ?>">
			<img src="<?= $GLOBALS["baseURL"] ?>/image/menu/go-home.png" alt="Home icon">
			<?= $subPage->title ?>
		</a>
		<?php
	}
	?>
</span>

The above code fragment specifies how a link in the menu section should be displayed when the page is active or not active. We use the custom rendering code to display a home icon before showing the hyperlink.

In the advanced test application, I have added an example page in which every sub menu item is rendered in a custom way:

In the above screenshot, we should see two custom presented menu items in the submenu section on the left. The first has the home icon added and the second uses a custom style that deviates from the normal page style.

If no menuItem parameter was provided, the framework just renders a menu item as a normal hyperlink.

Other functionality

In addition to the new functionality explained earlier, I also made a number of nice small feature additions:

A function that displays bread crumbs (the route from the entry page to the currently opened page). The route is derived automatically from the requested URL and application model.
A function that displays a site map that shows the hierarchy of pages.
A function that makes it possible to embed a menu section in arbitrary sections of a page.

Conclusion

I am quite happy with the recent feature changes that I made to the layout framework. Although I have not done any web front-end development for quite some time, I had quite a bit of fun doing it.

In addition to the fact that useful new features were added, I have also simplified the codebase and improved its quality.

Availability

The Java, PHP and JavaScript implementations of my layout framework can be obtained from my GitHub page. Use them at your own risk!

Porting a Duke3D map to Shadow Warrior

2022-08-20T00:49:00.004+02:00

Almost six years ago, I wrote a blog post about Duke Nukem 3D, the underlying BUILD engine and my own total conversion that consists of 22 maps and a variety of interesting customizations.

Between 1997 and 2000, while I was still in middle school, I have spent a considerable amount of time developing my own maps and customizations, such as modified monsters. In the process, I learned a great deal about the technical details of the BUILD engine.

In addition to Duke Nukem 3D, the BUILD engine is also used as a basis for many additional games, such as Tekwar, Witchaven, Blood, and Shadow Warrior.

In my earlier blog post, I also briefly mentioned that in addition to the 22 maps that I created for Duke Nukem 3D, I have also developed one map for Shadow Warrior.

Last year, in my summer holiday, that was still mostly about improvising my spare time because of the COVID-19 pandemic, I did many interesting retro-computing things, such as fixing my old computers. I also played a bit with some of my old BUILD engine game experiments, after many years of inactivity.

I discovered an interesting Shadow Warrior map that attempts to convert the E1L2 map from Duke Nukem 3D. Since both games use the BUILD engine with mostly the same features (Shadow Warrior uses a slightly more advanced version of the BUILD engine), this map inspired me to also port one of my own Duke Nukem 3D maps, as an interesting deep dive to compare both game's internal concepts.

Although most of the BUILD engine and editor concepts are the same in both games, their game mechanics are totally different. As a consequence, the porting process turned out to be very challenging.

Another reason that it took me a while to complete the project is because I had to put it on hold in several occasions due to all kinds obligations. Fortunately, I have managed to finally finish it.

In this blog post, I will describe some of the things that both games have in common and the differences that I had to overcome in the porting process.

BUILD engine concepts

As explained in my previous blog post, the BUILD-engine is considered a 2.5D engine, not a true 3D engine due to the fact that it had to cope with all kinds of technical limitations of home computers commonly used at that time.

In fact, most of the BUILD-engine concepts are two-dimensional -- maps are made out two-dimensional surfaces called sectors:

The above picture shows a 2-dimensional top-level view of my ported Shadow Warrior map. Sectors are two dimensional areas surrounded by walls -- the white lines denote solid walls and red lines the walls between adjacent sectors. Red walls are invisible in 3D mode.

The purple and cyan colored objects are sprites (objects that typically provide some form of interactivity with the player, such as monsters, weapons, items or switches). The "sticks" that are attached to the sprites indicate in which direction the sprite is facing. When a sprite is purple, it will block the player. Cyan colored sprites allow a player to move through it.

You can switch between 2D and 3D mode in the editor by pressing the Enter-key on the numeric key pad.

In 3D mode, each sector's ceiling and floor can be given its own height, and we can configure textures for the walls, floors and ceilings (by pointing to any of these objects and pressing the 'V' key) giving the player the illusion to walk around in a 3D world:

In the above screenshot, we can see the corresponding 3D view of the 2D grid shown earlier. It consists of an outdoor area, grass, a lane, and the interior of the building. Each of these areas are separate 2D sectors with their own custom floor and ceiling heights, and their own textures.

The BUILD engine has all kinds of limitations. Although a world may appear to be (somewhat) 3-dimensional, it is not possible to stack multiple sectors on top of each other and simultaneously see them in 3D mode, although there are some tricks to cope with that limitation.

(As a sidenote: Shadow Warrior has a hacky feature that makes it possible for a player to observe multiple rooms stacked on top of each other, by using specialized wall/ceiling textures, special purpose sprites and a certain positioning of the sectors themselves. Sectors in the map are still separated, but thanks to the hack they can be visualized in such a way that they appear to be stacked on top of each other).

Moreover, the BUILD engine can also not change the perspective when a player looks up or down, although there is the possibility to give a player that illusion by stretching the walls. (As a sidenote: modern source ports of the BUILD engine have been adjusted to use Polymost, an OpenGL rendering extension, which actually makes it possible to provide a true 3D look).

Monsters, weapons, items, and most breakable/movable objects are sprites. Sprites are not really "true 3D" objects. Normally, sprites will always face the player from the same side, regardless of the position or the perspective of the player:

As can be seen, the guardian sprite always faces the player from the front, regardless of the angle of the camera.

Sprites can also be flattened and rotated, if desired. Then they will appear as a flat surface to the player:

For example, the wall posters in the screenshot above are flattened and rotated sprites.

Shadow warrior uses a slightly upgraded BUILD engine that can provide a true 3D experience for certain objects (such as weapons, items, buttons and switches) by displaying them as voxels (3D pixels):

The BUILD engine that comes with Duke Nukem 3D lacks the ability to display voxels.

Porting my Duke Nukem 3D map to Shadow Warrior

The map format that Duke Nukem 3D and Shadow Warrior use are exactly the same. To be precise: they both use version 7 of the map format.

At first, it seemed to look relatively straight forward to port a map from one game to another.

The first step in my porting process was to simply make a copy of the Duke Nukem 3D map and open it in the Shadow Warrior BUILD editor. What I immediately noticed is that all the textures and sprites look weird. The textures still have the same indexes and refer to textures in the Shadow Warrior catalog that are completely different:

Quite a bit of my time was spent on fixing all textures and sprites by looking for suitable replacements. I ended up replacing textures for the rocks, sky, buildings, water, etc. I also had to replace the monsters, weapons, items and other dynamic objects, and overcome some limitations for the player in the map, such as the absence of a jet pack. The process was labourious, but straight forward.

For example, this is how I have fixed the beach area:

I have changed the interior of the office building as follows:

And the back garden as follows:

The nice thing about the garden area is that Shadow Warrior has a more diverse set of vegetation sprites. Duke Nukem 3D only has palm trees.

Game engine differences

The biggest challenge for me was porting the interactive parts of the game. As explained earlier, game mechanics are not implemented by the engine or the editor. BUILD-engine games are separated into an engine and game part in which only the former component is generalized.

This diagram (that I borrowed from a Duke Nukem 3D code review article written by Fabien Sanglard) describes the high-level architecture of Duke Nukem 3D:

In the above diagram, the BUILD engine (on the right) is a general purpose component developed by Ken Silverman (the author of the BUILD engine and editor) and shipped as a header and object code file to 3D Realms. 3D Realms combines the engine with the game artifacts on the left to construct a game executable (DUKE3D.EXE).

To configure game effects in the BUILD editor, you need to annotate objects (walls, sprites and sectors) with tags and add special purpose sprites to the map. To the editor these objects are just meta-data, but the game engine treats them as parameters to create special effects.

Every object in a map can be annotated with meta data properties called Lotags and Hitags storing a 16-bit numeric value (by using the Alt+T and Alt+H key combinations in 2D mode).

In Shadow Warrior, the tag system was extended even further -- in addition to Lotags and Hitags, objects can potentially have 15 numerical tags (TAG1 corresponds to the Hitag, and TAG2 to the Lotag) and 11 boolean tags (BOOL1-BOOL11). In 2D mode, these can be configured with the ' and ; keys in combination with a numeric key (0-9).

We can also use special purpose sprites that are visible in the editor, but hidden in the game:

In the above screenshot of my Shadow Warrior map, there are multiple special purpose sprites visible: the ST1 sprites (that can be used to control all kinds of effects, such as moving a door). ST1 sprites are visible in the editor, but not in the game.

Although both games use the same principles for configuring game effects, their game mechanics are completely different.

In the next sections, I will show all the relevant game effects in my Duke Nukem 3D map and explain how I translated them to Shadow Warrior.

Differences in conventions

As explained earlier, both games frequently use Lotags and Hitags to create effects.

In Duke Nukem 3D, a Lotag value typically determines the kind of effect, while an Hitag value is used as a match tag to group certain events together. For example, multiple doors can be triggered by the same switch by using the same match tag.

Shadow Warrior uses the opposite convention -- a Hitag value typically determines the effect, while a Lotag value is often used as a match tag.

Furthermore, in Duke Nukem 3D there are many kinds of special purpose sprites, as shown in the screenshot above. The S-symbol sprite is called a Sector Effector that determines the kind of effect that a sector has, the M-symbol is a MUSIC&SFX sprite used to configure a sound for a certain event, and a GPSPEED sprite determines the speed of an effect.

Shadow Warrior has fewer special purpose sprites. In almost all cases, we end up using the ST1 sprite (with index 2307) for the configuration of an effect.

ST1 sprites typically combine multiple interactivity properties. For example, to make a sector a door, that opens slowly, produces a sound effect and that closes automatically, we need to use three Sector Effector sprites and one GPSPEED sprite in Duke Nukem 3D. In Shadow Warrior, the same is accomplished by only using two ST1 sprites.

The fact that the upgraded BUILD engine in Shadow Warrior makes it possible to change more than two numerical tags (and boolean values), makes it possible to combine several kinds of functionality into one sprite.

Co-op respawn points

To make it possible to play a multiplayer cooperative game, you need to add co-op respawn points to your map. In Duke Nukem 3D, this can be done by adding seven sprites with texture 1405 and setting the Lotag value of the sprites to 1. Furthermore, the player's respawn point is also automatically a co-op respawn point.

In Shadow Warrior, co-op respawn points can be configured by adding ST1 sprites with Hitag 48. You need eight of them, because the player's starting point is not a co-op start point. Each respawn point requires a unique Lotag value (a value between 0 and 7).

Duke match/Wang Bang respawn points

For the other multiplayer game mode: Duke match/Wang Bang, we also need re-spawn points. In both games the process is similar to their co-op counterparts -- in Duke Nukem 3D, you need to add seven sprites with texture 1405, and set the Lotag value to 0. Moreover, the player's respawn point is also a Duke Match respawn point.

In Shadow Warrior, we need to use ST1 sprites with a Hitag value of 42. You need eight of them and give each of them a unique Lotag value between 0-7 -- the player's respawn point is not a Wang Bang respawn point.

Underwater areas

As explained earlier, the BUILD engine makes it possible to have overlapping sectors, but they cannot be observed simultaneously in 3D mode -- as such, it is not possible to natively provide a room over room experience, although there are some tricks to cope with that limitation.

In both games it is possible to dive into the water and swim in underwater areas, giving the player some form of a room over room experience. The trick is that the BUILD engine does not render both sectors. When you dive into the water or surface again, you get teleported from one sector to another sector in the map.

Although both games use a similar kind of teleportation concept for underwater areas, they are configured in a slightly different way.

In both games, you need the ability to sink into the water in the upper area. In Duke Nukem 3D, the player automatically sinks by giving the sector a Lotag value of 1. In Shadow Warrior, you need to add a ST1 sprite with a Hitag value of 0, and a Lotag value that determines how much the player will sink. 40 is typically a good value for water areas.

The underwater sector in Duke Nukem 3D needs a Lotag value of 2. In the game, the player will automatically swim when it enters the sector and the colors will be turned blue-ish.

We also need to determine from what position in a sector a player will teleport. Both the upper and lower sector should have the same 2 dimensional shape. In Duke Nukem 3D, teleportation can be specified by two Sector Effector sprites having a Lotag 7. These sprites need to be exactly in the same position in the upper and lower sectors. The Hitag value (match tag) needs to be the same:

In the screenshot above, we should see a 2D grid with two Sector Effector sprites having a Lotag of 7 (teleporter) and unique match tags (110 and 111). Both the upper and underwater sectors have exactly the same 2-dimensional shape.

In Shadow Warrior, teleportation is also controlled by sprites that should be in exactly the same position in the upper and lower sectors.

In the upper area, we need an ST1 sprite with a Hitag value of 7 and a unique Lotag value. In the underwater area, we need an ST1 sprite with an Hitag value of 8 and the same match Lotag. The latter ST1 sprite (with Hitag 8) automatically lets the player swim. If the player is an under water area where he can not surface, the match Lotag value should be 0.

In Duke Nukem 3D the landscape will automatically look blue-ish in an underwater area. To make the landscape look blue-ish in Shadow Warrior, we need to adjust the palette of the walls, floors and ceilings from 0 to 9.

Garage doors

In my map, I commonly use garage/DOOM-style doors that move up when you touch them.

In Duke Nukem 3D, we can turn a sector into a garage door by giving it a Lotag value of 20 and lowering the ceiling in such a way that it touches the floor. By default, opening a door does not produce any sound. Moreover, a door will not close automatically.

We can adjust that behaviour by placing two special purpose sprites in the door sector:

By adding a MUSIC&SFX sprite we can play a sound. The Lotag value indicates the sound number. 166 is typically a good sound.
To automatically close the door after a certain time interval, we need to add a Sector Effector sprite with Lotag 10. The Hitag indicates the time interval. For many doors, 100 is a good value.

In the above screenshot, we can see what the garage door looks like if I slightly move the ceiling up (normally the ceiling should touch the floor). There is both a MUSIC&SFX (to give it a sound effect) as well as a Sector Effector sprite (to ensure that the door gets closed automatically) in the door sector.

In Shadow Warrior, we can accomplish the same thing by adding an ST1 sprite to the door sector with Hitag 92 (Vator). A vator is a multifunctional concept that can be used to move sectors up and down in all kinds of interesting ways.

An auto closing garage door can be configured by giving the ST1 sprite the following tag and boolean values:

TAG2 (Lotag). Is a match that that should refer to a unique numeric value
TAG3 specifies the type of vator. 0 indicates that it is operated manually or by a switch/trigger
TAG4 (angle) specifies the speed of the vator. 350 is a reasonable value.
TAG9 specifies the auto return time. 35 is a reasonable value.
BOOL1 specifies whether the door should be opened by default. Setting it to 1 (true) allows us to keep the door open in the editor, rather than moving the ceiling down so that it touches the floor.
BOOL3 specifies whether the door could crush the player. We set it to 1 to prevent this from happening.

By default, a vator moves a sector down on first use. To make the door move up, we must rotate the ST1 sprite twice in 3D mode (by pressing the F key twice).

We can configure a sound effect by placing another ST1 sprite near the door sector with a Hitag value of 134. We can use TAG4 (angle) to specify the sound number. 473 is a good value for many doors.

In the above screenshot, we should see what a garage door looks like in Shadow Warrior. The rotated ST1 sprite defines the Vator whereas the regular ST1 provides the sound effect.

Lifts

Another prominent feature of my Duke Nukem 3D map are lifts that allow the player to reach the top or roofs of the buildings.

In Duke Nukem 3D, lift mechanics are a fairly simple concept -- we should give a sector a Lotag value of 17 and the sector will automatically move up or down when the player presses the use key while standing in the sector. The Hitag of a MUSIC&SFX sprite determines the stop sound and a Lotag value the start sound.

In Shadow Warrior, there is no direct equivalent of the same lift concept, but we can create a switch-operated lift by using the Vator concept (the same ST1 sprite with Hitag 92 used for garage doors) with the following properties:

TAG2 (Lotag) should refer to a unique match tag value. The switches should use the exact same value.
TAG3 determines the type of vator. 1 is used to indicate that it can only be operated by switches.
TAG4 (Angle) determines the speed of the vator. 325 is a reasonable value.

We have to move the ST1 sprite to the same height where the lift should arrive after it was moved up.

Since it is not possible to respond to the use key while the player is standing in the sector, we have to add switches to control the lift. A possible switch is sprite number 575. The Hitag should match the Lotag value of the ST1 sprite. The switch sprite should have a Lotag value of 206 to indicate that it controls a Vator.

The above screenshot shows the result of my porting effort -- switches have been added, the MUSIC&SFX sprite was replaced by an equivalent ST1 sprite. The ST1 sprite that controls the movement is not visible because it was moved up to the same height as the adjacent upper floor.

Swinging doors

In addition to garage doors, my level also contains a number of swinging doors.

In Duke Nukem 3D, a sector can be turned into a swinging door by giving it a Lotag of 23 and moving the floor up a bit. We also need to add a Sector Effector with Lotag 11 and a unique Hitag value that acts as the door's pivot.

As with garage doors, they will not produce any sound effects or close automatically by default, unless we add a MUSIC&SFX and a Sector Effector sprite (with Lotag 10) to the door sector.

In Shadow Warrior, the rotating door concept is almost the same. We need to add an ST1 sprite with Hitag 144 and a unique Lotag value to the sector that acts as the door's pivot.

In addition, we need to add an ST1 sprite to the sector that configures a rotator:

TAG2/Lotag determines a unique match tag value that should be identical to the door's pivot ST1 sprite.
TAG3 determines the type of rotator. 0 indicates that it can be manually triggered or by a switch.
TAG5 determines the angle move amount. 512 specifies that it should move 90 degrees to the right. -512 is moving the door 90 degrees to the left.
TAG7 specifies the angle increment. 50 is a good value.
TAG9 specifies the auto return time. 35 is a good value.

As with garage doors, we also need to add an ST1 sprite (with Hitag 134) to produce a sound. TAG4 (the angle) can be used to specify the sound number. 170 is a good value for rotating doors.

Secret places

My map also has a number of secret places (please do not tell anyone :-) ). In Duke Nukem 3D, any sector that has a Lotag value of 32767 is considered a secret place. In Shadow Warrior the idea is the same -- any sector with a Lotag of 217 is considered a secret place.

Puzzle switches

Some Duke Nukem 3D maps also have so-called puzzle switches requiring the player to find the correct on-and-off combination to unlock something. In my map they are scattered all over the level to unlock the final key. The E2L1 map in Duke Nukem 3D shows a better example:

We can use the Hitag value to determine whether the switch needs to be switched off (0) or on (1). We can use the Lotag as a match tag to group multiple switches.

In Shadow Warrior, each switch uses a Hitag as a match tag and a Lotag value to configure the switch type. Giving a switch a Lotag value of 213 makes it a combo switch. TAG3 can be used set to 0 to indicate that it needs to be turned off and 1 that it needs to be turned on.

Skill settings

Both games have four skill levels. The idea is that the higher the skill level is, the more monsters you will have to face.

In Duke Nukem 3D you can specify the minimum skill level of a monster by giving the sprite a Lotag value that corresponds to the minimum skill level. For example, giving a monster a Lotag value of 2 means that it will only show up when the skill level is two or higher (Skill level 2 corresponds to: Let's rock). 0 (the default value) means that it will show up in any skill level:

In Shadow Warrior, each sprite has its own dedicated skill attribute that can be set by using the key combination ' + K. The skill level is displayed as one of the sprite's attributes.

In the above screenshot, the sprite on the left has a S:0 prefix meaning that it will be visible in skill level 0 or higher. The sprite on the right (with a prefix: S:2) appears from skill level 2 or higher.

End switch

In both games, you typically complete a level by touching a so-called end switch. In Duke Nukem 3D an ending switch can be created by using sprite 142 and giving it a Lotag of 65535. In Shadow Warrior the idea is the same -- we can create an end switch by using sprite 2470 and giving it a Lotag of 116.

Conclusion

In this blog post, I have described the porting process of a Duke Nukem 3D map to Shadow Warrior and explained some of the properties that are common and different in both games.

Although this project was a pretty useless project (the game is quite old, from the late 90s), I had a lot of fun doing it after not having touched this kind of technology for over a decade. I am quite happy with the result:

Despite the fact that this technology is old, I am still quite surprised to see how many maps and customizations are still being developed for these ancient games. I think this can be attributed to the fact that these engines and game mechanics are highly customizable and still relatively simple to use due to the technical limitations at the time they were developed.

Since I did most of my mapping/customization work many years before I started this blog, I thought that sharing my current experiences can be useful for others who intend to look at these games and creating their own customizations.

In memoriam: Eelco Visser (1966-2022)

2022-04-20T12:10:00.000+02:00

On Tuesday 5 April 2022 I received the unfortunate news that my former master's and PhD thesis supervisor: Eelco Visser has unexpectedly passed away.

Although I made my transition from academia to industry almost 10 years ago and I have not been actively doing academic research anymore (I published my last paper in 2014, almost two years after completing my PhD), I have always remained (somewhat) connected to the research world and the work carried out by my former supervisor, who started his own programming languages research group in 2013.

He was very instrumental in the domain-specific programming languages research domain, but also in the software deployment research domain, a very essential part in almost any software development process.

Without him and his ideas, his former PhD student: Eelco Dolstra would probably never have started the work that resulted in the Nix package manager. As a consequence, my research on software deployment (resulting in Disnix and many other Nix-related tools and articles) and this blog would also not exist.

In this blog post, I will share some memories of my time working with Eelco Visser.

How it all started

The first time I met Eelco was in 2007 when I was still a MSc student. I just completed the first year of the TU Delft master's programme and I was looking for an assignment for my master's thesis.

Earlier that year, I was introduced to a concept called model-driven development (also ambiguously called model-driven engineering/architecture, the right terminology is open to interpretation) in a guest lecture by Jos Warmer in the software architecture course.

Modeling software systems and automatically generating code (as much as possible), was one of the aspects that really fascinated me. Back then, I was already convinced that working from a higher abstraction level, with more "accessible" building blocks could be quite useful to hide complexity, reduce the chances on errors and make developers more productive.

In my first conversation with Eelco, he asked me why I was looking for a model-driven development assignment and he asked me various questions about my past experience.

I told him about my experiences with Jos Warmer's lecture. Although he seemed to understand my enthusiasm, he also explained me that his work was mostly about creating textual languages, not visual languages such as UML profiles, that are commonly used in MDA development processes.

He also specifically asked me about the compiler construction course (also part of the master's programme), that is required for essential basic knowledge about textual languages.

The compiler construction course (as it was taught in 2007) was considered to be very complex by many students, in particular the practical assignment. As a practical assignment, you had to rewrite the a parser from using GNU Bison (a LARL(1) parser) to LLnextgen (a LL(1) parser) and extend the reference compiler with additional object-oriented programming features. Moreover, the compiler was implemented in C, and relied on advanced concepts, such as function pointers and proper alignment of members in a struct.

I explained Eelco that despite the negative image of the course because of its complexity, I actually liked it very much. Already at a young age I had the idea to develop my own programming language, but I had no idea how to do it, but when I was exposed to all these tools and concepts I finally learned about all the missing bits and pieces.

I was also trying to convince him that I am always motivated to deep dive into technical details. As an example, I explained him that one of my personal projects is creating customized Linux distributions by following the Linux from Scratch book. Manually following all the instructions in the book is time consuming and difficult to repeat. To make deploying a customized Linux distribution doable, I developed my own automated solution.

After elaborating about my (somewhat crazy) personal project, he told me that there is an ongoing research project that I will probably like very much. A former PhD student of his: Eelco Dolstra developed the Nix package manager and this package manager was used as the foundation for an entire Linux distribution: NixOS.

He gave me a printed copy of Eelco Dolstra's thesis and convinced me that I should give NixOS a try.

Research assignment

After reading Eelco Dolstra's PhD thesis and trying out NixOS (that was much more primitive in terms of features compared to today's version), Eelco Visser gave me my first research assignment.

When he joined Delft University of Technology in 2006 (a year before I met him) as an associate professor, he started working on a new project called: WebDSL. Previously, most of his work was focused on the development of various kinds meta-tools for creating domain specific languages, such as:

SDF2 is a formalism used to write a lexical and context free syntax. It has many interesting features, such as a module system and scannerless parsing, making it possible to embed a guest language in an host language (that may share the same keywords). SDF2 was originally developed by Eelco Visser for his PhD thesis for the ASF+SDF Meta Environment.
ATerm library. A library that implements the annotated terms format to exchange structured data between tools. SDF2 uses it to encode parse/abstract syntax trees.
Stratego/XT. A language and toolset for program transformation.

WebDSL was a new step for him, because it is an application language (built with the above tools) rather than a meta language.

With WebDSL, in addition to just building an application language, he also had all kinds of interesting ideas about web application development and how to improve it, such as:

Reducing/eliminating boilerplate code. Originally WebDSL was implemented with JBoss and the Seam framework using Java as an implementation language, requiring you to write a lot of boilerplate code, such as getters/setters, deployment descriptors etc.

WebDSL is declarative in the sense that you could more concisely describe what you want in a rich web application: a data model, and pages that should render content and data.
Improving static consistency checking. Java (the implementation language used for the web applications) is statically typed, but not every concern of a web application can be statically checked. For example, for interacting with a database, embedded SQL queries (in strings) are often not checked. In JSF templates, page references are not checked.

With WebDSL all these concerns are checked before the deployment of an web application.

By the time I joined, he already assembled several PhD and master's students to work on a variety of aspects of WebDSL and the underlying tooling, such as Stratego/XT.

Obviously, in a development process of a WebDSL application, like any application, you will also eventually face a deployment problem -- you need to perform activities to make the application available for use.

For solving deployment problems in our department, Nix was already quite intensively used. For example, we had a Nix-based continuous integration service called the Nix buildfarm (several years later, its implementation was re-branded into Hydra), that built all bleeding edge versions of WebDSL, Stratego/XT and all other relevant packages. The Nix package manager was used by all kinds of people in the department to install bleeding edge versions of these tools.

My research project was to automate the deployment of WebDSL applications using tooling from the Nix project. In my first few months, I have packaged all the infrastructure components that a WebDSL application requires in NixOS (JBoss, MySQL and later Apache Tomcat). I changed WebDSL to use GNU Autotools as build infrastructure (which was a common practice for all Stratego/XT related projects at that time) and made subtle modifications to prevent unnecessary recompilations of WebDSL applications (such as making the root folder dynamically configurable) and wrote an abstraction function to automatically build WAR files.

Thanks to Eelco I ended up in a really friendly and collaborative atmosphere. I came in touch with his fellow PhD and master's students and we frequently had very good discussions and collaborations.

Eelco was also quite helpful in the early stages of my research. For example, whenever I was stuck with a challenge he was always quite helpful in discussing the underlying problem and bringing me in touch with people that could help me.

My master's thesis project

After completing my initial version of the WebDSL deployment tool, that got me familiarised with the basics of Nix and NixOS, I started working on my master's thesis which was a collaboration project between Delft University of Technology and Philips Research.

Thanks to Eelco I came in contact with a former master's thesis student and postdoc of his: Merijn de Jonge who was employed by Philips Research. He was an early contributor to the Nix project and collaborated on the first two research papers about Nix.

While working on my master's thesis I developed the first prototype version of Disnix.

During my master's thesis project, Eelco Dolstra, who was formerly a postdoc at Utrecht University joined our research group in Delft. Eelco Visser made sure that I got all the help from Eelco Dolstra about all technical questions about Nix.

Becoming a PhD student

My master's thesis project was a pilot for a bigger research project. Eelco Visser, Eelco Dolstra and Merijn de Jonge (I was already working quite intensively with them for my master's thesis) were working on a research project proposal. When the proposal got accepted by NWO/Jacquard for funding, Eelco Visser was the first to inform me about the project to ask me what I thought about it.

At that moment, I was quite surprised to even consider doing a PhD. A year before, I attended somebody's else's PhD defence (someone who I really considered smart and talented) and thought that doing such a thing myself was way out of my grasp.

I also felt a bit like an impostor because I had interesting ideas about deployment, but I was still in the process of finishing up/proving some of my points.

Fortunately, thanks to Eelco my attitude completely changed in that year -- during my master's thesis project he convinced me that the work I was doing is relevant. What I also liked is the attitude in our group to actively build tools, have the time and space to explore things, and eat our own dogfood with it to solve relevant practical problems. Moreover, much of the work we did was also publicly available as free and open source software.

As a result, I easily grew accustomed to the research process and the group's atmosphere and it did not take long to make the decision to do a PhD.

My PhD

Although Eelco Visser only co-authored one of my published papers, he was heavily involved in many aspects of my PhD. There are way too many things to talk about, but there are some nice anecdotes that I really find worth sharing.

OOPSLA 2008

I still remember the first research conference that I attended: OOPSLA 2008. I had a very quick publication start, with a paper covering an important aspect of master's thesis: the upgrade aspect for distributed systems. I had to present my work at HotSWUp, an event co-located with OOPSLA 2008.

(As a sidenote: because we had to put all our efforts in making the deadline, I had to postpone the completion of my master's thesis a bit, so it started overlap with my PhD).

It was quite an interesting experience, because in addition to the fact that it was my first conference, it was also my first time to travel to the United States and to step into an airplane.

The trip was basically a group outing -- I was joined by Eelco and many of his PhD students. In addition to my HotSWUp 2008 paper, we also had an OOPSLA paper (about the Dryad compiler), a WebDSL poster, and another paper about the implementation of WebDSL (the paper titled: "When frameworks let you down") to present.

I was surprised to see how many people Eelco knew at the conference. He was also actively encouraging us to meet up with people and bringing us in touch with people that he know that could be relevant.

We were having a good time together, but I also remember him saying that it is actually much better to visit a conference alone, rather than in a group. Being alone makes it much easier and more encouraging to meet new people. That lesson stuck and in many future events, I took the advantage of being alone as an opportunity to meet up.

Working on practical things

Once in a while I had casual discussions with him about ongoing things in my daily work. For my second paper, I had to travel to ICSE 2009 in Vancouver, Canada all by myself (there were some colleagues traveling to co-located events, but took different flights).

Despite the fact that I was doing research on Nix-related things, NixOS at that time was not my main operating system yet on my laptop because it was missing features that I consider a must-have in a Linux distribution.

In the weeks before the planned travel date, I was intensively working on getting all the software packaged that I consider important. One major packaging effort was getting KDE 4.2 to work, because I was dissatisfied with only having the KDE 3.5 base package available in NixOS. VirtualBox was another package that I consider critical, so that I could still run a conventional Linux distribution and Microsoft Windows.

Nothing about this work is considered scientific "research" that may result in a paper that we can publish. Nonetheless, Eelco recognized the value of making NixOS more usable and encouraged me to get all that software packaged. He even asked me: "Are you sure that you have packaged enough software in NixOS so that you can survive that week?"

Starting my blog

Another particularly helpful advice that he gave me is that I should start a blog. Although I had a very good start of my PhD, having a paper accepted in my first month and another several months later, I slowly ran into numerous paper rejections, with reviews that were not helpful at all.

I talked to him about my frustrations and explained that software deployment research is generally a neglected subject. There is no research-related conference that is specifically about software deployment (there used to be a working conference on component deployment, but by the time I became a PhD student it was no longer organized), so we always had to "package" our ideas into subjects for different kinds of conferences.

He gave me the advice to start a blog to increase my interaction with the research community. As a matter of fact, many people in our research group, including Eelco, had their own blogs.

It took me some time to take that step. First, I had to "catch up" on my blog with relevant background materials. Eventually, it paid off -- I wrote a blog post titled: Software deployment complexity to emphasize software deployment as an important research subject, and thanks to Eelco's Twitter network I came in touch with all kinds of people.

Lifecycle management

For most of my publication work, I intensively worked with Eelco Dolstra. Eelco Visser left most of the practical supervision to him. The only published paper that we co-authored was: "Software Deployment in a Dynamic Cloud: From Device to Service Orientation in a Hospital Environment".

There was also a WebDSL-related subject that we intensively worked on for a while, that unfortunately never fully materialized.

Although I had already had the static aspects of a WebDSL application deployment automated -- the infrastructure components (Apache Tomcat, MySQL) as well as a function to compile a Java Web application Archive (WAR) with the WebDSL compiler, we also had to cope with the data that a WebDSL application stores -- WebDSL data models can evolve, and when this happens, the data needs to be migrated from an old to a new table structure.

Sander Vermolen, a colleague of mine, worked on a solution to make automated data migrations of WebDSL possible.

At some point, we came up with the idea to make this all work together -- deployment automation and data migration from a high-level point of view hiding unimportant implementation details. Due to the lack of a better name we called this solution: "lifecycle management".

Although the project seemed to look straight forward to me in the beginning, I (and probably all of us) heavily underestimated how complex it was to bring Nix's functional deployment properties to data management.

For example, Nix makes it possible to store multiple variants of the same packages (e.g. old and new versions) simultaneously on a machine without conflicts and makes it possible to cheaply switch between versions. Databases, on the other hand, make imperative modifications. We could manage multiple versions of a database by making snapshots, but doing this atomically and in an portable way is very expensive, in particular when databases are big.

Fortunately, the project was not a complete failure. I have managed to publish a paper about a sub set of the problem (automatic data migrations when databases move from one machine to another and a snapshotting plugin system), but the entire solution was never fully implemented.

During my PhD defence he asked me a couple of questions about this subject, from which (of course!) I understood that it was a bummer that we never fully realized the vision that we initially came up with.

Retrospectively, we should have divided the problem into a smaller chunks and solve each problem one by one, rather than working on the entire integration right from the start. The integrated solution would probably still consist of many trade-offs, but it still would have been interesting to have come up with at least a solution.

PhD thesis

When I was about to write my PhD thesis, I was making the bold decision to not compose the chapters directly out of papers, but to write a coherent story using my papers as ingredients, similar to Eelco Dolstra's thesis. Although there are plenty of reasons to think of to not do such a thing (e.g. it takes much more time for a reading committee to review such a thesis), he was actually quite supportive in doing that.

On the other hand, I was not completely surprised by it, considering the fact that his PhD thesis was several orders of magnitude bigger than mine (over 380 pages!).

Spoofax

After I completed my PhD, and made my transition to industry, he and his research group relentlessly kept working on the solution ecosystem that I just described.

Already during my PhD, many improvements and additions were developed that resulted in the Spoofax language workbench, an Eclipse plugin in which all these technologies come together to make the construction of Domain Specific Languages as convenient as possible. For a (somewhat :-) ) brief history of the Spoofax language workbench I recommend you to read this blog post written by him.

Moreover, he also kept dogfooding his own practical problems. During my PhD, three serious applications were created with WebDSL: researchr (a social network for researchers sharing publications), Yellowgrass (an issue tracker) and Weblab (a system to facilitate programming exams). These applications are still maintained and used by the university as of today.

A couple of months after my PhD defence in 2013 (I had to wait for several months to get feedback and a date for my defence), he was awarded the prestigious Vici grant and became a full professor, starting his own programming language research group.

In 2014, when I was already in industry for two years, I was invited for his inauguration ceremony and was given another demonstration of what Spoofax has become. I was really impressed by all the new meta languages that were developed and what Spoofax looked like. For example, SDF2 evolved into SDF3, a new meta-language for developing Name Address bindings (NaBL) was developed etc.

Moreover, I liked his inauguration speech very much, in which he briefly demonstrated the complexities of computers and programming, and what value domain specific languages can provide.

Concluding remarks

In this blog post, I have written down some of my memories working with Eelco Visser. I did this in the spirit of my blog, whose original purpose was to augment my research papers with practical information and other research aspects that you normally never read about.

I am grateful for the five years that we worked together, that he gave me the opportunity to do a PhD with him, for all the support, the things he learned me, and the people who he brought me in touch with. People that I still consider friends as of today.

My thoughts are with his family, friends, the research community and the entire programming languages group (students, PhD students, Postdocs, and other staff).

A layout framework experiment in JavaScript

2022-02-14T22:23:00.006+01:00

It has been a while since I wrote a blog post about front-end web technology. The main reason is that I am not extensively doing front-end development anymore, but once in a while I still tinker with it.

In my Christmas break, I wanted to expand my knowledge about modern JavaScript programming practices. To make the learning process more motivating, I have been digging up my old web layout framework project and ported it to JavaScript.

In this blog post, I will explain the rationale of the framework and describe the features of the JavaScript version.

Background

Several years ago, I have elaborated about some of the challenges that I faced while creating layouts for web applications. Although front-end web technology (HTML and CSS) were originally created for pages (not graphical user interfaces), most web applications nowadays are complex information systems that typically have to present collections of data to end-users in a consistent manner.

Although some concepts of web technology are powerful and straight forward, a native way to isolate layout from a page's content and style is still virtually non-existent (with the exception of frames that have been deprecated a long time ago). As a consequence, it has become quite common to rely on custom abstractions and frameworks to organize layouts.

Many years ago, I also found myself repeating the same patterns to implement consistent layouts. To make my life easier, I have developed my own layout framework that allows you to define a model of your application layout, that captures common layout properties and all available sub pages and their dynamic content.

A view function can render a requested sub page, using the path in the provided URL as a selector.

I have created two implementations of the framework: one in Java and another in PHP. The Java version was the original implementation but I ended up using the PHP version the most, because nearly all of the web applications I developed were hosted at shared web hosting providers only offering PHP as a scripting language.

Something that I consider both an advantage and disadvantage of my framework is that it has to generate pages on the server-side. The advantage of this approach is that pages rendered by the framework will work in many browsers, even primitive text-oriented browsers that lack JavaScript support.

A disadvantage is that server-side scripting requires a more complex server installation. Although PHP is relatively simple to set up, a Java Servlet container install (such as Apache Tomcat) is typically more complex. For example, you typically want to put it behind a reverse proxy that serves static content more efficiently.

Furthermore, executing server-side code for each request is also significantly more expensive (in terms of processing power) than serving static files.

The interesting aspect of using JavaScript as an implementation language is that we can use the framework both on the client-side (in the browser) as well as on the server-side (with Node.js). The former aspect makes it possible to host applications on web servers that only serve static content, making web hosting considerably easier and cheaper.

Writing an application model

As explained earlier, my layout framework separates the model from a view. An application layout model can be implemented in JavaScript as follows:

import { Application } from "js-sblayout/model/Application.mjs";

import { StaticSection } from "js-sblayout/model/section/StaticSection.mjs";
import { MenuSection } from "js-sblayout/model/section/MenuSection.mjs";
import { ContentsSection } from "js-sblayout/model/section/ContentsSection.mjs";

import { StaticContentPage } from "js-sblayout/model/page/StaticContentPage.mjs";
import { HiddenStaticContentPage } from "js-sblayout/model/page/HiddenStaticContentPage.mjs";
import { PageAlias } from "js-sblayout/model/page/PageAlias.mjs";

import { Contents } from "js-sblayout/model/page/content/Contents.mjs";

/* Create an application model */

export const application = new Application(
    /* Title */
    "My application",

    /* Styles */
    [ "default.css" ],

    /* Sections */
    {
        header: new StaticSection("header.html"),
        menu: new MenuSection(0),
        submenu: new MenuSection(1),
        contents: new ContentsSection(true)
    },

    /* Pages */
    new StaticContentPage("Home", new Contents("home.html"), {
        "404": new HiddenStaticContentPage("Page not found", new Contents("error/404.html")),

        home: new PageAlias("Home", ""),

        page1: new StaticContentPage("Page 1", new Contents("page1.html"), {
            page11: new StaticContentPage("Subpage 1.1", new Contents("page1/subpage11.html")),
            page12: new StaticContentPage("Subpage 1.2", new Contents("page1/subpage12.html")),
            page13: new StaticContentPage("Subpage 1.3", new Contents("page1/subpage13.html"))
        }),

        page2: new StaticContentPage("Page 2", new Contents("page2.html"), {
            page21: new StaticContentPage("Subpage 2.1", new Contents("page2/subpage21.html")),
            page22: new StaticContentPage("Subpage 2.2", new Contents("page2/subpage22.html")),
            page23: new StaticContentPage("Subpage 2.3", new Contents("page2/subpage23.html"))
        }),
    }),

    /* Favorite icon */
    "favicon.ico"
);

The above source code file (appmodel.mjs) defines an ECMAScript module exporting an application object. The application object defines the layout of a web application with the following properties:

The title of the web application is: "My application".
All pages use: default.css as a common stylesheet.
Every page consists of a number of sections that have a specific purpose:
- A static section (header) provides content that is the same for every page.
- A menu section (menu, submenu) display links to sub pages part of the web application.
- A content section (contents) displays variable content, such as text and images.
An application consists of multiple pages that display the same sections. Every page object refers to a file with static HTML code providing the content that needs to be displayed in the content section.
The last parameter refers to a favorite icon that is the same for every page.

Pages in the application model are organized in a tree-like data structure. The application constructor only accepts a single page parameter that refers to the entry page of the web application. The entry page can be reached by opening the web application from the root URL or by clicking on the logo displayed in the header section.

The entry page refers to two sub pages: page1, page2. The menu section displays links to the sub pages that are reachable from the entry page.

Every sub page can also refer to their own sub pages. The submenu section will display links to the sub pages that are reachable from a selected the sub page. For example, when page1 is selected the submenu section will display links to: page11, page12.

In addition to pages that are reachable from the menu sections, the application model also has hidden error pages and a home link that is an alias for the entry page. In many web applications, it is a common habit that in addition to clicking on the logo, a home button can also be used to redirect a user to the entry page.

Besides using the links in the menu sections, any sub page in the web application can be reached by using the URL as a selector. A common convention is to use the path components in the URL to determine which page and sub page need to be displayed.

For example, by opening the following URL in a web browser:

http://localhost/page1/page12

Brings the user to the second sub page of the first sub page.

When providing an invalid selector in the URL, such as http://localhost/page4, the framework automatically redirects the user to the 404 error page, because the page cannot be found.

Displaying sub pages in the application model

As explained earlier, to display any of the sub pages that the application model defines, we must invoke a view function.

A reasonable strategy (that should suit most needs) is to generate an HTML page, with a title tag composed the application and page's title, globally include the application and page-level stylesheets, and translate every section to a div using the section identifier as its id. The framework provides a view function that automatically performs this translation.

As a sidenote: for pages that require a more complex structure (for example, to construct a layout with more advanced visualizations), it is also possible to develop a custom view function.

We can create a custom style sheet: default.css to position the divs and give each section a unique color. By using such a stylesheet, the application model shown earlier may be presented as follows:

As can be seen in the screenshot above, the header section has a gray color and displays a logo, the menu section is blue, the submenu is red and the contents section is black.

The second sub page from the first sub page was selected (as can be seen in the URL as well as the selected buttons in the menu sections). The view functions that generate the menu sections automatically mark the selected sub pages as active.

With the Java and PHP versions (described in my previous blog post), it is a common practice to generate all requested pages server-side. With the JavaScript port, we can also use it on the client-side in addition to server-side.

Constructing an application that generates pages server-side

For creating web applications with Node.js, it is a common practice to create an application that runs its own web server.

(As a sidenote: for production environments it is typically recommended to put a more mature HTTP reverse proxy in front of the Node.js application, such as nginx. A reverse proxy is often more efficient for serving static content and has more features with regards to security etc.).

We can construct an application that runs a simple embedded HTTP server:

import { application } from "./appmodel.mjs";
import { displayRequestedPage } from "js-sblayout/view/server/index.mjs";
import { createTestServer } from "js-sblayout/testhttpserver.mjs";

const server = createTestServer(function(req, res, url) {
    displayRequestedPage(req, res, application, url.pathname);
});
server.listen(process.env.PORT || 8080);

The above Node.js application (app.mjs) performs the following steps:

It includes the application model shown in the code fragment in the previous section.
It constructs a simple test HTTP server that serves well-known static files by looking at common file extensions (e.g. images, stylesheets, JavaScript source files) and treats any other URL pattern as a dynamic request.
The embedded HTTP server listens to port 8080 unless a PORT environment variable with a different value was provided.
Dynamic URLs are handled by a callback function (last parameter). The callback invokes a view function from the framework that generates an HTML page with all properties and sections declared in the application layout model.

We can start the application as follows:

$ node app.mjs

and then use the web browser to open the root page:

http://localhost:8080

or any sub page of the application, such as the second sub page of the first sub page:

http://localhost:8080/page1/page12

Although Node.js includes a library and JavaScript interface to run an embedded HTTP server, it is very low-level. Its only purpose is to map HTTP requests (e.g. GET, POST, PUT, DELETE requests) to callback functions.

My framework contains an abstraction to construct a test HTTP server with reasonable set of features for testing web applications built with the framework, including serving commonly used static files (such as images, stylesheets and JavaScript files).

For production deployments, there is much more to consider, which is beyond the scope of my HTTP server abstraction.

It is also possible to use the de-facto web server framework for Node.js: express in combination with the layout framework:

import { application } from "./appmodel.mjs";
import { displayRequestedPage } from "js-sblayout/view/server/index.mjs";
import express from "express";

const app = express();
const port = process.env.PORT || 8080;

// Configure static file directories
app.use("/styles", express.static("styles"));
app.use("/image", express.static("image"));

// Make it possible to parse form data
app.use(express.json());
app.use(express.urlencoded({ extended: true }));

// Map all URLs to the SB layout manager
app.get('*', (req, res) => {
    displayRequestedPage(req, res, application, req.url);
});

app.post('*', (req, res) => {
    displayRequestedPage(req, res, application, req.url);
});

// Configure listening port
app.listen(port, () => {
    console.log("Application listening on port " + port);
});

The above application invokes express to construct an HTTP web server that listens to port 8080 by default.

In addition, express has been configured to serve static files from the styles and image folders, and maps all dynamic GET and POST requests to the displayRequestedPage view function of the layout framework.

Using the model client-side and dynamically updating the DOM

As already explained, using JavaScript as an implementation language also makes it possible to directly consume the application model in the browser and dynamically generate pages from it.

To make this possible, we only have to write a very minimal static HTML page:

<!DOCTYPE html>

<html>
    <head>
        <title>My page</title>
        <script type="module">
import { application } from "./appmodel.mjs";
import { initRequestedPage, updateRequestedPage } from "js-sblayout/view/client/index.mjs";

document.body.onload = function() {
    initRequestedPage(application);
};

document.body.onpopstate = function() {
    updateRequestedPage(application);
};
        </script>
    </head>

    <body>
    </body>
</html>

The above HTML page has the following properties:

It contains the bare minimum of HTML code to construct a page that is still valid HTML5.
We include the application model (shown earlier) that is identical to the application model that we have been using to generate pages server-side.
We configure two event handlers. When the page is loaded (onload) we initially render all required page elements in the DOM (including the sections that translate to divs). Whenever the user clicks on a link (onpopstate), we update the affected sections in the DOM.

To make the links in the menu sections work, we have to compose them in a slightly different way -- rather than using the path to derive the selected sub page, we have to use hashes instead.

For example, the second sub page of the first page can be reached by opening the following URL:

http://localhost/index.html#/page1/page21

The popstate event triggers whenever the browser's history changes, and makes it possible for the user to use the back and forward navigation buttons.

Generating dynamic content

In the example application model shown earlier, all sections are made out of static HTML code fragments. Sometimes it may also be desired to generate the sections' content dynamically, for example, to respond to user input.

In addition to providing a string with static HTML code as a parameter, it is also possible to provide a function that generates the content of the section dynamically.

new StaticContentPage("Home", new Contents("home.html"), {
    ...
    hello: new StaticContentPage("Hello 10 times", new Contents(displayHello10Times))
})

In the above code fragment, we have added a new sub page the to entry page that refers to the function: displayHello10Times to dynamically generate content. The purpose of this function is to display the string: "Hello" 10 times:

When writing an application that generates pages server-side, we could implement this function as follows:

function displayHello10Times(req, res) {
    for(let i = 0; i < 10; i++) {
        res.write("<p>Hello!</p>\n");
    }
}

The above function follows a convention that is commonly used by applications using Node.js internal HTTP server:

The req parameter refers to the Node.js internal HTTP server's http.IncomingMessage object and can be used to retrieve HTTP headers and other request parameters.
The req.sbLayout parameter provides parameters that are related to the layout framework.
The res parameter refers to the Node.js internal HTTP server's http.ServerResponse object and can be used to generate a response message.

It is also allowed to declare the function above async or let it return a Promise so that asynchronous APIs can be used.

When developing a client-side application (that dynamically updates the browser DOM), this function should have a different signature:

function displayHello10Times(div, params) {
    let response = "";

    for(let i = 0; i < 10; i++) {
        response += "<p>Hello!</p>\n";
    }

    div.innerHTML = response;
}

In the browser, a dynamic content generation function accepts two parameters:

div refers to an HTMLDivElement in the DOM that contains the content of the section.
params provides layout framework specific properties (identical to req.sbLayout in the server-side example).

Using a templating engine

Providing functions that generate dynamic content (by embedding HTML code in strings) may not always be the most intuitive way to generate dynamic content. It is also possible to configure template handlers: the framework can invoke a template handler function for files with a certain extension.

In the following server-side example, we define a template handler for files with an .ejs extension to use the EJS templating engine:

import { application } from "./appmodel.mjs";
import { displayRequestedPage } from "js-sblayout/view/server/index.mjs";
import { createTestServer } from "js-sblayout/testhttpserver.mjs";

import * as ejs from "ejs";

function renderEJSTemplate(req, res, sectionFile) {
    return new Promise((resolve, reject) => {
        ejs.renderFile(sectionFile, { req: req, res: res }, {}, function(err, str) {
            if(err) {
                reject(err);
            } else {
                res.write(str);
                resolve();
            }
        });
    });
}

const server = createTestServer(function(req, res, url) {
    displayRequestedPage(req, res, application, url.pathname, {
        ejs: renderEJSTemplate
    });
});
server.listen(process.env.PORT || 8080);

In the above code fragment, the renderEJSTemplate function is used to open an .ejs template file and uses ejs.renderFile function to render the template. The resulting string is propagated as a response to the user.

To use the template handlers, we invoke the displayRequestedPage with an additional parameter that maps the ejs file extension to the template handler function.

In a client-side/browser application, we can define a template handler as follows:

<!DOCTYPE html>

<html>
    <head>
        <title>My page</title>
        <script type="text/javascript" src="ejs.js"></script>
        <script type="module">
import { application } from "./appmodel.mjs";
import { initRequestedPage, updateRequestedPage } from "js-sblayout/view/client/index.mjs";

const templateHandlers = {
  ejs: function(div, response) {
      return ejs.render(response, {});
  }
}

document.body.onload = function() {
    initRequestedPage(application, templateHandlers);
};

document.body.onpopstate = function() {
    updateRequestedPage(application, templateHandlers);
};
        </script>
    </head>

    <body>
    </body>
</html>

In the above code fragment, we define a templateHandlers object that gets propagated to the view function that initially renders the page (initRequestedPage) and dynamically updates the page (updateRequestedPage).

By adding the following sub page to the entry page, we can use an ejs template file to dynamically generate a page rather than a static HTML file or function:

new StaticContentPage("Home", new Contents("home.html"), {
    ...
    stats: new StaticContentPage("Stats", new Contents("stats.ejs"))
})

In a server-side application, we can use stats.ejs to display request variables:

<h2>Request parameters</h2>

<table>
    <tr>
        <th>HTTP version</th>
        <td><%= req.httpVersion %></td>
    </tr>
    <tr>
        <th>Method</th>
        <td><%= req.method %></td>
    </tr>
    <tr>
        <th>URL</th>
        <td><%= req.url %></td>
    </tr>
</table>

resulting in a page that may have the following look:

In a client-side application, we can use stats.ejs to display browser variables:

<h2>Some parameters</h2>

<table>
    <tr>
        <th>Location URL</th>
        <td><%= window.location.href %></td>
    </tr>
    <tr>
        <th>Browser languages</th>
        <td>
        <%
        navigator.languages.forEach(language => {
            %>
            <%= language %><br>
            <%
        });
        %>
        </td>
    </tr>
    <tr>
        <th>Browser code name</th>
        <td><%= navigator.appCodeName %></td>
    </tr>
</table>

displaying the following page:

Strict section and page key ordering

In all the examples shown previously, we have used an Object to define sections and sub pages. In JavaScript, the order of keys in an object is somewhat deterministic but not entirely -- for example, numeric keys will typically appear before keys that are arbitrary strings, regardless of the insertion order.

As a consequence, the order of the pages and sections may not be the same as the order in which the keys are declared.

When the object key ordering is a problem, it is also possible to use iterable objects, such as a nested array, to ensure strict key ordering:

import { Application } from "js-sblayout/model/Application.mjs";

import { StaticSection } from "js-sblayout/model/section/StaticSection.mjs";
import { MenuSection } from "js-sblayout/model/section/MenuSection.mjs";
import { ContentsSection } from "js-sblayout/model/section/ContentsSection.mjs";

import { StaticContentPage } from "js-sblayout/model/page/StaticContentPage.mjs";
import { HiddenStaticContentPage } from "js-sblayout/model/page/HiddenStaticContentPage.mjs";
import { PageAlias } from "js-sblayout/model/page/PageAlias.mjs";

import { Contents } from "js-sblayout/model/page/content/Contents.mjs";

/* Create an application model */

export const application = new Application(
    /* Title */
    "My application",

    /* Styles */
    [ "default.css" ],

    /* Sections */
    [
        [ "header", new StaticSection("header.html") ],
        [ "menu", new MenuSection(0) ],
        [ "submenu", new MenuSection(1) ],
        [ "contents", new ContentsSection(true) ],
        [ 1, new StaticSection("footer.html") ]
    ],

    /* Pages */
    new StaticContentPage("Home", new Contents("home.html"), [
        [ 404, new HiddenStaticContentPage("Page not found", new Contents("error/404.html")) ],

        [ "home", new PageAlias("Home", "") ],

        [ "page1", new StaticContentPage("Page 1", new Contents("page1.html"), [
            [ "page11", new StaticContentPage("Subpage 1.1", new Contents("page1/subpage11.html")) ],
            [ "page12", new StaticContentPage("Subpage 1.2", new Contents("page1/subpage12.html")) ],
            [ "page13", new StaticContentPage("Subpage 1.3", new Contents("page1/subpage13.html")) ]
        ])],

        [ "page2", new StaticContentPage("Page 2", new Contents("page2.html"), [
            [ "page21", new StaticContentPage("Subpage 2.1", new Contents("page2/subpage21.html")) ],
            [ "page22", new StaticContentPage("Subpage 2.2", new Contents("page2/subpage22.html")) ],
            [ "page23", new StaticContentPage("Subpage 2.3", new Contents("page2/subpage23.html")) ]
        ])],
        
        [ 0, new StaticContentPage("Last page", new Contents("lastpage.html")) ]
    ]),

    /* Favorite icon */
    "favicon.ico"
);

In the above example, we have rewritten the application model example to use strict key ordering. We have added a section with numeric key: 1 and a sub page with key: 0. Because we have defined a nested array (instead of an object), these section and page will come last (if we would have used an object, then they will appear first, which is undesired).

Internally, the Application and Page objects use a Map to ensure strict ordering.

More features

The framework has full feature parity with the PHP and Java implementations of the layout framework. In addition to the features described in the previous sections, it can also do the following:

Work with multiple content sections. In our examples, we only have one content section that changes when picking a menu item, but it is also possible to have multiple content sections.
Page specific stylesheets and JavaScript includes. Besides including CSS stylesheets and JavaScript files globally it can also be done on page level.
Using path components as parameters. Instead of selecting a sub page, it is also possible to treat a path component as a parameter and dynamically generate a response.
Internationalized pages. Each sub page uses an ISO localization code and the framework will pick the most suitable language in which the page should be displayed by default.
Security handlers. Every page can implements its own method that checks whether it should be accessible or not according to a custom security policy.
Controllers. It is also possible to process GET or POST parameters before the page gets rendered to decide what to do with them, such as validation.

Conclusion

In this blog post, I have described the features of the JavaScript port of my layout framework. In addition to rendering pages server-side, it can also be directly used in the web browser to dynamically update the DOM. For the latter aspect, it is not required to run any server-side scripting language making application deployments considerably easier.

One of the things I liked about this experiment is that the layout model is sufficiently high-level so that it can be used in a variety of application domains. To make client-side rendering possible, I only had to develop another view function. The implementation of the model aspect is exactly the same for server-side and client-side rendering.

Moreover, the newer features of the JavaScript language (most notably ECMAScript modules) make it much easier to reuse code between Node.js and web browsers. Before ECMAScript modules were adopted by browser vendors, there was no module system in the browser at all (Node.js has CommonJS) forcing me to implement all kinds of tricks to make a reusable implementation between Node.js and browsers possible.

As explained in the introduction of this blog post, web front-end technologies do not have a separated layout concern. A possible solution to cope with this limitation is to generate pages server-side. With the JavaScript implementation this is no longer required, because it can also be directly done in the browser.

However, this does still not fully solve my layout frustrations. For example, dynamically generated pages are poorly visible to search engines. Moreover, a dynamically rendered web application is useless to users that have JavaScript disabled, or a web browser that does not support JavaScript, such as text browsers.

Using JavaScript also breaks the declarative nature of web applications -- HTML and CSS allow you to write what the structure and style of a page without specifying how to render it. This has all kinds of advantages, such as the ability to degrade gracefully when certain features cannot be used, such as graphics. With JavaScript some of these properties are lost.

Still, this project was a nice distraction -- I already had the idea to explore this for several years. During the COVID-19 pandemic, I have read quite a few technical books, such as JavaScript: The Definitive Guide and learned that with the introduction of new language JavaScript features, such as ECMAScript modules, it would be possible to exactly the same implementation of the model both server-side and client-side.

As explained in my blog reflection over 2021, I have been overly focused on a single goal for almost two years and it started to negatively affect my energy level. This project was a nice short distraction.

Future work

I have also been investigating whether I could use my framework to create offline web applications with a consistent layout. Unfortunately, it does not seem to be very straight forward to do that.

It seems that it is not allowed to do any module imports from local files for security reasons. In theory, this restriction can be bypassed by packing up all the modules into a single JavaScript include with webpack.

However, it turns out that there is another problem -- it is also not possible to open any files from the local drive for security reasons. There is a file system access API in development, that is still not finished or mature yet.

Some day, when these APIs have become more mature, I may revisit this problem and revise my framework to also make offline web applications possible.

Availability

The JavaScript port of my layout framework can be obtained from my GitHub page. To use this framework client-side, a modern web browser is required, such as Mozilla Firefox or Google Chrome.

Structured asynchronous programming revisited (Asynchronous programming with JavaScript part 5)

2022-01-11T19:14:00.000+01:00

It has been a while since I wrote a JavaScript related blog post. In my previous job, I was using it on a daily basis, but in the last few years I have been using it much less frequently.

One of the reasons that I wrote so many JavaScript-related blog posts is because the language used to have many catches, such as:

Scoping. Contrary to many other mainstream programming languages, JavaScript uses function-level scoping as opposed to block-level scoping. Syntactically, function-level scoping looks very similar to block-level scoping.

Function-level scoping has a number of implications that could have severe consequences. For example, you may unintentionally re-assign values.
Simulating class-based inheritance. JavaScript supports Object Oriented programming with prototypes rather than classes (that most mainstream Object Oriented programming languages use). It is possible to use prototypes to simulate classes and class-based inheritance.

Although I consider prototypes to be conceptually simple, using them in JavaScript used to be quite confusing. As a consequence, simulating class inheritance also used to be quite difficult.
Asynchronous programming. JavaScript is originally designed for use in web browsers, but has also become quite popular outside the browser, such as Node.js, to write server/command-line applications. For both browser usage as well as server applications, it is often desired to do multiple tasks concurrently.

Unfortunately, most of JavaScript's language constructs are synchronous, making such tasks quite difficult without using any software abstractions.

In particular about the last topic: asynchronous programming, I wrote many blog posts. I have elaborated about callbacks in Node.js and abstraction libraries to do coordination and another popular abstraction: promises.

I have also argued that most of JavaScript's language constructs, that implement structured programming concepts, are synchronous and cannot be used with asynchronous functions that return (almost) immediately, and resume their executions with callbacks.

I have built my own library: slasp, that implements asynchronous equivalents for the synchronous language constructs that you should not use.

Fortunately, much has happened since I wrote that blog post. The JavaScript language has many new features (part of the ECMAScript 6 standard) that have become a standard practice nowadays, such as:

Block level scoping. Block scoped immutable values can be declared with: const and mutable values with: let.
An object with a custom prototype can now be directly created with: Object.create.
A class construct was added that makes it possible to define classes (that are simulated with prototypes).

Moreover, modern JavaScript also has new constructs and APIs for asynchronous programming, making most of the software abstractions that I have elaborated about obsolete for the most part.

Recently, I have been using these new constructs quite intensively and learned that my previous blog posts (that I wrote several years ago) are still mostly about old practices.

In this blog post, I will revisit the structured programming topic and explain how modern language constructs can be used to implement these concepts.

Asynchronous programming in JavaScript

As explained in my previous blog posts, asynchronous programming in JavaScript is important for a variety of reasons. Some examples are:

Animating objects and providing other visual effects in a web browser, while keeping the browser responsive so that it can respond to user events, such as mouse clicks.
The ability to serve multiple clients concurrently in a Node.js server application.

Multi-tasking in JavaScript is (mainly) cooperative. The idea is that JavaScript code runs in a hidden main loop that responds to events in a timely manner, such as user input (e.g. mouse clicks) or incoming connections.

To keep your application responsive, it is required that the execution of a code block does not take long (to allow the application to respond to other events), and that an event is generated to allow the execution to be resumed at a later point in time.

Not meeting this requirement may cause the web browser or your server application to block, which is often undesirable.

Because writing non-blocking code is so important, many functions in the Node.js API are asynchronous by default: they return (almost) immediately and use a callback function parameter that gets invoked when the work is done.

For example, reading a file from disk while keeping the application responsive can be done as follows:

const fs = require('fs');

fs.readFile("hello.txt", function(err, contents) {
    if(err) {
        console.error(err);
        process.exit(1);
    } else {
        console.log(contents);
    }
});

Note that in the above code fragment, instead of relying on the return value of the fs.readFile call, we provide a callback function as a parameter that gets invoked when the operation has finished. The callback is responsible for displaying the file's contents or the resulting error message.

While the file is being read (that happens in the background), the event loop is still able to process other events making it possible for the application to work on other tasks concurrently.

To ensure that an application is responsive and scalable, I/O related functionality in the Node.js API is asynchronous by default. For some functions there are also synchronous equivalents for convenience, but as a rule of thumb they should be avoided as much as possible.

Asynchronous I/O is an important ingredient in making Node.js applications scalable -- because I/O operations are typically several orders of magnitude slower than CPU operations, the application should remain responsive as long as no callback takes long to complete. Furthermore, because there is no thread per connection model, there is no context-switching and memory overhead for each concurrent task.

However, asynchronous I/O operations and a callback-convention does not guarantee that the main loop never gets blocked.

When implementing tasks that are heavily CPU-bound (such as recursively computing a Fibonacci number), the programmer has to make sure that the execution does not block the main loop too long (for example, by dividing it into smaller tasks that generate events, or using threads).

Code structuring issues

Another challenge that comes with asynchronous functions is that it becomes much harder to keep your code structured and maintainable.

For example, if we want to create a directory, then write a text file to it, and then read the text file, and only use non-blocking functions to keep the application responsive, we may end up writing:

const fs = require('fs');

fs.mkdir("test", function(err) {
    if(err) {
        console.error(err);
        process.exit(1);
    } else {
        fs.writeFile("hello.txt", "Hello world!", function(err) {
            if(err) {
                console.error(err);
                process.exit(1);
            } else {
                fs.readFile("hello.txt", function(err, contents) {
                    if(err) {
                        console.error(err);
                        process.exit(1);
                    } else {
                        // Displays: "Hello world!"
                        console.log(contents);
                    }
                });
            }
        });
    }
});

As may be observed, for each function call, we define a callback responsible for checking the status of the call and executing the next step. For each step, we have to nest another callback function, resulting in pyramid code.

The code above is difficult to read and maintain. If we want to add another step in the middle, we are forced to refactor the callback nesting structure, which is labourious and tedious.

Because code structuring issues are so common, all kinds of software abstractions have been developed to coordinate the execution of tasks. For example, we can use the async library to rewrite the above code fragment as follows:

const async = require('async');

async.waterfall([
    function(callback) {
        fs.mkdir("test", callback);
    },
    
    function(callback) {
        fs.writeFile("hello.txt", "Hello world!", callback);
    },
    
    function(callback) {
        fs.readFile("hello.txt", callback);
    },
    
    function(contents, callback) {
        // Displays: "Hello world!"
        console.log(contents);
        callback();
    }
], function(err) {
    if(err) {
        console.error(err);
        process.exit(1);
    }
});

The async.waterfall abstraction flattens the code, allows us to conveniently add additional asynchronous steps and change the order, if desired.

Promises

In addition to Node.js-style callback functions and abstraction libraries for coordination, a more powerful software abstraction was developed: promises (to be precise: there are several kinds of promise abstractions developed, but I am referring to the Promises/A+ specification).

Promises have become very popular, in particular for APIs that are used in the browser. As a result, they have been accepted into the core JavaScript API.

With promises the idea is that every asynchronous function quickly returns a promise object that can be used as a reference to a value that will be delivered at some point in the future.

For example, we can wrap the function invocation that reads a text file into a function that returns promise:

const fs = require('fs');

function readHelloFile() {
    return new Promise((resolve, reject) => {
       fs.readFile("hello.txt", function(err, contents) {
           if(err) {
               reject(err);
           } else {
               resolve(contents);
           }
       });
    });
}

The above function: readHelloFile invokes fs.readFile from the Node.js API to read the hello.txt file and returns a promise. In case the file was successfully read, the promise is resolved and the file's contents is propagated as a result. In case of an error, the promise is rejected with the resulting error message.

To retrieve and display the result, we can invoke the above function as follows:

readHelloFile().then(function(contents) {
    console.log(contents);
}, function(err) {
    console.error(err);
    process.exit(1);
});

Invoking the then method causes the main event loop to invoke either the resolve (first parameter) or reject callback function (second parameter) when the result is available.

Because promises have become part of the ECMAScript standard, Node.js has introduced alternative APIs that are promise-based, instead of callback based (such as for filesystem operations: fs.promises).

By using the promises-based API for filesystem operations, we can simplify the previous example to:

const fs = require('fs').promises;

fs.readFile("hello.txt").then(function(contents) {
    console.log(contents);
}, function(err) {
    console.error(err);
    process.exit(1);
});

As described in my old blog post about promises -- they are considered more powerful than callbacks. A promise provides you a reference to a value that is in the process of being delivered. Callbacks can only give you insights in the status of a background task as soon as it completes.

Although promises have an advantage over callbacks, both approaches still share the same drawback -- we are forced to avoid most JavaScript language constructs and use alternative function abstractions.

In modern JavaScript, it is also no longer necessary to always explicitly create promises. Instead, we can also declare a function as async. Simply returning a value or throwing exception in such a function automatically ensures that a promise is returned:

async function computeSum(a, b) {
    return a + b;
}

The function above returns the sum of the provided input parameters. The JavaScript runtime automatically wraps its execution into a promise that can be used to retrieve the return value at some point in the future.

(As a sidenote: the function above returns a promise, but is still blocking. It does not generate an event that can be picked up by the event loop and a callback that can resume its execution at a later point in time.)

The result of executing the following code:

const result = computeSum(1, 1);
console.log("The result is: " + result);

is a promise object, not a numeric value:

Result is: [object Promise]

When a function returns a promise, we also no longer have to invoke .then() and provide callbacks as parameters to retrieve the result or any thrown errors. The await keyword can be used to automatically wait for a promise to yield its return value or an exception, and then move to the next statement:

(async function() {
    const result = await computeSum(1, 1);
    console.log("The result is: " + result); // The result is: 2
})();

The only catch is that you can only use await in the scope of an asynchronous function. The default scope of a Node.js program is synchronous. As a result, we have to wrap the code into an asynchronous function block.

By using the promise-based fs API and the new language features, we can rewrite our earlier callback-based example (that creates a directory, writes and reads a file) as follows:

const fs = require('fs').promises;

(async function() {
    try {
        await fs.mkdir("test");
        await fs.writeFile("hello.txt", "Hello world!");
        const contents = await fs.readFile("hello.txt");
    } catch(err) {
        console.error(err);
        process.exit(1);
    }
})();

As may be observed, the code is much simpler than manually orchestrating promises.

Structured programming concepts

In my previous blog post, I have argued that most of JavaScript's language constructs (that implement structured programming concepts) cannot be directly used in combination with non-blocking functions (that return almost immediately and require callback functions as parameters).

As a personal exercise, I have created function abstractions that are direct asynchronous equivalents for all these structured programming concepts that should be avoided and added them to my own library: slasp.

By combining promises, async functions and await statements, these function abstractions have mostly become obsolete.

In this section, I will go over the structured programming concepts I covered in my old blog post and show their direct asynchronous programming equivalents using modern JavaScript language constructs.

Function definitions

As I have already explained in my previous blog post, the most basic thing one can do in JavaScript is executing statements, such as variable assignments or function invocations. This used to be already much different when moving from a synchronous programming to an asynchronous programming world.

As a trivial example, I used a synchronous function whose only purpose is to print text on the console:

function printOnConsole(value) {
    console.log(value);
}

The above example is probably too trivial, but it is still possible to make it non-blocking -- we can generate a tick event so that the function returns immediately and use a callback parameter so that the task will be resumed at a later point in time:

function printOnConsole(value) {
    return new Promise((resolve, reject) => {
        process.nextTick(function() {
            console.log(value);
            resolve();
        });
    });
}

To follow modern JavaScript practices, the above function is wrapped into a constructor that immediately returns a promise that can be used as a reference to determine when the task was completed.

(As a sidenote: we compose a regular function that returns a promise. We cannot define an async function, because the process.nextTick is an asynchronous function that requires a callback function parameter. The callback is responsible for propagating the end result. Using a return only causes the callback function to return and not the enclosing function.)

I have also shown that for functions that return a value, the same principle can be applied. As an example, I have used a function that translates a numeric digit into a word:

function generateWord(digit) {
    const words = [ "zero", "one", "two", "three", "four",
        "five", "six", "seven", "eight", "nine" ];
    return words[digit];
}

We can also make this function non-blocking by generating a tick event and wrapping it into a promise:

function generateWord(digit) {
    return new Promise((resolve, reject) => {
        process.nextTick(function() {
            const words = [ "zero", "one", "two", "three", "four", "five",
                "six", "seven", "eight", "nine" ];
            resolve(words[digit]);
        });
    });
}

Sequential decomposition

The first structured programming concept I elaborated about was sequential decomposition in which a number of statements are executed in sequential order.

I have shown a trivial example that adds 1 to a number, then converts the resulting digit into a word, and finally prints the word on the console:

const a = 1;
const b = a + 1;
const number = generateWord(b);
printOnConsole(number); // two

With the introduction of the await keyword, converting the above code to use the asynchronous implementations of all required functions has become straight forward:

(async function() {
    const a = 1;
    const b = a + 1;
    const number = await generateWord(b);
    await printOnConsole(number); // two
})();

The above example is a one-on-one port of its synchronous counterpart -- we just have to use the await keyword in combination with our asynchronous function invocations (that return promises).

The only unconventional aspect is that we need to wrap the code inside an asynchronous function block to allow the await keyword to be used.

Alteration

The second programming concept that I covered is alteration that is used to specify conditional statements.

I gave a simple example that checks whether a given name matches my own name:

function checkMe(name) {
    return (name == "Sander");
}
    
const name = "Sander";
    
if(checkMe(name)) {
    printOnConsole("It's me!");
    printOnConsole("Isn't it awesome?");
} else {
    printOnConsole("It's someone else!");
}

It is also possible to make the checkMe function non-blocking by generating a tick event and wrapping it into a promise:

function checkMe(name) {
    return new Promise((resolve, reject) => {
        process.nextTick(function() {
            resolve(name == "Sander");
        });
    });
}

To invoke the asynchronous function shown above inside the if-statement, we only have to write:

(async function() {
    const name = "Sander";

    if(await checkMe(name)) {
        await printOnConsole("It's me!");
        await printOnConsole("Isn't it awesome?");
    } else {
        await printOnConsole("It's someone else!");
    }
})();

In my previous blog post, I was forced to abolish the regular if-statement and use an abstraction (slasp.when) that invokes the non-blocking function first, then uses the callback to retrieve the result for use inside an if-statement. In the above example, the only subtle change I need to make is to use await inside the if-statement.

I can also do the same thing for the other alteration construct: the switch -- just using await in the conditional expression and the body should suffice.

Repetition

For the repetition concept, I have shown an example program that implements the Gregory-Leibniz formula to approximate PI up to 6 digits:

function checkTreshold(approx) {
    return (approx.toString().substring(0, 7) != "3.14159");
}

let approx = 0;
let denominator = 1;
let sign = 1;

while(checkTreshold(approx)) {
    approx += 4 * sign / denominator;
    printOnConsole("Current approximation is: "+approx);

    denominator += 2;
    sign *= -1;
}

As with the previous example, we can also make the checkTreshold function non-blocking:

function checkTreshold(approx) {
    return new Promise((resolve, reject) => {
        process.nextTick(function() {
            resolve(approx.toString().substring(0, 7) != "3.14159");
        });
    });
}

In my previous blog post, I have explained that the while statement is unfit for executing non-blocking functions in sequential order, because they return immediately and have to resume their execution at a later point in time.

As with the alteration language constructs, I have developed a function abstraction that is equivalent to the while statement (slasp.whilst), making it possible to have a non-blocking conditional check and body.

With the introduction of the await statement, this abstraction function also has become obsolete. We can rewrite the code as follows:

(async function() {
    let approx = 0;
    let denominator = 1;
    let sign = 1;

    while(await checkTreshold(approx)) {
        approx += 4 * sign / denominator;
        await printOnConsole("Current approximation is: "+approx);

        denominator += 2;
        sign *= -1;
    }
})();

As can be seen, the above code is a one-on-one port of its synchronous counterpart.

The function abstractions for the other repetition concepts: doWhile, for, for-in have also become obsolete by using await for evaluating the non-blocking conditional expressions and bodies.

Implementing non-blocking recursive algorithms still remains tricky, such as the following (somewhat inefficient) recursive algorithm to compute a Fibonacci number:

function fibonacci(n) {
    if (n < 2) {
        return 1;
    } else {
        return fibonacci(n - 2) + fibonacci(n - 1);
    }
}

const result = fibonacci(20);
printOnConsole("20th element in the fibonacci series is: "+result);

The above algorithm is mostly CPU-bound and takes some time to complete. As long as it is computing, the event loop remains blocked causing the entire application to become unresponsive.

To make sure that the execution does not block for too long by using cooperative multi-tasking principles, we should regularly generate events, suspend its execution (so that the event loop can process other events) and use callbacks to allow it to resume at a later point in time:

function fibonacci(n) {
    return new Promise((resolve, reject) => {
        if (n < 2) {
            setImmediate(function() {
                resolve(1);
            });
        } else {
            let first;
            let second;

            fibonacci(n - 2)
            .then(function(result) {
                first = result;
                return fibonacci(n - 1);
            })
            .then(function(result) {
                second = result;
                resolve(first + second);
            });
        }
    });
}

(async function() {
    const result = await fibonacci(20);
    await printOnConsole("20th element in the fibonacci series is: "+result);
})();

In the above example, I made the algorithm non-blocking by generating a macro-event with setImmediate for the base step. Because the function returns a promise, and cannot be wrapped into an async function, I have to use the promises' then methods to retrieve the return values of the computations in the induction step.

Extensions

In my previous blog post, I have also covered the extensions to structured programming that JavaScript provides.

Exceptions

I have also explained that with asynchronous functions, we cannot use JavaScript's throw, and try-catch-finally language constructs, because exceptions are typically not thrown instantly but at a later point in time.

With await, using these constructs is also no longer a problem.

For example, I can modify our generateWord example to throw an exception when the provided number is not between 0 and 9:

function generateWord(num) {
    if(num < 0 || num > 9) {
        throw "Cannot convert "+num+" into a word";
    } else {
        const words = [ "zero", "one", "two", "three", "four", "five",
            "six", "seven", "eight", "nine" ];
        return words[num];
    }
}

try {
    let word = generateWord(1);
    printOnConsole("We have a: "+word);
    word = generateWord(10);
    printOnConsole("We have a: "+word);
} catch(err) {
    printOnConsole("Some exception occurred: "+err);
} finally {
    printOnConsole("Bye bye!");
}

We can make generateWord an asynchronous function by converting it in the usual way:

function generateWord(num) {
    return new Promise((resolve, reject) => {
        process.nextTick(function() {
            if(num < 0 || num > 9) {
                reject("Cannot convert "+num+" into a word");
            } else {
                const words = [ "zero", "one", "two", "three", "four", "five",
                    "six", "seven", "eight", "nine" ];
                resolve(words[num]);
            }
        });
    });
}

(async function() {
    try {
        let word = await generateWord(1);
        await printOnConsole("We have a: "+word);
        word = await generateWord(10);
        await printOnConsole("We have a: "+word);
    } catch(err) {
        await printOnConsole("Some exception occurred: "+err);
    } finally {
        await printOnConsole("Bye bye!");
    }
})();

As can be seen in the example above, thanks to the await construct, we have not lost our ability to use try/catch/finally.

Objects

Another major extension is object-oriented programming. As explained in an old blog post about object oriented programming in JavaScript, JavaScript uses prototypes rather than classes, but prototypes can still be used to simulate classes and class-based inheritance.

Because simulating classes is such a common use-case, a class construct was added to the language that uses prototypes to simulate it.

The following example defines a Rectangle class with a method that can compute a rectangle's area:

class Rectangle {
    constructor(width, height) {
        this.width = width;
        this.height = height;
    }

    calculateArea() {
        return this.width * this.height;
    }
}

const r = new Rectangle(2, 2);

printOnConsole("Area is: "+r.calculateArea());

In theory, it is also possible that the construction of an object takes a long time and should be made non-blocking.

Although JavaScript does not have a language concept to do asynchronous object construction, we can still do it by making a couple of small changes:

class Rectangle {
    asyncConstructor(width, height) {
        return new Promise((resolve, reject) => {
            process.nextTick(() => {
                this.width = width;
                this.height = height;
                resolve();
            });
        });
    }

    calculateArea() {
        return this.width * this.height;
    }
}

(async function() {
    const r = new Rectangle();
    await r.asyncConstructor(2, 2);
    await printOnConsole("Area is: "+r.calculateArea());
})();

As can be seen in the above example, the constructor function has been replaced with an asyncConstructor method that implements the usual strategy to make it non-blocking.

To asynchronously construct the rectangle, we first construct an empty object using the Rectangle class object as its prototype. Then we invoke the asynchronous constructor to initialize the object in a non-blocking way.

In my previous blog post, I have developed an abstraction function that could be used as an asynchronous replacement for JavaScript's new operator (slasp.novel) that performs the initialization of an empty object and then invokes the asynchronous constructor.

Due to the fact that JavaScript introduced a class construct (that replaces all the obscure instructions that I had to perform to simulate an empty object instance with the correct class object as prototype) my abstraction function has mostly lost its value.

Summary of concepts

In my previous blog post, I have given an overview of all covered synchronous programming language concepts and corresponding replacement function abstractions that should be used with non-blocking asynchronous functions.

In this blog post, I will do the same with the concepts covered:

Concept	Synchronous	Asynchronous
Function interface	function f(a) { ... }	async function f(a) { ... } function f(a) { return new Promise(() => {...}); }
Return statement	return val;	return val; resolve(val);
Sequence	a(); b(); ...	await a(); await b(); ...
if-then-else	if(condFun()) { thenFun(); } else { elseFun(); }	if(await condFun()) { await thenFun(); } else { await elseFun(); }
switch	switch(condFun()) { case "a": funA(); break; case "b": funB(); break; ... }	switch(await condFun()) { case "a": await funA(); break; case "b": await funB(); break; ... }
Recursion	function fun() { fun(); }	function fun(callback) { return new Promise((res, rej) => { setImmediate(function() { return fun(); }); }); }
while	while(condFun()) { stmtFun(); }	while(await condFun()) { await stmtFun(); }
doWhile	do { stmtFun(); } while(condFun());	do { await stmtFun(); } while(await condFun());
for	for(startFun(); condFun(); stepFun() ) { stmtFun(); }	for(await startFun(); await condFun(); await stepFun() ) { await stmtFun(); }
for-in	for(const a in arrFun()) { stmtFun(); }	for(const a in (await arrFun())) { await stmtFun(); }
throw	throw err;	throw err; reject(err);
try-catch-finally	try { funA(); } catch(err) { funErr(); } finally { funFinally(); }	try { await funA(); } catch(err) { await funErr(); } finally { await funFinally(); }
constructor	class C { constructor(a) { this.a = a; } }	class C { asyncConstructor(a) { return new Promise((res, rej) => { this.a = a; res(); } } }
new	const obj = new C(a);	const obj = new C(); await obj.asyncConstructor(a);

The left column in the table shows all language constructs that are synchronous by default and the right column shows their equivalent asynchronous implementations.

Note that compared to the overview given in my previous blog post, the JavaScript language constructs are not avoided, but used.

With the exceptions of wrapping callback-based function invocations in promises and implementing recursive algorithms, using await usually suffices to retrieve the results of all required sub expressions.

Discussion

In all my previous blog posts that I wrote about asynchronous programming, I was always struck by the fact that most of JavaScript's language constructs were unfit for asynchronous programming. The abstractions that were developed to cope with this problem (e.g. callbacks, coordination libraries, promises etc.) make it possible to get the job done in a reasonable manner, but IMO they always remain somewhat tedious to use and do not prevent you from making many common mistakes.

Using these abstractions remained a common habit for years. With the introduction of the async and await concepts, we finally have a solution that is decent IMO.

Not all problems have been solved with the introduction of these new language features. Callback-based APIs have been a common practice for a very long time, as can be seen in some of my examples. Not all APIs have been converted to promise-based solutions and there are still many APIs and third-party libraries that keep following old practices. Most likely, not all old-fashioned APIs will ever go away.

As a result, we sometimes still have to manually compose promises and do the appropriate conversions from callback APIs. There are also nice facilities that make it possible to conveniently convert callback invocations into promises, but it still remains a responsibility of the programmer.

Another problem (that I often see with programmers that are new to JavaScript), is that they believe that using the async keyword automatically makes their functions non-blocking.

Ensuring that a function does not block still remains the responsibility of the programmer. For example, by making sure that only non-blocking I/O functions are called, or CPU-bound instructions are broken up into smaller pieces.

The async keyword is only an interface solution -- making sure that a function returns a promise and that the await keyword can be used so that a function can stop (by returning) and be resumed at a later point in time.

The JavaScript language does not natively support threads or processes, but APIs have been added to Node.js (worker threads) and web browsers (web workers) to allow code to be executed in a thread. Using these facilities somewhat relieve programmers of the burden to divide long running CPU-bound operations into smaller tasks. Moreover, context-switching is typically much more efficient than cooperative multi-tasking (by generating events and invoking callbacks).

Another problem that remains is calling asynchronous functions from synchronous functions -- there is still no facility in JavaScript that makes it possible to wait for the completion of an asynchronous function in a synchronous function context.

I also want to make a general remark about structured programming -- although the patterns shown in this blog post can be used to prevent that a main loop blocks, structured programming is centered around the idea that you need to execute a step after the completion of another. For long running tasks that do not have a dependency on each other this may not always be the most efficient way of doing things. You may end up waiting for an unnecessary amount of time.

The fact that a promise gives you a reference to a value that will be delivered in the future, also gives you many new interesting abilities. For example, in an application that consists of a separate model and view, you could already start composing a view and provide the promises for the model objects as parameters to the views. Then it is no longer necessary to wait for all model objects to be available before the views can be constructed -- instead, the views can already be rendered and the data can be updated dynamically.

Structured programming patterns are also limiting the ability to efficiently process collections of data -- the repetition patterns in this blog post expect that all data is retrieved before we can iterate over the resulting data collection. It may be more efficient to work with asynchronous iterators that can retrieve data on an element-by-element basis, in which the element that comes first is processed first.

11th annual blog reflection

2021-12-30T21:19:00.000+01:00

Today it is my blog's 11th anniversary. As with previous years, this is a nice opportunity to reflect over last year's writings.

Nix process management framework

In the first few months of the year, I have dedicated quite a bit of time on the development of the experimental Nix process framework that I started in 2019.

As explained in my blog reflection over 2020, I have reached all of my original objectives. However, while developing these features and exploring their underlying concepts, I discovered that there were still a number side issues that I needed to address to make the framework usable.

s6-rc backend

The first thing I did was developing a s6-rc backend. Last year, I did not know anything about s6 or s6-rc , but it was provided to me as a feature suggestion by people from the Nix community. Aside from the fact that it is a nice experiment to evaluate how portable the framework is, I also learned a great deal about s6, its related tools, and its ancestor: daemontools from which many of s6's ideas are inspired.

Mutable multi-process containers

I also worked on a mutable multi-process container deployment approach. Last year, I have developed a Docker backend for the Nix process management framework making it possible to expose each running process instance as a container. Furthermore, I also made it possible to conveniently construct multi-process containers in which any capable process manager that the Nix process management framework supports can be used as a root process.

Unfortunately, multi-process containers have a big drawback: they are immutable, and when any of the processes need to be changed or upgraded, the container as a whole needs to be discarded and redeployed from a new image, causing all processes to be terminated.

To cope with this limitation, I have developed an extension that makes it possible to deploy mutable multi-process containers, in which most of the software in containers can be upgraded by the Nix package manager.

As long as the root process is not affected, a container does not have to be discarded when a process is upgraded. This extension also makes it possible to run Hydra: the Nix-based continuous integration service in a Docker container.

Using the Nix process management framework as an infrastructure deployment solution

I was also able to use the Nix process management framework to solve the bootstrap problem for Disnix on non-NixOS systems -- in order to use Disnix, every target machine needs to run the Disnix service and a number of container provider services.

For a NixOS machine this process is automated, but on non-NixOS systems a manual installation is still required, which is quite cumbersome. The Nix process management framework can automatically deploy Disnix and all required container provider services on any system capable of running the Nix package manager and the Nix process management framework.

Test framework

Finally, I have developed a test framework for the Nix process management framework. As I have already explained, the framework makes it possible to use multiple process managers, multiple operating systems, multiple instances of all kinds of services, and run services as an unprivileged user, if desired.

Although the framework facilitates all these features, it cannot guarantee that a service will work with under all possible conditions. The framework makes it possible to conveniently reproduce all these conditions so that a service can be validated.

With the completion of the test framework, I consider the Nix process management framework to be quite practical. I have managed to automate the deployments of all services that I frequently use (e.g. web servers, databases, application services etc.) and they seem to work quite well. Even commonly used Nix projects are packaged, such as the Nix daemon for multi-user installations and Hydra: the Nix-based continuous integration server.

Future work

There are still some open framework issues that I intend to address at some point in the future. We still cannot test any services on non-Linux systems such as FreeBSD, which requires a more generalized test driver.

I also still need to start writing an RFC that identifies the concepts of the framework so that these ideas can be integrated into Nixpkgs. The Nix process management framework is basically a prototype to explore ideas, and it has always been my intention to push the good parts upstream.

Home computing

After the completion of the test framework, I have shifted my priorities and worked on improving my home computing experience.

For many years, I have been using a custom script that uses rsync to exchange files, but implements a Git-like workflow, to make backups of my personal files and exchange files between multiple machines, such as my desktop machine and laptop. I have decided to polish the script, release it, and write a blog post that explains how it came about.

Last summer, I visited the Home Computer Museum, that gave me inspiration to check if both of my vintage computers: the Commodore 128 and Commodore Amiga 500 still work. I have not touched the Amiga since 2011 (the year that I wrote a blog post about it) and it was lying dormant in a box on the attic every since.

Unfortunately, a few peripherals were broken or in a bad condition (such as the hard drive). I have decided to bring it to the museum for repairs and order replacement peripherals. It turns out that it was quite a challenge to have it all figured out, in particular the installation process of the operating system.

Because not all information that I needed is available on the Internet, I have decided to write a blog post about my experiences.

I am still in the process of figuring out all the details for my Commodore 128 and I hope I can publish about it soon.

Revising the NPM package deployment infrastructure for Nix

Aside from doing a nice "home computing detour", I have also been shifting my priorities to a new major development area: improving the NPM deployment infrastructure for Nix. Although node2nix is doing its job pretty well in most cases, its design is heavily dated, and giving me more and more problems in correctly supporting the new features of NPM.

As a first step, I have revised what I consider the second most complicated part of node2nix: the process that populates the node_modules/ folder and makes all necessary modifications so that npm install will not attempt to download source artifacts from their original locations.

This is an important requirement -- the fact that NPM and Nix do not play well together is because dependency management is conflicting -- Nix's purity principles are more strict. As a result, NPM's dependency management needs to be bypassed.

The result is a companion tool that I call: placebo-npm that will replace most of the complicated shell code in the node-env.nix module.

I am still working on revising many other parts of node2nix. This should eventually lead to a new and more modular design, that will support NPM's newer features and should be much easier to maintain.

Next year, I hope to report more about my progress.

Some thoughts

As with 2020, I consider 2021 an exceptional year for the record books.

Development

Compared to last year, I am much less productive from a blogging perspective. Partially, this is caused by the fact that there are still many things I have worked on that I could not properly finish.

I have also noticed that there was a considerable drop in my energy level after I completed the test framework for the Nix process management framework. I think this can be attributed to the fact that the process management framework has basically been my only spare time project for over two years.

For a small part, this kind of work is about exploring ideas, but is even more about the execution of those ideas -- unfortunately, being in execution mode for such a long time (while ignoring the exploration of ideas you come up in other areas) gradually made it more difficult to keep enjoying the work.

Despite struggling with my energy levels, I remained motivated to complete all of it, because I know that I am also a very bad multi-tasker. Switching to something else makes it even more difficult to complete it in a reasonable time span.

After I reached all my goals, for a while, it became extremely difficult to get myself focused on any technical challenge.

Next year, I have another big project that I am planning to take on (node2nix), but at the same I will try to schedule proper "breaks" in between to keep myself in balance.

The COVID-19 pandemic

In my annual reflection from last year, I have also elaborated about the COVID-19 pandemic that reached my home country (The Netherlands) in March 2020. Many things have happened that year, and at the time writing my reflection blog post over 2020, we were in our second lockdown.

The second lockdown felt much worse than the first, but I was still mildly optimistic because of the availability of the first vaccine: Pfizer/BioNTech that looked like our "way out". Furthermore, I was also quite fascinated by the mRNA technique making it possible to produce a vaccine so quickly.

Last year, I was hoping that next year I could report that the problem was under control or at least partially solved, but sadly that does not seem to be the case.

Many things have happened: the variant that appeared in England eventually became dominant (Alpha variant) and was considerably more contagious than the original variant, causing the second lockdown to last another three months.

In addition, two more contagaious mutations appeared at the same time in South Africa and Brazil (Beta and Gamma), of which the Alpha variant became the dominant.

Several months later, there was another huge outbreak in India introducing the Delta variant, that was even more contagious than the Alpha variant. Eventually, that variant became the most dominant in the world. Fortunately, the mRNA vaccines that were developed were still effective enough, but the Delta variant was so contagious that it was considered impossible to build up herd immunity to eradicate the virus.

In the summer, the problem seemed to be mostly under control because many people have been vaccinated in my home country. Nonetheless, we have learned in a painful way that relaxing restrictions too much could still lead to very big rapid outbreaks.

We have also learned in a painful way that the virus spreads more easily in the winter. As a result, we also observed that, despite the fact that over 85% of all adults are fully vaccinated, there are still enough clusters of people that have not built up any resistance against the virus (either by vaccination or contracting the virus), again leading to significant problems in the hospitals and ICs.

Furthermore, the effectiveness of vaccines also drops over time, causing vaccinated people with health problems to still end up in hospitals.

As a result of all these hospitalizations and low IC capacity, we are now in yet another lockdown.

What has always worried me the most is the fact that in so many areas in the world, people hardly have any access to vaccines and medicines causing the virus to easily spread on a massive scale and mutate. I knew because of these four variants, it would only be a matter of time before a new dominant mutation will appear.

And that fear eventually became reality -- in November, in Botswana and South Africa, a new and even more contagious mutation appeared: the Omicron variant with a spike-protein that is much different than the delta variant, reducing the effectiveness of our vaccines.

At the moment, we are not sure the implications are. In the Netherlands, as well as many other countries in the world, the Omicron variant has now become the most dominant virus mutation.

The only thing that may be good news is that the Omicron variant could also potentially cause less severe sickness, but so far we have no clarity on that yet.

I think next year we still have much to learn -- to me it has become very clear that this problem will not go away any time soon and that we have to find more innovative/smarter ways to cope with it.

Furthermore, the mutations have also demonstrated that we should probably do more about inequality in the world. As a consequence, the virus could still significantly spread and mutate becoming a problem to everybody in the world.

Blog posts

Last year I forgot about it, but every year I also typically reflect over my top 10 of most frequently read blog posts:

Managing private Nix packages outside the Nixpkgs tree. As with previous years, this blog post remains the most popular because it is very practical and unanswered in official Nix documentation.
On Nix and GNU Guix. This blog post used to be my most popular blog post for a long time, and still remains the second most popular. I believe this can be attributed to the fact that this comparison is still very relevant.
An evaluation and comparison of Snappy Ubuntu. Also a very popular blog post since 2015. It seems that the comparison with Snappy and Flatpak (a tool with similar objectives) remains relevant.
Disnix 0.5 release announcement and some reflection. This is a blog post that I wrote in 2016 and suddenly appeared in the overall top 10 this year. I am not sure why this has become so relevant all of a sudden.
On using Nix and Docker as deployment solutions: similarities and differences. This is a blog post that I wrote last year to compare Nix and Docker and explain in what ways they are similar and different. It seems to be very popular despite the fact that it was not posted on discussion sites such as Reddit and Hackernews.
Yet another blog post about Object Oriented Programming and JavaScript. This explanation blog post is pretty old but seems to stay relevant, despite the fact that modern JavaScript has a class construct.
Setting up a multi-user Nix installation on non-NixOS systems. Setting up multi-user Nix installations on non-NixOS machines used to be very cumbersome, but fortunately that has been improved in the recent versions. Still, the discussion seems to remain relevant.
An alternative explanation of the Nix package manager. An alternative explanation that I consider to be better of two that I wrote. It seems to remain popular because I refer to it a lot.
On NixOps, Disnix, service deployment and infrastructure deployment. A very popular blog post, that has dropped somewhat in popularity. It still seems that the tools and the discussion is relevant.
Composing FHS-compatible chroot environments with Nix (or deploying steam in NixOS). An old blog post, but it remains relevant because it addresses a very important compatibility concern with binary-only software and a common Nix-criticism that it is not FHS-compatible.

Conclusion

As with 2020, 2021 has been quite a year. I hope everybody stays safe and healthy.

The remaining thing I'd like to say is: HAPPY NEW YEAR!!!

Using my Commodore Amiga 500 in 2021

2021-10-19T22:41:00.007+02:00

Due to the high number of new COVID-19 infections in my home country last summer, I had to "improvise" yet another summer holiday. As a result, I finally found the time to tinker with my old computers again after a very long time of inactivity.

As I have explained in two blog posts that I wrote over ten years ago, the first computer (a Commodore 128 bought by my parents in 1985) and second computer (Commodore Amiga 500 bought by my parents in 1992) that I ever used, are still in my possession.

In the last few years, I have used the Commodore 128 a couple of times, but I have not touched the Commodore Amiga 500 since I wrote my blog post about it ten years ago.

It turns out that the Commodore Amiga 500 still works, but I ran into a number of problems:

A black and white display. I used to have a monitor, but it broke down in 1997. Since then, I have been using Genlock device to attach the Amiga to a TV screen. Unfortunately, in 2021 the Genlock device no longer seems to work.

The only display option I had left is to attach the Amiga to a TV with an RCA to SCART cable by using the monochrome video output. The downside is that it is only capable of displaying a black and white screen.
No secondary disk drive. I used to have two 3.5-inch double density disk drives: an internal disk drive (inside the case) and an external disk drive that you can attach to the disk drive port.

The external disk drive still seems to respond when I insert a floppy disk (the led blinks), but it no longer seems to be capable of reading any disks.
Bad hard drive and expansion slot problems. The expansion board (that contains the hard drive) seems to give me all kinds of problems.

Sometimes the Amiga completely fails to detect it. In other occasions, I ran into crashes causing the filesystem to return me write errors. Attempting to repair them typically results in new errors.

After thoroughly examining the disk with DiskSalv, I learned that the drive has physical damage and needs to be replaced.

I also ran into an interesting problem from a user point of view -- exchanging data to and from my Amiga (such as software downloaded from the Internet and programs that I used to write) is quite a challenge. In late 1996, when my parents switched to the PC, I used floppy disks to exchange data.

In 2021, floppy drives have completely disappeared from all modern computers. In the rare occasion that I still need to read a floppy disk, I have an external USB floppy drive at my disposal, but it is only capable of reading high density 3.5-inch floppy disks. A Commodore Amiga's standard floppy drive (with the exception of the Amiga 4000) is only capable of reading double density disks.

Fortunately, I have discovered that there are still many things possible with old machines. I brought both my Commodore 128 and Commodore 500 to the Home Computer Museum in Helmond for repairs. Furthermore, I have ordered all kinds of replacement peripherals.

Getting it all to work, turned out to be quite a challenge. Eventually, I have managed to overcome all my problems and the machine works like a charm again.

In this blog post, I will describe what problems I faced and how I solved them.

Some interesting properties of the Amiga

I often receive many questions from all kinds of people who want to know why it is so interesting to use such an old machine. Aside from nostalgic reasons, I think the machine is an interesting piece of computer history. At the time the first model was launched: the Amiga 1000 in 1985, the machine was far ahead of its time and provided unique multimedia capabilities.

Back in the late 80s, system resources were very limited (such as CPU, RAM and storage) compared to modern machines, but there were all kinds of interesting facilities to overcome their design limitations.

For example, the original Amiga 500 model only had 1 MiB of RAM (512 KiB chip RAM and 512 KiB fast RAM) and 32 configurable color registers. Colors can be picked out of a range of 4096 possible colors.

Despite only having the ability to configure a maximum 32 distinct colors, it could still display photo-realistic images:

As can be seen, the screen shot above clearly has more than 32 distinct colors. This is made possible by using a special screen mode called Hold-and-Modify (HAM).

In HAM mode, a pixel's color can be picked from a palette of 16 base colors, or a color component (red, green or blue) of the adjacent pixel can be changed. The HAM screen mode makes it possible to use all possible 4096 colors, albeit with some restrictions on the adjacent color values.

Another unique selling point of the Amiga were its sound capabilities. It could mix 4 audio channels in hardware, and easily combined with graphics, animations and games. The Amiga has all kinds of interesting music productivity software, such as ProTracker, that I used a lot.

To make all these multimedia features possible, the Amiga has its own unique hardware architecture:

The above diagram provides a simplified view of the most important chips in the Amiga 500 and how they are connected:

On the left, the CPU is shown: a Motorola 68000 that runs at approximately 7 MHz (the actual clock speeds differ somewhat on a PAL and NTSC display). The CPU is responsible for doing calculations and executing programs.
On the right, the unique Amiga chips are shown. Each of them has a specific purpose:
- Denise (Display ENabler) is responsible for producing the RGB signal for the display, provides bitplane registers for storing graphics data, and is responsible for displaying sprites.
- Agnus (Address GeNerator UnitS) provides a blitter (that is responsible for quick transfers of data in chip memory, typically graphics data), and a copper: a programmable co-processor that is aligned with the video beam.
  
  The copper makes all kinds of interesting graphical features possible, while keeping the CPU free for work. For example, the following screenshot of the game Trolls:
  
  clearly contains more than 32 distinct colors. For example, the rainbow-like background provides a unique color on each scanline. The copper is used in such a way that the value of the background color register is changed on each scanline, while the screen is drawn.
  
  The copper also makes it possible to switch between screen modes (low resolution, high resolution) on the same physical display, such as in the Workbench:
  
  As can be seen in the above screenshot, the upper part of the screen shows Deluxe Paint in low-res mode with its own unique set of colors, while the lower part shows the workbench in high resolution mode (with a different color palette). The copper can change the display properties while the screen is rendered, while keeping the CPU free to do work.
- Paula is a multi-functional chip that provides sound support, such as processing sample data from memory and mixing 4 audio channels. Because it does mixing in hardware, the CPU is still free to do work.
  
  It also controls the disk drive, serial port, mouse and joysticks.
All the chips in the above diagram require access to memory. Chip RAM is memory that is shared between all chips. As a consequence, they share the same memory bus.

A shared bus imposes speed restrictions -- on even clock cycles the CPU can access chip memory, while on the uneven cycles the chips have memory access.

Many Amiga programs are optimized in such a way that all CPU's memory access operations are at even clock cycles as much as possible. When the CPU needs to access memory on uneven clock cycles, it is forced to wait, losing execution speed.
An Amiga can also be extended with Fast RAM that does not suffer from any speed limitations. Fast RAM is on a different memory bus that can only be accessed by the CPU and not by any of the chips.

(As a sidenote: there is also Slow RAM that is not shown in the diagram. It falls in between chip and fast RAM. Slow RAM is memory that is exclusive to the CPU, but cannot be used on uneven clock cycles).

Compared to other computer architectures used at the same time, such as the PC, 7 MHz of CPU clock speed does not sound all that impressive, but the combination of all these autonomous chips working together is what makes many incredible multimedia properties possible.

My Amiga 500 specs

When my parents bought my Commodore Amiga 500 machine in 1992, it still had the original chipset and 512 KiB of Chip RAM. The only peripherals were an external 3.5-inch floppy drive and a kickstart switcher allowing me switch between Kickstart 1.3 and 2.0. (The kickstart are portions of the Amiga operating system residing in the ROM).

Some time later, the Agnus and Denise chips were upgraded (we moved from the Original Chipset to the Enchanced Chipset), extending the amount of chip RAM to 1 MiB and making it possible to use super high resolution screen modes.

At some point, we bought a KCS PowerPC board making it possible to emulate a PC and run MS-DOS applications. Although the product calls itself an emulator, it is also provides a board that extends the hardware with a number of interesting features:

A 10 MHz NEC V30 CPU that is pin and instruction-compatible with an Intel 8086/8088 CPU. Moreover, it implements some 80186 instructions, some of its own instructions, and is between 10-30% faster.
1 MiB of RAM that can be used by the NEC V30 CPU for conventional and upper memory. In addition, the board's memory can also be used by the Amiga as additional chip RAM, fast RAM and as a RAM disk.
A clock (powered by a battery) so that you do not have reconfigure the date and time on startup. This PC clock can also be used in Amiga mode.

Eventually, we also obtained a hard drive. The Amiga 500 does not include any hard drive, nor has it an internal hard drive connector.

Nonetheless, it can be extended through the Zorro expansion slot with an extension board. We obtained this extension board: MacroSystem evolution providing a SCSI connector, a whopping 8 MiB of fast RAM and an additional floppy drive connector. To the SCSI connector, a 120 MiB Maxtor 7120SR hard-drive was attached.

Installing new and replacement peripherals

In this section, I will describe my replacement peripherals and what I did to make them work.

RGB to SCART cable

As explained in the introduction, I no longer have a monitor and the Genlock device is broken, only making it possible to have a black and white display.

Fortunately, all kinds of replacement options seem to be available to connect an Amiga to a more modern display.

I have ordered an RGB to SCART cable. It can be attached to the RGB and audio output of the Amiga and to the SCART input on my LCD TV.

GoTek floppy emulator

Another problem is that the secondary floppy drive is broken and could not be repaired.

Even if I could find a suitable replacement drive, floppy disks are very difficult media to use for data exchange these days.

Even with an old PC that still has an internal floppy drive (capable of reading both high and double density floppy disks), exchanging information remains difficult -- due to the limitations of the PC floppy controller, a PC is incapable of reading Amiga disks, but an Amiga can read and write to PC floppy disks. A PC formatted floppy disk has less storage capacity than an Amiga formatted disk.

There is also an interesting alternative to a real floppy drive: the GoTek floppy emulator.

The GoTek floppy emulator works with disk image files stored on a USB memory stick. The numeric digit on the display indicates which disk image is currently inserted into the drive. With the rotating switch you can switch between disk images. It operates at the same speed as a real disk drive and produces similar sounds.

Booting from floppy disk 0 starts a program that allows you to configure disk images for the remaining numeric entries:

The GoTek floppy emulator can act both as a replacement for the internal floppy drive as well as an external floppy drive and uses the same connectors.

I have decided to buy an external model, because the internal floppy drive still works and I want to keep the machine as close to the original as possible. I can turn the GoTek floppy drive into the primary disk drive, by using the DF0 switch on the right side of the Amiga case.

Because all disk images are stored on a FAT filesystem-formatted USB stick, makes exchanging information with a PC much easier. I can transfer the same disk files that I can use in the Amiga emulator to the USB memory stick on my PC and then natively use them on a real Amiga.

SCSI2SD

As explained earlier, the 29-year old SCSI hard drive connected to the expansion board is showing all kinds of age-related problems. Although I could search for a compatible second-hand hard drive that was built in the same era, it is probably not going to last very long either.

Fortunately, for retro-computing purposes, an interesting replacement device was developed: the SCSI2SD, that can be used as drop-in replacement for a SCSI hard drive and other kinds of SCSI devices.

This device can be attached to the same SCSI and power connector cables that the old hard drive uses. As the name implies, its major difference is that it uses a (modern) SD-card for storage.

The left picture (shown above) shows the interior of the MacroSystem evolution board's case with the original Maxtor hard drive attached. On the right, I have replaced the hard drive with a SCSI2SD board (that uses a 16 GiB SD-card for storage).

Another nice property of the SCSI2SD is that an SD card offers much more storage capacity. The smallest SD card that I could buy offers 16 GiB of storage, which is a substantially more than the 120 MiB that the old Maxtor hard drive from 1992 used to offer.

Unfortunately, the designers of the original Amiga operating system did not forsee that people would use devices with so much storage capacity. From a technical point of view, AmigaOS versions 3.1 and older are incapable of addressing more than 4 GiB of storage per device.

In addition to the operating system's storage addressing limit, I discovered that there is another limit -- the SCSI controller on the MacroSystem evolution extension board is unable to address more than 1 GiB of storage space per SCSI device. Trying to format a partition beyond this 1 GiB boundary results in a "DOS disk not found" error. This limit does not seem to be documented anywhere in the MacroSystem evolution manual.

To cope with these limitations, the SCSI2SD device can be configured in such a way that it stays within the boundaries of the operating system. To do this, it needs to be connected to a PC with a micro USB cable and configured with the scsi2sd-util tool.

After many rounds of trial and error, I ended up using the following settings:

Enable SCSI terminator (V5.1 only): on
SCSI Host Speed: Normal
Startup Delay (seconds): 0
SCSI Selection Delay: 255
Enable Parity: on
Enable Unit Attention: off
Enable SCSI2 Mode: on
Disable glitch filter: off
Enable disk cache (experimental): off
Enable SCSI Disconnect: off
Respond to short SCSI selection pulses: on
Map LUNS to SCSI IDs: off

Furthermore, the SCSI2SD allows you to configure multiple SCSI devices and put restrictions on how much storage from the SD card can be used per device.

I have configured one SCSI device (representing a 1 GiB hard drive) with the following settings:

Enable SCSI Target: on
SCSI ID: 0
Device Type: Hard Drive
Quirks Mode: None
SD card start sector: 0
Sector size (bytes): 512
Sector count: leave it alone
Device size: 1 GB

I left the Vendor, ProductID, Revision and Serial Number values untouched. The Sector count is derived automatically from the start sector and device size.

Before using the SD card, I recommend to erase it first. Strictly speaking, this is not required, but I have learned in a very painful way that DiskSalv, a tool that is frequently used to fix corrupted Amiga file systems, may get confused if there are traces of a previous filesystem left behind. As a result, it may incorrectly treat files as invalid file references causing further corruption.

On Linux, I can clear the memory of the SD card with the following command (/dev/sdb refers to the device file of my SD-card reader):

$ dd if=/dev/zero of=/dev/sdb bs=1M status=progress

After clearing the SD card, I can insert it into the SCSI2SD device, do the partitioning and perform the installation of the Workbench. This process turns out to be more tricky than I thought -- the MacroSystem evolution board seems to only include a manual that is in German, requiring me to brush up my German reading skills.

The first step is to use the HDToolBox tool (included with the Amiga Workbench 2.1 installation disk) to detect the hard disk.

(As a sidenote: check if the SCSI cable is properly attached to both the SCSI2SD device, as well as the board. In my first attempt, the firmware was able to detect that there was a SCSI device with LUN 0, but it could not detect that it was a hard drive. After many rounds of trial and error, I discovered that the SCSI cable was not properly attached to the extension board!).

By default, HDToolBox works with the standard SCSI driver bundled with the Amiga operating system (scsi.device) which is not compatible with the SCSI controller on the MacroSystem Evolution board.

To use the correct driver, I had to configure HDToolBox to use a different driver, by opening a shell session and running the following command-line instructions:

Install2.1:HDTools
HDToolBox evolution.device

In the above code fragment, I pass the driver name: evolution.device as a command-line parameter to HDToolBox.

With the above configuration setting, the SCSI2SD device gets detected by HDToolBox:

I did the partitioning of my SD-card hard drive as follows:

Partition Device Name	Capacity	Bootable
DH0	100 MiB	yes
KCS	100 MiB	no
DH1	400 MiB	no
DH2	400 MiB	no

I did not change any advanced file system settings. I have configured all partitions to use mask: 0xfffffe and max transfer: 0xffffff.

Beyond creating partitions, there was another tricky configuration aspect I had to take into account -- I had to reserve the second partition (the KCS partition) as a hard drive for the KCS PowerPC emulator.

In my first partitioning attempt, I configured the KCS partition as the last partition, but that seems to cause problems when I start the KCS PowerPC emulator, typically resulting in a very slow startup followed by a system crash.

It appears that this problem is a caused by a memory addressing problem. Putting the KCS partition under the 200 MiB limit seems to fix the problem. Since most addressing boundaries are power of 2 values, my guess is that the KCS PowerPC emulator expects a hard drive partition to reside below the 256 MiB limit.

After creating the partitions and rebooting the machine, I can format them. For some unknown reason, a regular format does not seem to work, so I ended up doing a quick format instead.

Finally, I can install the workbench on the DH0: partition by running the Workbench installer (that resides in the: Install2.1 folder on the installation disk):

Null modem cable

The GoTek floppy drive and SCSI2SD already make it much easier to exchange data with my Amiga, but they are still somewhat impractical for exchanging small files, such as Protracker modules or software packages (in LhA format) downloaded from Aminet.

I have also bought a good old-fashioned null modem cable that can be used to link two computers through their serial ports. Modern computers no longer have a RS-232 serial port, but you can still use an USB to RS-232 converter that indirectly makes it possible to link up with a USB connection.

To link up, the serial port settings on both ends need to be the same and the baud rate should not be to high. I have configured the following settings on my Amiga (configured with the SYS:Prefs/Serial preferences program):

Baud rate: 19,200
Input buffer size: 512
Handshaking: RTS/CTS
Parity: None
Bits/Char: 8
Stop Bits: 1

With a terminal client, such as NComm, I can make a terminal connection to my Linux machine. By installing lrzsz on my Linux machine, I can exchange files by using the Zmodem protocol.

There are a variety of ways to link my Amiga with a Linux PC. A quick and easy way to exchange files, is by starting picocom on the Linux machine with the following parameters:

$ picocom --baud 19200 \
  --flow h \
  --parity n \
  --databits 8 \
  --stopbits 1 \
  /dev/ttyUSB0

After starting Picocom, I can download files from my Linux PC by selecting: Transfer -> Download in the NComm menu. This action opens a file dialog on my Linux machine that allows me to pick the files that I want to download.

Similarly, I can upload files to my Linux machine by selecting Transfer -> Upload. On my Linux machine, a file dialog appears that allows me to pick the target directory where the uploaded files need to be stored.

In addition to simple file exchange, I can also expose a Linux terminal over a serial port and use my Amiga to remotely provide command-line instructions:

$ agetty --flow-control ttyUSB0 19200

To keep the terminal screen formatted nicely (e.g. a fixed number of rows and columns) I should run the following command in the terminal session:

stty rows 48 cols 80

By using NComm's upload function, I can transfer files to the current working directory.

Downloading a file from my Linux PC can be done by running the sz command:

$ sz mod.cool

The above command allows me to download the ProTracker module file: mod.cool from the current working directory.

It is also possible to remotely administer an Amiga machine from my Linux machine. Running the following command starts a shell session exposed over the serial port:

> NewShell AUX:

With a terminal client on my Linux machine, such as Minicom, I can run Amiga shell instructions remotely:

$ minicom -b 19200 -D /dev/ttyUSB0

showing me the following output:

Usage

All these new hardware peripherals open up all kinds of new interesting possibilities.

Using the SD card in FS-UAE

For example, I can detach the SD card from the SCSI2SD device, put it in my PC, and then use the hard drive in the emulator (both FS-UAE and WinUAE seem to work).

By giving the card reader's device file public permissions:

$ chmod 666 /dev/sdb

FS-UAE, that runs as an ordinary user, should be able to access it. By configuring a hard drive that refers to the device file:

hard_drive_0 = /dev/sdb

we have configured FS-UAE to use the SD card as a virtual hard drive (allowing me to use the exact same installation):

An advantage of using the SD card in the emulator is that we can perform installations of software packages much faster. I can temporarily boost the emulator's execution and disk drive speed, saving me quite a bit of installation time.

I can also more conveniently transfer large files from my host system to the SD card. For example, I can create a temp folder and expose it in FS-UAE as a secondary virtual hard drive:

hard_drive_1 = /home/sander/temp
hard_drive_1_label = temp

and then copy all files from the temp: drive to the SD card:

Using the KCS PowerPC board with the new peripherals

The GoTek floppy emulator and the SCSI2SD device can also be used in the KCS PowerPC board emulator.

In addition to Amiga floppy disks, the GoTek floppy emulator can also be used for emulating double density PC disks. The only inconvenience is that it is impossible to format an empty disk on the Amiga for a PC with CrossDOS.

However, on my Linux machine, it is possible to create an empty 720 KiB disk image, format it as a DOS disk, and put the image file on the USB stick:

$ dd if=/dev/zero of=./mypcdisk.img bs=1k count=720
$ mkdosfs -n mydisk ./mypcdisk.img

The KCS PowerPC emulator also makes it possible to use Amiga's serial and parallel ports. As a result, I can also transfer files from my Linux PC by using a PC terminal client, such as Telix:

To connect to my Linux PC, I am using almost the same serial port settings as in the Workbench preferences. The only limitation is that I need to lower my baud rate -- it seems that Telix no longer works reliably for baud rates higher than 9600 bits per second.

The KCS PowerPC board is a very capable PC emulator. Some PC aspects are handled by real hardware, so that there is no speed loss -- the board provides a real 8086/8088 compatible CPU and 1 MiB of memory.

It also provides its own implementation of a system BIOS and VGA BIOS. As a result, text-mode DOS applications work as well as their native XT-PC counterparts, sometimes even slightly better.

One particular aspect that is fully emulated in software is CGA/EGA/VGA graphics. As I have explained in a blog written several years ago, the Amiga uses bitplane encoding for graphics whereas PC hardware uses chunky graphics. To allow graphics to be displayed, the data needs to be translated into planar graphics format, making graphics rendering very slow.

For example, it is possible to run Microsoft Windows 3.0 (in real mode) in the emulator, but the graphics are rendered very very slowly:

Interestingly enough, the game: Commander Keen seems to work at an acceptable speed:

I think Commander Keen runs so fast in the emulator (despite its slow graphics emulation), because of the adaptive tile refresh technique (updating the screen by only redrawing the necessary parts).

File reading problems and crashes

Although all these replacement peripherals are nice, such as the SCSI2SD, I was also running into a very annoying recurring problem.

I have noticed that after using the SCSI2SD for a while, sometimes a file may get incorrectly read.

Incorrectly read files lead to all kinds of interesting problems. For example, unpacking an LhA or Zip archive from the hard drive may sometimes result in one or more CRC errors. I have also noticed subtle screen and audio glitches while playing games stored on the SD card.

A really annoying problem is when an executable is incorrectly read -- this typically results in program failure crashes with error codes 8000 0003 or 8000 0004. The former error is caused by executing a wrong CPU instruction.

These read errors do not seem to happen all the time. For example, reading a previously incorrectly read file may actually open it successfully, so it appears that files are correctly written to disk.

After some investigation and comparing my SD card configuration with the old SCSI hard drive, I have noticed that the read speeds were a bit poor. SysInfo shows me a read speed of roughly 698 KiB per second:

By studying the MacroSystem Evolution manual (in German) and comparing the configuration with the Workbench installation on the old hard drive, I discovered that there is a burst mode option that can boost read performance.

To enable burst mode, I need to copy the Evolution utilities from the MacroSystem evolution driver disk to my hard drive (e.g. by copying DF0:Evolution3 to DH0:Programs/Evolution3). and add the following command-line instruction to S:User-Startup:

DH0:Programs/Evolution3/Utilities/HDParms 0 NOCHANGE NOFORMAT NOCACHE BURST

Resulting in read speeds that are roughly 30% faster:

Unfortunately, faster read speeds also seem to dramatically increase the likelyhood on read errors making my system quite unreliable.

I am still not completely sure what is causing these incorrect reads, but from my experiments I know that read speeds definitely have something to do with it. Restoring the configuration to no longer use burst mode (and slower reads), seems to make my system much more stable.

I also learned that these read problems are very similar to problems reported about a wrong MaxTransfer value. According to this page, setting it to 0x1fe00 should be a safe value. I tried adjusting the MaxTransfer value, but it does not seem to change anything.

Although my system seems to be stable enough after making these modifications, I would still like to expand my knowledge about this subject so that I can fully explain what is going on.

UPDATE 2024: I have discovered that these unstable reads may have something to do with power -- when I attach the SCSI2SD board to a power socket through the USB-C socket (while also keeping the SCSI power cable attached), it seems that the board is much more stable and I can use it reliably for hours. Maybe there is something wrong with the SCSI power connector or the power for SCSI termination.

Conclusion

It took me several months to figure out all these details, but with my replacement peripherals, my Commodore Amiga 500 works great again. The machine is more than 29 years old and I can still run all applications and games that I used to work with in the mid 1990s and more. Furthermore, data exchange with my Linux PC has become much easier.

Back in the early 90s, I did not have the luxury to download software and information from Internet.

I also learned many new things about terminal connections. It seems that Linux (because of its UNIX heritage) has all kinds of nice facilities to expose itself as a terminal server.

After visiting the home computer museum, I became more motivated to preserve my Amiga 500 in the best possible way. It seems that as of today, there are still replacement parts for sale and many things can be repaired.

My recommendation is that if you still own a classic machine, do not just throw it away. You may regret it later.

Future work

Aside from finding a proper explanation for the file reading problems, I am still searching for a real replacement floppy drive. Moreover, I still need to investigate whether the Genlock device can be repaired.

A more elaborate approach for bypassing NPM's dependency management features in Nix builds

2021-08-31T20:47:00.000+02:00

Nix is a general purpose package manager that can be used to automate the deployments of a variety of systems -- it can deploy components written in a variety of programming languages (e.g. C, C++, Java, Go, Rust, Perl, Python, JavaScript) using various kinds of technologies and frameworks, such as Django, Android, and Node.js.

Another unique selling point of Nix is that it provides strong reproducibility guarantees. If a build succeeds on one machine, then performing the same build on another should result in a build that is (nearly) bit-identical.

Nix improves build reproducibility by complementing build processes with features, such as:

Storing all artifacts in isolation in a so-called Nix store: /nix/store (e.g. packages, configuration files), in which every path is unique by prefixing it with an SHA256 hash code derived from all build inputs (e.g. dependencies, build scripts etc.). Isolated paths make it possible for multiple variants and versions of the same packages to safely co-exist.
Clearing environment variables or setting them to dummy values. In combination with unique and isolated Nix store paths, search environment variables must configured in such a way that the build script can find its dependencies in the Nix store, or it will fail.

Having to specify all search environment variables may sound inconvenient, but prevents undeclared dependencies to accidentally make a build succeed -- deployment of such a package is very likely to fail on machine that misses an unknown dependency.
Running builds as an unprivileged user that does not have any rights to make modifications to the host system -- a build can only write in its designated temp folder or output paths.
Optionally running builds in a chroot environment, so that a build cannot possibly find any undeclared host system dependencies through hard-coded absolute paths.
Restricting network access to prevent a build from obtaining unknown dependencies that may influence the build outcome.

For many build tools, the Nixpkgs repository provides abstraction functions that allow you to easily construct a package from source code (e.g. GNU Make, GNU Autotools, Apache Ant, Perl's MakeMaker, SCons etc.).

However, certain tools are difficult to use in combination with Nix -- for example, NPM that is used to deploy Node.js projects.

NPM is both a dependency and build manager and the former aspect conflicts with Nix -- builds in Nix are typically prevented from downloading files from remote network locations, with the exception of so-called fixed-output derivations in which the output hash is known in advance.

If network connections would be allowed in regular builds, then Nix can no longer ensure that a build is reproducible (i.e. that the hash code in the Nix store path reflects the same build output derived from all inputs).

To cope with the conflicting dependency management feature of NPM, various kinds of integrations have been developed. npm2nix was the first, and several years ago I have started node2nix to provide a solution that aims for accuracy.

Basically, the build process of an NPM package in Nix boils down to performing the following steps in a Nix derivation:

# populate the node_modules/ folder
npm install --offline

We must first obtain the required dependencies of a project through the Nix package manager and install them in the correct locations in the node_modules/ directory tree.

Finally, we should run NPM in offline mode forcing it not to re-obtain or re-install any dependencies, but still perform build management tasks, such as running build scripts.

From a high-level point of view, this principle may look simple, but in practice it is not:

With earlier versions of NPM, we were forced to imitate its dependency resolution algorithm. At first sight, it looked simple, but getting it right (such as coping with circular dependencies and dependency de-duplication) is much more difficult than expected.
NPM 5.x introduced lock files. For NPM development projects, they provide exact version specifiers of all dependencies and transitive dependencies, making it much easier to know which dependencies need to be installed.

Unfortunately, NPM also introduced an offline cache, that prevents us from simply copying packages into the node_modules/ tree. As a result, we need to make additional complex modifications to the package.json configuration files of all dependencies.

Furthermore, end user package installations do not work with lock files, requiring us to still keep our custom implementation of the dependency resolution algorithm.
NPM's behaviour with dependencies on directories on the local file system has changed. In old versions of NPM, such dependencies were copied, but in newer versions, they are symlinked. Furthermore, each directory dependency maintains its own node_modules/ directory for transitive dependencies.

Because we need to take many kinds of installation scenarios into account and work around the directory dependency challenges, the implementation of the build environment: node-env.nix in node2nix has become very complicated.

It has become so complicated that I consider it a major impediment in making any significant changes to the build environment.

In the last few weeks, I have been working on a companion tool named: placebo-npm that should simplify the installation process. Moreover, it should also fix a number of frequently reported issues.

In this blog post, I will explain how the tool works.

Lock-driven deployments

In NPM 5.x, package-lock.json files were introduced. The fact that they capture the exact versions of all dependencies and make all transitive dependencies known, makes certain aspects of an NPM deployment in a Nix build environment easier.

For lock-driven projects, we no longer have to run our own implementation of the dependency resolution algorithm to figure out what the exact versions of all dependencies and transitive dependencies are.

For example, a project with the following package.json:

{
  "name": "simpleproject",
  "version": "0.0.1",
  "dependencies": {
    "underscore": "*",
    "prom2cb": "github:svanderburg/prom2cb",
    "async": "https://mylocalserver/async-3.2.1.tgz"
  }
}

may have the following package-lock.json file:

{
  "name": "simpleproject",
  "version": "0.0.1",
  "lockfileVersion": 1,
  "requires": true,
  "dependencies": {
    "async": {
      "version": "https://mylocalserver/async-3.2.1.tgz",
      "integrity": "sha512-XdD5lRO/87udXCMC9meWdYiR+Nq6ZjUfXidViUZGu2F1MO4T3XwZ1et0hb2++BgLfhyJwy44BGB/yx80ABx8hg=="
    },
    "prom2cb": {
      "version": "github:svanderburg/prom2cb#fab277adce1af3bc685f06fa1e43d889362a0e34",
      "from": "github:svanderburg/prom2cb"
    },
    "underscore": {
      "version": "1.13.1",
      "resolved": "https://registry.npmjs.org/underscore/-/underscore-1.13.1.tgz",
      "integrity": "sha512-hzSoAVtJF+3ZtiFX0VgfFPHEDRm7Y/QPjGyNo4TVdnDTdft3tr8hEkD25a1jC+TjTuE7tkHGKkhwCgs9dgBB2g=="
    }
  }
}

As you may notice, the package.json file declares three dependencies:

The first dependency is underscore that refers to the latest version in the NPM registry. In the package-lock.json file, the dependency is frozen to version 1.13.1. The resolved property provides the URL where the tarball should be obtained from. Its integrity can be verified with the given SHA512 hash.
The second dependency: prom2cb refers to the latest revision of the main branch of the prom2cb Git repository on GitHub. In the package-lock.json file, it is pinpointed to the fab277... revision.
The third dependency: async refers to a tarball that is downloaded from an arbitrary HTTP URL. The package-lock.json records its SHA512 integrity hash to make sure that we can only deploy with the version that we have used previously.

As explained earlier, to ensure purity, in a Nix build environment, we cannot allow NPM to obtain the required dependencies of a project. Instead, we must let Nix obtain all the dependencies.

When all dependencies have been obtained, we should populate the node_modules/ folder of the project. In the above example, it is just simply a matter of unpacking the tarballs or copying the Git clones into the node_modules/ folder of the project. No transitive dependencies need to be deployed.

For projects that do not rely on build scripts (that perform tasks, such as linting, compiling code, such as TypeScript etc.) this typically suffices to make a project work.

However, when we also need build management, we need to run the full installation process:

$ npm install --offline

npm ERR! code ENOTCACHED
npm ERR! request to https://registry.npmjs.org/async/-/async-3.2.1.tgz failed: cache mode is 'only-if-cached' but no cached response available.

npm ERR! A complete log of this run can be found in:
npm ERR!     /home/sander/.npm/_logs/2021-08-29T12_56_13_978Z-debug.log

Unfortunately, NPM still tries to obtain the dependencies despite the fact that they have already been copied into the right locations into node_modules folder.

Bypassing the offline cache

To cope with the problem that manually obtained dependencies cannot be detected, my initial idea was to use the NPM offline cache in a specific way.

The offline cache claims to be content-addressable, meaning that every item can be looked up by using a hash code that represents its contents, regardless of its origins. Unfortunately, it turns out that this property cannot be fully exploited.

For example, when we obtain the underscore tarball (with the exact same contents) from a different URL:

$ npm cache add http://mylocalcache/underscore-1.13.1.tgz

and run the installation in offline mode:

$ npm install --offline
npm ERR! code ENOTCACHED
npm ERR! request to https://registry.npmjs.org/underscore/-/underscore-1.13.1.tgz failed: cache mode is 'only-if-cached' but no cached response available.

npm ERR! A complete log of this run can be found in:
npm ERR!     /home/sander/.npm/_logs/2021-08-26T13_50_15_137Z-debug.log

The installation still fails, despite the fact that we already have a tarball (with the exact same SHA512 hash) in our cache.

However, downloading underscore from its original location (the NPM registry):

$ npm cache add underscore@1.13.1

makes the installation succeed.

The reason why downloading the same tarball from an arbitrary HTTP URL does not work is because NPM will only compute a SHA1 hash. Obtaining a tarball from the NPM registry causes NPM to compute a SHA512 hash. Because it was downloaded from a different source, it fails to recognize the SHA512 hash in the package-lock.json file.

We also run into similar issues when we obtain an old package from the NPM registry that only has an SHA1 hash. Importing the same file from a local file path causes NPM to compute a SHA512 hash. As a result, npm install tries to re-obtain the same tarball from the remote location, because the hash was not recognized.

To cope with these problems, placebo-npm will completely bypass the cache. After all dependencies have been copied to the node_modules folder, it modifies their package.json configuration files with hidden metadata properties to trick NPM that they came from their original locations.

For example, to make the underscore dependency work (that is normally obtained from the NPM registry), we must add the following properties to the package.json file:

{
  ...
  _from: "underscore@https://registry.npmjs.org/underscore/-/underscore-1.13.1.tgz",
  _integrity: "sha512-XdD5lRO/87udXCMC9meWdYiR+Nq6ZjUfXidViUZGu2F1MO4T3XwZ1et0hb2++BgLfhyJwy44BGB/yx80ABx8hg==",
  _resolved: "https://registry.npmjs.org/underscore/-/underscore-1.13.1.tgz"
}

For prom2cb (that is a Git dependency), we should add:

{
  ...
  _from = "github:svanderburg/prom2cb",
  _integrity = "",
  _resolved = "github:svanderburg/prom2cb#fab277adce1af3bc685f06fa1e43d889362a0e34"
}

and for HTTP/HTTPS dependencies and local files we should do something similar (adding _from and _integrity fields).

With these modifications, NPM will no longer attempt to consult the local cache, making the dependency installation step succeed.

Handling directory dependencies

Another challenge is dependencies on local directories, that are frequently used for local development projects:

{
  "name": "simpleproject",
  "version": "0.0.1",
  "dependencies": {
    "underscore": "*",
    "prom2cb": "github:svanderburg/prom2cb",
    "async": "https://mylocalserver/async-3.2.1.tgz",
    "mydep": "../../mydep",
  }
}

In the package.json file shown above, a new dependency has been added: mydep that refers to a relative local directory dependency: ../../mydep.

If we run npm install, then NPM creates a symlink to the folder in the project's node_modules/ folder and installs the transitive dependencies in the node_modules/ folder of the target dependency.

If we want to deploy the same project to a different machine, then it is required to put mydep in the exact same relative location, or the deployment will fail.

Deploying such an NPM project with Nix introduces a new problem -- all packages deployed by Nix are stored in the Nix store (typically /nix/store). After deploying the project, the relative path to the project (from the Nix store) will no longer be correct. Moreover, we also want Nix to automatically deploy the directory dependency as part of the deployment of the entire project.

To cope with these inconveniences, we are required to implement a tricky solution -- we must rewrite directory dependencies in such a way that can refer to a folder that is automatically deployed by Nix. Furthermore, the dependency should still end up being symlink to satisfy NPM -- copying directory dependencies in the node_modules/ folder is not accepted by NPM.

Usage

To conveniently install NPM dependencies from a local source (and satisfying npm in such a way that it believes the dependencies came from their original locations), I have created a tool called: placebo-npm.

We can, for example, obtain all required dependencies ourselves and put them in a local cache folder:

$ mkdir /home/sander/mycache
$ wget https://mylocalserver/async-3.2.1.tgz
$ wget https://registry.npmjs.org/underscore/-/underscore-1.13.1.tgz
$ git clone https://github.com/svanderburg/prom2cb

The deployment process that placebo-npm executes is driven by a package-placebo.json configuration file that has the following structure:

{
   "integrityHashToFile": {
     "sha512-hzSoAVtJF+3ZtiFX0VgfFPHEDRm7Y/QPjGyNo4TVdnDTdft3tr8hEkD25a1jC+TjTuE7tkHGKkhwCgs9dgBB2g==": "/home/sander/mycache/underscore-1.13.1.tgz",
     "sha512-XdD5lRO/87udXCMC9meWdYiR+Nq6ZjUfXidViUZGu2F1MO4T3XwZ1et0hb2++BgLfhyJwy44BGB/yx80ABx8hg==": "/home/sander/mycache/async-3.2.1.tgz"
   },
   "versionToFile": {
     github:svanderburg/prom2cb#fab277adce1af3bc685f06fa1e43d889362a0e34": "/home/sander/mycache/prom2cb"
   },
   "versionToDirectoryCopyLink": {
     "file:../dep": "/home/sander/alternatedir/dep"
   }
}

The placebo config maps dependencies in a package-lock.json file to local file references:

integrityHashToFile maps dependencies with an integrity hash to local files, which is useful for HTTP/HTTPS dependencies, registry dependencies, and local file dependencies.
versionToFile: maps dependencies with a version property to local directories. This is useful for Git dependencies.
versionToDirectoryCopyLink: specifies directories that need to be copied into a shadow directory named: placebo_node_dirs and creates symlinks to the shadow directories in the node_modules/ folder. This is useful for installing directory dependencies from arbitrary locations.

With the following command, we can install all required dependencies from the local cache directory and make all necessary modifications to let NPM accept the dependencies:

$ placebo-npm package-placebo.json

Finally, we can run:

$ npm install --offline

The above command does not attempt to re-obtain or re-install the dependencies, but still performs all required build management tasks.

Integration with Nix

All the functionality that placebo-npm provides has already been implemented in the node-env.nix module, but over the years it has evolved into a very complex beast -- it is implemented as a series of Nix functions that generates shell code.

As a consequence, it suffers from recursion problems and makes it extremely difficult to tweak/adjust build processes, such as modifying environment variables or injecting arbitrary build steps to work around Nix integration problems.

With placebo-npm we can reduce the Nix expression that builds projects (buildNPMProject) to an implementation that roughly has the following structure:

{stdenv, placebo-npm}:
{packagePlacebo}:

stdenv.mkDerivation ({
  pname = builtins.replaceStrings [ "@" "/" ] [ "_at_" "_slash_" ] pname; # Escape characters that aren't allowed in a store path

  placeboJSON = builtins.toJSON packagePlacebo;
  passAsFile = [ "placeboJSON" ];

  buildInputs = [ nodejs placebo-npm ] ++ buildInputs;

  buildPhase = ''
    runHook preBuild
    true
    runHook postBuild
  '';
  installPhase = ''
    runHook preInstall

    mkdir -p $out/lib/node_modules/${pname}
    mv * $out/lib/node_modules/${pname}
    cd $out/lib/node_modules/${pname}

    placebo-npm --placebo $placeboJSONPath
    npm install --offline

    runHook postInstall
  '';
} // extraArgs)

As may be observed, the implementation is much more compact and fits easily on one screen. The function accepts a packagePlacebo attribute set as a parameter (that gets translated into a JSON file by the Nix package manager).

Aside from some simple house keeping work, most of the complex work has been delegated to executing placebo-npm inside the build environment, before we run npm install.

The function above is also tweakable -- it is possible to inject arbitrary environment variables and adjust the build process through build hooks (e.g. preInstall and postInstall).

Another bonus feature of delegating all dependency installation functionality to the placebo-npm tool is that we can also use this tool as a build input for other kinds projects -- we can use it the construction process of systems that are built from monolithic repositories, in which NPM is invoked from the build process of the encapsulating project.

The only requirement is to run placebo-npm before npm install is invoked.

Other use cases

In addition to using placebo-npm as a companion tool for node2nix and setting up a simple local cache, it can also be useful to facilitate offline installations from external media, such as USB flash drives.

Discussion

With placebo-npm we can considerably simplify the implementation of node-env.nix (part of node2nix) making it much easier to maintain. I consider the node-env.nix module the second most complicated aspect of node2nix.

As a side effect, it has also become quite easy to provide tweakable build environments -- this should solve a large number of reported issues. Many reported issues are caused by the fact that it is difficult or sometimes impossible to make changes to a project so that it will cleanly deploy.

Moreover, placebo-npm can also be used as a build input for projects built from monolithic repositories, in which a sub set needs to be deployed by NPM.

The integration of the new node-env.nix implementation into node2nix is not completely done yet. I have reworked it, but the part that generates the package-placebo.json file and lets Nix obtain all required dependencies is still a work-in-progress.

I am experimenting with two implementations: a static approach that generates Nix expressions and dynamic implementation that directly consumes a package-lock.json file in the Nix expression language. Both approaches have pros and cons. As a result, node2nix needs to combine both of them into a hybrid approach.

In a next blog post, I will explain more about them.

Availability

The initial version of placebo-npm can be obtained from my GitHub page.

An unconventional method for creating backups and exchanging files

2021-06-01T22:07:00.000+02:00

I have written many blog posts about software deployment and configuration management. For example, a couple of years ago, I have discussed a very basic configuration management process for small organizations, in which I explained that one of the worst things that could happen is that a machine breaks down and everything that it provides gets lost.

Fortunately, good configuration management practices and deployment tools (such as Nix) can help you to restore a machine's configuration with relative ease.

Another problem is managing a machine's data, which in many ways is even more important and complicated -- software packages can be typically obtained from a variety of sources, but data is typically unique (and therefore more valuable).

Even if a machine stays operational, the data that it stores can still be at risk -- it may get deleted by accident, or corrupted (for example, by the user, or a hardware problem).

It also does not matter whether a machine is used for business (for example, storing data for information systems) or personal use (for example, documents, pictures, and audio files). In both cases, data is valuable, and as a result, needs to be protected from loss and corruption.

In addition to recovery, the availability of data is often also very important -- many users (including me) typically own multiple devices (e.g. a desktop PC, laptop and phone) and typically want access to the same data from multiple places.

Because of the importance of data, I sometimes get questions from non-technical users that want to know how I manage my personal data (such as documents, images and audio files) and what tools I would recommend.

Similar to most computer users, I too have faced my own share of reliability problems -- of all the desktop computers I owned, I ended up with a completely broken hard drive three times, and a completely broken laptop once. Furthermore, I have also worked with all kinds of external media (e.g. floppy disks, CD-ROMs etc.) each having their own share of reliability problems.

To cope with data availability and loss, I came up with a custom script that I have been conveniently using to create backups and synchronize my data between the machines that I use.

In this blog post, I will explain how this script works.

About storage media

To cope with the potential loss of data, I have always made it a habit to transfer data to external media. I have worked with a variety of them, each having their advantages and disadvantages:

In the old days, I used floppy disks. Most people who are (at the time reading this blog post) in their early twenties or younger, may probably have no clue what I am talking about (for those people perhaps the 'Save icon' used in many desktop applications looks familiar).

Roughly 25 years ago, floppy disks were a common means to exchange data between computers.

Although they were common, they had many drawbacks. Probably the biggest drawback was their limited storage capacity -- I used to own 5.25 inch disks that (on PCs) were capable of storing ~360 KiB (if both sides are used), and the more sturdy 3.5 inch disks providing double density (720 KiB) and high density capacity (1.44 MiB).

Furthermore, floppy disks were also quite slow and could be easily damaged, for example, by toughing the magnetic surface.
When I switched from the Commodore Amiga to the PC, I also used tapes for a while in addition to floppy disks. They provided a substantial amount of storage capacity (~500 MiB in 1996). As of 2019 (and this probably still applies to today), tapes are still considered very cheap and reliable media for archival of data.

What I found impractical about tapes is that they are difficult to use as random access memory -- data on a tape is stored sequentially. As a consequence, it is typically very slow to find files or to "update" existing files. Typically, a backup tool needs to scan the tape from the beginning to the end or maintain a database with known storage locations.

Many of my personal files (such as documents) are regularly updated and older versions do not have to be retained. Instead, they should be removed to clear up storage space. With tapes this is very difficult to do.
When writable CD/DVDs became affordable, I used them as a backup media for a while. Similar to tapes, they also have substantial storage capacity. Furthermore, they are very fast and convenient to read.

A similar disadvantage is that they are not a very convenient medium for updating files. Although it is possible to write multi-sessions discs, in which files can be added, overwritten, or made invisible (essentially a "soft delete"), it remained inconvenient because you can not clear up the storage space that a deleted file used to occupy.

I also learned the hard way that writable discs (and in particular rewritable discs) are not very reliable for long term storage -- I have discarded many old writable discs (10 years or older) that can no longer be read.

Nowadays, I use a variety of USB storage devices (such as memory sticks, hard drives) as backup media. They are relatively cheap, fast, have more than enough storage capacity, and I can use them as random access memory -- it is no problem at all to update and delete data existing data.

To cope with the potential breakage of USB storage media, I always make sure that I have at least two copies of my important data.

About data availability

As already explained in the introduction, I have multiple devices for which I want the same data to be available. For example, on both my desktop PC and company laptop, I want to have access to my music and research papers collection.

A possible solution is to use a shared storage medium, such as a network drive. The advantage of this approach is that there is a single source of truth and I only need to maintain a single data collection -- when I add a new document it will immediately be available to both devices.

Although a network drive may be a possible solution, it is not a good fit for my use cases -- I typically use laptops for traveling. When I am not at home, I can no longer access my data stored on the network drive.

Another solution is to transfer all required files to the hard drive on my laptop. Doing a bulk transfer for the first time is typically not a big problem (in particular, if you use orthodox file managers), but keeping collections of files up-to-date between machines is in my experience quite tedious to do by hand.

Automating data synchronization

For both backing up and synchronizing files to other machines I need to regularly compare and update files in directories. In the former case, I need to sync data between local directories, and for the latter I need to sync data between directories on remote machines.

Each time I want make updates to my files, I want to inspect what has changed, and see which files require updating before actually doing it, so that I do not end up wasting time or risk modifying the wrong files.

Initially, I started to investigate how to implement a synchronization tool myself, but quite quickly I realized that there is already a tool available that is quite suitable for the job: rsync.

rsync is designed to efficiently transfer and synchronize files between drivers and machines across networks by comparing the modification times and sizes of files.

The only thing that I consider a drawback is that it is not fully optimized to conveniently automate my personal workflow -- to accomplish what I want, I need to memorize all the relevant rsync command-line options and run multiple command-line instructions.

To alleviate this problem, I have created a custom script, that evolved into a tool that I have named: gitlike-rsync.

Usage

gitlike-rsync is a tool that facilitates synchronisation of file collections between directories on local or remote machines using rsync and a workflow that is similar to managing Git projects.

Making backups

For example, if we have a data directory that we want to back up to another partition (for example, that refers to an external USB drive), we can open the directory:

$ cd /home/sander/Documents

and configure a destination directory, such as a directory on a backup drive (/media/MyBackupDrive/Documents):

$ gitlike-rsync destination-add /media/MyBackupDrive/Documents

By running the following command-line instruction, we can create a backup of the Documents folder:

$ gitlike-rsync push
sending incremental file list
.d..tp..... ./
>f+++++++++ bye.txt
>f+++++++++ hello.txt

sent 112 bytes  received 25 bytes  274.00 bytes/sec
total size is 10  speedup is 0.07 (DRY RUN)
Do you want to proceed (y/N)? y
sending incremental file list
.d..tp..... ./
>f+++++++++ bye.txt
              4 100%    0.00kB/s    0:00:00 (xfr#1, to-chk=1/3)
>f+++++++++ hello.txt
              6 100%    5.86kB/s    0:00:00 (xfr#2, to-chk=0/3)

sent 202 bytes  received 57 bytes  518.00 bytes/sec
total size is 10  speedup is 0.04

The output above shows me the following:

When no additional command-line parameters have been provided, the script will first do a dry run and show the user what it intends to do. In the above example, it shows me that it wants to transfer the contents of the Documents folder that consists of only two files: hello.txt and bye.txt.
After providing my confirmation, the files in the destination directory will be updated -- the backup drive that is mounted on /media/MyBackupDrive.

I can conveniently make updates in my documents folder and update my backups.

For example, I can add a new file to the Documents folder named: greeting.txt, and run the push command again:

$ gitlike-rsync push
sending incremental file list
.d..t...... ./
>f+++++++++ greeting.txt

sent 129 bytes  received 22 bytes  302.00 bytes/sec
total size is 19  speedup is 0.13 (DRY RUN)
Do you want to proceed (y/N)? y
sending incremental file list
.d..t...... ./
>f+++++++++ greeting.txt
              9 100%    0.00kB/s    0:00:00 (xfr#1, to-chk=1/4)

sent 182 bytes  received 38 bytes  440.00 bytes/sec
total size is 19  speedup is 0.09

In the above output, only the greeting.txt file is transferred to backup partition, leaving the other files untouched, because they have not changed.

Restoring files from a backup

In addition to the push command, gitlike-rsync also supports pull that can be used to sync data from the configured destination folders. The pull command can be used as a means to restore data from a backup partition.

For example, if I accidentally delete a file from the Documents folder:

$ rm hello.txt

and run the pull command:

$ gitlike-rsync pull
sending incremental file list
.d..t...... ./
>f+++++++++ hello.txt

sent 137 bytes  received 22 bytes  318.00 bytes/sec
total size is 19  speedup is 0.12 (DRY RUN)
Do you want to proceed (y/N)? y
sending incremental file list
.d..t...... ./
>f+++++++++ hello.txt
              6 100%    0.00kB/s    0:00:00 (xfr#1, to-chk=0/4)

sent 183 bytes  received 38 bytes  442.00 bytes/sec
total size is 19  speedup is 0.09

the script is able to detect that hello.txt was removed and restore it from the backup partition.

Synchronizing files between machines in a network

In addition to local directories, that are useful for back ups, the gitlike-rsync script can also be used in a similar way to exchange files between machines, such as my desktop PC and office laptop.

With the following command-line instruction, I can automatically clone the Documents folder from my desktop PC to the Documents folder on my office laptop:

$ gitlike-rsync clone sander@desktop-pc:/home/sander/Documents

The above command connects to my desktop PC over SSH and retrieves the content of the Documents/ folder. It will also automatically configure the destination directory to synchronize with the Documents folder on the desktop PC.

When new documents have been added on the desktop PC, I just have to run the following command on my office laptop to update it:

$ gitlike-rsync pull

I can also modify the contents of the Documents folder on my office laptop and synchronize the changed files to my desktop PC with a push:

$ gitlike-rsync push

About versioning

As explained in the beginning of this blog post, in addition to the recovery of failing machines and equipment, another important reason to create backups is to protect yourself against accidental modifications.

Although gitlike-rsync can detect and display file changes, it does not do any versioning of any kind. This feature is deliberately left unimplemented, for very good reasons.

For most of my personal files (e.g. images, audio, video) I do not need any versioning. As soon as they are organized, they are not supposed to be changed.

However, for certain kinds of files I do need versioning, such as software development projects. Whenever I need versioning, my answer is very simple: I use the "ordinary" Git, even for projects that are private and not supposed to be shared on a public hosting service, such as GitHub.

As seasoned Git users may probably already know, you can turn any local directory into a Git repository, by running:

$ git init

The above command creates a local .git folder that tracks and stores changes locally.

When using a public hosting service, such as GitHub, and cloning a repository from GitHub, a remote: origin has been automatically configured to automatically push and pull changes to and from GitHub.

It is also possible to synchronize Git changes between arbitrary computers using a private SSH connection. I can, for example, configure a remote for a private repository, as follows:

$ git remote add origin sander@desktop-pc:/home/sander/Development/private-project

the above command configures the Git project that is stored in the /home/sander/Development/private-project directory on my desktop PC as a remote.

I can pull changes from the remote repository, by running:

$ git pull origin

and push locally stored changes, by running:

$ git push origin

As you may probably have already noticed, the above workflow is very similar to exchanging documents, shown earlier in this blog post.

What about backing up private Git repositories? To do this, I typically create tarballs of the Git project directories and sync them to my backup media with gitlike-rsync. The presence of the .git folder suffices to retain a project's history.

Conclusion

In this blog post, I have described gitlike-rsync, a simple opinionated wrapper script for exchanging files between local directories (for backups) and remote directories (for data exchange between machines).

As its name implies, it heavily builds on top of rsync for efficient data exchange, and the concepts of git as an inspiration for the workflow.

I have been conveniently using this script for over ten years, and it works extremely well for my own use cases and a variety of operating systems (Linux, Windows, macOS and FreeBSD).

My solution is obviously not rocket science -- my contribution is only the workflow automation. The "true credits" should go the developers of rsync and Git.

I also have to thank the COVID-19 crisis that allowed me to finally find the time to polish the script, document it and give it a name. In the Netherlands, as of today, there are still many restrictions, but the situation is slowly getting better.

Availability

I have added the gitlike-rsync script described in this blog post to my custom-scripts repository that can be obtained from my GitHub page.

A test framework for the Nix process management framework

2021-04-26T21:32:00.000+02:00

As already explained in many previous blog posts, the Nix process management framework adds new ideas to earlier service management concepts explored in Nixpkgs and NixOS:

It makes it possible to deploy services on any operating system that can work with the Nix package manager, including conventional Linux distributions, macOS and FreeBSD. It also works on NixOS, but NixOS is not a requirement.
It allows you to construct multiple instances of the same service, by using constructor functions that identify conflicting configuration parameters. These constructor functions can be invoked in such a way that these configuration properties no longer conflict.
We can target multiple process managers from the same high-level deployment specifications. These high-level specifications are automatically translated to parameters for a target-specific configuration function for a specific process manager.

It is also possible to override or augment the generated parameters, to work with configuration properties that are not universally supported.
There is a configuration option that conveniently allows you to disable user changes making it possible to deploy services as an unprivileged user.

Although the above features are interesting, one particular challenge is that the framework cannot guarantee that all possible variations will work after writing a high-level process configuration. The framework facilitates code reuse, but it is not a write once, run anywhere approach.

To make it possible to validate multiple service variants, I have developed a test framework that is built on top of the NixOS test driver that makes it possible to deploy and test a network of NixOS QEMU virtual machines with very minimal storage and RAM overhead.

In this blog post, I will describe how the test framework can be used.

Automating tests

Before developing the test framework, I was mostly testing all my packaged services manually. Because a manual test process is tedious and time consuming, I did not have any test coverage for anything but the most trivial example services. As a result, I frequently ran into many configuration breakages.

Typically, when I want to test a process instance, or a system that is composed of multiple collaborative processes, I perform the following steps:

First, I need to deploy the system for a specific process manager and configuration profile, e.g. for a privileged or unprivileged user, in an isolated environment, such as a virtual machine or container.
Then I need to wait for all process instances to become available. Readiness checks are critical and typically more complicated than expected -- for most services, there is a time window between a successful invocation of a process and its availability to carry out its primary task, such as accepting network connections. Executing tests before a service is ready, typically results in errors.

Although there are process managers that can generally deal with this problem (e.g. systemd has the sd_notify protocol and s6 its own protocol and a sd_notify wrapper), the lack of a standardized protocol and its adoption still requires me to manually implement readiness checks.

(As a sidenote: the only readiness check protocol that is standardized is for traditional System V services that daemonize on their own. The calling parent process should almost terminate immediately, but still wait until the spawned daemon child process notifies it to be ready.

As described in an earlier blog post, this notification aspect is more complicated to implement than I thought. Moreover, not all traditional System V daemons follow this protocol.)
When all process instances are ready, I can check whether they properly carry out their tasks, and whether the integration of these processes work as expected.

An example

I have developed a Nix function: testService that automates the above process using the NixOS test driver -- I can use this function to create a test suite for systems that are made out of running processes, such as the webapps example described in my previous blog posts about the Nix process management framework.

The example system consists of a number of webapp processes with an embedded HTTP server returning HTML pages displaying their identities. Nginx reverse proxies forward incoming connections to the appropriate webapp processes by using their corresponding virtual host header values:

{ pkgs ? import <nixpkgs> { inherit system; }
, system ? builtins.currentSystem
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, cacheDir ? "${stateDir}/cache"
, libDir ? "${stateDir}/lib"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager
}:

let
  sharedConstructors = import ../../../examples/services-agnostic/constructors/constructors.nix {
    inherit pkgs stateDir runtimeDir logDir cacheDir libDir tmpDir forceDisableUserChange processManager;
  };

  constructors = import ../../../examples/webapps-agnostic/constructors/constructors.nix {
    inherit pkgs stateDir runtimeDir logDir tmpDir forceDisableUserChange processManager;
    webappMode = null;
  };
in
rec {
  webapp1 = rec {
    port = 5000;
    dnsName = "webapp1.local";

    pkg = constructors.webapp {
      inherit port;
      instanceSuffix = "1";
    };
  };

  webapp2 = rec {
    port = 5001;
    dnsName = "webapp2.local";

    pkg = constructors.webapp {
      inherit port;
      instanceSuffix = "2";
    };
  };

  webapp3 = rec {
    port = 5002;
    dnsName = "webapp3.local";

    pkg = constructors.webapp {
      inherit port;
      instanceSuffix = "3";
    };
  };

  webapp4 = rec {
    port = 5003;
    dnsName = "webapp4.local";

    pkg = constructors.webapp {
      inherit port;
      instanceSuffix = "4";
    };
  };

  nginx = rec {
    port = if forceDisableUserChange then 8080 else 80;
    webapps = [ webapp1 webapp2 webapp3 webapp4 ];

    pkg = sharedConstructors.nginxReverseProxyHostBased {
      inherit port webapps;
    } {};
  };

  webapp5 = rec {
    port = 5004;
    dnsName = "webapp5.local";

    pkg = constructors.webapp {
      inherit port;
      instanceSuffix = "5";
    };
  };

  webapp6 = rec {
    port = 5005;
    dnsName = "webapp6.local";

    pkg = constructors.webapp {
      inherit port;
      instanceSuffix = "6";
    };
  };

  nginx2 = rec {
    port = if forceDisableUserChange then 8081 else 81;
    webapps = [ webapp5 webapp6 ];

    pkg = sharedConstructors.nginxReverseProxyHostBased {
      inherit port webapps;
      instanceSuffix = "2";
    } {};
  };
}

The processes model shown above (processes-advanced.nix) defines the following process instances:

There are six webapp process instances, each running an embedded HTTP service, returning HTML pages with their identities. The dnsName property specifies the DNS domain name value that should be used as a virtual host header to make the forwarding from the reverse proxies work.
There are two nginx reverse proxy instances. The former: nginx forwards incoming connections to the first four webapp instances. The latter: nginx2 forwards incoming connections to webapp5 and webapp6.

With the following command, I can connect to webapp2 through the first nginx reverse proxy:

$ curl -H 'Host: webapp2.local' http://localhost:8080
<!DOCTYPE html>
<html>
  <head>
    <title>Simple test webapp</title>
  </head>
  <body>
    Simple test webapp listening on port: 5001
  </body>
</html>

Creating a test suite

I can create a test suite for the web application system as follows:

{ pkgs, testService, processManagers, profiles }:

testService {
  exprFile = ./processes.nix;

  readiness = {instanceName, instance, ...}:
    ''
      machine.wait_for_open_port(${toString instance.port})
    '';

  tests = {instanceName, instance, ...}:
    pkgs.lib.optionalString (instanceName == "nginx" || instanceName == "nginx2")
      (pkgs.lib.concatMapStrings (webapp: ''
        machine.succeed(
            "curl --fail -H 'Host: ${webapp.dnsName}' http://localhost:${toString instance.port} | grep ': ${toString webapp.port}'"
        )
      '') instance.webapps);

  inherit processManagers profiles;
}

The Nix expression above invokes testService with the following parameters:

processManagers refers to a list of names of all the process managers that should be tested.
profiles refers to a list of configuration profiles that should be tested. Currently, it supports privileged for privileged deployments, and unprivileged for unprivileged deployments in an unprivileged user's home directory, without changing user permissions.
The exprFile parameter refers to the processes model of the system: processes-advanced.nix shown earlier.
The readiness parameter refers to a function that does a readiness check for each process instance. In the above example, it checks whether each service is actually listening on the required TCP port.
The tests parameter refers to a function that executes tests for each process instance. In the above example, it ignores all but the nginx instances, because explicitly testing a webapp instance is a redundant operation.

For each nginx instance, it checks whether all webapp instances can be reached from it, by running the curl command.

The readiness and tests functions take the following parameters: instanceName identifies the process instance in the processes model, and instance refers to the attribute set containing its configuration.

Furthermore, they can refer to global process model configuration parameters:

stateDir: The directory in which state files are stored (typically /var for privileged deployments)
runtimeDir: The directory in which runtime files are stored (typically /var/run for privileged deployments).
forceDisableUserChange: Indicates whether to disable user changes (for unprivileged deployments) or not.

In addition to writing tests that work on instance level, it is also possible to write tests on system level, with the following parameters (not shown in the example):

initialTests: instructions that run right after deploying the system, but before the readiness checks, and instance-level tests.
postTests: instructions that run after the instance-level tests.

The above functions also accept the same global configuration parameters, and processes that refers to the entire processes model.

We can also configure other properties useful for testing:

systemPackages: installs additional packages into the system profile of the test virtual machine.
nixosConfig defines a NixOS module with configuration properties that will be added to the NixOS configuration of the test machine.
extraParams propagates additional parameters to the processes model.

Composing test functions

The Nix expression above is not self-contained. It is a function definition that needs to be invoked with all required parameters including all the process managers and profiles that we want to test for.

We can compose tests in the following Nix expression:

{ pkgs ? import <nixpkgs> { inherit system; }
, system ? builtins.currentSystem
, processManagers ? [ "supervisord" "sysvinit" "systemd" "disnix" "s6-rc" ]
, profiles ? [ "privileged" "unprivileged" ]
}:

let
  testService = import ../../nixproc/test-driver/universal.nix {
    inherit system;
  };
in
{

  nginx-reverse-proxy-hostbased = import ./nginx-reverse-proxy-hostbased {
    inherit pkgs processManagers profiles testService;
  };

  docker = import ./docker {
    inherit pkgs processManagers profiles testService;
  };

  ...
}

The above partial Nix expression (default.nix) invokes the function defined in the previous Nix expression that resides in the nginx-reverse-proxy-hostbased directory and propagates all required parameters. It also composes other test cases, such as docker.

The parameters of the composition expression allow you to globally configure all the desired service variants:

processManagers allows you to select the process managers you want to test for.
profiles allows you to select the configuration profiles.

With the following command, we can test our system as a privileged user, using systemd as a process manager:

$ nix-build -A nginx-reverse-proxy-hostbased.privileged.systemd

we can also run the same test, but then as an unprivileged user:

$ nix-build -A nginx-reverse-proxy-hostbased.unprivileged.systemd

In addition to systemd, any configured process manager can be used that works in NixOS. The following command runs a privileged test of the same service for sysvinit:

$ nix-build -A nginx-reverse-proxy-hostbased.privileged.sysvinit

Results

With the test driver in place, I have managed to expand my repository of example services, provided test coverage for them and fixed quite a few bugs in the framework caused by regressions.

Below is a screenshot of Hydra: the Nix-based continuous integration service showing an overview of test results for all kinds of variants of a service:

So far, the following services work multi-instance, with multiple process managers, and (optionally) as an unprivileged user:

Apache HTTP server. In the services repository, there are multiple constructors for deploying an Apache HTTP server: to deploy static web applications or dynamic web applications with PHP, and to use it as a reverse proxy (via HTTP and AJP) with HTTP basic authentication optionally enabled.
Apache Tomcat.
Nginx. For Nginx we also have multiple constructors. One to deploy a configuration for serving static web apps, and two for setting up reverse proxies using paths or virtual hosts to forward incoming requests to the appropriate services.

The reverse proxy constructors can also generate configurations that will cache the responses of incoming requests.
MySQL/MariaDB.
PostgreSQL.
InfluxDB.
MongoDB.
OpenSSH.
svnserve.
xinetd.
fcron. By default, the fcron user and group are hardwired into the executable. To facilitate unprivileged user deployments, we automatically create a package build override to propagate the --with-run-non-privileged configuration flag so that it can run as unprivileged user. Similarly, for multiple instances we create an override to use a different user and group that does not conflict with the primary instance.
supervisord
s6-svscan

The following service also works with multiple instances and multiple process managers, but not as an unprivileged user:

Docker. In theory, Docker supports rootless deployments, but it is still very highly experimental and I find it very cumbersome to set up.

The following services work with multiple process managers, but not multi-instance or as an unprivileged user:

D-Bus
Disnix
nix-daemon
Hydra

In theory, the above services could be adjusted to work as an unprivileged user, but doing so is not very useful -- for example, the nix-daemon's purpose is to facilitate multi-user package deployments. As an unprivileged user, you only want to facilitate package deployments for yourself.

Moreover, the multi-instance aspect is IMO also not very useful to explore for these services. For example, I can not think of a useful scenario to have two Hydra instances running next to each other.

Discussion

The test framework described in this blog post is an important feature addition to the Nix process management framework -- it allowed me to package more services and fix quite a few bugs caused by regressions.

I can now finally show that it is doable to package services and make them work under nearly all possible conditions that the framework supports (e.g. multiple instances, multiple process managers, and unprivileged user installations).

The only limitation of the test framework is that it is not operating system agnostic -- the NixOS test driver (that serves as its foundation), only works (as its name implies) with NixOS, which itself is a Linux distribution. As a result, we can not automatically test bsdrc scripts, launchd daemons, and cygrunsrv services.

In theory, it is also possible to make a more generalized test driver that works with multiple operating systems. The NixOS test driver is a combination of ideas (e.g. a shared Nix store between the host and guest system, an API to control QEMU, and an API to manage services). We could also dissect these ideas and run them on conventional QEMU VMs running different operating systems (with the Nix package manager).

Although making a more generalized test driver is interesting, it is beyond the scope of the Nix process management framework (which is about managing process instances, not entire systems).

Another drawback is that while it is possible to test all possible service variants on Linux, it may be very expensive to do so.

However, full process manager coverage is often not required to get a reasonable level of confidence. For many services, it typically suffices to implement the following strategy:

Pick two process managers: one that prefers foreground processes (e.g. supervisord) and one that prefers daemons (e.g. sysvinit). This is the most significant difference (from a configuration perspective) between all these different process managers.
If a service supports multiple configuration variants, and multiple instances, then create a processes model that concurrently deploys all these variants.

Implementing the above strategy only requires you to test four variants, providing a high degree of certainty that it will work with all other process managers as well.

Future work

Most of the interesting functionality required to work with the Nix process management framework is now implemented. I still need to implement more changes to make it more robust and "dog food" more of my own problems as much as possible.

Moreover, the docker backend still requires a bit more work to make it more usable.

Eventually, I will be thinking of an RFC that will upstream the interesting bits of the framework into Nixpkgs.

Availability

The Nix process management framework repository as well as the example services repository can be obtained from my GitHub page.

Using the Nix process management framework as an infrastructure deployment solution for Disnix

2021-03-12T23:28:00.000+01:00

As explained in many previous blog posts, I have developed Disnix as a solution for automating the deployment of service-oriented systems -- it deploys heterogeneous systems, that consist of many different kinds of components (such as web applications, web services, databases and processes) to networks of machines.

The deployment models for Disnix are typically not fully self-contained. Foremost, a precondition that must be met before a service-oriented system can be deployed, is that all target machines in the network require the presence of Nix package manager, Disnix, and a remote connectivity service (e.g. SSH).

For multi-user Disnix installations, in which the user does not have super-user privileges, the Disnix service is required to carry out deployment operations on behalf of a user.

Moreover, the services in the services model typically need to be managed by other services, called containers in Disnix terminology (not to be confused with Linux containers).

Examples of container services are:

The MySQL DBMS container can manage multiple databases deployed by Disnix.
The Apache Tomcat servlet container can manage multiple Java web applications deployed by Disnix.
systemd can act as a container that manages multiple systemd units deployed by Disnix.

Managing the life-cycles of services in containers (such as activating or deactivating them) is done by a companion tool called Dysnomia.

In addition to Disnix, these container services also typically need to be deployed in advance to the target machines in the network.

The problem domain that Disnix works in is called service deployment, whereas the deployment of machines (bare metal or virtual machines) and the container services is called infrastructure deployment.

Disnix can be complemented with a variety of infrastructure deployment solutions:

NixOps can deploy networks of NixOS machines, both physical and virtual machines (in the cloud), such as Amazon EC2.

As part of a NixOS configuration, the Disnix service can be deployed that facilitates multi-user installations. The Dysnomia NixOS module can expose all relevant container services installed by NixOS as container deployment targets.
disnixos-deploy-network is a tool that is included with the DisnixOS extension toolset. Since services in Disnix can be any kind of deployment unit, it is also possible to deploy an entire NixOS configuration as a service. This tool is mostly developed for demonstration purposes.

A limitation of this tool is that it cannot instantiate virtual machines and bootstrap Disnix.
Disnix itself. The above solutions are all NixOS-based, a software distribution that is Linux-based and fully managed by the Nix package manager.

Although NixOS is very powerful, it has two drawbacks for Disnix:
- NixOS uses the NixOS module system for configuring system aspects. It is very powerful but you can only deploy one instance of a system service -- Disnix can also work with multiple container instances of the same type on a machine.
- Services in NixOS cannot be deployed to other kinds software distributions: conventional Linux distributions, and other operating systems, such as macOS and FreeBSD.
To overcome these limitations, Disnix can also be used as a container deployment solution on any operating system that is capable of running Nix and Disnix. Services deployed by Disnix can automatically be exposed as container providers.

Similar to disnix-deploy-network, a limitation of this approach is that it cannot be used to bootstrap Disnix.

Last year, I have also added a new major feature to Disnix making it possible to deploy both application and container services in the same Disnix deployment models, minimizing the infrastructure deployment problem -- the only requirement is to have machines with Nix, Disnix, and a remote connectivity service (such as SSH) pre-installed on them.

Although this integrated feature is quite convenient, in particular for test setups, a separated infrastructure deployment process (that includes container services) still makes sense in many scenarios:

The infrastructure parts and service parts can be managed by different people with different specializations. For example, configuring and tuning an application server is a different responsibility than developing a Java web application.
The service parts typically change more frequently than the infrastructure parts. As a result, they typically have different kinds of update cycles.
The infrastructure components can typically be reused between projects (e.g. many systems use a database backend such as PostgreSQL or MySQL), whereas the service components are typically very project specific.

I also realized that my other project: the Nix process management framework can serve as a partial infrastructure deployment solution -- it can be used to bootstrap Disnix and deploy container services.

Moreover, it can also deploy multiple instances of container services and used on any operating system that the Nix process management framework supports, including conventional Linux distributions and other operating systems, such as macOS and FreeBSD.

Deploying and exposing the Disnix service with the Nix process management framework

As explained earlier, to allow Disnix to deploy services to a remote machine, a machine needs to have Disnix installed (and run the Disnix service for a multi-user installation), and be remotely connectible, e.g. through SSH.

I have packaged all required services as constructor functions for the Nix process management framework.

The following process model captures the configuration of a basic multi-user Disnix installation:

{ pkgs ? import <nixpkgs> { inherit system; }
, system ? builtins.currentSystem
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, spoolDir ? "${stateDir}/spool"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager
}:

let
  ids = if builtins.pathExists ./ids-bare.nix then (import ./ids-bare.nix).ids else {};

  constructors = import ../../services-agnostic/constructors.nix {
    inherit pkgs stateDir runtimeDir logDir tmpDir cacheDir spoolDir forceDisableUserChange processManager ids;
  };
in
rec {
  sshd = {
    pkg = constructors.sshd {
      extraSSHDConfig = ''
        UsePAM yes
      '';
    };

    requiresUniqueIdsFor = [ "uids" "gids" ];
  };

  dbus-daemon = {
    pkg = constructors.dbus-daemon {
      services = [ disnix-service ];
    };

    requiresUniqueIdsFor = [ "uids" "gids" ];
  };

  disnix-service = {
    pkg = constructors.disnix-service {
      inherit dbus-daemon;
    };

    requiresUniqueIdsFor = [ "gids" ];
  };
}

The above processes model (processes.nix) captures three process instances:

sshd is the OpenSSH server that makes it possible to remotely connect to the machine by using the SSH protocol.
dbus-daemon runs a D-Bus system daemon, that is a requirement for the Disnix service. The disnix-service is propagated as a parameter, so that its service directory gets added to the D-Bus system daemon configuration.
disnix-service is a service that executes deployment operations on behalf of an authorized unprivileged user. The disnix-service has a dependency on the dbus-service making sure that the latter gets activated first.

We can deploy the above configuration on a machine that has the Nix process management framework already installed.

For example, to deploy the configuration on a machine that uses supervisord, we can run:

$ nixproc-supervisord-switch processes.nix

Resulting in a system that consists of the following running processes:

$ supervisorctl 
dbus-daemon                      RUNNING   pid 2374, uptime 0:00:34
disnix-service                   RUNNING   pid 2397, uptime 0:00:33
sshd                             RUNNING   pid 2375, uptime 0:00:34

As may be noticed, the above supervised services correspond to the processes in the processes model.

On the coordinator machine, we can write a bootstrap infrastructure model (infra-bootstrap.nix) that only contains connectivity settings:

{
  test1.properties.hostname = "192.168.2.1";
}

and use the bootstrap model to capture the full infrastructure model of the system:

$ disnix-capture-infra infra-bootstrap.nix

resulting in the following configuration:

{
  "test1" = {
    properties = {
      "hostname" = "192.168.2.1";
      "system" = "x86_64-linux";
    };
    containers = {
      echo = {
      };
      fileset = {
      };
      process = {
      };
      supervisord-program = {
        "supervisordTargetDir" = "/etc/supervisor/conf.d";
      };
      wrapper = {
      };
    };
    "system" = "x86_64-linux";
  };
}

Despite the fact that we have not configured any containers explicitly, the above configuration (infrastructure.nix) already exposes a number of container services:

The echo, fileset and process container services are built-in container providers that any Dysnomia installation includes.

The process container can be used to automatically deploy services that daemonize. Services that daemonize themselves do not require the presence of any external service.
The supervisord-program container refers to the process supervisor that manages the services deployed by the Nix process management framework. It can also be used as a container for processes deployed by Disnix.

With the above infrastructure model, we can deploy any system that depends on the above container services, such as the trivial Disnix proxy example:

{ system, distribution, invDistribution, pkgs
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager ? "supervisord"
, nix-processmgmt ? ../../../nix-processmgmt
}:

let
  customPkgs = import ../top-level/all-packages.nix {
    inherit system pkgs stateDir logDir runtimeDir tmpDir forceDisableUserChange processManager nix-processmgmt;
  };

  ids = if builtins.pathExists ./ids.nix then (import ./ids.nix).ids else {};

  processType = import "${nix-processmgmt}/nixproc/derive-dysnomia-process-type.nix" {
    inherit processManager;
  };
in
rec {
  hello_world_server = rec {
    name = "hello_world_server";
    port = ids.ports.hello_world_server or 0;
    pkg = customPkgs.hello_world_server { inherit port; };
    type = processType;
    requiresUniqueIdsFor = [ "ports" ];
  };

  hello_world_client = {
    name = "hello_world_client";
    pkg = customPkgs.hello_world_client;
    dependsOn = {
      inherit hello_world_server;
    };
    type = "package";
  };
}

The services model shown above (services.nix) captures two services:

The hello_world_server service is a simple service that listens on a TCP port for a "hello" message and responds with a "Hello world!" message.
The hello_world_client service is a package providing a client executable that automatically connects to the hello_world_server.

With the following distribution model (distribution.nix), we can map all the services to our deployment machine (that runs the Disnix service managed by the Nix process management framework):

{infrastructure}:

{
  hello_world_client = [ infrastructure.test1 ];
  hello_world_server = [ infrastructure.test1 ];
}

and deploy the system by running the following command:

$ disnix-env -s services-without-proxy.nix \
  -i infrastructure.nix \
  -d distribution.nix \
  --extra-params '{ processManager = "supervisord"; }'

The last parameter: --extra-params configures the services model (that indirectly invokes the createManagedProcess abstraction function from the Nix process management framework) in such a way that supervisord configuration files are generated.

(As a sidenote: without the --extra-params parameter, the process instances will be built for the disnix process manager generating configuration files that can be deployed to the process container, expecting programs to daemonize on their own and leave a PID file behind with the daemon's process ID. Although this approach is convenient for experiments, because no external service is required, it is not as reliable as managing supervised processes).

The result of the above deployment operation is that the hello-world-service service is deployed as a service that is also managed by supervisord:

$ supervisorctl 
dbus-daemon                      RUNNING   pid 2374, uptime 0:09:39
disnix-service                   RUNNING   pid 2397, uptime 0:09:38
hello-world-server               RUNNING   pid 2574, uptime 0:00:06
sshd                             RUNNING   pid 2375, uptime 0:09:39

and we can use the hello-world-client executable on the target machine to connect to the service:

$ /nix/var/nix/profiles/disnix/default/bin/hello-world-client 
Trying 192.168.2.1...
Connected to 192.168.2.1.
Escape character is '^]'.
hello
Hello world!

Deploying container providers and exposing them

With Disnix, it is also possible to deploy systems that are composed of different kinds of components, such as web services and databases.

For example, the Java variant of the ridiculous Staff Tracker example consists of the following services:

The services in the diagram above have the following purpose:

The StaffTracker service is the front-end web application that shows an overview of staff members and their locations.
The StaffService service is web service with a SOAP interface that provides read and write access to the staff records. The staff records are stored in the staff database.
The RoomService service provides read access to the rooms records, that are stored in a separate rooms database.
The ZipcodeService service provides read access to zip codes, that are stored in a separate zipcodes database.
The GeolocationService infers the location of a staff member from its IP address using the GeoIP service.

To deploy the system shown above, we need a target machine that provides Apache Tomcat (for managing the web application front-end and web services) and MySQL (for managing the databases) as container provider services:

{ pkgs ? import <nixpkgs> { inherit system; }
, system ? builtins.currentSystem
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, spoolDir ? "${stateDir}/spool"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager
}:

let
  ids = if builtins.pathExists ./ids-tomcat-mysql.nix then (import ./ids-tomcat-mysql.nix).ids else {};

  constructors = import ../../services-agnostic/constructors.nix {
    inherit pkgs stateDir runtimeDir logDir tmpDir cacheDir spoolDir forceDisableUserChange processManager ids;
  };

  containerProviderConstructors = import ../../service-containers-agnostic/constructors.nix {
    inherit pkgs stateDir runtimeDir logDir tmpDir cacheDir spoolDir forceDisableUserChange processManager ids;
  };
in
rec {
  sshd = {
    pkg = constructors.sshd {
      extraSSHDConfig = ''
        UsePAM yes
      '';
    };

    requiresUniqueIdsFor = [ "uids" "gids" ];
  };

  dbus-daemon = {
    pkg = constructors.dbus-daemon {
      services = [ disnix-service ];
    };

    requiresUniqueIdsFor = [ "uids" "gids" ];
  };

  tomcat = containerProviderConstructors.simpleAppservingTomcat {
    commonLibs = [ "${pkgs.mysql_jdbc}/share/java/mysql-connector-java.jar" ];
    webapps = [
      pkgs.tomcat9.webapps # Include the Tomcat example and management applications
    ];

    properties.requiresUniqueIdsFor = [ "uids" "gids" ];
  };

  mysql = containerProviderConstructors.mysql {
    properties.requiresUniqueIdsFor = [ "uids" "gids" ];
  };

  disnix-service = {
    pkg = constructors.disnix-service {
      inherit dbus-daemon;
      containerProviders = [ tomcat mysql ];
    };

    requiresUniqueIdsFor = [ "gids" ];
  };
}

The process model above is an extension of the previous processes model, adding two container provider services:

tomcat is the Apache Tomcat server. The constructor function: simpleAppServingTomcat composes a configuration for a supported process manager, such as supervisord.

Moreover, it bundles a Dysnomia container configuration file, and a Dysnomia module: tomcat-webapplication that can be used to manage the life-cycles of Java web applications embedded in the servlet container.
mysql is the MySQL DBMS server. The constructor function also creates a process manager configuration file, and bundles a Dysnomia container configuration file and module that manages the life-cycles of databases.
The container services above are propagated as containerProviders to the disnix-service. This function parameter is used to update the search paths for container configuration and modules, so that services can be deployed to these containers by Disnix.

After deploying the above processes model, we should see the following infrastructure model after capturing it:

$ disnix-capture-infra infra-bootstrap.nix
{
  "test1" = {
    properties = {
      "hostname" = "192.168.2.1";
      "system" = "x86_64-linux";
    };
    containers = {
      echo = {
      };
      fileset = {
      };
      process = {
      };
      supervisord-program = {
        "supervisordTargetDir" = "/etc/supervisor/conf.d";
      };
      wrapper = {
      };
      tomcat-webapplication = {
        "tomcatPort" = "8080";
        "catalinaBaseDir" = "/var/tomcat";
      };
      mysql-database = {
        "mysqlPort" = "3306";
        "mysqlUsername" = "root";
        "mysqlPassword" = "";
        "mysqlSocket" = "/var/run/mysqld/mysqld.sock";
      };
    };
    "system" = "x86_64-linux";
  };
}

As may be observed, the tomcat-webapplication and mysql-database containers (with their relevant configuration properties) were added to the infrastructure model.

With the following command we can deploy the example system's services to the containers in the network:

$ disnix-env -s services.nix -i infrastructure.nix -d distribution.nix

resulting in a fully functional system:

Deploying multiple container provider instances

As explained in the introduction, a limitation of the NixOS module system is that it is only possible to construct one instance of a service on a machine.

Process instances in a processes model deployed by the Nix process management framework as well as services in a Disnix services model are instantiated from functions that make it possible to deploy multiple instances of the same service to the same machine, by making conflicting properties configurable.

The following processes model was modified from the previous example to deploy two MySQL servers and two Apache Tomcat servers to the same machine:

{ pkgs ? import <nixpkgs> { inherit system; }
, system ? builtins.currentSystem
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, spoolDir ? "${stateDir}/spool"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager
}:

let
  ids = if builtins.pathExists ./ids-tomcat-mysql-multi-instance.nix then (import ./ids-tomcat-mysql-multi-instance.nix).ids else {};

  constructors = import ../../services-agnostic/constructors.nix {
    inherit pkgs stateDir runtimeDir logDir tmpDir cacheDir spoolDir forceDisableUserChange processManager ids;
  };

  containerProviderConstructors = import ../../service-containers-agnostic/constructors.nix {
    inherit pkgs stateDir runtimeDir logDir tmpDir cacheDir spoolDir forceDisableUserChange processManager ids;
  };
in
rec {
  sshd = {
    pkg = constructors.sshd {
      extraSSHDConfig = ''
        UsePAM yes
      '';
    };

    requiresUniqueIdsFor = [ "uids" "gids" ];
  };

  dbus-daemon = {
    pkg = constructors.dbus-daemon {
      services = [ disnix-service ];
    };

    requiresUniqueIdsFor = [ "uids" "gids" ];
  };

  tomcat-primary = containerProviderConstructors.simpleAppservingTomcat {
    instanceSuffix = "-primary";
    httpPort = 8080;
    httpsPort = 8443;
    serverPort = 8005;
    ajpPort = 8009;
    commonLibs = [ "${pkgs.mysql_jdbc}/share/java/mysql-connector-java.jar" ];
    webapps = [
      pkgs.tomcat9.webapps # Include the Tomcat example and management applications
    ];
    properties.requiresUniqueIdsFor = [ "uids" "gids" ];
  };

  tomcat-secondary = containerProviderConstructors.simpleAppservingTomcat {
    instanceSuffix = "-secondary";
    httpPort = 8081;
    httpsPort = 8444;
    serverPort = 8006;
    ajpPort = 8010;
    commonLibs = [ "${pkgs.mysql_jdbc}/share/java/mysql-connector-java.jar" ];
    webapps = [
      pkgs.tomcat9.webapps # Include the Tomcat example and management applications
    ];
    properties.requiresUniqueIdsFor = [ "uids" "gids" ];
  };

  mysql-primary = containerProviderConstructors.mysql {
    instanceSuffix = "-primary";
    port = 3306;
    properties.requiresUniqueIdsFor = [ "uids" "gids" ];
  };

  mysql-secondary = containerProviderConstructors.mysql {
    instanceSuffix = "-secondary";
    port = 3307;
    properties.requiresUniqueIdsFor = [ "uids" "gids" ];
  };

  disnix-service = {
    pkg = constructors.disnix-service {
      inherit dbus-daemon;
      containerProviders = [ tomcat-primary tomcat-secondary mysql-primary mysql-secondary ];
    };

    requiresUniqueIdsFor = [ "gids" ];
  };
}

In the above processes model, we made the following changes:

We have configured two Apache Tomcat instances: tomcat-primary and tomcat-secondary. Both instances can co-exist because they have been configured in such a way that they listen to unique TCP ports and have a unique instance name composed from the instanceSuffix.
We have configured two MySQL instances: mysql-primary and mysql-secondary. Similar to Apache Tomcat, they can both co-exist because they listen to unique TCP ports (e.g. 3306 and 3307) and have a unique instance name.
Both the primary and secondary instances of the above services are propagated to the disnix-service (with the containerProviders parameter) making it possible for a client to discover them.

After deploying the above processes model, we can run the following command to discover the machine's configuration:

$ disnix-capture-infra infra-bootstrap.nix
{
  "test1" = {
    properties = {
      "hostname" = "192.168.2.1";
      "system" = "x86_64-linux";
    };
    containers = {
      echo = {
      };
      fileset = {
      };
      process = {
      };
      supervisord-program = {
        "supervisordTargetDir" = "/etc/supervisor/conf.d";
      };
      wrapper = {
      };
      tomcat-webapplication-primary = {
        "tomcatPort" = "8080";
        "catalinaBaseDir" = "/var/tomcat-primary";
      };
      tomcat-webapplication-secondary = {
        "tomcatPort" = "8081";
        "catalinaBaseDir" = "/var/tomcat-secondary";
      };
      mysql-database-primary = {
        "mysqlPort" = "3306";
        "mysqlUsername" = "root";
        "mysqlPassword" = "";
        "mysqlSocket" = "/var/run/mysqld-primary/mysqld.sock";
      };
      mysql-database-secondary = {
        "mysqlPort" = "3307";
        "mysqlUsername" = "root";
        "mysqlPassword" = "";
        "mysqlSocket" = "/var/run/mysqld-secondary/mysqld.sock";
      };
    };
    "system" = "x86_64-linux";
  };
}

As may be observed, the infrastructure model contains two Apache Tomcat instances and two MySQL instances.

With the following distribution model (distribution.nix), we can divide each database and web application over the two container instances:

{infrastructure}:

{
  GeolocationService = {
    targets = [
      { target = infrastructure.test1;
        container = "tomcat-webapplication-primary";
      }
    ];
  };
  RoomService = {
    targets = [
      { target = infrastructure.test1;
        container = "tomcat-webapplication-secondary";
      }
    ];
  };
  StaffService = {
    targets = [
      { target = infrastructure.test1;
        container = "tomcat-webapplication-primary";
      }
    ];
  };
  StaffTracker = {
    targets = [
      { target = infrastructure.test1;
        container = "tomcat-webapplication-secondary";
      }
    ];
  };
  ZipcodeService = {
    targets = [
      { target = infrastructure.test1;
        container = "tomcat-webapplication-primary";
      }
    ];
  };
  rooms = {
    targets = [
      { target = infrastructure.test1;
        container = "mysql-database-primary";
      }
    ];
  };
  staff = {
    targets = [
      { target = infrastructure.test1;
        container = "mysql-database-secondary";
      }
    ];
  };
  zipcodes = {
    targets = [
      { target = infrastructure.test1;
        container = "mysql-database-primary";
      }
    ];
  };
}

Compared to the previous distribution model, the above model uses a more verbose notation for mapping services.

As explained in an earlier blog post, in deployments in which only a single container is deployed, services are automapped to the container that has the same name as the service's type. When multiple instances exist, we need to manually specify the container where the service needs to be deployed to.

After deploying the system with the following command:

$ disnix-env -s services.nix -i infrastructure.nix -d distribution.nix

we will get a running system with the following deployment architecture:

Using the Disnix web service for executing remote deployment operations

By default, Disnix uses SSH to communicate to target machines in the network. Disnix has a modular architecture and is also capable of communicating to target machines by other means, for example via NixOps, the backdoor client, D-Bus, and directly executing tasks on a local machine.

There is also an external package: DisnixWebService that remotely exposes all deployment operations from a web service with a SOAP API.

To use the DisnixWebService, we must deploy a Java servlet container (such as Apache Tomcat) with the DisnixWebService application, configured in such a way that it can connect to the disnix-service over the D-Bus system bus.

The following processes model is an extension of the non-multi containers Staff Tracker example, with an Apache Tomcat service that bundles the DisnixWebService:

{ pkgs ? import <nixpkgs> { inherit system; }
, system ? builtins.currentSystem
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, spoolDir ? "${stateDir}/spool"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager
}:

let
  ids = if builtins.pathExists ./ids-tomcat-mysql.nix then (import ./ids-tomcat-mysql.nix).ids else {};

  constructors = import ../../services-agnostic/constructors.nix {
    inherit pkgs stateDir runtimeDir logDir tmpDir cacheDir spoolDir forceDisableUserChange processManager ids;
  };

  containerProviderConstructors = import ../../service-containers-agnostic/constructors.nix {
    inherit pkgs stateDir runtimeDir logDir tmpDir cacheDir spoolDir forceDisableUserChange processManager ids;
  };
in
rec {
  sshd = {
    pkg = constructors.sshd {
      extraSSHDConfig = ''
        UsePAM yes
      '';
    };

    requiresUniqueIdsFor = [ "uids" "gids" ];
  };

  dbus-daemon = {
    pkg = constructors.dbus-daemon {
      services = [ disnix-service ];
    };

    requiresUniqueIdsFor = [ "uids" "gids" ];
  };

  tomcat = containerProviderConstructors.disnixAppservingTomcat {
    commonLibs = [ "${pkgs.mysql_jdbc}/share/java/mysql-connector-java.jar" ];
    webapps = [
      pkgs.tomcat9.webapps # Include the Tomcat example and management applications
    ];
    enableAJP = true;
    inherit dbus-daemon;

    properties.requiresUniqueIdsFor = [ "uids" "gids" ];
  };

  apache = {
    pkg = constructors.basicAuthReverseProxyApache {
      dependency = tomcat;
      serverAdmin = "admin@localhost";
      targetProtocol = "ajp";
      portPropertyName = "ajpPort";

      authName = "DisnixWebService";
      authUserFile = pkgs.stdenv.mkDerivation {
        name = "htpasswd";
        buildInputs = [ pkgs.apacheHttpd ];
        buildCommand = ''
          htpasswd -cb ./htpasswd admin secret
          mv htpasswd $out
        '';
      };
      requireUser = "admin";
    };

    requiresUniqueIdsFor = [ "uids" "gids" ];
  };

  mysql = containerProviderConstructors.mysql {
    properties.requiresUniqueIdsFor = [ "uids" "gids" ];
  };

  disnix-service = {
    pkg = constructors.disnix-service {
      inherit dbus-daemon;
      containerProviders = [ tomcat mysql ];
      authorizedUsers = [ tomcat.name ];
      dysnomiaProperties = {
        targetEPR = "http://$(hostname)/DisnixWebService/services/DisnixWebService";
      };
    };

    requiresUniqueIdsFor = [ "gids" ];
  };
}

The above processes model contains the following changes:

The Apache Tomcat process instance is constructed with the containerProviderConstructors.disnixAppservingTomcat constructor function automatically deploying the DisnixWebService and providing the required configuration settings so that it can communicate with the disnix-service over the D-Bus system bus.

Because the DisnixWebService requires the presence of the D-Bus system daemon, it is configured as a dependency for Apache Tomcat ensuring that it is started before Apache Tomcat.
Connecting to the Apache Tomcat server including the DisnixWebService requires no authentication. To secure the web applications and the DisnixWebService, I have configured an apache reverse proxy that forwards connections to Apache Tomcat using the AJP protocol.

Moreover, the reverse proxy protects incoming requests by using HTTP basic authentication requiring a username and password.

We can use the following bootstrap infrastructure model to discover the machine's configuration:

{
  test1.properties.targetEPR = "http://192.168.2.1/DisnixWebService/services/DisnixWebService";
}

The difference between this bootstrap infrastructure model and the previous is that it uses a different connection property (targetEPR) that refers to the URL of the DisnixWebService.

By default, Disnix uses the disnix-ssh-client to communicate to target machines. To use a different client, we must set the following environment variables:

$ export DISNIX_CLIENT_INTERFACE=disnix-soap-client
$ export DISNIX_TARGET_PROPERTY=targetEPR

The above environment variables instruct Disnix to use the disnix-soap-client executable and the targetEPR property from the infrastructure model as a connection string.

To authenticate ourselves, we must set the following environment variables with a username and password:

$ export DISNIX_SOAP_CLIENT_USERNAME=admin
$ export DISNIX_SOAP_CLIENT_PASSWORD=secret

The following command makes it possible to discover the machine's configuration using the disnix-soap-client and DisnixWebService:

$ disnix-capture-infra infra-bootstrap.nix
{
  "test1" = {
    properties = {
      "hostname" = "192.168.2.1";
      "system" = "x86_64-linux";
      "targetEPR" = "http://192.168.2.1/DisnixWebService/services/DisnixWebService";
    };
    containers = {
      echo = {
      };
      fileset = {
      };
      process = {
      };
      supervisord-program = {
        "supervisordTargetDir" = "/etc/supervisor/conf.d";
      };
      wrapper = {
      };
      tomcat-webapplication = {
        "tomcatPort" = "8080";
        "catalinaBaseDir" = "/var/tomcat";
        "ajpPort" = "8009";
      };
      mysql-database = {
        "mysqlPort" = "3306";
        "mysqlUsername" = "root";
        "mysqlPassword" = "";
        "mysqlSocket" = "/var/run/mysqld/mysqld.sock";
      };
    };
    "system" = "x86_64-linux";
  }
  ;
}

After capturing the full infrastructure model, we can deploy the system with disnix-env if desired, using the disnix-soap-client to carry out all necessary remote deployment operations.

Miscellaneous: using Docker containers as light-weight virtual machines

As explained earlier in this blog post, the Nix process management framework is only a partial infrastructure deployment solution -- you still need to somehow obtain physical or virtual machines with a software distribution running the Nix package manager.

In a blog post written some time ago, I have explained that Docker containers are not virtual machines or even light-weight virtual machines.

In my previous blog post, I have shown that we can also deploy mutable Docker multi-process containers in which process instances can be upgraded without stopping the container.

The deployment workflow for upgrading mutable containers, is very machine-like -- NixOS has a similar workflow that consists of updating the machine configuration (/etc/nixos/configuration.nix) and running a single command-line instruction to upgrade machine (nixos-rebuild switch).

We can actually start using containers as VMs by adding another ingredient in the mix -- we can also assign static IP addresses to Docker containers.

With the following Nix expression, we can create a Docker image for a mutable container, using any of the processes models shown previously as the "machine's configuration":

let
  pkgs = import <nixpkgs> {};

  createMutableMultiProcessImage = import ../nix-processmgmt/nixproc/create-image-from-steps/create-mutable-multi-process-image-universal.nix {
    inherit pkgs;
  };
in
createMutableMultiProcessImage {
  name = "disnix";
  tag = "test";
  contents = [ pkgs.mc pkgs.disnix ];
  exprFile = ./processes.nix;
  interactive = true;
  manpages = true;
  processManager = "supervisord";
}

The exprFile in the above Nix expression refers to a previously shown processes model, and the processManager the desired process manager to use, such as supervisord.

With the following command, we can build the image with Nix and load it into Docker:

$ nix-build
$ docker load -i result

With the following command, we can create a network to which our containers (with IP addresses) should belong:

$ docker network create --subnet=192.168.2.0/8 disnixnetwork

The above command creates a subnet with a prefix: 192.168.2.0 and allocates an 8-bit block for host IP addresses.

We can create and start a Docker container named: containervm using our previously built image, and assign it an IP address:

$ docker run --network disnixnetwork --ip 192.168.2.1 \
  --name containervm disnix:test

By default, Disnix uses SSH to connect to remote machines. With the following commands we can create a public-private key pair and copy the public key to the container:

$ ssh-keygen -t ed25519 -f id_test -N ""

$ docker exec containervm mkdir -m0700 -p /root/.ssh
$ docker cp id_test.pub containervm:/root/.ssh/authorized_keys
$ docker exec containervm chmod 600 /root/.ssh/authorized_keys
$ docker exec containervm chown root:root /root/.ssh/authorized_keys

On the coordinator machine, that carries out the deployment, we must add the private key to the SSH agent and configure the disnix-ssh-client to connect to the disnix-service:

$ ssh-add id_test
$ export DISNIX_REMOTE_CLIENT=disnix-client

By executing all these steps, containervm can be (mostly) used as if it were a virtual machine, including connecting to it with an IP address over SSH.

Conclusion

In this blog post, I have described how the Nix process management framework can be used as a partial infrastructure deployment solution for Disnix. It can be used both for deploying the disnix-service (to facilitate multi-user installations) as well as deploying container providers: services that manage the life-cycles of services deployed by Disnix.

Moreover, the Nix process management framework makes it possible to do these deployments on all kinds of software distributions that can use the Nix package manager, including NixOS, conventional Linux distributions and other operating systems, such as macOS and FreeBSD.

If I had developed this solution a couple of years ago, it would probably have saved me many hours of preparation work for my first demo in my NixCon 2015 talk in which I wanted demonstrate that it is possible to deploy services to a heterogeneous network that consists of a NixOS, Ubuntu and Windows machine. Back then, I had to do all the infrastructure deployment tasks manually.

I also have to admit (but this statement is mostly based on my personal preferences, not facts), is that I find the functional style that the framework uses is IMO far more intuitive than the NixOS module system for certain service configuration aspects, especially for configuring container services and exposing them with Disnix and Dysnomia:

Because every process instance is constructed from a constructor function that makes all instance parameters explicit, you are guarded against common configuration errors such as undeclared dependencies.

For example, the DisnixWebService-enabled Apache Tomcat service requires access to the dbus-service providing the system bus. Not having this service in the processes model, causes a missing function parameter error.
Function parameters in the processes model make it more clear that a process depends on another process and what that relationship may be. For example, with the containerProviders parameter it becomes IMO really clear that the disnix-service uses them as potential deployment targets for services deployed by Disnix.

In comparison, the implementations of the Disnix and Dysnomia NixOS modules are far more complicated and monolithic -- the Dysnomia module has to figure for all potential container services deployed as part of a NixOS configuration, their properties, convert them to Dysnomia configuration files, and configure the systemd configuration for the disnix-service for proper activation ordering.

The wants parameter (used for activation ordering) is just a list of strings, not knowing whether it contains valid references to services that have been deployed already.

Availability

The constructor functions for the services as well as the deployment examples described in this blog post can be found in the Nix process management services repository.

Future work

Slowly more and more of my personal use cases are getting supported by the Nix process management framework.

Moreover, the services repository is steadily growing. To ensure that all the services that I have packaged so far do not break, I really need to focus my work on a service test solution.

Deploying mutable multi-process Docker containers with the Nix process management framework (or running Hydra in a Docker container)

2021-02-24T22:46:00.001+01:00

In a blog post written several months ago, I have shown that the Nix process management framework can also be used to conveniently construct multi-process Docker images.

Although Docker is primarily used for managing single root application process containers, multi-process containers can sometimes be useful to deploy systems that consist of multiple, tightly coupled, processes.

The Docker manual has a section that describes how to construct images for multi-process containers, but IMO the configuration process is a bit tedious and cumbersome.

To make this process more convenient, I have built a wrapper function: createMultiProcessImage around the dockerTools.buildImage function (provided by Nixpkgs) that does the following:

It constructs an image that runs a Linux and Docker compatible process manager as an entry point. Currently, it supports supervisord, sysvinit, disnix and s6-rc.
The Nix process management framework is used to build a configuration for a system that consists of multiple processes, that will be managed by any of the supported process managers.

Although the framework makes the construction of multi-process images convenient, a big drawback of multi-process Docker containers is upgrading them -- for example, for Debian-based containers you can imperatively upgrade packages by connecting to the container:

$ docker exec -it mycontainer /bin/bash

and upgrade the desired packages, such as file:

$ apt install file

The upgrade instruction above is not reproducible -- apt may install file version 5.38 today, and 5.39 tomorrow.

To cope with these kinds of side-effects, Docker works with images that snapshot the outcomes of all the installation steps. Constructing a container from the same image will always provide the same versions of all dependencies.

As a consequence, to perform a reproducible container upgrade, it is required to construct a new image, discard the container and reconstruct the container from the new image version, causing the system as a whole to be terminated, including the processes that have not changed.

For a while, I have been thinking about this limitation and developed a solution that makes it possible to upgrade multi-process containers without stopping and discarding them. The only exception is the process manager.

To make deployments reproducible, it combines the reproducibility properties of Docker and Nix.

In this blog post, I will describe how this solution works and how it can be used.

Creating a function for building mutable Docker images

As explained in an earlier blog post, that compares the deployment properties of Nix and Docker, both solutions support reproducible deployment, albeit for different application domains.

Moreover, their reproducibility properties are built around different concepts:

Docker containers are reproducible, because they are constructed from images that consist of immutable layers identified by hash codes derived from their contents.
Nix package builds are reproducible, because they are stored in isolation in a Nix store and made immutable (the files' permissions are set read-only). In the construction process of the packages, many side effects are mitigated.

As a result, when the hash code prefix of a package (derived from all build inputs) is the same, then the build output is also (nearly) bit-identical, regardless of the machine on which the package was built.

By taking these reproducibilty properties into account, we can create a reproducible deployment process for upgradable containers by using a specific separation of responsibilities.

Deploying the base system

For the deployment of the base system that includes the process manager, we can stick ourselves to the traditional Docker deployment workflow based on images (the only unconventional aspect is that we use Nix to build a Docker image, instead of Dockerfiles).

The process manager that the image provides deploys its configuration from a dynamic configuration directory.

To support supervisord, we can invoke the following command as the container's entry point:

supervisord --nodaemon \
  --configuration /etc/supervisor/supervisord.conf \
  --logfile /var/log/supervisord.log \
  --pidfile /var/run/supervisord.pid

The above command starts the supervisord service (in foreground mode), using the supervisord.conf configuration file stored in /etc/supervisord.

The supervisord.conf configuration file has the following structure:

[supervisord]

[include]
files=conf.d/*

The above configuration automatically loads all program definitions stored in the conf.d directory. This directory is writable and initially empty. It can be populated with configuration files generated by the Nix process management framework.

For the other process managers that the framework supports (sysvinit, disnix and s6-rc), we follow a similar strategy -- we configure the process manager in such a way that the configuration is loaded from a source that can be dynamically updated.

Deploying process instances

Deployment of the process instances is not done in the construction of the image, but by the Nix process management framework and the Nix package manager running in the container.

To allow a processes model deployment to refer to packages in the Nixpkgs collection and install binary substitutes, we must configure a Nix channel, such as the unstable Nixpkgs channel:

$ nix-channel --add https://nixos.org/channels/nixpkgs-unstable
$ nix-channel --update

(As a sidenote: it is also possible to subscribe to a stable Nixpkgs channel or a specific Git revision of Nixpkgs).

The processes model (and relevant sub models, such as ids.nix that contains numeric ID assignments) are copied into the Docker image.

We can deploy the processes model for supervisord as follows:

$ nixproc-supervisord-switch

The above command will deploy the processes model in the NIXPROC_PROCESSES environment variable, which defaults to: /etc/nixproc/processes.nix:

First, it builds supervisord configuration files from the processes model (this step also includes deploying all required packages and service configuration files)
It creates symlinks for each configuration file belonging to a process instance in the writable conf.d directory
It instructs supervisord to reload the configuration so that only obsolete processes get deactivated and new services activated, causing unchanged processes to remain untouched.

(For the other process managers, we have equivalent tools: nixproc-sysvinit-switch, nixproc-disnix-switch and nixproc-s6-rc-switch).

Initial deployment of the system

Because only the process manager is deployed as part of the image (with an initially empty configuration), the system is not yet usable when we start a container.

To solve this problem, we must perform an initial deployment of the system on first startup.

I used my lessons learned from the chainloading techniques in s6 (in the previous blog post) and developed hacky generated bootstrap script (/bin/bootstrap) that serves as the container's entry point:

cat > /bin/bootstrap <<EOF
#! ${pkgs.stdenv.shell} -e

# Configure Nix channels
nix-channel --add ${channelURL}
nix-channel --update

# Deploy the processes model (in a child process)
nixproc-${input.processManager}-switch &

# Overwrite the bootstrap script, so that it simply just
# starts the process manager the next time we start the
# container
cat > /bin/bootstrap <<EOR
#! ${pkgs.stdenv.shell} -e
exec ${cmd}
EOR

# Chain load the actual process manager
exec ${cmd}
EOF
chmod 755 /bin/bootstrap

The generated bootstrap script does the following:

First, a Nix channel is configured and updated so that we can install packages from the Nixpkgs collection and obtain substitutes.
The next step is deploying the processes model by running the nixproc-*-switch tool for a supported process manager. This process is started in the background (as a child process) -- we can use this trick to force the managing bash shell to load our desired process supervisor as soon as possible.

Ultimately, we want the process manager to become responsible for supervising any other process running in the container.
After the deployment process is started in the background, the bootstrap script is overridden by a bootstrap script that becomes our real entry point -- the process manager that we want to use, such as supervisord.

Overriding the bootstrap script makes sure that the next time we start the container, it will start instantly without attempting to deploy the system again.
Finally, the bootstrap script "execs" into the real process manager, becoming the new PID 1 process. When the deployment of the system is done (the nixproc-*-switch process that still runs in the background), the process manager becomes responsible for reaping it.

With the above script, the workflow of deploying an upgradable/mutable multi-process container is the same as deploying an ordinary container from a Docker image -- the only (minor) difference is that the first time that we start the container, it may take some time before the services become available, because the multi-process system needs to be deployed by Nix and the Nix process management framework.

A simple usage scenario

Similar to my previous blog posts about the Nix process management framework, I will use the trivial web application system to demonstrate how the functionality of the framework can be used.

The web application system consists of one or more webapp processes (with an embedded HTTP server) that only return static HTML pages displaying their identities.

An Nginx reverse proxy forwards incoming requests to the appropriate webapp instance -- each webapp service can be reached by using its unique virtual host value.

To construct a mutable multi-process Docker image with Nix, we can write the following Nix expression (default.nix):

let
  pkgs = import <nixpkgs> {};

  nix-processmgmt = builtins.fetchGit {
    url = https://github.com/svanderburg/nix-processmgmt.git;
    ref = "master";
  };

  createMutableMultiProcessImage = import "${nix-processmgmt}/nixproc/create-image-from-steps/create-mutable-multi-process-image-universal.nix" {
    inherit pkgs;
  };
in
createMutableMultiProcessImage {
  name = "multiprocess";
  tag = "test";
  contents = [ pkgs.mc ];
  exprFile = ./processes.nix;
  idResourcesFile = ./idresources.nix;
  idsFile = ./ids.nix;
  processManager = "supervisord"; # sysvinit, disnix, s6-rc are also valid options
}

The above Nix expression invokes the createMutableMultiProcessImage function that constructs a Docker image that provides a base system with a process manager, and a bootstrap script that deploys the multi-process system:

The name, tag, and contents parameters specify the image name, tag and the packages that need to be included in the image.
The exprFile parameter refers to a processes model that captures the configurations of the process instances that need to be deployed.
The idResources parameter refers to an ID resources model that specifies from which resource pools unique IDs need to be selected.
The idsFile parameter refers to an IDs model that contains the unique ID assignments for each process instance. Unique IDs resemble TCP/UDP port assignments, user IDs (UIDs) and group IDs (GIDs).
We can use the processManager parameter to select the process manager we want to use. In the above example it is supervisord, but other options are also possible.

We can use the following processes model (processes.nix) to deploy a small version of our example system:

{ pkgs ? import <nixpkgs> { inherit system; }
, system ? builtins.currentSystem
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager
}:

let
  nix-processmgmt = builtins.fetchGit {
    url = https://github.com/svanderburg/nix-processmgmt.git;
    ref = "master";
  };

  ids = if builtins.pathExists ./ids.nix then (import ./ids.nix).ids else {};

  sharedConstructors = import "${nix-processmgmt}/examples/services-agnostic/constructors/constructors.nix" {
    inherit pkgs stateDir runtimeDir logDir cacheDir tmpDir forceDisableUserChange processManager ids;
  };

  constructors = import "${nix-processmgmt}/examples/webapps-agnostic/constructors/constructors.nix" {
    inherit pkgs stateDir runtimeDir logDir tmpDir forceDisableUserChange processManager ids;
  };
in
rec {
  webapp = rec {
    port = ids.webappPorts.webapp or 0;
    dnsName = "webapp.local";

    pkg = constructors.webapp {
      inherit port;
    };

    requiresUniqueIdsFor = [ "webappPorts" "uids" "gids" ];
  };

  nginx = rec {
    port = ids.nginxPorts.nginx or 0;

    pkg = sharedConstructors.nginxReverseProxyHostBased {
      webapps = [ webapp ];
      inherit port;
    } {};

    requiresUniqueIdsFor = [ "nginxPorts" "uids" "gids" ];
  };
}

The above Nix expression configures two process instances, one webapp process that returns a static HTML page with its identity and an Nginx reverse proxy that forwards connections to it.

A notable difference between the expression shown above and the processes models of the same system shown in my previous blog posts, is that this expression does not contain any references to files on the local filesystem, with the exception of the ID assignments expression (ids.nix).

We obtain all required functionality from the Nix process management framework by invoking builtins.fetchGit. Eliminating local references is required to allow the processes model to be copied into the container and deployed from within the container.

We can build a Docker image as follows:

$ nix-build

load the image into Docker:

$ docker load -i result

and create and start a Docker container:

$ docker run -it --name webapps --network host multiprocess:test
unpacking channels...
warning: Nix search path entry '/nix/var/nix/profiles/per-user/root/channels' does not exist, ignoring
created 1 symlinks in user environment
2021-02-21 15:29:29,878 CRIT Supervisor is running as root.  Privileges were not dropped because no user is specified in the config file.  If you intend to run as root, you can set user=root in the config file to avoid this message.
2021-02-21 15:29:29,878 WARN No file matches via include "/etc/supervisor/conf.d/*"
2021-02-21 15:29:29,897 INFO RPC interface 'supervisor' initialized
2021-02-21 15:29:29,897 CRIT Server 'inet_http_server' running without any HTTP authentication checking
2021-02-21 15:29:29,898 INFO supervisord started with pid 1
these derivations will be built:
  /nix/store/011g52sj25k5k04zx9zdszdxfv6wy1dw-credentials.drv
  /nix/store/1i9g728k7lda0z3mn1d4bfw07v5gzkrv-credentials.drv
  /nix/store/fs8fwfhalmgxf8y1c47d0zzq4f89fz0g-nginx.conf.drv
  /nix/store/vxpm2m6444fcy9r2p06dmpw2zxlfw0v4-nginx-foregroundproxy.sh.drv
  /nix/store/4v3lxnpapf5f8297gdjz6kdra8g7k4sc-nginx.conf.drv
  /nix/store/mdldv8gwvcd5fkchncp90hmz3p9rcd99-builder.pl.drv
  /nix/store/r7qjyr8vr3kh1lydrnzx6nwh62spksx5-nginx.drv
  /nix/store/h69khss5dqvx4svsc39l363wilcf2jjm-webapp.drv
  /nix/store/kcqbrhkc5gva3r8r0fnqjcfhcw4w5il5-webapp.conf.drv
  /nix/store/xfc1zbr92pyisf8lw35qybbn0g4f46sc-webapp.drv
  /nix/store/fjx5kndv24pia1yi2b7b2bznamfm8q0k-supervisord.d.drv
these paths will be fetched (78.80 MiB download, 347.06 MiB unpacked):
...

As may be noticed by looking at the output, on first startup the Nix process management framework is invoked to deploy the system with Nix.

After the system has been deployed, we should be able to connect to the webapp process via the Nginx reverse proxy:

$ curl -H 'Host: webapp.local' http://localhost:8080
<!DOCTYPE html>
<html>
  <head>
    <title>Simple test webapp</title>
  </head>
  <body>
    Simple test webapp listening on port: 5000
  </body>
</html>

When it is desired to upgrade the system, we can change the system's configuration by connecting to the container instance:

$ docker exec -it webapps /bin/bash

In the container, we can edit the processes.nix configuration file:

$ mcedit /etc/nixproc/processes.nix

and make changes to the configuration of the system. For example, we can change the processes model to include a second webapp process:

{ pkgs ? import <nixpkgs> { inherit system; }
, system ? builtins.currentSystem
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager
}:

let
  nix-processmgmt = builtins.fetchGit {
    url = https://github.com/svanderburg/nix-processmgmt.git;
    ref = "master";
  };

  ids = if builtins.pathExists ./ids.nix then (import ./ids.nix).ids else {};

  sharedConstructors = import "${nix-processmgmt}/examples/services-agnostic/constructors/constructors.nix" {
    inherit pkgs stateDir runtimeDir logDir cacheDir tmpDir forceDisableUserChange processManager ids;
  };

  constructors = import "${nix-processmgmt}/examples/webapps-agnostic/constructors/constructors.nix" {
    inherit pkgs stateDir runtimeDir logDir tmpDir forceDisableUserChange processManager ids;
  };
in
rec {
  webapp = rec {
    port = ids.webappPorts.webapp or 0;
    dnsName = "webapp.local";

    pkg = constructors.webapp {
      inherit port;
    };

    requiresUniqueIdsFor = [ "webappPorts" "uids" "gids" ];
  };

  webapp2 = rec {
    port = ids.webappPorts.webapp2 or 0;
    dnsName = "webapp2.local";

    pkg = constructors.webapp {
      inherit port;
      instanceSuffix = "2";
    };

    requiresUniqueIdsFor = [ "webappPorts" "uids" "gids" ];
  };

  nginx = rec {
    port = ids.nginxPorts.nginx or 0;

    pkg = sharedConstructors.nginxReverseProxyHostBased {
      webapps = [ webapp webapp2 ];
      inherit port;
    } {};

    requiresUniqueIdsFor = [ "nginxPorts" "uids" "gids" ];
  };
}

In the above process model model, a new process instance named: webapp2 was added that listens on a unique port that can be reached with the webapp2.local virtual host value.

By running the following command, the system in the container gets upgraded:

$ nixproc-supervisord-switch

resulting in two webapp process instances running in the container:

$ supervisorctl 
nginx                            RUNNING   pid 847, uptime 0:00:08
webapp                           RUNNING   pid 459, uptime 0:05:54
webapp2                          RUNNING   pid 846, uptime 0:00:08
supervisor>

The first instance: webapp was left untouched, because its configuration was not changed.

The second instance: webapp2 can be reached as follows:

$ curl -H 'Host: webapp2.local' http://localhost:8080
<!DOCTYPE html>
<html>
  <head>
    <title>Simple test webapp</title>
  </head>
  <body>
    Simple test webapp listening on port: 5001
  </body>
</html>

After upgrading the system, the new configuration should also get reactivated after a container restart.

A more interesting example: Hydra

As explained earlier, to create upgradable containers we require a fully functional Nix installation in a container. This observation made a think about a more interesting example than the trivial web application system.

A prominent example of a system that requires Nix and is composed out of multiple tightly integrated process is Hydra: the Nix-based continuous integration service.

To make it possible to deploy a minimal Hydra service in a container, I have packaged all its relevant components for the Nix process management framework.

The processes model looks as follows:

{ pkgs ? import <nixpkgs> { inherit system; }
, system ? builtins.currentSystem
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager
}:

let
  nix-processmgmt = builtins.fetchGit {
    url = https://github.com/svanderburg/nix-processmgmt.git;
    ref = "master";
  };

  nix-processmgmt-services = builtins.fetchGit {
    url = https://github.com/svanderburg/nix-processmgmt-services.git;
    ref = "master";
  };

  constructors = import "${nix-processmgmt-services}/services-agnostic/constructors.nix" {
    inherit nix-processmgmt pkgs stateDir runtimeDir logDir tmpDir cacheDir forceDisableUserChange processManager;
  };

  instanceSuffix = "";
  hydraUser = hydraInstanceName;
  hydraInstanceName = "hydra${instanceSuffix}";
  hydraQueueRunnerUser = "hydra-queue-runner${instanceSuffix}";
  hydraServerUser = "hydra-www${instanceSuffix}";
in
rec {
  nix-daemon = {
    pkg = constructors.nix-daemon;
  };

  postgresql = rec {
    port = 5432;
    postgresqlUsername = "postgresql";
    postgresqlPassword = "postgresql";
    socketFile = "${runtimeDir}/postgresql/.s.PGSQL.${toString port}";

    pkg = constructors.simplePostgresql {
      inherit port;
      authentication = ''
        # TYPE  DATABASE   USER   ADDRESS    METHOD
        local   hydra      all               ident map=hydra-users
      '';
      identMap = ''
        # MAPNAME       SYSTEM-USERNAME          PG-USERNAME
        hydra-users     ${hydraUser}             ${hydraUser}
        hydra-users     ${hydraQueueRunnerUser}  ${hydraUser}
        hydra-users     ${hydraServerUser}       ${hydraUser}
        hydra-users     root                     ${hydraUser}
        # The postgres user is used to create the pg_trgm extension for the hydra database
        hydra-users     postgresql               postgresql
      '';
    };
  };

  hydra-server = rec {
    port = 3000;
    hydraDatabase = hydraInstanceName;
    hydraGroup = hydraInstanceName;
    baseDir = "${stateDir}/lib/${hydraInstanceName}";
    inherit hydraUser instanceSuffix;

    pkg = constructors.hydra-server {
      postgresqlDBMS = postgresql;
      user = hydraServerUser;
      inherit nix-daemon port instanceSuffix hydraInstanceName hydraDatabase hydraUser hydraGroup baseDir;
    };
  };

  hydra-evaluator = {
    pkg = constructors.hydra-evaluator {
      inherit nix-daemon hydra-server;
    };
  };

  hydra-queue-runner = {
    pkg = constructors.hydra-queue-runner {
      inherit nix-daemon hydra-server;
      user = hydraQueueRunnerUser;
    };
  };

  apache = {
    pkg = constructors.reverseProxyApache {
      dependency = hydra-server;
      serverAdmin = "admin@localhost";
    };
  };
}

In the above processes model, each process instance represents a component of a Hydra installation:

The nix-daemon process is a service that comes with Nix package manager to facilitate multi-user package installations. The nix-daemon carries out builds on behalf of a user.

Hydra requires it to perform builds as an unprivileged Hydra user and uses the Nix protocol to more efficiently orchestrate large builds.
Hydra uses a PostgreSQL database backend to store data about projects and builds.

The postgresql process refers to the PostgreSQL database management system (DBMS) that is configured in such a way that the Hydra components are authorized to manage and modify the Hydra database.
hydra-server is the front-end of the Hydra service that provides a web user interface. The initialization procedure of this service is responsible for initializing the Hydra database.
The hydra-evaluator regularly updates the repository checkouts and evaluates the Nix expressions to decide which packages need to be built.
The hydra-queue-runner builds all jobs that were evaluated by the hydra-evaluator.
The apache server is used as a reverse proxy server forwarding requests to the hydra-server.

With the following commands, we can build the image, load it into Docker, and deploy a container that runs Hydra:

$ nix-build hydra-image.nix
$ docker load -i result
$ docker run -it --name hydra-test --network host hydra:test

After deploying the system, we can connect to the container:

$ docker exec -it hydra-test /bin/bash

and observe that all processes are running and managed by supervisord:

$ supervisorctl
apache                           RUNNING   pid 1192, uptime 0:00:42
hydra-evaluator                  RUNNING   pid 1297, uptime 0:00:38
hydra-queue-runner               RUNNING   pid 1296, uptime 0:00:38
hydra-server                     RUNNING   pid 1188, uptime 0:00:42
nix-daemon                       RUNNING   pid 1186, uptime 0:00:42
postgresql                       RUNNING   pid 1187, uptime 0:00:42
supervisor>

With the following commands, we can create our initial admin user:

$ su - hydra
$ hydra-create-user sander --password secret --role admin
creating new user `sander'

We can connect to the Hydra front-end in a web browser by opening http://localhost (this works because the container uses host networking):

and configure a job set to a build a project, such as libprocreact:

Another nice bonus feature of having multiple process managers supported is that if we build Hydra's Nix process management configuration for Disnix, we can also visualize the deployment architecture of the system with disnix-visualize:

The above diagram displays the following properties:

The outer box indicates that we are deploying to a single machine: localhost
The inner box indicates that all components are managed as processes
The ovals correspond to process instances in the processes model and the arrows denote dependency relationships.

For example, the apache reverse proxy has a dependency on hydra-server, meaning that the latter process instance should be deployed first, otherwise the reverse proxy is not able to forward requests to it.

Building a Nix-enabled container image

As explained in the previous section, mutable Docker images require a fully functional Nix package manager in the container.

Since this may also be an interesting sub use case, I have created a convenience function: createNixImage that can be used to build an image whose only purpose is to provide a working Nix installation:

let
  pkgs = import <nixpkgs> {};

  nix-processmgmt = builtins.fetchGit {
    url = https://github.com/svanderburg/nix-processmgmt.git;
    ref = "master";
  };

  createNixImage = import "${nix-processmgmt}/nixproc/create-image-from-steps/create-nix-image.nix" {
    inherit pkgs;
  };
in
createNixImage {
  name = "foobar";
  tag = "test";
  contents = [ pkgs.mc ];
}

The above Nix expression builds a Docker image with a working Nix setup and a custom package: the Midnight Commander.

Conclusions

In this blog post, I have described a new function in the Nix process management framework: createMutableMultiProcessImage that creates reproducible mutable multi-process container images, by combining the reproducibility properties of Docker and Nix. With the exception of the process manager, process instances in a container can be upgraded without bringing the entire container down.

With this new functionality, the deployment workflow of a multi-process container configuration has become very similar to how physical and virtual machines are managed with NixOS -- you can edit a declarative specification of a system and run a single command-line instruction to deploy the new configuration.

Moreover, this new functionality allows us to deploy a complex, tightly coupled multi-process system, such as Hydra: the Nix-based continuous integration service. In the Hydra example case, we are using Nix for three deployment aspects: constructing the Docker image, deploying the multi-process system configuration and building the projects that are configured in Hydra.

A big drawback of mutable multi-process images is that there is no sharing possible between multiple multi-process containers. Since the images are not built from common layers, the Nix store is private to each container and all packages are deployed in the writable custom layer, this may lead to substantial disk and RAM overhead per container instance.

Deploying the processes model to a container instance can probably be made more convenient by using Nix flakes -- a new Nix feature that is still experimental. With flakes we can easily deploy an arbitrary number of Nix expressions to a container and pin the deployment to a specific version of Nixpkgs.

Another interesting observation is the word: mutable. I am not completely sure if it is appropriate -- both the layers of a Docker image, as well as the Nix store paths are immutable and never change after they have been built. For both solutions, immutability is an important ingredient in making sure that a deployment is reproducible.

I have decided to still call these deployments mutable, because I am looking at the problem from a Docker perspective -- the writable layer of the container (that is mounted on top of the immutable layers of an image) is modified each time that we upgrade a system.

Future work

Although I am quite happy with the ability to create mutable multi-process containers, there is still quite a bit of work that needs to be done to make the Nix process management framework more usable.

Most importantly, trying to deploy Hydra revealed all kinds of regressions in the framework. To cope with all these breaking changes, a structured testing approach is required. Currently, such an approach is completely absent.

I could also (in theory) automate the still missing parts of Hydra. For example, I have not automated the process that updates the garbage collector roots, which needs to run in a timely manner. To solve this, I need to use a cron service or systemd timer units, which is beyond the scope of my experiment.

Availability

The createMutableMultiProcessImage function is part of the experimental Nix process management framework GitHub repository that is still under heavy development.

Because the amount of services that can be deployed with the framework has grown considerably, I have moved all non-essential services (not required for testing) into a separate repository. The Hydra constructor functions can be found in this repository as well.