From my perspective, processes are an interesting operating systems concept, in particular in UNIX/POSIX-like operating systems, such as Linux.
The IEEE Std 1003.1 POSIX standard defines a "live process" as:
An address space with one or more threads executing within that address space, and the required system resources for those threads.
Within the boundaries of many (relatively simple) applications, process creation and management is typically not required. Nonetheless, decomposing a system into sub processes can be quite useful for a variety of reasons, such as:
- Improving a system's responsiveness by running slow/long-running tasks in the background, concurrently with other tasks. This is particularly useful to retain the ability to respond to user events, such as mouse clicks or keyboard input while data is being processed or to handle multiple connecting clients at the same time to a server application.
- Increased protection. If an incorrectly implemented or malicious task crashes during execution, it neither tears down the entire system nor affects the state of any other processes.
- More security. Child processes can be run under more restrictive user privileges, making it more difficult for the system to do any harm, such accessing privacy-sensitive filesystem areas.
- Portability. In a child process, we can invoke an external executable implemented in a different programming language.
- Scalability and performance. The execution of a collection of tasks can be parallelized by means of processes and their executions can be divided over multiple CPU cores by the operating system, potentially increasing the execution speed of a program.
Although multi-process applications may provide a number compelling benefits, programming such applications using the C programming language and the POSIX API is IMO not always straight forward -- I have found myself frequently repeating numerous patterns over and over again.
To alleviate the burden of repetition, I have identified a number of patterns, derived abstractions from them and constructed a library package, named libprocreact, providing these abstractions. The APIs that libprocreact provides are loosely inspired by reactive programming.
Programming patterns
When programming multi-process applications, there are many housekeeping tasks that need to be performed in order to properly organize them. Below, I have described a number of recurring patterns I typically implement:
Forking
The first and most prominent house keeping task is process creation. The key ingredient in creating processes is the fork() system call, as shown in the code example below:
#include <stdio.h> #include <unistd.h> pid_t pid = fork(); if(pid == -1) printf("The child process cannot be forked!\n"); else if(pid == 0) { printf("Code executed by the child process\n"); printf("It runs in the background!\n"); _exit(0); } else { printf("Code executed by the parent process!\n"); printf("The pid of the child process is: %d\n", pid); }
Forking is a relatively simple concept -- after successfully invoking the fork() function call (i.e. the return value is not -1), a child process gets created that appears as a nearly identical clone of the parent process. For example, their memory contents and file descriptors are identical.
Furthermore, a forked child process will be executed immediately in parallel to the parent process. Since a parent and child process are almost identical, we can use the return value of fork() to make a distinction between them -- in the parent, the fork() function call returns the PID of the child process so that it can monitor its status and interact with it. In the child process, fork() returns 0.
Although creating a clone of a parent process may sound very expensive, many POSIX-compliant operating systems have optimized this process by using a Copy-On-Write (COW) memory model -- instead of copying a parent process' memory, the memory between the parent and child processes is shared. The operating system maintains a table of shared and private memory pages for each process. When a process attempts to write to a shared memory page, then the corresponding memory page is copied and marked as private to the process.
Executing tasks
After forking a child process, we typically want it to execute a task so that we can use some of the process' positive traits in our advantage. In the if(pid == 0) { ... } block (shown in the previous example), we can put the code that the child process must execute.
For example, we can execute a long running task without blocking the parent process, such as doing an expensive linear search over an array of strings:
#include <string.h> char *strings[] = { "foo", "bar", "baz", ..., NULL }; char *search = "baz"; ... else if(pid == 0) { unsigned int i; for(i = 0; i < strlen(strings); i++) { if(strcmp(strings[i], search) == 0) { printf("%s has been found!\n", search); _exit(0); } } printf("%s cannot be found!\n", search); _exit(1); }
(As may be observed in the example above, the string array is allocated by the parent process, but since the child process manifests itself as a duplicate on spawning, it has a reference to a "logical copy" as well).
We can also change the user permissions of the child process (for example, to restrict a task from having super-user permissions that might do potential harm):
#include <sys/types.h> #include <unistd.h> ... else if(pid == 0) { if(setgid(100) == 0 && setuid(1000) == 0) { /* Execute some code with restrictive user permissions */ ... } else { printf("Cannot change user permissions!\n"); _exit(1); } }
or invoking external (pre-built) executables, such as the cat command to stream the contents of a file to the standard output:
... else if(pid == 0) { char *const args[] = { "cat", "bigtextfile.txt", NULL }; execvp(args[0], args); _exit(1); }
Waiting
In addition to forking and carrying out tasks by child processes, the parent process must also take notice of a child process' status at some point. This is actually an obligation for certain events, for example, when a child process terminates -- a terminated child process remains a zombie until the parent process takes notice of it.
Taking notice of a process' status can be done by invoking a wait function, such as:
#include <sys/types.h> #include <wait.h> pid_t pid; int wstatus; /* Fork and execute */ pid_t ret_pid = waitpid(pid, &status, 0);
The above function call specifically waits for a process with a given PID to change state, and captures its wait status in the wstatus variable.
As a sidenote: besides waiting for a specific child process' state to change, it also possible to wait for any process in a process group to terminate (e.g. by invoking wait()). Furthermore, the wait function invocation blocks the parent process' execution by default, until a child process' state changes. We can also pass the WNOHANG flag to waitpid() to prevent it from blocking.
After a wait function invocation completes, we must interpret the return value and wait status:
if(ret_pid == -1) printf("Cannot obtain wait status of PID: %d\n", pid); else if(!WIFEXITED(wstatus)) printf("The process terminated abnormally!\n"); else if(WEXITSTATUS(wstatus) != 0) printf("The process execution failed!\n"); else printf("The process has completed its tasks successfully!\n");
In the above code fragment, we check for the following properties:
- Whether the wait status could be obtained. Sometimes this may not be possible, for example, if a process with a given PID does not exists or when it is beyond the parent process' control.
- Whether a process has been terminated abnormally or not. For example, abnormal termination happens when a process runs into a segmentation fault. In such cases, it may happen that a process still returns a zero exit status, incorrectly indicating that everything has succeeded.
- Whether a process has succeeded its tasks or not. By convention, a zero exit status indicates success, while any non-zero exit status indicates failure.
Output buffering
Sometimes, it may also be desired to propagate data back to the parent, for example, after the completion of a data transformation task. Since processes operate in their own private address space, we can no longer rely on shared memory, but we must use some means to transfer the data.
One of the possible means is using a pipe, as shown in the following code fragment:
int pipefd[2]; if(pipe(pipefd) == 0) { pid_t pid = fork(); if(pid == -1) fprintf(stderr, "Cannot fork process!\n"); else if(pid == 0) { char *const args[] = { "sort", "words.txt", NULL }; close(pipefd[0]); /* Close read-end of pipe */ dup2(pipefd[1], 1); /* Attach write-end to the stdout */ execvp(args[0], args); _exit(0); } else { close(pipefd[1]); /* Close write-end of pipe */ /* Read from pipefd[0] */ close(pipefd[0]); } } else fprintf(stderr, "Cannot construct a pipe!\n");
Before forking a child process, we construct a pipe consisting of two file descriptors -- a read and write end. In the child process, we close the read-end of the pipe (since it is not needed), and we write data to the write-end. In the parent, we read from the read-end and we discard the unneeded write-end.
When retrieving data from a pipe, we may want to capture its output in a data structure (such as a string, string array or struct). Capturing and transforming data to a specific structure is often not very straight forward, for example:
#define BUFFER_SIZE 1024 ssize_t bytes_read; char *captured_string = NULL; unsigned int captured_string_size = 0; while((bytes_read = read(pipefd[0], buffer, BUFFER_SIZE)) > 0) { char buffer[BUFFER_SIZE]; captured_string = (char*)realloc(captured_string, captured_string_size + bytes_read); memcpy(captured_string + captured_string_size, buffer, bytes_read); captured_string_size += bytes_read; } /* Add NUL-termination */ captured_string = (char*)realloc(captured_string, captured_string_size + 1)); captured_string[captured_string_size] = '\0';
The purpose of above code fragment is to read data from a pipe and constructing a NUL-terminated string. It repeatedly reads chunks of data (of one kilobyte each) from the read-end of the pipe, dynamically extends the size of the buffer collecting the output, appends each chunk to the buffer, and finally appends a NUL-termination to the result.
As may be noticed, there are many concerns that we have to take care of and the resulting code is not trivial at all.
Orchestrating collections of processes
In addition to implementing the above patterns for a single child process running in the background, we may also want to apply them to collections of processes running concurrently.
Orchestrating collections of processes introduce many additional challenges beyond those described in the previous sections. For example, in order to read from the processes' pipes without blocking their executions (which could happen if any of their buffers gets full), we have to multiplex the house keeping operations in such a way that they read from each pipe breadth-first.
Multiplexing any of the previously shown patterns makes developing multi-process applications even more difficult.
A functional programming discipline
Because housekeeping tasks require us to supply many lines of boilerplate code, multi-process applications tend to become quite messy without any proper organization. For example, if I would take the sorting example (shown earlier) and extend it to capture the output of the process invocation into a string, I may end up writing:
#define BUFFER_SIZE 1024 int pipefd[2]; if(pipe(pipefd) == 0) { pid_t pid = fork(); if(pid == -1) fprintf(stderr, "Cannot fork process!\n"); else if(pid == 0) { char *const args[] = { "sort", "words.txt", NULL }; close(pipefd[0]); /* Close read-end of pipe */ dup2(pipefd[1], 1); /* Attach write-end to the stdout */ execvp(args[0], args); _exit(0); } else { ssize_t bytes_read; char *captured_string = NULL; unsigned int captured_string_size = 0; close(pipefd[1]); /* Close write-end of pipe */ while((bytes_read = read(pipefd[0], buffer, BUFFER_SIZE)) > 0) { char buffer[BUFFER_SIZE]; captured_string = (char*)realloc(captured_string, captured_string_size + bytes_read); memcpy(captured_string + captured_string_size, buffer, bytes_read); captured_string_size += bytes_read; } /* Add NUL-termination */ captured_string = (char*)realloc(captured_string, captured_string_size + 1)); captured_string[captured_string_size] = '\0'; close(pipefd[0]); /* Close read-end of pipe */ } } else fprintf(stderr, "Cannot construct a pipe!\n");
I do not expect anyone the study the above code fragment in detail, but just by looking at its structure and length, you could clearly see that this is not very appealing way to construct applications.
In contrast, the same task can be implemented in only one line of bash shell code:
captured_string=$(sort words.txt)
When I look at the previous code fragment from an abstract point of view, then it encapsulates an implementation of a specific concern (i.e. its primary task, such as sorting an array), and a number of general house-keeping concerns, such as constructing a pipe, waiting, output buffering, and closing file descriptors.
One possible way to structure the code in a better way is by function decomposition -- we can separate the specific and general concerns into functions and we can put the general house keeping aspects into a (reusable) library.
Consider the following synchronous function definition and invocation example that checks whether a given string exists in array of strings by doing a linear search:
#include <stdio.h> #include <string.h> int array_contains_string(const char **strings, const char *search) { unsigned int i; for(i = 0; i < strlen(strings); i++) { if(strcmp(strings[i], search) == 0) return TRUE; } return FALSE; } int main(int argc, char *argv[]) { char *strings[] = { "foo", "bar", "baz", NULL }; char *search = "baz"; int result = array_contains_string(strings, search); if(result) printf("The array does contain the string: %s\n", search); else printf("The array does not contain the string: %s\n", search); return 0; }
Since searching (in particular linear searching) may take some time, we may want to change the function it into an asynchronous function running its primary task (the searching) in a child process. I can concisely express this primary concern in one single function:
#include <stdio.h> #include <unistd.h> pid_t array_contains_string_async(const char **strings, const char *search) { pid_t pid = fork(); if(pid == 0) { unsigned int i; for(i = 0; i < strlen(strings); i++) { if(strcmp(strings[i], search) == 0) _exit(0); } _exit(1); } return pid; }
As may be noticed, the above function executes the same linear search procedure shown in the previous code fragment, with the following differences:
- The function forks a child process and carries out the search operation in the child process.
- Instead of returning a boolean value, it exits the child process with an exit status. By convention, a zero exit status indicates success while a non-zero exit status indicates failure.
We can capture the wait and exit status checking in a general utility function (procreact_wait_for_boolean()) that interprets the exit status of a child process as a boolean value (meaning that when a process exits with exit status 0 it returns TRUE and for any non-zero exit status, it returns FALSE):
#include <procreact_pid.h> int main(int argc, char *argv[]) { char *strings[] = { "foo", "bar", "baz", NULL }; char *search = "baz"; ProcReact_Status status; int result = procreact_wait_for_boolean(array_contains_string_async(strings, search), &status); if(status == PROCREACT_STATUS_OK) { if(result) printf("The array does contain the string: %s\n", search); else printf("The array does not contain the string: %s\n", search); } else fprintf(stderr, "The process terminated abnormally!\n"); return 0; }
As may be observed, by separating concerns and putting common operations into a library, we can accomplish the same result as the synchronous code fragment example, with relatively little overhead of boilerplate code.
Managing arbitrary output
The previously shown abstractions work well for functions returning a byte, boolean, or void-functions. However, it may also be desirable to implement asynchronous functions returning more complex data, such as strings or arrays of strings, for example:
#include <stdlib.h> #include <string.h> char *say_hello_to(const char *name) { char *result = (char*)malloc(strlen(name) + 7 + 1); sprintf(result, "Hello %s!", name); return result; }
The above function composes a string that greets a person with a given name. As explained earlier, implementing an asynchronous variant of the above function requires extra facilities to propagate the result back to the parent process, such as constructing a pipe, forcing us to do more housekeeping work.
In the previous examples, we were able to separate a task's primary concern into a function returning a PID and a function waiting and interpreting the exit status by using the PID reference. To manage complex data, we need to memorize more than just the PID -- we also need the pipe's file descriptors, store the buffered data, and the end result.
In some ways, a PID reference resembles another software abstraction -- a future in reactive programming or a promise in JavaScript. A future/promise is an object encapsulating a return value that will be provided at some point in the future.
We can encapsulate the entire housekeeping procedure for transferring and returning complex data in a ProcReact_Future struct:
#include <procreact_future.h> ProcReact_Future say_hello_to_async(const char *name) { ProcReact_Future future = procreact_initialize_future(procreact_create_string_type()); if(future.pid == 0) { dprintf(future.fd, "Hello %s!", name); _exit(0); } return future; }
The above code fragment looks similar to the synchronous function definition shown in the previous example, with the following differences:
- By constructing a ProcReact_Future struct, we no longer have to fork a child process and construct a pipe ourselves.
- The string composition step is carried out by the forked child process.
- Instead of returning a heap-allocated string, we write the resulting string to the write-end of the pipe provided by the future struct and we terminate the process by invoking the exit function call.
The procreact_initialize_future() function takes a parameter: a type, that is responsible for reading the output from the pipe and converting it into a representation of choice -- in this case a NUL-terminated string.
We can collect the return value of the function in the parent process by invoking the procreact_future_get() function:
ProcReact_Status status; ProcReact_Future future = say_hello_to_async(name); char *result = procreact_future_get(&future, &status); if(status == PROCREACT_STATUS_OK && result != NULL) printf("%s\n", result); else fprintf(stderr, "Some error occured!\n");
The procreact_future_get() function (that looks similar to a Future's .get() method or Promise's .then() method) takes care of reading from the pipe, buffering the output, converting the output to a string, waiting for the child process to terminate and closing the obsolete file descriptors.
Furthermore, analogous to a Future or Promise, when the retrieval function gets invoked for a second time, it will return its cached value instead of reading from the pipe again.
Orchestrating collections of processes
With concerns well separated, orchestration of collections of processes also becomes easier. For example, we may want to execute multiple invocations to the following function in parallel:
ProcReact_Future return_count_async(unsigned int count) { ProcReact_Future future = procreact_initialize_future(procreact_create_string_type()); if(future.pid == 0) { dprintf(future.fd, "%u", count); _exit(0); } return future; }
The purpose of the function shown above is to simply return a string presentation of a given numeric counter value.
As with the abstraction facilities shown previously (such as ProcReact_Future), we can also create similar abstractions for orchestrating collections of processes including processes whose output need to be captured:
ProcReact_FutureIterator iterator = procreact_initialize_future_iterator(has_next_count, next_count_process, complete_count_process, &data);
The above function invocation: procreact_initialize_future_iterator() configures an ProcReact_FutureIterator struct. It takes the following parameters:
- A pointer to a function that indicates whether there is a next element in the collection.
- A pointer to a function that invokes the next process in the collection.
- A pointer to a function that gets invoked when a process completes.
- A void-pointer referring to an arbitrary data structure that gets passed to all functions above.
If I would like to invoke this function 5 times to say, count from 1 to 5, I can encapsulate the properties of this iteration process in the following data structure:
typedef struct { unsigned int index; unsigned int amount; int success; char **results; unsigned int results_length; } IteratorData;
and compose the following instance from it:
IteratorData data = { 0, 5, TRUE, NULL, 0 };
The above struct maintains an index value indicating which element it currently processing, the amount value holds the amount of iterations that need to be executed, success is a boolean status flag indicating whether the iterations have all succeeded or not, and the results variable corresponds to an array of strings capturing the output of each function invocation.
The following function can be used to check whether we have completed the iteration or not:
static int has_next_count_process(void *data) { IteratorData *iterator_data = (IteratorData*)data; return iterator_data->index < iterator_data->amount; }
The following function executes each successive iteration step:
static ProcReact_Future next_count_process(void *data) { IteratorData *iterator_data = (IteratorData*)data; ProcReact_Future future = return_count_async(iterator_data->index + 1); iterator_data->index++; return future; }
The above function increases the index, invokes the function with the index as a parameter, and increases the index value.
The following function gets invoked when a process' execution finishes:
static void complete_count_process(void *data, ProcReact_Future *future, ProcReact_Status status) { IteratorData *iterator_data = (IteratorData*)data; if(status == PROCREACT_STATUS_OK && future->result != NULL) { iterator_data->results = (char**)realloc(iterator_data->results, (iterator_data->results_length + 1) * sizeof(char*)); iterator_data->results[iterator_data->results_length] = future->result; iterator_data->results_length++; }the else iterator_data->success = FALSE; }
The above function checks the status of the function invocation and captures the results that each function returns. When a process completes successfully (i.e. it does not terminate abnormally and provides a non-NULL result), it appends the result to the results array. In case of a failure, it sets the overall status flag of the iterator to FALSE.
With all iteration aspects abstracted away into a ProcReact_Future struct, we can execute the following function to do all iteration steps in parallel:
procreact_fork_in_parallel_buffer_and_wait(&iterator);
We can also limit the amount of processes that are allowed to run concurrently to a specific value, e.g. 2:
procreact_fork_buffer_and_wait_in_parallel_limit(&iterator, 2);
After executing all iterations, we can consult the data struct to figure out whether the iterations have succeeded and what their results are.
Asynchronously orchestrating collections of processes
The previous two collection examples are executed synchronously. This means that while the execution of each function that retrieves an element is done asynchronously, the overall iteration task blocks the parent process until it completes, which is not always desirable.
A possible solution to make iterations asynchronous is to fork another process and iterate over the collection in the child process, but this introduces another challenge when the collected data needs to be returned to the parent.
Instead of performing all iteration steps as a whole we can also control each iteration step ourselves. For example, the following command-line invocation executes a single iteration step that composes a ProcReact_Future struct:
if(procreact_spawn_next_future(&iterator)) printf("Spawned a process and we have more of them!\n"); else printf("All processes have been spawned\n");
We can also run a each buffer iteration step ourselves and integrate that function call into a program's main loop:
while(TRUE) { unsigned int running_processes = procreact_buffer(&iterator); if(running_processes == 0) { /* This indicates that there are no running processes anymore */ /* You could do something with the end result here */ } /* Do other stuff in the main loop */ }
The above code fragment allows us to evaluate processes' statuses and buffer their outputs without blocking the main loop.
Summary
In this blog post, I have described a number of common housekeeping tasks I typically need to implement when developing multi-process applications and I have described an API that abstracts over these common house keeping tasks.
To summarize, when we intend to execute task or a collection of tasks, we can use one of the following data structures, depending whether we need to evaluate a single value or a collection of values, and whether the type of the value is simple (e.g. a boolean, byte, or void) or complex (e.g. strings, arrays of strings, etc.):
One | Many | |
---|---|---|
Simple | pid_t | ProcReact_PidIterator |
Complex | ProcReact_Future<ProcReact_Type> | ProcReact_FutureIterator |
Processing collections of tasks for each type can be done either synchronously or asynchronously, by picking the appropriate utility functions:
Synchronous | Asynchronous | |
---|---|---|
Simple |
procreact_fork_in_parallel_and_wait() procreact_fork_and_wait_in_parallel_limit() |
procreact_register_signal_handler() procreact_spawn_next_pid() procreact_complete_all_finished_processes() |
Complex |
procreact_fork_in_parallel_buffer_and_wait() procreact_fork_buffer_and_wait_in_parallel_limit() |
procreact_spawn_next_future() procreact_buffer() |
Motivation: Disnix
After reading this blog post, you may probably wonder why I have developed these abstractions? My main motivation is to organize Disnix's codebase in a better way.
Besides being a distributed deployment tool, all deployment activities that Disnix carries out (package management operations, state management operations, and communications) are carried out by external processes, for exactly the same reasons mentioned in the introduction.
Before deriving the abstractions described in this blog post, all process coordination in Disnix was hand coded -- Disnix's code was cluttered by boilerplate code doing output buffering, concurrency limiting, waiting, and resource allocation. As a result, some areas of the code were quite hard to read and difficult to extend. Moreover, it was also quite hard to ensure that the code is free from some serious issues, such as potential buffer overflows.
After restructuring the code to use these abstractions (between Git revisions 595c1ec and 4ba5e3d), I have managed to significantly reduce the amount of code. For example, the methods.c module (an RPC interface for all deployment operations that Disnix can execute remotely) previously consisted of 1976 LOC. After refactoring, it consists of only 488 LOC.
Moreover, the disnix-build utility's main module (build.c) size has been reduced from 250 LOC to 153 LOC. Each command-line utility that executes a deployment task (13 in total) each have at least 100 fewer lines lines of code.
In addition to reducing the size of many modules, I have also accomplished the following:
- Better separation of concerns. I have managed to clearly separate all asynchronous functions spawning processes into three separate libraries: (libpkgmgmt for package manager, libstatemgmt for state management and libinterface for communication) considerably simplifying the toolset's architecture.
- Performance improvements. Some tools (disnix-capture-infra and disnix-query) were still executing their tasks sequentially and were difficult to optimize. With these new abstractions, parallelizing their tasks became quite simple.
- Better error reporting. In older versions of Disnix, it was difficult to relate error messages to target machines. With the abstractions described in this blog post, implementing a translation process became much easier. As a result, all tools can now clearly report the origins of an error.
Discussion
Although the abstractions described in this blog post have allowed me to structure Disnix's code in a much better way, they are not a solution for everything.
For example, the ProcReact_Future abstraction buffers the process' output, which is inadequate for processes transferring large quantities of data. Moreover, each function invocation implies a fork. On a very large scale, this could become quite expensive, as it takes time to set up a child process and memory for allocating the Copy-on-Write (COW) table. In theory, we could also make it possible to reuse spawned processes among function invocations to optimize this, but currently no such optimizations have been implemented yet.
Furthermore, processes are not the only solution for implementing concurrent applications. When frequent interaction is required between parent and child, it may be better to use threads as their memory is shared (at the same time, this also has disadvantages). A major disadvantage of processes (compared to threads) is that all communication data needs to be serialized, transferred and deserialized, introducing quite a bit of communication overhead.
Apart from processes and threads, it is also possible to use a single threaded event loop, in which the programmer has the obligation to divide bigger tasks into small tasks. For example, this model is prominently used in Node.js. The advantage of this approach is that each additional concurrent task does not require the allocation of additional resources. This works, for example, well for applications that are mostly I/O-bound (I/O is typically thousands of times slower than a CPU).
The disadvantages of such an application organization is that it becomes a programmer's obligation to ensure that the main thread never blocks and that the code remains structured properly. Moreover, "context-switching" between single threaded tasks is more expensive than a context switch on CPU-level, making it quite inadequate for computationally intensive applications. Finally, the operating system cannot automatically divide tasks over multiple CPU cores.
Availability
libprocreact is part of the latest development version of Disnix. Additionally, I have isolated the library code and created a separate GitHub repository for it.
Although libprocreact can be used independently from Disnix, I currently do not have any plans to make it a fully fledged general purpose library. At the moment, it is not supposed to be used for any other use cases than Disnix's.
No comments:
Post a Comment