Page tree
Skip to end of metadata
Go to start of metadata

The fork() system call is used to create (“fork”) a child process of a parent; these two processes can then run completely independently. This is how all of the processes are started on Linux (the so-called “fork-and-exec” pattern).

When a program calls fork(), the Linux kernel creates an identical copy of that program; in the case of an MPI program, this includes all of the internal state. However, some of this internal state involves handles for interacting with, for example, the InfiniBand adaptor in the node – while the kernel has created a copy of the handle for the new process, it still refers to the same internal state in the hardware. This means that if you use that handle in the child process, it will affect the parent process as well.

Perhaps the most severe (but least obvious) use of the handle in the child process is allowing the child process to exit. This will call of all the exit handlers that finalise and close the handles into the hardware, which has the same as closing them in the parent process. Suddenly, the parent process will find its interface with the hardware has been mysteriously removed, and this will almost certainly cause it to crash.

Because of this, using fork() from an MPI program is strongly discouraged. For C, C++, and Fortran, this mean using system() or fork(). For Python, this means os.system()os.popen(), and most things from the subprocess module.