Writing Proper Entrypoints for Docker Containers of Multi-processing Programs

dyx
2021-06-04 +0800

A large program may contain multiple processes and generates complicated process trees. The entrypoint of the docker container as the PID 1 process is responsible to adopt orphans and forward signals.

Adopting Orphans

While a process is terminated before it reaps its children, the children will be adopted by PID 1 – make PID 1 their parent.1 PID 1 must reap these adopted processes properly after they are terminated. The easiest way to implement this is using a Bash script as the entrypoint, because bash automatically reaps adopted processes2.

Forwarding SIGTERM

A container could run as a daemon.

    docker run -d <image>

Users get used to terminate a daemon container by executing docker stop. The command sends SIGTERM to the entrypoint and waits a duration. If the entrypoint doesn’t exit after the duration it sends a SIGKILL to forcefully terminate the container.3 Thus for graceful termination, while the entrypoint receives SIGTERM it must send proper signals to its child processes and wait them to terminate before it exits .

Most programs take SIGTERM as the graceful termination signal so in most time the entrypoint just needs to forward SIGTERM and wait children to terminate before it exits.

For generality, the entrypoint shouldn’t send signals to non-direct descendants. Let children worry about their children.

Handling SIGINT

A container could run in foreground.

    docker run -it <image>

For a foreground process, users get used to terminate it by pressing CTRL + C. It causes a SIGINT sent to every process in the foreground process group.

There are two problems here.

The first problem is that the entrypoint will ignore SIGINT by default. Unlike the SIGTERM sent from docker stop which is from the parent process namesapce, the SIGINT is sent within the same process namespace of the entrypoint. The entrypoint is PID 1 of the namespace thus it ignores all signals sent within the same process namespace by default.

The solution is explicitly installing a handler for SIGINT in the entrypoint to response to CTRL + C.4

The second problem is that the children in background will ignore SIGINT too. Because if background processes are launched in a non-interactive shell, SIGINT and SIGQUIT are by default ignored for the background processes.5 Moreover, if the background process is a shell, the POSIX standard forbids it to restore the handler of SIGINT and SIGQUIT.6 Thus background descendants of the entrypoint won’t response to SIGINT.

The solution is sending SIGTERM to children of the entrypoint in the SIGINT handler of the entrypoint.7

Conclusion: A Solution Template

All above discussions lead up to the following Bash script.

    #!/bin/bash
    
    function cleanup() {
            local exst
            exst="${1:-0}"
            
            for child in `pgrep -P $$`
            do
                    kill $child 2>/dev/null
            done
            
            wait
            exit $exst
    }
    
    trap "cleanup -15" SIGTERM
    trap "cleanup -2" SIGINT
    # any other signals expected to terminate the container could be added here
    
    # some init code
    # start children in background
    
    wait

References

  1. Stevens, W. R., & Rago, S. A. (2008). Advanced programming in the UNIX environment. Addison-Wesley. 

  2. Docker and the PID 1 zombie reaping problem 

  3. docker stop | Docker Documentation 

  4. kill(2) - Linux manual page 

  5. Shell Command Language - IEEE Std 1003.1-2008 

  6. trap - IEEE Std 1003.1, 2004 Edition 

  7. Subtle Behaviors of Signals: A Comparison between macOS and Linux, Zsh and Bash