A large program may contain multiple processes and generates complicated process trees. The entrypoint of the docker container as the PID 1 process is responsible to adopt orphans and forward signals.
While a process is terminated before it reaps its children, the
children will be adopted by PID 1 – make PID 1 their parent.1 PID 1 must reap these adopted
processes properly after they are terminated. The easiest way to
implement this is using a Bash script as the entrypoint, because
bash
automatically reaps adopted processes2.
A container could run as a daemon.
docker run -d <image>
Users get used to terminate a daemon container by executing
docker stop
. The command sends SIGTERM
to the
entrypoint and waits a duration. If the entrypoint doesn’t exit after
the duration it sends a SIGKILL
to forcefully terminate the
container.3 Thus for graceful termination, while
the entrypoint receives SIGTERM
it must send proper signals
to its child processes and wait them to terminate before it exits .
Most programs take SIGTERM
as the graceful termination
signal so in most time the entrypoint just needs to forward
SIGTERM
and wait children to terminate before it exits.
For generality, the entrypoint shouldn’t send signals to non-direct descendants. Let children worry about their children.
A container could run in foreground.
docker run -it <image>
For a foreground process, users get used to terminate it by pressing
CTRL + C
. It causes a SIGINT
sent to every
process in the foreground process group.
There are two problems here.
The first problem is that the entrypoint will ignore
SIGINT
by default. Unlike the SIGTERM
sent
from docker stop
which is from the parent process
namesapce, the SIGINT
is sent within the same process
namespace of the entrypoint. The entrypoint is PID 1 of the namespace
thus it ignores all signals sent within the same process namespace by
default.
The solution is explicitly installing a handler for
SIGINT
in the entrypoint to response to
CTRL + C
.4
The second problem is that the children in background will ignore
SIGINT
too. Because if background processes are launched in
a non-interactive shell, SIGINT
and SIGQUIT
are by default ignored for the background processes.5
Moreover, if the background process is a shell, the POSIX standard
forbids it to restore the handler of SIGINT
and
SIGQUIT
.6 Thus background descendants of the
entrypoint won’t response to SIGINT
.
The solution is sending SIGTERM
to children of the
entrypoint in the SIGINT
handler of the entrypoint.7
All above discussions lead up to the following Bash script.
#!/bin/bash
function cleanup() {
local exst
exst="${1:-0}"
for child in `pgrep -P $$`
do
kill $child 2>/dev/null
done
wait
exit $exst
}
trap "cleanup -15" SIGTERM
trap "cleanup -2" SIGINT
# any other signals expected to terminate the container could be added here
# some init code
# start children in background
wait
Stevens, W. R., & Rago, S. A. (2008). Advanced programming in the UNIX environment. Addison-Wesley.↩︎
Subtle Behaviors of Signals: A Comparison between macOS and Linux, Zsh and Bash↩︎