$..." /> $..."> $..." />

Python multiprocessing crashes docker container

Refresh

6 days ago

Views

304 time

10

There is simple python multiprocessing code that works like a charm, when I run it in console:

# mp.py
import multiprocessing as mp


def do_smth():
    print('something')


if __name__ == '__main__':
    ctx = mp.get_context("spawn")
    p = ctx.Process(target=do_smth, args=tuple())
    p.start()
    p.join()

Result:

> $ python3 mp.py
something

Then I've created a simple Docker container with Dockerfile:

FROM python:3.6

ADD . /app
WORKDIR /app

And docker-compose.yml:

version: '3.6'

services:
  bug:
    build:
      context: .
    environment:
      - PYTHONUNBUFFERED=1
    command: su -c "python3.6 forever.py"

Where forever.py is:

from time import sleep

if __name__ == '__main__':
    i = 0
    while True:
        sleep(1.0)
        i += 1
        print(f'hello {i:3}')

Now I run forever.py with docker compose:

> $ docker-compose build && docker-compose up 
...
some output
...
Attaching to mpbug_bug_1
bug_1  | hello   1
bug_1  | hello   2
bug_1  | hello   3
bug_1  | hello   4

Up to this moment everything is good and understandable. But when I'm trying to run mp.py in the docker container it crashes without any message:

> $ docker exec -it mpbug_bug_1 /bin/bash
[email protected]:/app# python mp.py 
something
[email protected]:/app# % 

Gist with the code can be found here: https://gist.github.com/ilalex/83649bf21ef50cb74a2df5db01686f18

Can you explain why docker container is crashed and how to do it without crashing?

Thank you in advance!

2 answers

1

mp.py doesn't look like an equivalent of forever.py. mp.py will run new worker process, which will just print something and then it will exit => join() in the main process will exit immediately, when this worker process is done.

Better equivalent of forever.py: worker process prints hello message in the infinite loop and main process will be waiting for this worker process exit in join() - forever-mp.py:

import multiprocessing as mp
from time import sleep

def do_smth():
    i = 0
    while True:
        sleep(1.0)
        i += 1
        print(f'hello {i:3}')

if __name__ == '__main__':
    ctx = mp.get_context("spawn")
    p = ctx.Process(target=do_smth, args=tuple())
    p.start()
    p.join()

Updated docker-compose.yml:

version: '3.6'

services:
  bug:
    build:
      context: .
    environment:
      - PYTHONUNBUFFERED=1
    command: su -c "python3.6 forever-mp.py"

Test:

$ docker-compose build && docker-compose up 
...
some output
...
Attaching to multiprcs_bug_1_72681117a752
bug_1_72681117a752 | hello   1
bug_1_72681117a752 | hello   2
bug_1_72681117a752 | hello   3
bug_1_72681117a752 | hello   4

Check processes in the container:

$ docker top multiprcs_bug_1_72681117a752
UID                 PID                 PPID                C                   STIME               TTY                 TIME                CMD
root                38235               38217               0                   21:36               ?                   00:00:00            su -c python3.6 forever-mp.py
root                38297               38235               0                   21:36               ?                   00:00:00            python3.6 forever-mp.py
root                38300               38297               0                   21:36               ?                   00:00:00            /usr/local/bin/python3.6 -c from multiprocessing.semaphore_tracker import main;main(3)
root                38301               38297               0                   21:36               ?                   00:00:00            /usr/local/bin/python3.6 -c from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=4, pipe_handle=6) --multiprocessing-fork
5

for a quick fix, do not use spawn start method, and/or do not use su -c ..., both are unnecessary IMO. change to:

p = mp.Process(target=do_smth, args=tuple())

or you could start container with --init option.

with spawn start method, Python will also start a semaphore tracker process to prevent semaphore leaking, you could see this process by pausing mp.py in the middle, it looks like:

472   463 /usr/local/bin/python3 -c from multiprocessing.semaphore_tracker import main;main(3)

this process is started by mp.py but exited after mp.py, thus it will not be reaped by mp.py, but is supposed to be reaped by init by design.

the problem is there is no init in this container(namespace), instead of init, PID 1 is su -c, therefore the dead semaphore tracker process is adopted by su.

it appears that su consider the dead child process is the command process(forever.py) mistakenly, without checking the relationship, so su exit blindly, as PID 1 exit, kernel kills all other processes in the container, including forever.py.

this behavior could be observed with strace:

docker run --security-opt seccomp:unconfined --rm -it ex_bug strace -e trace=process -f su -c 'python3 forever.py'

will output error message like:

strace: Exit of unknown pid 14 ignored

ref: Docker and the PID 1 zombie reaping problem (phusion.nl)