Nice threads in Python
Python has two ways to execute code in parallel: processes and threads. Both have their strengths and weaknesses. In particular, threads are cheaper in terms of resources and are able to share memory, but besides the obvious pitfalls of multi-threaded programming, they suffer from some Python specific problems, notably:
- Because of the GIL, only one Python's thread is actually running at any one time. So, very little parallelism is possible if the task is CPU bound. In fact, more often than not, it's more efficient to execute multiple tasks sequentially on a single thread (pinned to a single CPU), rather than spawning multiple threads and having the Python interpreter context switching too much and not being able to schedule the threads on all available CPUs.
- Python's threads have no priority, so a "less important" thread can starve "more important" ones. I am not sure why setting the scheduling policy for a thread is not possible in Python, given that it's in the POSIX standard.
In general, it's recommended to use multiprocessing, but if you are stuck with threads and you want to make sure certain threads are executed with higher priority, you can you the following trick.
Take a look at this script:
It spawns 4 threads, each of which executes some CPU intensive task (calculating the 26th Fibonacci number in a very inefficient way!). The whole process is pinned to a single CPU.
When all threads are given the same priority, the output is like this:
~ » python3 ~/src/yougov/brandindex/qq7/priv/nice_threads.py 2020-02-01 17:26:18,668 WARNING [root] <nice_threads.py:pin_process_to_cpus:28> {14060-140417266833216} Pinning 14060 to cpus 7 pid 14060's current affinity list: 0-7 pid 14060's new affinity list: 7 2020-02-01 17:26:18,671 INFO [root] <nice_threads.py:loop:39> {14060-140417225721600} Starting t0 2020-02-01 17:26:18,691 INFO [root] <nice_threads.py:loop:39> {14060-140417217328896} Starting t1 2020-02-01 17:26:18,739 INFO [root] <nice_threads.py:loop:39> {14060-140417208936192} Starting t2 2020-02-01 17:26:18,783 INFO [root] <nice_threads.py:loop:39> {14060-140417200543488} Starting t3 2020-02-01 17:26:31,931 INFO [root] <nice_threads.py:loop:44> {14060-140417217328896} Done t1 in 13.20 2020-02-01 17:26:31,940 INFO [root] <nice_threads.py:loop:44> {14060-140417225721600} Done t0 in 13.27 2020-02-01 17:26:31,967 INFO [root] <nice_threads.py:loop:44> {14060-140417200543488} Done t3 in 13.18 2020-02-01 17:26:31,998 INFO [root] <nice_threads.py:loop:44> {14060-140417208936192} Done t2 in 13.26
All threads terminate at the same time (they are given the same amount of CPU
time), but they all take 13 secs, to execute a task (fib(26)
) that takes ~3
secs to execute, because threads are stepping on each other's toes.
It would be more efficient to execute the 4 tasks sequentially. See the output of a sequential run:
~ » python3 ~/src/yougov/brandindex/qq7/priv/nice_threads.py 2020-02-01 17:33:31,627 WARNING [root] <nice_threads.py:pin_process_to_cpus:28> {16673-140526022530880} Pinning 16673 to cpus 7 pid 16673's current affinity list: 0-7 pid 16673's new affinity list: 7 2020-02-01 17:33:31,630 INFO [root] <nice_threads.py:loop:39> {16673-140526022530880} Starting seq-0 [None] 2020-02-01 17:33:35,005 INFO [root] <nice_threads.py:loop:45> {16673-140526022530880} Done seq-0 [None] in 3.37 2020-02-01 17:33:35,005 INFO [root] <nice_threads.py:loop:39> {16673-140526022530880} Starting seq-1 [None] 2020-02-01 17:33:38,352 INFO [root] <nice_threads.py:loop:45> {16673-140526022530880} Done seq-1 [None] in 3.35 2020-02-01 17:33:38,352 INFO [root] <nice_threads.py:loop:39> {16673-140526022530880} Starting seq-2 [None] 2020-02-01 17:33:41,830 INFO [root] <nice_threads.py:loop:45> {16673-140526022530880} Done seq-2 [None] in 3.48 2020-02-01 17:33:41,830 INFO [root] <nice_threads.py:loop:39> {16673-140526022530880} Starting seq-3 [None] 2020-02-01 17:33:45,228 INFO [root] <nice_threads.py:loop:45> {16673-140526022530880} Done seq-3 [None] in 3.40
If instead each thread is given a different priority, things behave better. We can do that by using nice. The output looks like this:
» python3 nice_threads.py 2020-02-01 14:29:51,838 WARNING [root] <nice_threads.py:pin_process_to_cpus:28> {8957-139865220663104} Pinning 8957 to cpus 1 pid 8957's current affinity list: 0-7 pid 8957's new affinity list: 1 2020-02-01 14:29:51,840 INFO [root] <nice_threads.py:loop:39> {8957-139865179551488} Starting t0 [0] 2020-02-01 14:29:51,859 INFO [root] <nice_threads.py:loop:39> {8957-139865171158784} Starting t1 [5] 2020-02-01 14:29:51,893 INFO [root] <nice_threads.py:loop:39> {8957-139865162503936} Starting t2 [10] 2020-02-01 14:29:51,955 INFO [root] <nice_threads.py:loop:39> {8957-139865154111232} Starting t3 [15] 2020-02-01 14:29:57,266 INFO [root] <nice_threads.py:loop:44> {8957-139865179551488} Done t0 [0] in 5.43 2020-02-01 14:30:03,568 INFO [root] <nice_threads.py:loop:44> {8957-139865171158784} Done t1 [5] in 11.71 2020-02-01 14:30:05,315 INFO [root] <nice_threads.py:loop:44> {8957-139865162503936} Done t2 [10] in 13.42 2020-02-01 14:30:05,423 INFO [root] <nice_threads.py:loop:44> {8957-139865154111232} Done t3 [15] in 13.47
Note that the threads terminate at different times: the OS scheduled more
CPU-time to the high priority thread, which completed before the others.
htop
shows it in "real time", too: