Multithreaded parallel Python through OpenMP support in Numba
Todd Anderson
Tim Mattson
A modern CPU delivers performance through parallelism. A program that exploits the performance
available from a CPU must run in parallel on multiple cores. This is usually best done through multithreading.
Threads belong to a process and share the memory associated with that process. The most
popular approach for writing multithreaded code is to use directives to tell the compiler
how to convert code into multithreaded code. The most commonly used directive-based API for
writing multithreaded code is OpenMP.
Python is not designed for parallel programming with threads.
The GlobalInterpreterLock (GIL) prevents multiple threads from simultaneously accessing Python objects.
This effectively prevents data races and makes Python naturally thread safe. Consequently, the GIL prevents parallel programming
with multiple threads and therefore keeps Python from accessing the full performance
from a CPU.
In this paper, we describe a solution to this problem. We implement OpenMP in
Python so programmers can easily annotate their code and then let the Numba just-in-time (JIT) compiler generate
multithreaded, OpenMP code in LLVM, thereby bypassing the GIL. We describe this new multithreading
system for Python and and show that the performance in our early tests is on par with the analogous C code.
OpenMP, Python, Numba
DOI10.25080/majora-1b6fd038-012