MPI/OpenMP binding
--
Notes on binding etc. for hybrid MPI + OpenMP jobs. Using OpenMPI 4.1.2 and OpenMPI 4.5 here (GCC 9.4 on Ubuntu 20.04 LTS). Also trying to incorporate NUMA/SMT concerns.
OpenMP binding/placement options:
- OMP_NUM_THREADS
- OMP_PLACES=[sockets|cores|threads[
- OMP_PROC_BIND=[false|true|master|close|spread]
- OMP_DISPLAY_ENV=true
cores/close will assign cores consecutively (socket 0 first, and then socket 1 cores)
cores/spread will round robin across cores. (socket 0/1/0/1…)
- former might be useful if there’s a lot of shared memory traffic
- latter might be useful if there’s competition for memory bandwidth
Linux core numbering in a NUMA/SMT node
(Empirical data, may not be a rule followed across vendors etc)
Cores 0–7: socket 0
Cores 8–15: socket 1
Cores 16:23: socket 0
Cores 24:31: socket 1
So if you want to avoid hyperthreaded cores, just use core #’s 0–15. 16 is hyperthreaded with 0, 17 with 1, 18 with 2, and so on…
If you just want to use 16 physical cores and bind to them, the following would work:
OMP_NUM_THREADS=16
OMP_PLACES=threads
OMP_PROC_BIND=spread
Note that this will actually bind the threads to different physical cores, preventing the OS from moving them subsequently.
(from: https://pages.tacc.utexas.edu/~eijkhout/pcse/html/omp-affinity.html)
<TBD: OpenMPI>
— bind-to none