In this application I have groups of N (POSIX) threads. The first group starts up, creates an object A, and winds down. A little bit later a new group with N threads starts up, uses A to create a similar object B, and winds down. This pattern is repeated. The application is highly memory-intensive (A and B have a large number of malloc'ed arrays). I would like local access to memory as much as possible. I can use
numactl --localalloc to achieve this, but in order for this to work I also need to make sure that those threads from the first and second group that work on the same data are bound to the same NUMA node. I've looked into
sched_setaffinity, but wonder if better approaches exist.
The logic of the application is such that a solution where there are no separate thread groups would tear apart the program logic. That is, a solution where a single group of threads manages first object A and later object B (without winding down inbetween) would be extremely contrived and obliterate the object-oriented lay-out of the code.
Binding threads in group B to the same cores that they ran on group A is more restrictive than what you need. Modern processors use dedicated level 1 cache (L1) and level 2 cache (L2) per core, so binding threads to a specific core makes sense only to get at data that is still "hot" in those caches. What you probably meant is binding group B threads to the same numa node as the threads in group A, so that the large arrays are in the same local memory.
That said, you have two choices:
Option (1) is relatively easy, so let's talk about how to implement option (2).
The following SO answer describes how to find out, given a virtual address in your process, which numa node has that memory local:
Can I get the NUMA node from a pointer address (in C on Linux)?
There is an move_pages function in -lnuma: http://linux.die.net/man/2/move_pages which can report current state of address(page) to node mappings:
nodes can also be NULL, in which case move_pages() does not move any pages but instead will return the node where each page currently resides, in the status array. Obtaining the status of each page may be necessary to determine pages that need to be moved.
Armed with that information, you want to set the affinity of your group B threads to that numa node, for how to do that we go to this SO answer
How to ensure that std::thread are created in multi core?
for GNU/linux with POSIX threads you will want pthread_setaffinity_np(), in FreeBSD cpuset_setaffinity(), in Windows SetThreadAffinityMask(), etc.