]> git.ipfire.org Git - thirdparty/man-pages.git/blob - man2/set_mempolicy.2
man*/: srcfix (Use .P instead of .PP or .LP)
[thirdparty/man-pages.git] / man2 / set_mempolicy.2
1 .\" SPDX-License-Identifier: Linux-man-pages-copyleft-var
2 .\"
3 .\" Copyright 2003,2004 Andi Kleen, SuSE Labs.
4 .\" and Copyright 2007 Lee Schermerhorn, Hewlett Packard
5 .\"
6 .\" 2006-02-03, mtk, substantial wording changes and other improvements
7 .\" 2007-08-27, Lee Schermerhorn <Lee.Schermerhorn@hp.com>
8 .\" more precise specification of behavior.
9 .\"
10 .TH set_mempolicy 2 (date) "Linux man-pages (unreleased)"
11 .SH NAME
12 set_mempolicy \- set default NUMA memory policy for a thread and its children
13 .SH LIBRARY
14 NUMA (Non-Uniform Memory Access) policy library
15 .RI ( libnuma ", " \-lnuma )
16 .SH SYNOPSIS
17 .nf
18 .B "#include <numaif.h>"
19 .P
20 .BI "long set_mempolicy(int " mode ", const unsigned long *" nodemask ,
21 .BI " unsigned long " maxnode );
22 .fi
23 .SH DESCRIPTION
24 .BR set_mempolicy ()
25 sets the NUMA memory policy of the calling thread,
26 which consists of a policy mode and zero or more nodes,
27 to the values specified by the
28 .IR mode ,
29 .IR nodemask ,
30 and
31 .I maxnode
32 arguments.
33 .P
34 A NUMA machine has different
35 memory controllers with different distances to specific CPUs.
36 The memory policy defines from which node memory is allocated for
37 the thread.
38 .P
39 This system call defines the default policy for the thread.
40 The thread policy governs allocation of pages in the process's
41 address space outside of memory ranges
42 controlled by a more specific policy set by
43 .BR mbind (2).
44 The thread default policy also controls allocation of any pages for
45 memory-mapped files mapped using the
46 .BR mmap (2)
47 call with the
48 .B MAP_PRIVATE
49 flag and that are only read (loaded) from by the thread
50 and of memory-mapped files mapped using the
51 .BR mmap (2)
52 call with the
53 .B MAP_SHARED
54 flag, regardless of the access type.
55 The policy is applied only when a new page is allocated
56 for the thread.
57 For anonymous memory this is when the page is first
58 touched by the thread.
59 .P
60 The
61 .I mode
62 argument must specify one of
63 .BR MPOL_DEFAULT ,
64 .BR MPOL_BIND ,
65 .BR MPOL_INTERLEAVE ,
66 .BR MPOL_PREFERRED ,
67 or
68 .B MPOL_LOCAL
69 (which are described in detail below).
70 All modes except
71 .B MPOL_DEFAULT
72 require the caller to specify the node or nodes to which the mode applies,
73 via the
74 .I nodemask
75 argument.
76 .P
77 The
78 .I mode
79 argument may also include an optional
80 .IR "mode flag" .
81 The supported
82 .I "mode flags"
83 are:
84 .TP
85 .BR MPOL_F_NUMA_BALANCING " (since Linux 5.12)"
86 .\" commit bda420b985054a3badafef23807c4b4fa38a3dff
87 When
88 .I mode
89 is
90 .BR MPOL_BIND ,
91 enable the kernel NUMA balancing for the task if it is supported by the kernel.
92 If the flag isn't supported by the kernel, or is used with
93 .I mode
94 other than
95 .BR MPOL_BIND ,
96 \-1 is returned and
97 .I errno
98 is set to
99 .BR EINVAL .
100 .TP
101 .BR MPOL_F_RELATIVE_NODES " (since Linux 2.6.26)"
102 A nonempty
103 .I nodemask
104 specifies node IDs that are relative to the
105 set of node IDs allowed by the process's current cpuset.
106 .TP
107 .BR MPOL_F_STATIC_NODES " (since Linux 2.6.26)"
108 A nonempty
109 .I nodemask
110 specifies physical node IDs.
111 Linux will not remap the
112 .I nodemask
113 when the process moves to a different cpuset context,
114 nor when the set of nodes allowed by the process's
115 current cpuset context changes.
116 .P
117 .I nodemask
118 points to a bit mask of node IDs that contains up to
119 .I maxnode
120 bits.
121 The bit mask size is rounded to the next multiple of
122 .IR "sizeof(unsigned long)" ,
123 but the kernel will use bits only up to
124 .IR maxnode .
125 A NULL value of
126 .I nodemask
127 or a
128 .I maxnode
129 value of zero specifies the empty set of nodes.
130 If the value of
131 .I maxnode
132 is zero,
133 the
134 .I nodemask
135 argument is ignored.
136 .P
137 Where a
138 .I nodemask
139 is required, it must contain at least one node that is on-line,
140 allowed by the process's current cpuset context,
141 (unless the
142 .B MPOL_F_STATIC_NODES
143 mode flag is specified),
144 and contains memory.
145 If the
146 .B MPOL_F_STATIC_NODES
147 is set in
148 .I mode
149 and a required
150 .I nodemask
151 contains no nodes that are allowed by the process's current cpuset context,
152 the memory policy reverts to
153 .IR "local allocation" .
154 This effectively overrides the specified policy until the process's
155 cpuset context includes one or more of the nodes specified by
156 .IR nodemask .
157 .P
158 The
159 .I mode
160 argument must include one of the following values:
161 .TP
162 .B MPOL_DEFAULT
163 This mode specifies that any nondefault thread memory policy be removed,
164 so that the memory policy "falls back" to the system default policy.
165 The system default policy is "local allocation"\[em]that is,
166 allocate memory on the node of the CPU that triggered the allocation.
167 .I nodemask
168 must be specified as NULL.
169 If the "local node" contains no free memory, the system will
170 attempt to allocate memory from a "near by" node.
171 .TP
172 .B MPOL_BIND
173 This mode defines a strict policy that restricts memory allocation to the
174 nodes specified in
175 .IR nodemask .
176 If
177 .I nodemask
178 specifies more than one node, page allocations will come from
179 the node with the lowest numeric node ID first, until that node
180 contains no free memory.
181 Allocations will then come from the node with the next highest
182 node ID specified in
183 .I nodemask
184 and so forth, until none of the specified nodes contain free memory.
185 Pages will not be allocated from any node not specified in the
186 .IR nodemask .
187 .TP
188 .B MPOL_INTERLEAVE
189 This mode interleaves page allocations across the nodes specified in
190 .I nodemask
191 in numeric node ID order.
192 This optimizes for bandwidth instead of latency
193 by spreading out pages and memory accesses to those pages across
194 multiple nodes.
195 However, accesses to a single page will still be limited to
196 the memory bandwidth of a single node.
197 .\" NOTE: the following sentence doesn't make sense in the context
198 .\" of set_mempolicy() -- no memory area specified.
199 .\" To be effective the memory area should be fairly large,
200 .\" at least 1 MB or bigger.
201 .TP
202 .B MPOL_PREFERRED
203 This mode sets the preferred node for allocation.
204 The kernel will try to allocate pages from this node first
205 and fall back to "near by" nodes if the preferred node is low on free
206 memory.
207 If
208 .I nodemask
209 specifies more than one node ID, the first node in the
210 mask will be selected as the preferred node.
211 If the
212 .I nodemask
213 and
214 .I maxnode
215 arguments specify the empty set, then the policy
216 specifies "local allocation"
217 (like the system default policy discussed above).
218 .TP
219 .BR MPOL_LOCAL " (since Linux 3.8)"
220 .\" commit 479e2802d09f1e18a97262c4c6f8f17ae5884bd8
221 .\" commit f2a07f40dbc603c15f8b06e6ec7f768af67b424f
222 This mode specifies "local allocation"; the memory is allocated on
223 the node of the CPU that triggered the allocation (the "local node").
224 The
225 .I nodemask
226 and
227 .I maxnode
228 arguments must specify the empty set.
229 If the "local node" is low on free memory,
230 the kernel will try to allocate memory from other nodes.
231 The kernel will allocate memory from the "local node"
232 whenever memory for this node is available.
233 If the "local node" is not allowed by the process's current cpuset context,
234 the kernel will try to allocate memory from other nodes.
235 The kernel will allocate memory from the "local node" whenever
236 it becomes allowed by the process's current cpuset context.
237 .P
238 The thread memory policy is preserved across an
239 .BR execve (2),
240 and is inherited by child threads created using
241 .BR fork (2)
242 or
243 .BR clone (2).
244 .SH RETURN VALUE
245 On success,
246 .BR set_mempolicy ()
247 returns 0;
248 on error, \-1 is returned and
249 .I errno
250 is set to indicate the error.
251 .SH ERRORS
252 .TP
253 .B EFAULT
254 Part of all of the memory range specified by
255 .I nodemask
256 and
257 .I maxnode
258 points outside your accessible address space.
259 .TP
260 .B EINVAL
261 .I mode
262 is invalid.
263 Or,
264 .I mode
265 is
266 .B MPOL_DEFAULT
267 and
268 .I nodemask
269 is nonempty,
270 or
271 .I mode
272 is
273 .B MPOL_BIND
274 or
275 .B MPOL_INTERLEAVE
276 and
277 .I nodemask
278 is empty.
279 Or,
280 .I maxnode
281 specifies more than a page worth of bits.
282 Or,
283 .I nodemask
284 specifies one or more node IDs that are
285 greater than the maximum supported node ID.
286 Or, none of the node IDs specified by
287 .I nodemask
288 are on-line and allowed by the process's current cpuset context,
289 or none of the specified nodes contain memory.
290 Or, the
291 .I mode
292 argument specified both
293 .B MPOL_F_STATIC_NODES
294 and
295 .BR MPOL_F_RELATIVE_NODES .
296 Or, the
297 .B MPOL_F_NUMA_BALANCING
298 isn't supported by the kernel, or is used with
299 .I mode
300 other than
301 .BR MPOL_BIND .
302 .TP
303 .B ENOMEM
304 Insufficient kernel memory was available.
305 .SH STANDARDS
306 Linux.
307 .SH HISTORY
308 Linux 2.6.7.
309 .SH NOTES
310 Memory policy is not remembered if the page is swapped out.
311 When such a page is paged back in, it will use the policy of
312 the thread or memory range that is in effect at the time the
313 page is allocated.
314 .P
315 For information on library support, see
316 .BR numa (7).
317 .SH SEE ALSO
318 .BR get_mempolicy (2),
319 .BR getcpu (2),
320 .BR mbind (2),
321 .BR mmap (2),
322 .BR numa (3),
323 .BR cpuset (7),
324 .BR numa (7),
325 .BR numactl (8)