]> git.ipfire.org Git - thirdparty/man-pages.git/blob - man2/set_mempolicy.2
mbind.2, set_mempolicy.2: Add kernel version for MPOL_LOCAL
[thirdparty/man-pages.git] / man2 / set_mempolicy.2
1 .\" Copyright 2003,2004 Andi Kleen, SuSE Labs.
2 .\" and Copyright 2007 Lee Schermerhorn, Hewlett Packard
3 .\"
4 .\" %%%LICENSE_START(VERBATIM_PROF)
5 .\" Permission is granted to make and distribute verbatim copies of this
6 .\" manual provided the copyright notice and this permission notice are
7 .\" preserved on all copies.
8 .\"
9 .\" Permission is granted to copy and distribute modified versions of this
10 .\" manual under the conditions for verbatim copying, provided that the
11 .\" entire resulting derived work is distributed under the terms of a
12 .\" permission notice identical to this one.
13 .\"
14 .\" Since the Linux kernel and libraries are constantly changing, this
15 .\" manual page may be incorrect or out-of-date. The author(s) assume no
16 .\" responsibility for errors or omissions, or for damages resulting from
17 .\" the use of the information contained herein.
18 .\"
19 .\" Formatted or processed versions of this manual, if unaccompanied by
20 .\" the source, must acknowledge the copyright and authors of this work.
21 .\" %%%LICENSE_END
22 .\"
23 .\" 2006-02-03, mtk, substantial wording changes and other improvements
24 .\" 2007-08-27, Lee Schermerhorn <Lee.Schermerhorn@hp.com>
25 .\" more precise specification of behavior.
26 .\"
27 .TH SET_MEMPOLICY 2 2015-05-07 Linux "Linux Programmer's Manual"
28 .SH NAME
29 set_mempolicy \- set default NUMA memory policy for a thread and its children
30 .SH SYNOPSIS
31 .nf
32 .B "#include <numaif.h>"
33 .sp
34 .BI "long set_mempolicy(int " mode ", const unsigned long *" nodemask ,
35 .BI " unsigned long " maxnode );
36 .sp
37 Link with \fI\-lnuma\fP.
38 .fi
39 .SH DESCRIPTION
40 .BR set_mempolicy ()
41 sets the NUMA memory policy of the calling thread,
42 which consists of a policy mode and zero or more nodes,
43 to the values specified by the
44 .IR mode ,
45 .I nodemask
46 and
47 .I maxnode
48 arguments.
49
50 A NUMA machine has different
51 memory controllers with different distances to specific CPUs.
52 The memory policy defines from which node memory is allocated for
53 the thread.
54
55 This system call defines the default policy for the thread.
56 The thread policy governs allocation of pages in the process's
57 address space outside of memory ranges
58 controlled by a more specific policy set by
59 .BR mbind (2).
60 The thread default policy also controls allocation of any pages for
61 memory-mapped files mapped using the
62 .BR mmap (2)
63 call with the
64 .B MAP_PRIVATE
65 flag and that are only read [loaded] from by the thread
66 and of memory-mapped files mapped using the
67 .BR mmap (2)
68 call with the
69 .B MAP_SHARED
70 flag, regardless of the access type.
71 The policy is applied only when a new page is allocated
72 for the thread.
73 For anonymous memory this is when the page is first
74 touched by the thread.
75
76 The
77 .I mode
78 argument must specify one of
79 .BR MPOL_DEFAULT ,
80 .BR MPOL_BIND ,
81 .BR MPOL_INTERLEAVE ,
82 .BR MPOL_PREFERRED ,
83 or
84 .BR MPOL_LOCAL
85 (which are described in detail below).
86 All modes except
87 .B MPOL_DEFAULT
88 require the caller to specify via the
89 .I nodemask
90 argument one or more nodes.
91
92 The
93 .I mode
94 argument may also include an optional
95 .IR "mode flag" .
96 The supported
97 .I "mode flags"
98 are:
99 .TP
100 .BR MPOL_F_STATIC_NODES " (since Linux 2.6.26)"
101 A nonempty
102 .I nodemask
103 specifies physical node ids.
104 Linux will not remap the
105 .I nodemask
106 when the process moves to a different cpuset context,
107 nor when the set of nodes allowed by the process's
108 current cpuset context changes.
109 .TP
110 .BR MPOL_F_RELATIVE_NODES " (since Linux 2.6.26)"
111 A nonempty
112 .I nodemask
113 specifies node ids that are relative to the set of
114 node ids allowed by the process's current cpuset.
115 .PP
116 .I nodemask
117 points to a bit mask of node IDs that contains up to
118 .I maxnode
119 bits.
120 The bit mask size is rounded to the next multiple of
121 .IR "sizeof(unsigned long)" ,
122 but the kernel will use bits only up to
123 .IR maxnode .
124 A NULL value of
125 .I nodemask
126 or a
127 .I maxnode
128 value of zero specifies the empty set of nodes.
129 If the value of
130 .I maxnode
131 is zero,
132 the
133 .I nodemask
134 argument is ignored.
135
136 Where a
137 .I nodemask
138 is required, it must contain at least one node that is on-line,
139 allowed by the process's current cpuset context,
140 [unless the
141 .B MPOL_F_STATIC_NODES
142 mode flag is specified],
143 and contains memory.
144 If the
145 .B MPOL_F_STATIC_NODES
146 is set in
147 .I mode
148 and a required
149 .I nodemask
150 contains no nodes that are allowed by the process's current cpuset context,
151 the memory policy reverts to
152 .IR "local allocation" .
153 This effectively overrides the specified policy until the process's
154 cpuset context includes one or more of the nodes specified by
155 .IR nodemask .
156
157 The
158 .B MPOL_DEFAULT
159 mode specifies that any nondefault thread memory policy be removed,
160 so that the memory policy "falls back" to the system default policy.
161 The system default policy is "local allocation"\(emthat is,
162 allocate memory on the node of the CPU that triggered the allocation.
163 .I nodemask
164 must be specified as NULL.
165 If the "local node" contains no free memory, the system will
166 attempt to allocate memory from a "near by" node.
167
168 The
169 .B MPOL_BIND
170 mode defines a strict policy that restricts memory allocation to the
171 nodes specified in
172 .IR nodemask .
173 If
174 .I nodemask
175 specifies more than one node, page allocations will come from
176 the node with the lowest numeric node ID first, until that node
177 contains no free memory.
178 Allocations will then come from the node with the next highest
179 node ID specified in
180 .I nodemask
181 and so forth, until none of the specified nodes contain free memory.
182 Pages will not be allocated from any node not specified in the
183 .IR nodemask .
184
185 .B MPOL_INTERLEAVE
186 interleaves page allocations across the nodes specified in
187 .I nodemask
188 in numeric node ID order.
189 This optimizes for bandwidth instead of latency
190 by spreading out pages and memory accesses to those pages across
191 multiple nodes.
192 However, accesses to a single page will still be limited to
193 the memory bandwidth of a single node.
194 .\" NOTE: the following sentence doesn't make sense in the context
195 .\" of set_mempolicy() -- no memory area specified.
196 .\" To be effective the memory area should be fairly large,
197 .\" at least 1MB or bigger.
198
199 .B MPOL_PREFERRED
200 sets the preferred node for allocation.
201 The kernel will try to allocate pages from this node first
202 and fall back to "near by" nodes if the preferred node is low on free
203 memory.
204 If
205 .I nodemask
206 specifies more than one node ID, the first node in the
207 mask will be selected as the preferred node.
208 If the
209 .I nodemask
210 and
211 .I maxnode
212 arguments specify the empty set, then the policy
213 specifies "local allocation"
214 (like the system default policy discussed above).
215
216 .BR MPOL_LOCAL " (since Linux 3.8)"
217 .\" commit 479e2802d09f1e18a97262c4c6f8f17ae5884bd8
218 .\" commit f2a07f40dbc603c15f8b06e6ec7f768af67b424f
219 specifies "local allocation"; the memory is allocated on
220 the node of the CPU that triggered the allocation (the "local node").
221 The
222 .I nodemask
223 and
224 .I maxnode
225 arguments must specify the empty set.
226 If the "local node" is low on free memory,
227 the kernel will try to allocate memory from other nodes.
228 The kernel will allocate memory from the "local node"
229 whenever memory for this node is available.
230 If the "local node" is not allowed by the process's current cpuset context,
231 the kernel will try to allocate memory from other nodes.
232 The kernel will allocate memory from the "local node" whenever
233 it becomes allowed by the process's current cpuset context.
234
235 The thread memory policy is preserved across an
236 .BR execve (2),
237 and is inherited by child threads created using
238 .BR fork (2)
239 or
240 .BR clone (2).
241 .SH RETURN VALUE
242 On success,
243 .BR set_mempolicy ()
244 returns 0;
245 on error, \-1 is returned and
246 .I errno
247 is set to indicate the error.
248 .SH ERRORS
249 .TP
250 .B EFAULT
251 Part of all of the memory range specified by
252 .I nodemask
253 and
254 .I maxnode
255 points outside your accessible address space.
256 .TP
257 .B EINVAL
258 .I mode
259 is invalid.
260 Or,
261 .I mode
262 is
263 .B MPOL_DEFAULT
264 and
265 .I nodemask
266 is nonempty,
267 or
268 .I mode
269 is
270 .B MPOL_BIND
271 or
272 .B MPOL_INTERLEAVE
273 and
274 .I nodemask
275 is empty.
276 Or,
277 .I maxnode
278 specifies more than a page worth of bits.
279 Or,
280 .I nodemask
281 specifies one or more node IDs that are
282 greater than the maximum supported node ID.
283 Or, none of the node IDs specified by
284 .I nodemask
285 are on-line and allowed by the process's current cpuset context,
286 or none of the specified nodes contain memory.
287 Or, the
288 .I mode
289 argument specified both
290 .B MPOL_F_STATIC_NODES
291 and
292 .BR MPOL_F_RELATIVE_NODES .
293 .TP
294 .B ENOMEM
295 Insufficient kernel memory was available.
296 .SH VERSIONS
297 The
298 .BR set_mempolicy (),
299 system call was added to the Linux kernel in version 2.6.7.
300 .SH CONFORMING TO
301 This system call is Linux-specific.
302 .SH NOTES
303 Memory policy is not remembered if the page is swapped out.
304 When such a page is paged back in, it will use the policy of
305 the thread or memory range that is in effect at the time the
306 page is allocated.
307
308 For information on library support, see
309 .BR numa (7).
310 .SH SEE ALSO
311 .BR get_mempolicy (2),
312 .BR getcpu (2),
313 .BR mbind (2),
314 .BR mmap (2),
315 .BR numa (3),
316 .BR cpuset (7),
317 .BR numa (7),
318 .BR numactl (8)