]> git.ipfire.org Git - thirdparty/man-pages.git/blob - man2/set_mempolicy.2
set_mempolicy.2,mbind.2: Add MPOL_LOCAL NUMA memory policy documentation
[thirdparty/man-pages.git] / man2 / set_mempolicy.2
1 .\" Copyright 2003,2004 Andi Kleen, SuSE Labs.
2 .\" and Copyright 2007 Lee Schermerhorn, Hewlett Packard
3 .\"
4 .\" %%%LICENSE_START(VERBATIM_PROF)
5 .\" Permission is granted to make and distribute verbatim copies of this
6 .\" manual provided the copyright notice and this permission notice are
7 .\" preserved on all copies.
8 .\"
9 .\" Permission is granted to copy and distribute modified versions of this
10 .\" manual under the conditions for verbatim copying, provided that the
11 .\" entire resulting derived work is distributed under the terms of a
12 .\" permission notice identical to this one.
13 .\"
14 .\" Since the Linux kernel and libraries are constantly changing, this
15 .\" manual page may be incorrect or out-of-date. The author(s) assume no
16 .\" responsibility for errors or omissions, or for damages resulting from
17 .\" the use of the information contained herein.
18 .\"
19 .\" Formatted or processed versions of this manual, if unaccompanied by
20 .\" the source, must acknowledge the copyright and authors of this work.
21 .\" %%%LICENSE_END
22 .\"
23 .\" 2006-02-03, mtk, substantial wording changes and other improvements
24 .\" 2007-08-27, Lee Schermerhorn <Lee.Schermerhorn@hp.com>
25 .\" more precise specification of behavior.
26 .\"
27 .TH SET_MEMPOLICY 2 2015-05-07 Linux "Linux Programmer's Manual"
28 .SH NAME
29 set_mempolicy \- set default NUMA memory policy for a thread and its children
30 .SH SYNOPSIS
31 .nf
32 .B "#include <numaif.h>"
33 .sp
34 .BI "long set_mempolicy(int " mode ", const unsigned long *" nodemask ,
35 .BI " unsigned long " maxnode );
36 .sp
37 Link with \fI\-lnuma\fP.
38 .fi
39 .SH DESCRIPTION
40 .BR set_mempolicy ()
41 sets the NUMA memory policy of the calling thread,
42 which consists of a policy mode and zero or more nodes,
43 to the values specified by the
44 .IR mode ,
45 .I nodemask
46 and
47 .I maxnode
48 arguments.
49
50 A NUMA machine has different
51 memory controllers with different distances to specific CPUs.
52 The memory policy defines from which node memory is allocated for
53 the thread.
54
55 This system call defines the default policy for the thread.
56 The thread policy governs allocation of pages in the process's
57 address space outside of memory ranges
58 controlled by a more specific policy set by
59 .BR mbind (2).
60 The thread default policy also controls allocation of any pages for
61 memory-mapped files mapped using the
62 .BR mmap (2)
63 call with the
64 .B MAP_PRIVATE
65 flag and that are only read [loaded] from by the thread
66 and of memory-mapped files mapped using the
67 .BR mmap (2)
68 call with the
69 .B MAP_SHARED
70 flag, regardless of the access type.
71 The policy is applied only when a new page is allocated
72 for the thread.
73 For anonymous memory this is when the page is first
74 touched by the thread.
75
76 The
77 .I mode
78 argument must specify one of
79 .BR MPOL_DEFAULT ,
80 .BR MPOL_BIND ,
81 .BR MPOL_INTERLEAVE ,
82 .BR MPOL_PREFERRED ,
83 or
84 .BR MPOL_LOCAL .
85 All modes except
86 .B MPOL_DEFAULT
87 require the caller to specify via the
88 .I nodemask
89 argument one or more nodes.
90
91 The
92 .I mode
93 argument may also include an optional
94 .IR "mode flag" .
95 The supported
96 .I "mode flags"
97 are:
98 .TP
99 .BR MPOL_F_STATIC_NODES " (since Linux 2.6.26)"
100 A nonempty
101 .I nodemask
102 specifies physical node ids.
103 Linux will not remap the
104 .I nodemask
105 when the process moves to a different cpuset context,
106 nor when the set of nodes allowed by the process's
107 current cpuset context changes.
108 .TP
109 .BR MPOL_F_RELATIVE_NODES " (since Linux 2.6.26)"
110 A nonempty
111 .I nodemask
112 specifies node ids that are relative to the set of
113 node ids allowed by the process's current cpuset.
114 .PP
115 .I nodemask
116 points to a bit mask of node IDs that contains up to
117 .I maxnode
118 bits.
119 The bit mask size is rounded to the next multiple of
120 .IR "sizeof(unsigned long)" ,
121 but the kernel will use bits only up to
122 .IR maxnode .
123 A NULL value of
124 .I nodemask
125 or a
126 .I maxnode
127 value of zero specifies the empty set of nodes.
128 If the value of
129 .I maxnode
130 is zero,
131 the
132 .I nodemask
133 argument is ignored.
134
135 Where a
136 .I nodemask
137 is required, it must contain at least one node that is on-line,
138 allowed by the process's current cpuset context,
139 [unless the
140 .B MPOL_F_STATIC_NODES
141 mode flag is specified],
142 and contains memory.
143 If the
144 .B MPOL_F_STATIC_NODES
145 is set in
146 .I mode
147 and a required
148 .I nodemask
149 contains no nodes that are allowed by the process's current cpuset context,
150 the memory policy reverts to
151 .IR "local allocation" .
152 This effectively overrides the specified policy until the process's
153 cpuset context includes one or more of the nodes specified by
154 .IR nodemask .
155
156 The
157 .B MPOL_DEFAULT
158 mode specifies that any nondefault thread memory policy be removed,
159 so that the memory policy "falls back" to the system default policy.
160 The system default policy is "local allocation"\(emthat is,
161 allocate memory on the node of the CPU that triggered the allocation.
162 .I nodemask
163 must be specified as NULL.
164 If the "local node" contains no free memory, the system will
165 attempt to allocate memory from a "near by" node.
166
167 The
168 .B MPOL_BIND
169 mode defines a strict policy that restricts memory allocation to the
170 nodes specified in
171 .IR nodemask .
172 If
173 .I nodemask
174 specifies more than one node, page allocations will come from
175 the node with the lowest numeric node ID first, until that node
176 contains no free memory.
177 Allocations will then come from the node with the next highest
178 node ID specified in
179 .I nodemask
180 and so forth, until none of the specified nodes contain free memory.
181 Pages will not be allocated from any node not specified in the
182 .IR nodemask .
183
184 .B MPOL_INTERLEAVE
185 interleaves page allocations across the nodes specified in
186 .I nodemask
187 in numeric node ID order.
188 This optimizes for bandwidth instead of latency
189 by spreading out pages and memory accesses to those pages across
190 multiple nodes.
191 However, accesses to a single page will still be limited to
192 the memory bandwidth of a single node.
193 .\" NOTE: the following sentence doesn't make sense in the context
194 .\" of set_mempolicy() -- no memory area specified.
195 .\" To be effective the memory area should be fairly large,
196 .\" at least 1MB or bigger.
197
198 .B MPOL_PREFERRED
199 sets the preferred node for allocation.
200 The kernel will try to allocate pages from this node first
201 and fall back to "near by" nodes if the preferred node is low on free
202 memory.
203 If
204 .I nodemask
205 specifies more than one node ID, the first node in the
206 mask will be selected as the preferred node.
207 If the
208 .I nodemask
209 and
210 .I maxnode
211 arguments specify the empty set, then the policy
212 specifies "local allocation"
213 (like the system default policy discussed above).
214
215 .B MPOL_LOCAL
216 specifies the "local allocation", the memory is allocated on
217 the node of the CPU that triggered the allocation, "local node".
218 The
219 .I nodemask
220 and
221 .I maxnode
222 arguments must specify the empty set. If the "local node" is low
223 on free memory the kernel will try to allocate memory from other
224 nodes. The kernel will allocate memory from the "local node"
225 whenever memory for this node is available. If the "local node"
226 is not allowed by the process's current cpuset context the kernel
227 will try to allocate memory from other nodes. The kernel will
228 allocate memory from the "local node" whenever it becomes allowed
229 by the process's current cpuset context.
230
231 The thread memory policy is preserved across an
232 .BR execve (2),
233 and is inherited by child threads created using
234 .BR fork (2)
235 or
236 .BR clone (2).
237 .SH RETURN VALUE
238 On success,
239 .BR set_mempolicy ()
240 returns 0;
241 on error, \-1 is returned and
242 .I errno
243 is set to indicate the error.
244 .SH ERRORS
245 .TP
246 .B EFAULT
247 Part of all of the memory range specified by
248 .I nodemask
249 and
250 .I maxnode
251 points outside your accessible address space.
252 .TP
253 .B EINVAL
254 .I mode
255 is invalid.
256 Or,
257 .I mode
258 is
259 .B MPOL_DEFAULT
260 and
261 .I nodemask
262 is nonempty,
263 or
264 .I mode
265 is
266 .B MPOL_BIND
267 or
268 .B MPOL_INTERLEAVE
269 and
270 .I nodemask
271 is empty.
272 Or,
273 .I maxnode
274 specifies more than a page worth of bits.
275 Or,
276 .I nodemask
277 specifies one or more node IDs that are
278 greater than the maximum supported node ID.
279 Or, none of the node IDs specified by
280 .I nodemask
281 are on-line and allowed by the process's current cpuset context,
282 or none of the specified nodes contain memory.
283 Or, the
284 .I mode
285 argument specified both
286 .B MPOL_F_STATIC_NODES
287 and
288 .BR MPOL_F_RELATIVE_NODES .
289 .TP
290 .B ENOMEM
291 Insufficient kernel memory was available.
292 .SH VERSIONS
293 The
294 .BR set_mempolicy (),
295 system call was added to the Linux kernel in version 2.6.7.
296 .SH CONFORMING TO
297 This system call is Linux-specific.
298 .SH NOTES
299 Memory policy is not remembered if the page is swapped out.
300 When such a page is paged back in, it will use the policy of
301 the thread or memory range that is in effect at the time the
302 page is allocated.
303
304 For information on library support, see
305 .BR numa (7).
306 .SH SEE ALSO
307 .BR get_mempolicy (2),
308 .BR getcpu (2),
309 .BR mbind (2),
310 .BR mmap (2),
311 .BR numa (3),
312 .BR cpuset (7),
313 .BR numa (7),
314 .BR numactl (8)