]>
Commit | Line | Data |
---|---|---|
314093c9 | 1 | .\" Copyright 2003,2004 Andi Kleen, SuSE Labs. |
73ae0a09 | 2 | .\" and Copyright 2007 Lee Schermerhorn, Hewlett Packard |
314093c9 | 3 | .\" |
9f882130 | 4 | .\" %%%LICENSE_START(VERBATIM_PROF) |
314093c9 MK |
5 | .\" Permission is granted to make and distribute verbatim copies of this |
6 | .\" manual provided the copyright notice and this permission notice are | |
7 | .\" preserved on all copies. | |
8 | .\" | |
9 | .\" Permission is granted to copy and distribute modified versions of this | |
10 | .\" manual under the conditions for verbatim copying, provided that the | |
11 | .\" entire resulting derived work is distributed under the terms of a | |
12 | .\" permission notice identical to this one. | |
c13182ef | 13 | .\" |
314093c9 MK |
14 | .\" Since the Linux kernel and libraries are constantly changing, this |
15 | .\" manual page may be incorrect or out-of-date. The author(s) assume no | |
16 | .\" responsibility for errors or omissions, or for damages resulting from | |
c13182ef MK |
17 | .\" the use of the information contained herein. |
18 | .\" | |
314093c9 MK |
19 | .\" Formatted or processed versions of this manual, if unaccompanied by |
20 | .\" the source, must acknowledge the copyright and authors of this work. | |
9f882130 | 21 | .\" %%%LICENSE_END |
c13182ef | 22 | .\" |
314093c9 | 23 | .\" 2006-02-03, mtk, substantial wording changes and other improvements |
00045cbb MK |
24 | .\" 2007-08-27, Lee Schermerhorn <Lee.Schermerhorn@hp.com> |
25 | .\" more precise specification of behavior. | |
314093c9 | 26 | .\" |
67d2c687 | 27 | .TH SET_MEMPOLICY 2 2015-05-07 Linux "Linux Programmer's Manual" |
314093c9 | 28 | .SH NAME |
85677816 | 29 | set_mempolicy \- set default NUMA memory policy for a thread and its children |
314093c9 | 30 | .SH SYNOPSIS |
521bf584 | 31 | .nf |
c13182ef | 32 | .B "#include <numaif.h>" |
314093c9 | 33 | .sp |
2cbf26f1 RV |
34 | .BI "long set_mempolicy(int " mode ", const unsigned long *" nodemask , |
35 | .BI " unsigned long " maxnode ); | |
73ae0a09 | 36 | .sp |
4ed3353d | 37 | Link with \fI\-lnuma\fP. |
521bf584 | 38 | .fi |
314093c9 MK |
39 | .SH DESCRIPTION |
40 | .BR set_mempolicy () | |
85677816 | 41 | sets the NUMA memory policy of the calling thread, |
73ae0a09 MK |
42 | which consists of a policy mode and zero or more nodes, |
43 | to the values specified by the | |
44 | .IR mode , | |
45 | .I nodemask | |
46 | and | |
0daa9e92 | 47 | .I maxnode |
73ae0a09 | 48 | arguments. |
314093c9 MK |
49 | |
50 | A NUMA machine has different | |
51 | memory controllers with different distances to specific CPUs. | |
73ae0a09 | 52 | The memory policy defines from which node memory is allocated for |
85677816 | 53 | the thread. |
314093c9 | 54 | |
85677816 BG |
55 | This system call defines the default policy for the thread. |
56 | The thread policy governs allocation of pages in the process's | |
73ae0a09 MK |
57 | address space outside of memory ranges |
58 | controlled by a more specific policy set by | |
314093c9 | 59 | .BR mbind (2). |
85677816 | 60 | The thread default policy also controls allocation of any pages for |
9a141bfb | 61 | memory-mapped files mapped using the |
73ae0a09 MK |
62 | .BR mmap (2) |
63 | call with the | |
64 | .B MAP_PRIVATE | |
8831d464 | 65 | flag and that are only read (loaded) from by the thread |
9a141bfb | 66 | and of memory-mapped files mapped using the |
73ae0a09 MK |
67 | .BR mmap (2) |
68 | call with the | |
69 | .B MAP_SHARED | |
70 | flag, regardless of the access type. | |
33a0ccb2 | 71 | The policy is applied only when a new page is allocated |
85677816 | 72 | for the thread. |
c13182ef | 73 | For anonymous memory this is when the page is first |
85677816 | 74 | touched by the thread. |
314093c9 | 75 | |
73ae0a09 MK |
76 | The |
77 | .I mode | |
78 | argument must specify one of | |
314093c9 MK |
79 | .BR MPOL_DEFAULT , |
80 | .BR MPOL_BIND , | |
bcc7c6dc | 81 | .BR MPOL_INTERLEAVE , |
a2b94599 | 82 | .BR MPOL_PREFERRED , |
73ae0a09 | 83 | or |
5fcb90fd MK |
84 | .BR MPOL_LOCAL |
85 | (which are described in detail below). | |
73ae0a09 | 86 | All modes except |
314093c9 | 87 | .B MPOL_DEFAULT |
f5a936f4 MK |
88 | require the caller to specify the node or nodes to which the mode applies, |
89 | via the | |
c13182ef | 90 | .I nodemask |
f5a936f4 | 91 | argument. |
73ae0a09 | 92 | |
f98b728e MK |
93 | The |
94 | .I mode | |
95 | argument may also include an optional | |
adfbcbeb | 96 | .IR "mode flag" . |
f98b728e MK |
97 | The supported |
98 | .I "mode flags" | |
99 | are: | |
100 | .TP | |
101 | .BR MPOL_F_STATIC_NODES " (since Linux 2.6.26)" | |
aa796481 | 102 | A nonempty |
f98b728e | 103 | .I nodemask |
b763062b | 104 | specifies physical node IDs. |
f6374cc2 | 105 | Linux will not remap the |
f98b728e MK |
106 | .I nodemask |
107 | when the process moves to a different cpuset context, | |
108 | nor when the set of nodes allowed by the process's | |
109 | current cpuset context changes. | |
110 | .TP | |
111 | .BR MPOL_F_RELATIVE_NODES " (since Linux 2.6.26)" | |
aa796481 | 112 | A nonempty |
f98b728e | 113 | .I nodemask |
b763062b MK |
114 | specifies node IDs that are relative to the set of |
115 | node IDs allowed by the process's current cpuset. | |
f98b728e | 116 | .PP |
c13182ef | 117 | .I nodemask |
00045cbb | 118 | points to a bit mask of node IDs that contains up to |
314093c9 | 119 | .I maxnode |
c13182ef | 120 | bits. |
73ae0a09 | 121 | The bit mask size is rounded to the next multiple of |
c13182ef | 122 | .IR "sizeof(unsigned long)" , |
33a0ccb2 | 123 | but the kernel will use bits only up to |
314093c9 | 124 | .IR maxnode . |
73ae0a09 MK |
125 | A NULL value of |
126 | .I nodemask | |
127 | or a | |
128 | .I maxnode | |
129 | value of zero specifies the empty set of nodes. | |
130 | If the value of | |
131 | .I maxnode | |
132 | is zero, | |
133 | the | |
134 | .I nodemask | |
135 | argument is ignored. | |
f98b728e | 136 | |
cdba9253 MK |
137 | Where a |
138 | .I nodemask | |
139 | is required, it must contain at least one node that is on-line, | |
140 | allowed by the process's current cpuset context, | |
bdf71bd3 | 141 | (unless the |
f98b728e | 142 | .B MPOL_F_STATIC_NODES |
bdf71bd3 | 143 | mode flag is specified), |
cdba9253 | 144 | and contains memory. |
f98b728e MK |
145 | If the |
146 | .B MPOL_F_STATIC_NODES | |
147 | is set in | |
148 | .I mode | |
149 | and a required | |
150 | .I nodemask | |
151 | contains no nodes that are allowed by the process's current cpuset context, | |
152 | the memory policy reverts to | |
153 | .IR "local allocation" . | |
154 | This effectively overrides the specified policy until the process's | |
155 | cpuset context includes one or more of the nodes specified by | |
fe48639f | 156 | .IR nodemask . |
314093c9 | 157 | |
c13182ef | 158 | The |
da451626 MK |
159 | .I mode |
160 | argument must include one of the following values: | |
161 | .TP | |
314093c9 | 162 | .B MPOL_DEFAULT |
da451626 | 163 | This mode specifies that any nondefault thread memory policy be removed, |
f98b728e | 164 | so that the memory policy "falls back" to the system default policy. |
88879aeb MK |
165 | The system default policy is "local allocation"\(emthat is, |
166 | allocate memory on the node of the CPU that triggered the allocation. | |
c13182ef | 167 | .I nodemask |
73ae0a09 MK |
168 | must be specified as NULL. |
169 | If the "local node" contains no free memory, the system will | |
170 | attempt to allocate memory from a "near by" node. | |
da451626 | 171 | .TP |
314093c9 | 172 | .B MPOL_BIND |
da451626 | 173 | This mode defines a strict policy that restricts memory allocation to the |
c13182ef | 174 | nodes specified in |
314093c9 | 175 | .IR nodemask . |
73ae0a09 MK |
176 | If |
177 | .I nodemask | |
178 | specifies more than one node, page allocations will come from | |
00045cbb | 179 | the node with the lowest numeric node ID first, until that node |
73ae0a09 MK |
180 | contains no free memory. |
181 | Allocations will then come from the node with the next highest | |
00045cbb | 182 | node ID specified in |
73ae0a09 MK |
183 | .I nodemask |
184 | and so forth, until none of the specified nodes contain free memory. | |
185 | Pages will not be allocated from any node not specified in the | |
186 | .IR nodemask . | |
314093c9 | 187 | |
da451626 | 188 | .TP |
314093c9 | 189 | .B MPOL_INTERLEAVE |
da451626 | 190 | This mode interleaves page allocations across the nodes specified in |
73ae0a09 | 191 | .I nodemask |
00045cbb | 192 | in numeric node ID order. |
73ae0a09 MK |
193 | This optimizes for bandwidth instead of latency |
194 | by spreading out pages and memory accesses to those pages across | |
195 | multiple nodes. | |
196 | However, accesses to a single page will still be limited to | |
197 | the memory bandwidth of a single node. | |
198 | .\" NOTE: the following sentence doesn't make sense in the context | |
199 | .\" of set_mempolicy() -- no memory area specified. | |
200 | .\" To be effective the memory area should be fairly large, | |
201 | .\" at least 1MB or bigger. | |
da451626 | 202 | .TP |
314093c9 | 203 | .B MPOL_PREFERRED |
da451626 | 204 | This mode sets the preferred node for allocation. |
73ae0a09 MK |
205 | The kernel will try to allocate pages from this node first |
206 | and fall back to "near by" nodes if the preferred node is low on free | |
c13182ef | 207 | memory. |
73ae0a09 MK |
208 | If |
209 | .I nodemask | |
00045cbb | 210 | specifies more than one node ID, the first node in the |
73ae0a09 MK |
211 | mask will be selected as the preferred node. |
212 | If the | |
c13182ef | 213 | .I nodemask |
73ae0a09 MK |
214 | and |
215 | .I maxnode | |
1313d297 MK |
216 | arguments specify the empty set, then the policy |
217 | specifies "local allocation" | |
218 | (like the system default policy discussed above). | |
da451626 | 219 | .TP |
4b006572 MK |
220 | .BR MPOL_LOCAL " (since Linux 3.8)" |
221 | .\" commit 479e2802d09f1e18a97262c4c6f8f17ae5884bd8 | |
222 | .\" commit f2a07f40dbc603c15f8b06e6ec7f768af67b424f | |
da451626 | 223 | This mode specifies "local allocation"; the memory is allocated on |
c0649ed5 | 224 | the node of the CPU that triggered the allocation (the "local node"). |
a2b94599 PK |
225 | The |
226 | .I nodemask | |
227 | and | |
228 | .I maxnode | |
5e38e258 | 229 | arguments must specify the empty set. |
c0649ed5 | 230 | If the "local node" is low on free memory, |
5e38e258 MK |
231 | the kernel will try to allocate memory from other nodes. |
232 | The kernel will allocate memory from the "local node" | |
233 | whenever memory for this node is available. | |
c0649ed5 | 234 | If the "local node" is not allowed by the process's current cpuset context, |
5e38e258 MK |
235 | the kernel will try to allocate memory from other nodes. |
236 | The kernel will allocate memory from the "local node" whenever | |
237 | it becomes allowed by the process's current cpuset context. | |
da451626 | 238 | .PP |
85677816 | 239 | The thread memory policy is preserved across an |
3bd6a9b1 | 240 | .BR execve (2), |
85677816 | 241 | and is inherited by child threads created using |
c13182ef MK |
242 | .BR fork (2) |
243 | or | |
314093c9 | 244 | .BR clone (2). |
314093c9 MK |
245 | .SH RETURN VALUE |
246 | On success, | |
247 | .BR set_mempolicy () | |
248 | returns 0; | |
249 | on error, \-1 is returned and | |
c13182ef | 250 | .I errno |
314093c9 | 251 | is set to indicate the error. |
73ae0a09 MK |
252 | .SH ERRORS |
253 | .TP | |
b3a7b55e MK |
254 | .B EFAULT |
255 | Part of all of the memory range specified by | |
256 | .I nodemask | |
257 | and | |
258 | .I maxnode | |
259 | points outside your accessible address space. | |
260 | .TP | |
73ae0a09 | 261 | .B EINVAL |
4d2be0ee MK |
262 | .I mode |
263 | is invalid. | |
73ae0a09 MK |
264 | Or, |
265 | .I mode | |
266 | is | |
00045cbb | 267 | .B MPOL_DEFAULT |
73ae0a09 MK |
268 | and |
269 | .I nodemask | |
aa796481 | 270 | is nonempty, |
73ae0a09 MK |
271 | or |
272 | .I mode | |
273 | is | |
00045cbb | 274 | .B MPOL_BIND |
73ae0a09 | 275 | or |
00045cbb | 276 | .B MPOL_INTERLEAVE |
73ae0a09 MK |
277 | and |
278 | .I nodemask | |
279 | is empty. | |
280 | Or, | |
281 | .I maxnode | |
282 | specifies more than a page worth of bits. | |
283 | Or, | |
284 | .I nodemask | |
00045cbb | 285 | specifies one or more node IDs that are |
cdba9253 | 286 | greater than the maximum supported node ID. |
00045cbb | 287 | Or, none of the node IDs specified by |
73ae0a09 | 288 | .I nodemask |
cdba9253 MK |
289 | are on-line and allowed by the process's current cpuset context, |
290 | or none of the specified nodes contain memory. | |
f98b728e MK |
291 | Or, the |
292 | .I mode | |
293 | argument specified both | |
294 | .B MPOL_F_STATIC_NODES | |
295 | and | |
296 | .BR MPOL_F_RELATIVE_NODES . | |
73ae0a09 | 297 | .TP |
73ae0a09 MK |
298 | .B ENOMEM |
299 | Insufficient kernel memory was available. | |
adfbcbeb MK |
300 | .SH VERSIONS |
301 | The | |
d0749cdf | 302 | .BR set_mempolicy () |
adfbcbeb | 303 | system call was added to the Linux kernel in version 2.6.7. |
9d9dc1e8 | 304 | .SH CONFORMING TO |
8382f16d | 305 | This system call is Linux-specific. |
a1d5f77c | 306 | .SH NOTES |
85677816 | 307 | Memory policy is not remembered if the page is swapped out. |
73ae0a09 | 308 | When such a page is paged back in, it will use the policy of |
85677816 | 309 | the thread or memory range that is in effect at the time the |
73ae0a09 | 310 | page is allocated. |
adfbcbeb MK |
311 | |
312 | For information on library support, see | |
313 | .BR numa (7). | |
314093c9 | 314 | .SH SEE ALSO |
fa23e023 | 315 | .BR get_mempolicy (2), |
f0c34053 | 316 | .BR getcpu (2), |
314093c9 | 317 | .BR mbind (2), |
73ae0a09 | 318 | .BR mmap (2), |
a18e2edb MK |
319 | .BR numa (3), |
320 | .BR cpuset (7), | |
adfbcbeb | 321 | .BR numa (7), |
a18e2edb | 322 | .BR numactl (8) |