]>
Commit | Line | Data |
---|---|---|
314093c9 | 1 | .\" Copyright 2003,2004 Andi Kleen, SuSE Labs. |
73ae0a09 | 2 | .\" and Copyright 2007 Lee Schermerhorn, Hewlett Packard |
314093c9 MK |
3 | .\" |
4 | .\" Permission is granted to make and distribute verbatim copies of this | |
5 | .\" manual provided the copyright notice and this permission notice are | |
6 | .\" preserved on all copies. | |
7 | .\" | |
8 | .\" Permission is granted to copy and distribute modified versions of this | |
9 | .\" manual under the conditions for verbatim copying, provided that the | |
10 | .\" entire resulting derived work is distributed under the terms of a | |
11 | .\" permission notice identical to this one. | |
c13182ef | 12 | .\" |
314093c9 MK |
13 | .\" Since the Linux kernel and libraries are constantly changing, this |
14 | .\" manual page may be incorrect or out-of-date. The author(s) assume no | |
15 | .\" responsibility for errors or omissions, or for damages resulting from | |
c13182ef MK |
16 | .\" the use of the information contained herein. |
17 | .\" | |
314093c9 MK |
18 | .\" Formatted or processed versions of this manual, if unaccompanied by |
19 | .\" the source, must acknowledge the copyright and authors of this work. | |
c13182ef | 20 | .\" |
314093c9 | 21 | .\" 2006-02-03, mtk, substantial wording changes and other improvements |
00045cbb MK |
22 | .\" 2007-08-27, Lee Schermerhorn <Lee.Schermerhorn@hp.com> |
23 | .\" more precise specification of behavior. | |
314093c9 | 24 | .\" |
00045cbb | 25 | .TH SET_MEMPOLICY 2 2007-08-27 Linux "Linux Programmer's Manual" |
314093c9 | 26 | .SH NAME |
73ae0a09 | 27 | set_mempolicy \- set default NUMA memory policy for a process and its children |
314093c9 | 28 | .SH SYNOPSIS |
521bf584 | 29 | .nf |
c13182ef | 30 | .B "#include <numaif.h>" |
314093c9 | 31 | .sp |
73ae0a09 | 32 | .BI "int set_mempolicy(int " mode ", unsigned long *" nodemask , |
521bf584 | 33 | .BI " unsigned long " maxnode ); |
73ae0a09 | 34 | .sp |
00045cbb | 35 | Link with \fI\-lnuma\fP |
521bf584 | 36 | .fi |
314093c9 MK |
37 | .SH DESCRIPTION |
38 | .BR set_mempolicy () | |
73ae0a09 MK |
39 | sets the NUMA memory policy of the calling process, |
40 | which consists of a policy mode and zero or more nodes, | |
41 | to the values specified by the | |
42 | .IR mode , | |
43 | .I nodemask | |
44 | and | |
0daa9e92 | 45 | .I maxnode |
73ae0a09 | 46 | arguments. |
314093c9 MK |
47 | |
48 | A NUMA machine has different | |
49 | memory controllers with different distances to specific CPUs. | |
73ae0a09 | 50 | The memory policy defines from which node memory is allocated for |
c13182ef | 51 | the process. |
314093c9 | 52 | |
73ae0a09 | 53 | This system call defines the default policy for the process. |
ecccf7c2 | 54 | The process policy governs allocation of pages in the process's |
73ae0a09 MK |
55 | address space outside of memory ranges |
56 | controlled by a more specific policy set by | |
314093c9 | 57 | .BR mbind (2). |
73ae0a09 MK |
58 | The process default policy also controls allocation of any pages for |
59 | memory mapped files mapped using the | |
60 | .BR mmap (2) | |
61 | call with the | |
62 | .B MAP_PRIVATE | |
63 | flag and that are only read [loaded] from by the task | |
64 | and of memory mapped files mapped using the | |
65 | .BR mmap (2) | |
66 | call with the | |
67 | .B MAP_SHARED | |
68 | flag, regardless of the access type. | |
314093c9 | 69 | The policy is only applied when a new page is allocated |
c13182ef MK |
70 | for the process. |
71 | For anonymous memory this is when the page is first | |
314093c9 MK |
72 | touched by the application. |
73 | ||
73ae0a09 MK |
74 | The |
75 | .I mode | |
76 | argument must specify one of | |
314093c9 MK |
77 | .BR MPOL_DEFAULT , |
78 | .BR MPOL_BIND , | |
73ae0a09 MK |
79 | .B MPOL_INTERLEAVE |
80 | or | |
314093c9 | 81 | .BR MPOL_PREFERRED . |
73ae0a09 | 82 | All modes except |
314093c9 | 83 | .B MPOL_DEFAULT |
73ae0a09 | 84 | require the caller to specify via the |
c13182ef | 85 | .I nodemask |
73ae0a09 MK |
86 | parameter |
87 | one or more nodes. | |
88 | ||
c13182ef | 89 | .I nodemask |
00045cbb | 90 | points to a bit mask of node IDs that contains up to |
314093c9 | 91 | .I maxnode |
c13182ef | 92 | bits. |
73ae0a09 | 93 | The bit mask size is rounded to the next multiple of |
c13182ef MK |
94 | .IR "sizeof(unsigned long)" , |
95 | but the kernel will only use bits up to | |
314093c9 | 96 | .IR maxnode . |
73ae0a09 MK |
97 | A NULL value of |
98 | .I nodemask | |
99 | or a | |
100 | .I maxnode | |
101 | value of zero specifies the empty set of nodes. | |
102 | If the value of | |
103 | .I maxnode | |
104 | is zero, | |
105 | the | |
106 | .I nodemask | |
107 | argument is ignored. | |
314093c9 | 108 | |
c13182ef | 109 | The |
314093c9 | 110 | .B MPOL_DEFAULT |
73ae0a09 | 111 | mode is the default and means to allocate memory locally, |
c13182ef MK |
112 | i.e., on the node of the CPU that triggered the allocation. |
113 | .I nodemask | |
73ae0a09 MK |
114 | must be specified as NULL. |
115 | If the "local node" contains no free memory, the system will | |
116 | attempt to allocate memory from a "near by" node. | |
314093c9 MK |
117 | |
118 | The | |
119 | .B MPOL_BIND | |
73ae0a09 | 120 | mode defines a strict policy that restricts memory allocation to the |
c13182ef | 121 | nodes specified in |
314093c9 | 122 | .IR nodemask . |
73ae0a09 MK |
123 | If |
124 | .I nodemask | |
125 | specifies more than one node, page allocations will come from | |
00045cbb | 126 | the node with the lowest numeric node ID first, until that node |
73ae0a09 MK |
127 | contains no free memory. |
128 | Allocations will then come from the node with the next highest | |
00045cbb | 129 | node ID specified in |
73ae0a09 MK |
130 | .I nodemask |
131 | and so forth, until none of the specified nodes contain free memory. | |
132 | Pages will not be allocated from any node not specified in the | |
133 | .IR nodemask . | |
314093c9 MK |
134 | |
135 | .B MPOL_INTERLEAVE | |
73ae0a09 MK |
136 | interleaves page allocations across the nodes specified in |
137 | .I nodemask | |
00045cbb | 138 | in numeric node ID order. |
73ae0a09 MK |
139 | This optimizes for bandwidth instead of latency |
140 | by spreading out pages and memory accesses to those pages across | |
141 | multiple nodes. | |
142 | However, accesses to a single page will still be limited to | |
143 | the memory bandwidth of a single node. | |
144 | .\" NOTE: the following sentence doesn't make sense in the context | |
145 | .\" of set_mempolicy() -- no memory area specified. | |
146 | .\" To be effective the memory area should be fairly large, | |
147 | .\" at least 1MB or bigger. | |
314093c9 MK |
148 | |
149 | .B MPOL_PREFERRED | |
c13182ef | 150 | sets the preferred node for allocation. |
73ae0a09 MK |
151 | The kernel will try to allocate pages from this node first |
152 | and fall back to "near by" nodes if the preferred node is low on free | |
c13182ef | 153 | memory. |
73ae0a09 MK |
154 | If |
155 | .I nodemask | |
00045cbb | 156 | specifies more than one node ID, the first node in the |
73ae0a09 MK |
157 | mask will be selected as the preferred node. |
158 | If the | |
c13182ef | 159 | .I nodemask |
73ae0a09 MK |
160 | and |
161 | .I maxnode | |
162 | arguments specify the empty set, then the memory is allocated on | |
0dd0df4e | 163 | the node of the CPU that triggered the allocation (like |
314093c9 MK |
164 | .BR MPOL_DEFAULT ). |
165 | ||
73ae0a09 | 166 | The process memory policy is preserved across an |
3bd6a9b1 MK |
167 | .BR execve (2), |
168 | and is inherited by child processes created using | |
c13182ef MK |
169 | .BR fork (2) |
170 | or | |
314093c9 | 171 | .BR clone (2). |
314093c9 MK |
172 | .SH RETURN VALUE |
173 | On success, | |
174 | .BR set_mempolicy () | |
175 | returns 0; | |
176 | on error, \-1 is returned and | |
c13182ef | 177 | .I errno |
314093c9 | 178 | is set to indicate the error. |
73ae0a09 MK |
179 | .SH ERRORS |
180 | .TP | |
b3a7b55e MK |
181 | .B EFAULT |
182 | Part of all of the memory range specified by | |
183 | .I nodemask | |
184 | and | |
185 | .I maxnode | |
186 | points outside your accessible address space. | |
187 | .TP | |
73ae0a09 | 188 | .B EINVAL |
4d2be0ee MK |
189 | .I mode |
190 | is invalid. | |
73ae0a09 MK |
191 | Or, |
192 | .I mode | |
193 | is | |
00045cbb | 194 | .B MPOL_DEFAULT |
73ae0a09 MK |
195 | and |
196 | .I nodemask | |
1f04cc97 | 197 | is non-empty, |
73ae0a09 MK |
198 | or |
199 | .I mode | |
200 | is | |
00045cbb | 201 | .B MPOL_BIND |
73ae0a09 | 202 | or |
00045cbb | 203 | .B MPOL_INTERLEAVE |
73ae0a09 MK |
204 | and |
205 | .I nodemask | |
206 | is empty. | |
207 | Or, | |
208 | .I maxnode | |
209 | specifies more than a page worth of bits. | |
210 | Or, | |
211 | .I nodemask | |
00045cbb MK |
212 | specifies one or more node IDs that are |
213 | greater than the maximum supported node ID, | |
73ae0a09 | 214 | or are not allowed in the calling task's context. |
00045cbb MK |
215 | .\" "calling task's context" refers to cpusets. |
216 | .\" No man page avail to ref. --Lee Schermerhorn | |
217 | Or, none of the node IDs specified by | |
73ae0a09 MK |
218 | .I nodemask |
219 | are on-line, or none of the specified nodes contain memory. | |
220 | .TP | |
73ae0a09 MK |
221 | .B ENOMEM |
222 | Insufficient kernel memory was available. | |
9d9dc1e8 | 223 | .SH CONFORMING TO |
8382f16d | 224 | This system call is Linux-specific. |
a1d5f77c MK |
225 | .SH NOTES |
226 | Process policy is not remembered if the page is swapped out. | |
73ae0a09 MK |
227 | When such a page is paged back in, it will use the policy of |
228 | the process or memory range that is in effect at the time the | |
229 | page is allocated. | |
0e99f2a5 MK |
230 | .SS "Versions and Library Support" |
231 | See | |
232 | .BR mbind (2). | |
314093c9 MK |
233 | .SH SEE ALSO |
234 | .BR mbind (2), | |
73ae0a09 | 235 | .BR mmap (2), |
314093c9 | 236 | .BR get_mempolicy (2), |
a18e2edb MK |
237 | .BR numa (3), |
238 | .BR cpuset (7), | |
239 | .BR numactl (8) |