]> git.ipfire.org Git - thirdparty/man-pages.git/blame - man2/set_mempolicy.2
Add reference to numa(7) for information on library support.
[thirdparty/man-pages.git] / man2 / set_mempolicy.2
CommitLineData
314093c9 1.\" Copyright 2003,2004 Andi Kleen, SuSE Labs.
73ae0a09 2.\" and Copyright 2007 Lee Schermerhorn, Hewlett Packard
314093c9
MK
3.\"
4.\" Permission is granted to make and distribute verbatim copies of this
5.\" manual provided the copyright notice and this permission notice are
6.\" preserved on all copies.
7.\"
8.\" Permission is granted to copy and distribute modified versions of this
9.\" manual under the conditions for verbatim copying, provided that the
10.\" entire resulting derived work is distributed under the terms of a
11.\" permission notice identical to this one.
c13182ef 12.\"
314093c9
MK
13.\" Since the Linux kernel and libraries are constantly changing, this
14.\" manual page may be incorrect or out-of-date. The author(s) assume no
15.\" responsibility for errors or omissions, or for damages resulting from
c13182ef
MK
16.\" the use of the information contained herein.
17.\"
314093c9
MK
18.\" Formatted or processed versions of this manual, if unaccompanied by
19.\" the source, must acknowledge the copyright and authors of this work.
c13182ef 20.\"
314093c9 21.\" 2006-02-03, mtk, substantial wording changes and other improvements
00045cbb
MK
22.\" 2007-08-27, Lee Schermerhorn <Lee.Schermerhorn@hp.com>
23.\" more precise specification of behavior.
314093c9 24.\"
1313d297 25.TH SET_MEMPOLICY 2 2008-08-11 Linux "Linux Programmer's Manual"
314093c9 26.SH NAME
73ae0a09 27set_mempolicy \- set default NUMA memory policy for a process and its children
314093c9 28.SH SYNOPSIS
521bf584 29.nf
c13182ef 30.B "#include <numaif.h>"
314093c9 31.sp
73ae0a09 32.BI "int set_mempolicy(int " mode ", unsigned long *" nodemask ,
521bf584 33.BI " unsigned long " maxnode );
73ae0a09 34.sp
00045cbb 35Link with \fI\-lnuma\fP
521bf584 36.fi
314093c9
MK
37.SH DESCRIPTION
38.BR set_mempolicy ()
73ae0a09
MK
39sets the NUMA memory policy of the calling process,
40which consists of a policy mode and zero or more nodes,
41to the values specified by the
42.IR mode ,
43.I nodemask
44and
0daa9e92 45.I maxnode
73ae0a09 46arguments.
314093c9
MK
47
48A NUMA machine has different
49memory controllers with different distances to specific CPUs.
73ae0a09 50The memory policy defines from which node memory is allocated for
c13182ef 51the process.
314093c9 52
73ae0a09 53This system call defines the default policy for the process.
ecccf7c2 54The process policy governs allocation of pages in the process's
73ae0a09
MK
55address space outside of memory ranges
56controlled by a more specific policy set by
314093c9 57.BR mbind (2).
73ae0a09
MK
58The process default policy also controls allocation of any pages for
59memory mapped files mapped using the
60.BR mmap (2)
61call with the
62.B MAP_PRIVATE
cdba9253 63flag and that are only read [loaded] from by the process
73ae0a09
MK
64and of memory mapped files mapped using the
65.BR mmap (2)
66call with the
67.B MAP_SHARED
68flag, regardless of the access type.
314093c9 69The policy is only applied when a new page is allocated
c13182ef
MK
70for the process.
71For anonymous memory this is when the page is first
314093c9
MK
72touched by the application.
73
73ae0a09
MK
74The
75.I mode
76argument must specify one of
314093c9
MK
77.BR MPOL_DEFAULT ,
78.BR MPOL_BIND ,
73ae0a09
MK
79.B MPOL_INTERLEAVE
80or
314093c9 81.BR MPOL_PREFERRED .
73ae0a09 82All modes except
314093c9 83.B MPOL_DEFAULT
73ae0a09 84require the caller to specify via the
c13182ef 85.I nodemask
c4bb193f 86argument one or more nodes.
73ae0a09 87
f98b728e
MK
88The
89.I mode
90argument may also include an optional
91.IR "mode flag ".
92The supported
93.I "mode flags"
94are:
95.TP
96.BR MPOL_F_STATIC_NODES " (since Linux 2.6.26)"
97A non-empty
98.I nodemask
99specifies physical node ids.
100Linux does will not remap the
101.I nodemask
102when the process moves to a different cpuset context,
103nor when the set of nodes allowed by the process's
104current cpuset context changes.
105.TP
106.BR MPOL_F_RELATIVE_NODES " (since Linux 2.6.26)"
107A non-empty
108.I nodemask
109specifies node ids that are relative to the set of
110node ids allowed by the process's current cpuset.
111.PP
c13182ef 112.I nodemask
00045cbb 113points to a bit mask of node IDs that contains up to
314093c9 114.I maxnode
c13182ef 115bits.
73ae0a09 116The bit mask size is rounded to the next multiple of
c13182ef
MK
117.IR "sizeof(unsigned long)" ,
118but the kernel will only use bits up to
314093c9 119.IR maxnode .
73ae0a09
MK
120A NULL value of
121.I nodemask
122or a
123.I maxnode
124value of zero specifies the empty set of nodes.
125If the value of
126.I maxnode
127is zero,
128the
129.I nodemask
130argument is ignored.
f98b728e 131
cdba9253
MK
132Where a
133.I nodemask
134is required, it must contain at least one node that is on-line,
135allowed by the process's current cpuset context,
f98b728e
MK
136[unless the
137.B MPOL_F_STATIC_NODES
138mode flag is specified],
cdba9253 139and contains memory.
f98b728e
MK
140If the
141.B MPOL_F_STATIC_NODES
142is set in
143.I mode
144and a required
145.I nodemask
146contains no nodes that are allowed by the process's current cpuset context,
147the memory policy reverts to
148.IR "local allocation" .
149This effectively overrides the specified policy until the process's
150cpuset context includes one or more of the nodes specified by
151.IR nodemask.
314093c9 152
c13182ef 153The
314093c9 154.B MPOL_DEFAULT
f98b728e
MK
155mode specifies that any non-default process memory policy be removed,
156so that the memory policy "falls back" to the system default policy.
157The system default policy is "local allocation"--
158i.e., allocate memory on the node of the CPU that triggered the allocation.
c13182ef 159.I nodemask
73ae0a09
MK
160must be specified as NULL.
161If the "local node" contains no free memory, the system will
162attempt to allocate memory from a "near by" node.
314093c9
MK
163
164The
165.B MPOL_BIND
73ae0a09 166mode defines a strict policy that restricts memory allocation to the
c13182ef 167nodes specified in
314093c9 168.IR nodemask .
73ae0a09
MK
169If
170.I nodemask
171specifies more than one node, page allocations will come from
00045cbb 172the node with the lowest numeric node ID first, until that node
73ae0a09
MK
173contains no free memory.
174Allocations will then come from the node with the next highest
00045cbb 175node ID specified in
73ae0a09
MK
176.I nodemask
177and so forth, until none of the specified nodes contain free memory.
178Pages will not be allocated from any node not specified in the
179.IR nodemask .
314093c9
MK
180
181.B MPOL_INTERLEAVE
73ae0a09
MK
182interleaves page allocations across the nodes specified in
183.I nodemask
00045cbb 184in numeric node ID order.
73ae0a09
MK
185This optimizes for bandwidth instead of latency
186by spreading out pages and memory accesses to those pages across
187multiple nodes.
188However, accesses to a single page will still be limited to
189the memory bandwidth of a single node.
190.\" NOTE: the following sentence doesn't make sense in the context
191.\" of set_mempolicy() -- no memory area specified.
192.\" To be effective the memory area should be fairly large,
193.\" at least 1MB or bigger.
314093c9
MK
194
195.B MPOL_PREFERRED
c13182ef 196sets the preferred node for allocation.
73ae0a09
MK
197The kernel will try to allocate pages from this node first
198and fall back to "near by" nodes if the preferred node is low on free
c13182ef 199memory.
73ae0a09
MK
200If
201.I nodemask
00045cbb 202specifies more than one node ID, the first node in the
73ae0a09
MK
203mask will be selected as the preferred node.
204If the
c13182ef 205.I nodemask
73ae0a09
MK
206and
207.I maxnode
1313d297
MK
208arguments specify the empty set, then the policy
209specifies "local allocation"
210(like the system default policy discussed above).
314093c9 211
73ae0a09 212The process memory policy is preserved across an
3bd6a9b1
MK
213.BR execve (2),
214and is inherited by child processes created using
c13182ef
MK
215.BR fork (2)
216or
314093c9 217.BR clone (2).
314093c9
MK
218.SH RETURN VALUE
219On success,
220.BR set_mempolicy ()
221returns 0;
222on error, \-1 is returned and
c13182ef 223.I errno
314093c9 224is set to indicate the error.
73ae0a09
MK
225.SH ERRORS
226.TP
b3a7b55e
MK
227.B EFAULT
228Part of all of the memory range specified by
229.I nodemask
230and
231.I maxnode
232points outside your accessible address space.
233.TP
73ae0a09 234.B EINVAL
4d2be0ee
MK
235.I mode
236is invalid.
73ae0a09
MK
237Or,
238.I mode
239is
00045cbb 240.B MPOL_DEFAULT
73ae0a09
MK
241and
242.I nodemask
1f04cc97 243is non-empty,
73ae0a09
MK
244or
245.I mode
246is
00045cbb 247.B MPOL_BIND
73ae0a09 248or
00045cbb 249.B MPOL_INTERLEAVE
73ae0a09
MK
250and
251.I nodemask
252is empty.
253Or,
254.I maxnode
255specifies more than a page worth of bits.
256Or,
257.I nodemask
00045cbb 258specifies one or more node IDs that are
cdba9253 259greater than the maximum supported node ID.
00045cbb 260Or, none of the node IDs specified by
73ae0a09 261.I nodemask
cdba9253
MK
262are on-line and allowed by the process's current cpuset context,
263or none of the specified nodes contain memory.
f98b728e
MK
264Or, the
265.I mode
266argument specified both
267.B MPOL_F_STATIC_NODES
268and
269.BR MPOL_F_RELATIVE_NODES .
73ae0a09 270.TP
73ae0a09
MK
271.B ENOMEM
272Insufficient kernel memory was available.
9d9dc1e8 273.SH CONFORMING TO
8382f16d 274This system call is Linux-specific.
a1d5f77c
MK
275.SH NOTES
276Process policy is not remembered if the page is swapped out.
73ae0a09
MK
277When such a page is paged back in, it will use the policy of
278the process or memory range that is in effect at the time the
279page is allocated.
0e99f2a5
MK
280.SS "Versions and Library Support"
281See
282.BR mbind (2).
314093c9 283.SH SEE ALSO
fa23e023 284.BR get_mempolicy (2),
f0c34053 285.BR getcpu (2),
314093c9 286.BR mbind (2),
73ae0a09 287.BR mmap (2),
a18e2edb
MK
288.BR numa (3),
289.BR cpuset (7),
290.BR numactl (8)