]> git.ipfire.org Git - thirdparty/man-pages.git/blob - man2/mlock.2
All pages: Remove the 5th argument to .TH
[thirdparty/man-pages.git] / man2 / mlock.2
1 .\" Copyright (C) Michael Kerrisk, 2004
2 .\" using some material drawn from earlier man pages
3 .\" written by Thomas Kuhn, Copyright 1996
4 .\"
5 .\" SPDX-License-Identifier: GPL-2.0-or-later
6 .\"
7 .TH MLOCK 2 2021-08-27 "Linux man-pages (unreleased)"
8 .SH NAME
9 mlock, mlock2, munlock, mlockall, munlockall \- lock and unlock memory
10 .SH LIBRARY
11 Standard C library
12 .RI ( libc ", " \-lc )
13 .SH SYNOPSIS
14 .nf
15 .B #include <sys/mman.h>
16 .PP
17 .BI "int mlock(const void *" addr ", size_t " len );
18 .BI "int mlock2(const void *" addr ", size_t " len ", unsigned int " flags );
19 .BI "int munlock(const void *" addr ", size_t " len );
20 .PP
21 .BI "int mlockall(int " flags );
22 .B int munlockall(void);
23 .fi
24 .SH DESCRIPTION
25 .BR mlock (),
26 .BR mlock2 (),
27 and
28 .BR mlockall ()
29 lock part or all of the calling process's virtual address
30 space into RAM, preventing that memory from being paged to the
31 swap area.
32 .PP
33 .BR munlock ()
34 and
35 .BR munlockall ()
36 perform the converse operation,
37 unlocking part or all of the calling process's virtual
38 address space, so that pages in the specified virtual address range may
39 once more to be swapped out if required by the kernel memory manager.
40 .PP
41 Memory locking and unlocking are performed in units of whole pages.
42 .SS mlock(), mlock2(), and munlock()
43 .BR mlock ()
44 locks pages in the address range starting at
45 .I addr
46 and continuing for
47 .I len
48 bytes.
49 All pages that contain a part of the specified address range are
50 guaranteed to be resident in RAM when the call returns successfully;
51 the pages are guaranteed to stay in RAM until later unlocked.
52 .PP
53 .BR mlock2 ()
54 .\" commit a8ca5d0ecbdde5cc3d7accacbd69968b0c98764e
55 .\" commit de60f5f10c58d4f34b68622442c0e04180367f3f
56 .\" commit b0f205c2a3082dd9081f9a94e50658c5fa906ff1
57 also locks pages in the specified range starting at
58 .I addr
59 and continuing for
60 .I len
61 bytes.
62 However, the state of the pages contained in that range after the call
63 returns successfully will depend on the value in the
64 .I flags
65 argument.
66 .PP
67 The
68 .I flags
69 argument can be either 0 or the following constant:
70 .TP
71 .B MLOCK_ONFAULT
72 Lock pages that are currently resident and mark the entire range so
73 that the remaining nonresident pages are locked when they are populated
74 by a page fault.
75 .PP
76 If
77 .I flags
78 is 0,
79 .BR mlock2 ()
80 behaves exactly the same as
81 .BR mlock ().
82 .PP
83 .BR munlock ()
84 unlocks pages in the address range starting at
85 .I addr
86 and continuing for
87 .I len
88 bytes.
89 After this call, all pages that contain a part of the specified
90 memory range can be moved to external swap space again by the kernel.
91 .SS mlockall() and munlockall()
92 .BR mlockall ()
93 locks all pages mapped into the address space of the
94 calling process.
95 This includes the pages of the code, data, and stack
96 segment, as well as shared libraries, user space kernel data, shared
97 memory, and memory-mapped files.
98 All mapped pages are guaranteed
99 to be resident in RAM when the call returns successfully;
100 the pages are guaranteed to stay in RAM until later unlocked.
101 .PP
102 The
103 .I flags
104 argument is constructed as the bitwise OR of one or more of the
105 following constants:
106 .TP
107 .B MCL_CURRENT
108 Lock all pages which are currently mapped into the address space of
109 the process.
110 .TP
111 .B MCL_FUTURE
112 Lock all pages which will become mapped into the address space of the
113 process in the future.
114 These could be, for instance, new pages required
115 by a growing heap and stack as well as new memory-mapped files or
116 shared memory regions.
117 .TP
118 .BR MCL_ONFAULT " (since Linux 4.4)"
119 Used together with
120 .BR MCL_CURRENT ,
121 .BR MCL_FUTURE ,
122 or both.
123 Mark all current (with
124 .BR MCL_CURRENT )
125 or future (with
126 .BR MCL_FUTURE )
127 mappings to lock pages when they are faulted in.
128 When used with
129 .BR MCL_CURRENT ,
130 all present pages are locked, but
131 .BR mlockall ()
132 will not fault in non-present pages.
133 When used with
134 .BR MCL_FUTURE ,
135 all future mappings will be marked to lock pages when they are faulted
136 in, but they will not be populated by the lock when the mapping is
137 created.
138 .B MCL_ONFAULT
139 must be used with either
140 .B MCL_CURRENT
141 or
142 .B MCL_FUTURE
143 or both.
144 .PP
145 If
146 .B MCL_FUTURE
147 has been specified, then a later system call (e.g.,
148 .BR mmap (2),
149 .BR sbrk (2),
150 .BR malloc (3)),
151 may fail if it would cause the number of locked bytes to exceed
152 the permitted maximum (see below).
153 In the same circumstances, stack growth may likewise fail:
154 the kernel will deny stack expansion and deliver a
155 .B SIGSEGV
156 signal to the process.
157 .PP
158 .BR munlockall ()
159 unlocks all pages mapped into the address space of the
160 calling process.
161 .SH RETURN VALUE
162 On success, these system calls return 0.
163 On error, \-1 is returned,
164 .I errno
165 is set to indicate the error,
166 and no changes are made to any locks in the
167 address space of the process.
168 .SH ERRORS
169 .\"SVr4 documents an additional EAGAIN error code.
170 .TP
171 .B EAGAIN
172 .RB ( mlock (),
173 .BR mlock2 (),
174 and
175 .BR munlock ())
176 Some or all of the specified address range could not be locked.
177 .TP
178 .B EINVAL
179 .RB ( mlock (),
180 .BR mlock2 (),
181 and
182 .BR munlock ())
183 The result of the addition
184 .IR addr + len
185 was less than
186 .I addr
187 (e.g., the addition may have resulted in an overflow).
188 .TP
189 .B EINVAL
190 .RB ( mlock2 ())
191 Unknown \fIflags\fP were specified.
192 .TP
193 .B EINVAL
194 .RB ( mlockall ())
195 Unknown \fIflags\fP were specified or
196 .B MCL_ONFAULT
197 was specified without either
198 .B MCL_FUTURE
199 or
200 .BR MCL_CURRENT .
201 .TP
202 .B EINVAL
203 (Not on Linux)
204 .I addr
205 was not a multiple of the page size.
206 .TP
207 .B ENOMEM
208 .RB ( mlock (),
209 .BR mlock2 (),
210 and
211 .BR munlock ())
212 Some of the specified address range does not correspond to mapped
213 pages in the address space of the process.
214 .TP
215 .B ENOMEM
216 .RB ( mlock (),
217 .BR mlock2 (),
218 and
219 .BR munlock ())
220 Locking or unlocking a region would result in the total number of
221 mappings with distinct attributes (e.g., locked versus unlocked)
222 exceeding the allowed maximum.
223 .\" I.e., the number of VMAs would exceed the 64kB maximum
224 (For example, unlocking a range in the middle of a currently locked
225 mapping would result in three mappings:
226 two locked mappings at each end and an unlocked mapping in the middle.)
227 .TP
228 .B ENOMEM
229 (Linux 2.6.9 and later) the caller had a nonzero
230 .B RLIMIT_MEMLOCK
231 soft resource limit, but tried to lock more memory than the limit
232 permitted.
233 This limit is not enforced if the process is privileged
234 .RB ( CAP_IPC_LOCK ).
235 .TP
236 .B ENOMEM
237 (Linux 2.4 and earlier) the calling process tried to lock more than
238 half of RAM.
239 .\" In the case of mlock(), this check is somewhat buggy: it doesn't
240 .\" take into account whether the to-be-locked range overlaps with
241 .\" already locked pages. Thus, suppose we allocate
242 .\" (num_physpages / 4 + 1) of memory, and lock those pages once using
243 .\" mlock(), and then lock the *same* page range a second time.
244 .\" In the case, the second mlock() call will fail, since the check
245 .\" calculates that the process is trying to lock (num_physpages / 2 + 2)
246 .\" pages, which of course is not true. (MTK, Nov 04, kernel 2.4.28)
247 .TP
248 .B EPERM
249 The caller is not privileged, but needs privilege
250 .RB ( CAP_IPC_LOCK )
251 to perform the requested operation.
252 .TP
253 .B EPERM
254 .RB ( munlockall ())
255 (Linux 2.6.8 and earlier) The caller was not privileged
256 .RB ( CAP_IPC_LOCK ).
257 .SH VERSIONS
258 .BR mlock2 ()
259 is available since Linux 4.4;
260 glibc support was added in version 2.27.
261 .SH STANDARDS
262 .BR mlock (),
263 .BR munlock (),
264 .BR mlockall (),
265 and
266 .BR munlockall ():
267 POSIX.1-2001, POSIX.1-2008, SVr4.
268 .PP
269 .BR mlock2 ()
270 is Linux specific.
271 .PP
272 On POSIX systems on which
273 .BR mlock ()
274 and
275 .BR munlock ()
276 are available,
277 .B _POSIX_MEMLOCK_RANGE
278 is defined in \fI<unistd.h>\fP and the number of bytes in a page
279 can be determined from the constant
280 .B PAGESIZE
281 (if defined) in \fI<limits.h>\fP or by calling
282 .IR sysconf(_SC_PAGESIZE) .
283 .PP
284 On POSIX systems on which
285 .BR mlockall ()
286 and
287 .BR munlockall ()
288 are available,
289 .B _POSIX_MEMLOCK
290 is defined in \fI<unistd.h>\fP to a value greater than 0.
291 (See also
292 .BR sysconf (3).)
293 .\" POSIX.1-2001: It shall be defined to -1 or 0 or 200112L.
294 .\" -1: unavailable, 0: ask using sysconf().
295 .\" glibc defines it to 1.
296 .SH NOTES
297 Memory locking has two main applications: real-time algorithms and
298 high-security data processing.
299 Real-time applications require
300 deterministic timing, and, like scheduling, paging is one major cause
301 of unexpected program execution delays.
302 Real-time applications will
303 usually also switch to a real-time scheduler with
304 .BR sched_setscheduler (2).
305 Cryptographic security software often handles critical bytes like
306 passwords or secret keys as data structures.
307 As a result of paging,
308 these secrets could be transferred onto a persistent swap store medium,
309 where they might be accessible to the enemy long after the security
310 software has erased the secrets in RAM and terminated.
311 (But be aware that the suspend mode on laptops and some desktop
312 computers will save a copy of the system's RAM to disk, regardless
313 of memory locks.)
314 .PP
315 Real-time processes that are using
316 .BR mlockall ()
317 to prevent delays on page faults should reserve enough
318 locked stack pages before entering the time-critical section,
319 so that no page fault can be caused by function calls.
320 This can be achieved by calling a function that allocates a
321 sufficiently large automatic variable (an array) and writes to the
322 memory occupied by this array in order to touch these stack pages.
323 This way, enough pages will be mapped for the stack and can be
324 locked into RAM.
325 The dummy writes ensure that not even copy-on-write
326 page faults can occur in the critical section.
327 .PP
328 Memory locks are not inherited by a child created via
329 .BR fork (2)
330 and are automatically removed (unlocked) during an
331 .BR execve (2)
332 or when the process terminates.
333 The
334 .BR mlockall ()
335 .B MCL_FUTURE
336 and
337 .B MCL_FUTURE | MCL_ONFAULT
338 settings are not inherited by a child created via
339 .BR fork (2)
340 and are cleared during an
341 .BR execve (2).
342 .PP
343 Note that
344 .BR fork (2)
345 will prepare the address space for a copy-on-write operation.
346 The consequence is that any write access that follows will cause
347 a page fault that in turn may cause high latencies for a real-time process.
348 Therefore, it is crucial not to invoke
349 .BR fork (2)
350 after an
351 .BR mlockall ()
352 or
353 .BR mlock ()
354 operation\(emnot even from a thread which runs at a low priority within
355 a process which also has a thread running at elevated priority.
356 .PP
357 The memory lock on an address range is automatically removed
358 if the address range is unmapped via
359 .BR munmap (2).
360 .PP
361 Memory locks do not stack, that is, pages which have been locked several times
362 by calls to
363 .BR mlock (),
364 .BR mlock2 (),
365 or
366 .BR mlockall ()
367 will be unlocked by a single call to
368 .BR munlock ()
369 for the corresponding range or by
370 .BR munlockall ().
371 Pages which are mapped to several locations or by several processes stay
372 locked into RAM as long as they are locked at least at one location or by
373 at least one process.
374 .PP
375 If a call to
376 .BR mlockall ()
377 which uses the
378 .B MCL_FUTURE
379 flag is followed by another call that does not specify this flag, the
380 changes made by the
381 .B MCL_FUTURE
382 call will be lost.
383 .PP
384 The
385 .BR mlock2 ()
386 .B MLOCK_ONFAULT
387 flag and the
388 .BR mlockall ()
389 .B MCL_ONFAULT
390 flag allow efficient memory locking for applications that deal with
391 large mappings where only a (small) portion of pages in the mapping are touched.
392 In such cases, locking all of the pages in a mapping would incur
393 a significant penalty for memory locking.
394 .SS Linux notes
395 Under Linux,
396 .BR mlock (),
397 .BR mlock2 (),
398 and
399 .BR munlock ()
400 automatically round
401 .I addr
402 down to the nearest page boundary.
403 However, the POSIX.1 specification of
404 .BR mlock ()
405 and
406 .BR munlock ()
407 allows an implementation to require that
408 .I addr
409 is page aligned, so portable applications should ensure this.
410 .PP
411 The
412 .I VmLck
413 field of the Linux-specific
414 .I /proc/[pid]/status
415 file shows how many kilobytes of memory the process with ID
416 .I PID
417 has locked using
418 .BR mlock (),
419 .BR mlock2 (),
420 .BR mlockall (),
421 and
422 .BR mmap (2)
423 .BR MAP_LOCKED .
424 .SS Limits and permissions
425 In Linux 2.6.8 and earlier,
426 a process must be privileged
427 .RB ( CAP_IPC_LOCK )
428 in order to lock memory and the
429 .B RLIMIT_MEMLOCK
430 soft resource limit defines a limit on how much memory the process may lock.
431 .PP
432 Since Linux 2.6.9, no limits are placed on the amount of memory
433 that a privileged process can lock and the
434 .B RLIMIT_MEMLOCK
435 soft resource limit instead defines a limit on how much memory an
436 unprivileged process may lock.
437 .SH BUGS
438 In Linux 4.8 and earlier,
439 a bug in the kernel's accounting of locked memory for unprivileged processes
440 (i.e., without
441 .BR CAP_IPC_LOCK )
442 meant that if the region specified by
443 .I addr
444 and
445 .I len
446 overlapped an existing lock,
447 then the already locked bytes in the overlapping region were counted twice
448 when checking against the limit.
449 Such double accounting could incorrectly calculate a "total locked memory"
450 value for the process that exceeded the
451 .B RLIMIT_MEMLOCK
452 limit, with the result that
453 .BR mlock ()
454 and
455 .BR mlock2 ()
456 would fail on requests that should have succeeded.
457 This bug was fixed
458 .\" commit 0cf2f6f6dc605e587d2c1120f295934c77e810e8
459 in Linux 4.9.
460 .PP
461 In the 2.4 series Linux kernels up to and including 2.4.17,
462 a bug caused the
463 .BR mlockall ()
464 .B MCL_FUTURE
465 flag to be inherited across a
466 .BR fork (2).
467 This was rectified in kernel 2.4.18.
468 .PP
469 Since kernel 2.6.9, if a privileged process calls
470 .I mlockall(MCL_FUTURE)
471 and later drops privileges (loses the
472 .B CAP_IPC_LOCK
473 capability by, for example,
474 setting its effective UID to a nonzero value),
475 then subsequent memory allocations (e.g.,
476 .BR mmap (2),
477 .BR brk (2))
478 will fail if the
479 .B RLIMIT_MEMLOCK
480 resource limit is encountered.
481 .\" See the following LKML thread:
482 .\" http://marc.theaimsgroup.com/?l=linux-kernel&m=113801392825023&w=2
483 .\" "Rationale for RLIMIT_MEMLOCK"
484 .\" 23 Jan 2006
485 .SH SEE ALSO
486 .BR mincore (2),
487 .BR mmap (2),
488 .BR setrlimit (2),
489 .BR shmctl (2),
490 .BR sysconf (3),
491 .BR proc (5),
492 .BR capabilities (7)