man2/mlock.2

   1 .\" Copyright (C) Michael Kerrisk, 2004
   2 .\"     using some material drawn from earlier man pages
   3 .\"     written by Thomas Kuhn, Copyright 1996
   4 .\"
   5 .\" %%%LICENSE_START(GPLv2+_DOC_FULL)
   6 .\" This is free documentation; you can redistribute it and/or
   7 .\" modify it under the terms of the GNU General Public License as
   8 .\" published by the Free Software Foundation; either version 2 of
   9 .\" the License, or (at your option) any later version.
  10 .\"
  11 .\" The GNU General Public License's references to "object code"
  12 .\" and "executables" are to be interpreted as the output of any
  13 .\" document formatting or typesetting system, including
  14 .\" intermediate and printed output.
  15 .\"
  16 .\" This manual is distributed in the hope that it will be useful,
  17 .\" but WITHOUT ANY WARRANTY; without even the implied warranty of
  18 .\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  19 .\" GNU General Public License for more details.
  20 .\"
  21 .\" You should have received a copy of the GNU General Public
  22 .\" License along with this manual; if not, see
  23 .\" <http://www.gnu.org/licenses/>.
  24 .\" %%%LICENSE_END
  25 .\"
  26 .TH MLOCK 2 2015-08-28 "Linux" "Linux Programmer's Manual"
  27 .SH NAME
  28 mlock, mlock2, munlock, mlockall, munlockall \- lock and unlock memory
  29 .SH SYNOPSIS
  30 .nf
  31 .B #include <sys/mman.h>
  32 .sp
  33 .BI "int mlock(const void *" addr ", size_t " len );
  34 .BI "int mlock2(const void *" addr ", size_t " len ", int " flags );
  35 .BI "int munlock(const void *" addr ", size_t " len );
  36 .sp
  37 .BI "int mlockall(int " flags );
  38 .B int munlockall(void);
  39 .fi
  40 .SH DESCRIPTION
  41 .BR mlock (),
  42 .BR mlock2 (),
  43 and
  44 .BR mlockall ()
  45 respectively lock part or all of the calling process's virtual address
  46 space into RAM, preventing that memory from being paged to the
  47 swap area.
  48 .BR munlock ()
  49 and
  50 .BR munlockall ()
  51 perform the converse operation,
  52 respectively unlocking part or all of the calling process's virtual
  53 address space, so that pages in the specified virtual address range may
  54 once more to be swapped out if required by the kernel memory manager.
  55 Memory locking and unlocking are performed in units of whole pages.
  56 .SS mlock(), mlock2(), and munlock()
  57 .BR mlock ()
  58 locks pages in the address range starting at
  59 .I addr
  60 and continuing for
  61 .I len
  62 bytes.
  63 All pages that contain a part of the specified address range are
  64 guaranteed to be resident in RAM when the call returns successfully;
  65 the pages are guaranteed to stay in RAM until later unlocked.
  66
  67 .BR mlock2 ()
  68 .\" commit a8ca5d0ecbdde5cc3d7accacbd69968b0c98764e
  69 .\" commit de60f5f10c58d4f34b68622442c0e04180367f3f
  70 .\" commit b0f205c2a3082dd9081f9a94e50658c5fa906ff1
  71 also locks pages in the specified range starting at
  72 .I addr
  73 and continuing for
  74 .I len
  75 bytes.
  76 However, the state of the pages contained in that range after the call
  77 returns successfully will depend on the value in the
  78 .I flags
  79 argument.
  80
  81 The
  82 .I flags
  83 argument can be either 0 or the following constant:
  84 .TP
  85 .B MLOCK_ONFAULT
  86 Lock pages that are currently resident and mark the entire range to have
  87 pages locked when they are populated by the page fault.
  88 .PP
  89
  90 If
  91 .I flags
  92 is 0,
  93 .BR mlock2 ()
  94 behaves exactly the same as
  95 .BR mlock ().
  96
  97 Note: currently, there is not a glibc wrapper for
  98 .BR mlock2 (),
  99 so it will need to be invoked using
 100 .BR syscall (2).
 101
 102 .BR munlock ()
 103 unlocks pages in the address range starting at
 104 .I addr
 105 and continuing for
 106 .I len
 107 bytes.
 108 After this call, all pages that contain a part of the specified
 109 memory range can be moved to external swap space again by the kernel.
 110 .SS mlockall() and munlockall()
 111 .BR mlockall ()
 112 locks all pages mapped into the address space of the
 113 calling process.
 114 This includes the pages of the code, data and stack
 115 segment, as well as shared libraries, user space kernel data, shared
 116 memory, and memory-mapped files.
 117 All mapped pages are guaranteed
 118 to be resident in RAM when the call returns successfully;
 119 the pages are guaranteed to stay in RAM until later unlocked.
 120
 121 The
 122 .I flags
 123 argument is constructed as the bitwise OR of one or more of the
 124 following constants:
 125 .TP 1.2i
 126 .B MCL_CURRENT
 127 Lock all pages which are currently mapped into the address space of
 128 the process.
 129 .TP
 130 .B MCL_FUTURE
 131 Lock all pages which will become mapped into the address space of the
 132 process in the future.
 133 These could be, for instance, new pages required
 134 by a growing heap and stack as well as new memory-mapped files or
 135 shared memory regions.
 136 .TP
 137 .BR MCL_ONFAULT " (since Linux 4.4)"
 138 Used together with
 139 .BR MCL_CURRENT ,
 140 .BR MCL_FUTURE ,
 141 or both.
 142 Mark all current (with
 143 .BR MCL_CURRENT )
 144 or future (with
 145 .BR MCL_FUTURE )
 146 mappings to lock pages when they are faulted in.
 147 When used with
 148 .BR MCL_CURRENT ,
 149 all present pages are locked, but
 150 .BR mlockall ()
 151 will not fault in non-present pages.
 152 When used with
 153 .BR MCL_FUTURE ,
 154 all future mappings will be marked to lock pages when they are faulted
 155 in, but they will not be populated by the lock when the mapping is
 156 created.
 157 .B MCL_ONFAULT
 158 must be used with either
 159 .B MCL_CURRENT
 160 or
 161 .B MCL_FUTURE
 162 or both.
 163 .PP
 164 If
 165 .B MCL_FUTURE
 166 has been specified, then a later system call (e.g.,
 167 .BR mmap (2),
 168 .BR sbrk (2),
 169 .BR malloc (3)),
 170 may fail if it would cause the number of locked bytes to exceed
 171 the permitted maximum (see below).
 172 In the same circumstances, stack growth may likewise fail:
 173 the kernel will deny stack expansion and deliver a
 174 .B SIGSEGV
 175 signal to the process.
 176
 177 .BR munlockall ()
 178 unlocks all pages mapped into the address space of the
 179 calling process.
 180 .SH RETURN VALUE
 181 On success, these system calls return 0.
 182 On error, \-1 is returned,
 183 .I errno
 184 is set appropriately, and no changes are made to any locks in the
 185 address space of the process.
 186 .SH ERRORS
 187 .TP
 188 .B ENOMEM
 189 (Linux 2.6.9 and later) the caller had a nonzero
 190 .B RLIMIT_MEMLOCK
 191 soft resource limit, but tried to lock more memory than the limit
 192 permitted.
 193 This limit is not enforced if the process is privileged
 194 .RB ( CAP_IPC_LOCK ).
 195 .TP
 196 .B ENOMEM
 197 (Linux 2.4 and earlier) the calling process tried to lock more than
 198 half of RAM.
 199 .\" In the case of mlock(), this check is somewhat buggy: it doesn't
 200 .\" take into account whether the to-be-locked range overlaps with
 201 .\" already locked pages.  Thus, suppose we allocate
 202 .\" (num_physpages / 4 + 1) of memory, and lock those pages once using
 203 .\" mlock(), and then lock the *same* page range a second time.
 204 .\" In the case, the second mlock() call will fail, since the check
 205 .\" calculates that the process is trying to lock (num_physpages / 2 + 2)
 206 .\" pages, which of course is not true.  (MTK, Nov 04, kernel 2.4.28)
 207 .TP
 208 .B EPERM
 209 The caller is not privileged, but needs privilege
 210 .RB ( CAP_IPC_LOCK )
 211 to perform the requested operation.
 212 .\"SVr4 documents an additional EAGAIN error code.
 213 .LP
 214 For
 215 .BR mlock (),
 216 .BR mlock2 (),
 217 and
 218 .BR munlock ():
 219 .TP
 220 .B EAGAIN
 221 Some or all of the specified address range could not be locked.
 222 .TP
 223 .B EINVAL
 224 The result of the addition
 225 .IR addr + len
 226 was less than
 227 .IR addr
 228 (e.g., the addition may have resulted in an overflow).
 229 .TP
 230 .B EINVAL
 231 (Not on Linux)
 232 .I addr
 233 was not a multiple of the page size.
 234 .TP
 235 .B ENOMEM
 236 Some of the specified address range does not correspond to mapped
 237 pages in the address space of the process.
 238 .TP
 239 .B ENOMEM
 240 Locking or unlocking a region would result in the total number of
 241 mappings with distinct attributes (e.g., locked versus unlocked)
 242 exceeding the allowed maximum.
 243 .\" I.e., the number of VMAs would exceed the 64kB maximum
 244 (For example, unlocking a range in the middle of a currently locked
 245 mapping would result in three mappings:
 246 two locked mappings at each end and an unlocked mapping in the middle.)
 247 .LP
 248 For
 249 .BR mlock2 ():
 250 .TP
 251 .B EINVAL
 252 Unknown \fIflags\fP were specified.
 253 .LP
 254 For
 255 .BR mlockall ():
 256 .TP
 257 .B EINVAL
 258 Unknown \fIflags\fP were specified or
 259 .B MCL_ONFAULT
 260 was specified without either
 261 .B MCL_FUTURE
 262 or
 263 .BR MCL_CURRENT .
 264 .LP
 265 For
 266 .BR munlockall ():
 267 .TP
 268 .B EPERM
 269 (Linux 2.6.8 and earlier) The caller was not privileged
 270 .RB ( CAP_IPC_LOCK ).
 271 .SH VERSIONS
 272 .BR mlock2 (2)
 273 is available since Linux 4.4.
 274 .SH CONFORMING TO
 275 POSIX.1-2001, POSIX.1-2008, SVr4.
 276
 277 mlock2 ()
 278 is Linux specific.
 279 .SH AVAILABILITY
 280 On POSIX systems on which
 281 .BR mlock ()
 282 and
 283 .BR munlock ()
 284 are available,
 285 .B _POSIX_MEMLOCK_RANGE
 286 is defined in \fI<unistd.h>\fP and the number of bytes in a page
 287 can be determined from the constant
 288 .B PAGESIZE
 289 (if defined) in \fI<limits.h>\fP or by calling
 290 .IR sysconf(_SC_PAGESIZE) .
 291
 292 On POSIX systems on which
 293 .BR mlockall ()
 294 and
 295 .BR munlockall ()
 296 are available,
 297 .B _POSIX_MEMLOCK
 298 is defined in \fI<unistd.h>\fP to a value greater than 0.
 299 (See also
 300 .BR sysconf (3).)
 301 .\" POSIX.1-2001: It shall be defined to -1 or 0 or 200112L.
 302 .\" -1: unavailable, 0: ask using sysconf().
 303 .\" glibc defines it to 1.
 304 .SH NOTES
 305 Memory locking has two main applications: real-time algorithms and
 306 high-security data processing.
 307 Real-time applications require
 308 deterministic timing, and, like scheduling, paging is one major cause
 309 of unexpected program execution delays.
 310 Real-time applications will
 311 usually also switch to a real-time scheduler with
 312 .BR sched_setscheduler (2).
 313 Cryptographic security software often handles critical bytes like
 314 passwords or secret keys as data structures.
 315 As a result of paging,
 316 these secrets could be transferred onto a persistent swap store medium,
 317 where they might be accessible to the enemy long after the security
 318 software has erased the secrets in RAM and terminated.
 319 (But be aware that the suspend mode on laptops and some desktop
 320 computers will save a copy of the system's RAM to disk, regardless
 321 of memory locks.)
 322
 323 Real-time processes that are using
 324 .BR mlockall ()
 325 to prevent delays on page faults should reserve enough
 326 locked stack pages before entering the time-critical section,
 327 so that no page fault can be caused by function calls.
 328 This can be achieved by calling a function that allocates a
 329 sufficiently large automatic variable (an array) and writes to the
 330 memory occupied by this array in order to touch these stack pages.
 331 This way, enough pages will be mapped for the stack and can be
 332 locked into RAM.
 333 The dummy writes ensure that not even copy-on-write
 334 page faults can occur in the critical section.
 335
 336 Memory locks are not inherited by a child created via
 337 .BR fork (2)
 338 and are automatically removed (unlocked) during an
 339 .BR execve (2)
 340 or when the process terminates.
 341 The
 342 .BR mlockall ()
 343 .B MCL_FUTURE
 344 and
 345 .B MCL_FUTURE | MCL_ONFAULT
 346 settings are not inherited by a child created via
 347 .BR fork (2)
 348 and are cleared during an
 349 .BR execve (2).
 350
 351 The memory lock on an address range is automatically removed
 352 if the address range is unmapped via
 353 .BR munmap (2).
 354
 355 Memory locks do not stack, that is, pages which have been locked several times
 356 by calls to
 357 .BR mlock (),
 358 .BR mlock2 (),
 359 or
 360 .BR mlockall ()
 361 will be unlocked by a single call to
 362 .BR munlock ()
 363 for the corresponding range or by
 364 .BR munlockall ().
 365 Pages which are mapped to several locations or by several processes stay
 366 locked into RAM as long as they are locked at least at one location or by
 367 at least one process.
 368
 369 If a call to
 370 .BR mlockall ()
 371 which uses the
 372 .B MCL_FUTURE
 373 flag is followed by another call that does not specify this flag, the
 374 changes made by the
 375 .B MCL_FUTURE
 376 call will be lost.
 377 .SS Linux notes
 378 Under Linux,
 379 .BR mlock (),
 380 .BR mlock2 (),
 381 and
 382 .BR munlock ()
 383 automatically round
 384 .I addr
 385 down to the nearest page boundary.
 386 However, the POSIX.1 specification of
 387 .BR mlock ()
 388 and
 389 .BR munlock ()
 390 allows an implementation to require that
 391 .I addr
 392 is page aligned, so portable applications should ensure this.
 393
 394 The
 395 .I VmLck
 396 field of the Linux-specific
 397 .I /proc/PID/status
 398 file shows how many kilobytes of memory the process with ID
 399 .I PID
 400 has locked using
 401 .BR mlock (),
 402 .BR mlock2 (),
 403 .BR mlockall (),
 404 and
 405 .BR mmap (2)
 406 .BR MAP_LOCKED .
 407 .SS Limits and permissions
 408 In Linux 2.6.8 and earlier,
 409 a process must be privileged
 410 .RB ( CAP_IPC_LOCK )
 411 in order to lock memory and the
 412 .B RLIMIT_MEMLOCK
 413 soft resource limit defines a limit on how much memory the process may lock.
 414
 415 Since Linux 2.6.9, no limits are placed on the amount of memory
 416 that a privileged process can lock and the
 417 .B RLIMIT_MEMLOCK
 418 soft resource limit instead defines a limit on how much memory an
 419 unprivileged process may lock.
 420 .SH BUGS
 421 In the 2.4 series Linux kernels up to and including 2.4.17,
 422 a bug caused the
 423 .BR mlockall ()
 424 .B MCL_FUTURE
 425 flag to be inherited across a
 426 .BR fork (2).
 427 This was rectified in kernel 2.4.18.
 428
 429 Since kernel 2.6.9, if a privileged process calls
 430 .I mlockall(MCL_FUTURE)
 431 and later drops privileges (loses the
 432 .B CAP_IPC_LOCK
 433 capability by, for example,
 434 setting its effective UID to a nonzero value),
 435 then subsequent memory allocations (e.g.,
 436 .BR mmap (2),
 437 .BR brk (2))
 438 will fail if the
 439 .B RLIMIT_MEMLOCK
 440 resource limit is encountered.
 441 .\" See the following LKML thread:
 442 .\" http://marc.theaimsgroup.com/?l=linux-kernel&m=113801392825023&w=2
 443 .\" "Rationale for RLIMIT_MEMLOCK"
 444 .\" 23 Jan 2006
 445 .SH SEE ALSO
 446 .BR mmap (2),
 447 .BR setrlimit (2),
 448 .BR shmctl (2),
 449 .BR sysconf (3),
 450 .BR proc (5),
 451 .BR capabilities (7)