]> git.ipfire.org Git - thirdparty/man-pages.git/blame - man2/clone.2
Changes.old: Fixes to 3.47 changelog
[thirdparty/man-pages.git] / man2 / clone.2
CommitLineData
fea681da
MK
1.\" Hey Emacs! This file is -*- nroff -*- source.
2.\"
3.\" Copyright (c) 1992 Drew Eckhardt <drew@cs.colorado.edu>, March 28, 1992
8c7b566c 4.\" and Copyright (c) Michael Kerrisk, 2001, 2002, 2005, 2013
fea681da
MK
5.\" May be distributed under the GNU General Public License.
6.\" Modified by Michael Haardt <michael@moria.de>
7.\" Modified 24 Jul 1993 by Rik Faith <faith@cs.unc.edu>
8.\" Modified 21 Aug 1994 by Michael Chastain <mec@shell.portal.com>:
9.\" New man page (copied from 'fork.2').
10.\" Modified 10 June 1995 by Andries Brouwer <aeb@cwi.nl>
11.\" Modified 25 April 1998 by Xavier Leroy <Xavier.Leroy@inria.fr>
12.\" Modified 26 Jun 2001 by Michael Kerrisk
13.\" Mostly upgraded to 2.4.x
14.\" Added prototype for sys_clone() plus description
15.\" Added CLONE_THREAD with a brief description of thread groups
c13182ef 16.\" Added CLONE_PARENT and revised entire page remove ambiguity
fea681da
MK
17.\" between "calling process" and "parent process"
18.\" Added CLONE_PTRACE and CLONE_VFORK
19.\" Added EPERM and EINVAL error codes
fd8a5be4 20.\" Renamed "__clone" to "clone" (which is the prototype in <sched.h>)
fea681da 21.\" various other minor tidy ups and clarifications.
c11b1abf 22.\" Modified 26 Jun 2001 by Michael Kerrisk <mtk.manpages@gmail.com>
d9bfdb9c 23.\" Updated notes for 2.4.7+ behavior of CLONE_THREAD
c11b1abf 24.\" Modified 15 Oct 2002 by Michael Kerrisk <mtk.manpages@gmail.com>
fea681da
MK
25.\" Added description for CLONE_NEWNS, which was added in 2.4.19
26.\" Slightly rephrased, aeb.
27.\" Modified 1 Feb 2003 - added CLONE_SIGHAND restriction, aeb.
28.\" Modified 1 Jan 2004 - various updates, aeb
0967c11f 29.\" Modified 2004-09-10 - added CLONE_PARENT_SETTID etc. - aeb.
d9bfdb9c 30.\" 2005-04-12, mtk, noted the PID caching behavior of NPTL's getpid()
31830ef0 31.\" wrapper under BUGS.
fd8a5be4
MK
32.\" 2005-05-10, mtk, added CLONE_SYSVSEM, CLONE_UNTRACED, CLONE_STOPPED.
33.\" 2005-05-17, mtk, Substantially enhanced discussion of CLONE_THREAD.
4e836144 34.\" 2008-11-18, mtk, order CLONE_* flags alphabetically
82ee147a 35.\" 2008-11-18, mtk, document CLONE_NEWPID
43ce9dda 36.\" 2008-11-19, mtk, document CLONE_NEWUTS
667417b3 37.\" 2008-11-19, mtk, document CLONE_NEWIPC
cfdc761b 38.\" 2008-11-19, Jens Axboe, mtk, document CLONE_IO
fea681da 39.\"
185341d4
MK
40.\" FIXME Document CLONE_NEWUSER, which is new in 2.6.23
41.\" (also supported for unshare()?)
360ed6b3 42.\"
8c7b566c 43.TH CLONE 2 2013-01-01 "Linux" "Linux Programmer's Manual"
fea681da 44.SH NAME
9b0e0996 45clone, __clone2 \- create a child process
fea681da 46.SH SYNOPSIS
c10859eb 47.nf
86b91fdf 48.BR "#define _GNU_SOURCE" " /* See feature_test_macros(7) */"
cc4615cc 49.\" Actually _BSD_SOURCE || _SVID_SOURCE
a4405ff9 50.\" FIXME See http://sources.redhat.com/bugzilla/show_bug.cgi?id=4749
fea681da 51.B #include <sched.h>
c10859eb 52
ff929e3b
MK
53.BI "int clone(int (*" "fn" ")(void *), void *" child_stack ,
54.BI " int " flags ", void *" "arg" ", ... "
d3dbc9b1 55.BI " /* pid_t *" ptid ", struct user_desc *" tls \
ff929e3b 56", pid_t *" ctid " */ );"
c10859eb 57.fi
fea681da 58.SH DESCRIPTION
edcc65ff
MK
59.BR clone ()
60creates a new process, in a manner similar to
fea681da 61.BR fork (2).
735f354f 62It is actually a library function layered on top of the underlying
e511ffb6 63.BR clone ()
fea681da
MK
64system call, hereinafter referred to as
65.BR sys_clone .
66A description of
0daa9e92 67.B sys_clone
5fab2e7c 68is given toward the end of this page.
fea681da
MK
69
70Unlike
71.BR fork (2),
c13182ef 72these calls
fea681da
MK
73allow the child process to share parts of its execution context with
74the calling process, such as the memory space, the table of file
c13182ef
MK
75descriptors, and the table of signal handlers.
76(Note that on this manual
77page, "calling process" normally corresponds to "parent process".
78But see the description of
79.B CLONE_PARENT
fea681da
MK
80below.)
81
82The main use of
edcc65ff 83.BR clone ()
fea681da
MK
84is to implement threads: multiple threads of control in a program that
85run concurrently in a shared memory space.
86
87When the child process is created with
c13182ef 88.BR clone (),
fea681da 89it executes the function
c13182ef 90.IR fn ( arg ).
fea681da 91(This differs from
c13182ef 92.BR fork (2),
fea681da 93where execution continues in the child from the point
c13182ef
MK
94of the
95.BR fork (2)
fea681da
MK
96call.)
97The
98.I fn
99argument is a pointer to a function that is called by the child
100process at the beginning of its execution.
101The
102.I arg
103argument is passed to the
104.I fn
105function.
106
c13182ef 107When the
fea681da 108.IR fn ( arg )
c13182ef
MK
109function application returns, the child process terminates.
110The integer returned by
fea681da 111.I fn
c13182ef
MK
112is the exit code for the child process.
113The child process may also terminate explicitly by calling
fea681da
MK
114.BR exit (2)
115or after receiving a fatal signal.
116
117The
118.I child_stack
c13182ef
MK
119argument specifies the location of the stack used by the child process.
120Since the child and calling process may share memory,
fea681da 121it is not possible for the child process to execute in the
c13182ef
MK
122same stack as the calling process.
123The calling process must therefore
fea681da
MK
124set up memory space for the child stack and pass a pointer to this
125space to
edcc65ff 126.BR clone ().
5fab2e7c 127Stacks grow downward on all processors that run Linux
fea681da
MK
128(except the HP PA processors), so
129.I child_stack
130usually points to the topmost address of the memory space set up for
131the child stack.
132
133The low byte of
134.I flags
fd8a5be4
MK
135contains the number of the
136.I "termination signal"
137sent to the parent when the child dies.
138If this signal is specified as anything other than
fea681da
MK
139.BR SIGCHLD ,
140then the parent process must specify the
c13182ef
MK
141.B __WALL
142or
fea681da 143.B __WCLONE
c13182ef
MK
144options when waiting for the child with
145.BR wait (2).
fea681da
MK
146If no signal is specified, then the parent process is not signaled
147when the child terminates.
148
149.I flags
fd8a5be4
MK
150may also be bitwise-or'ed with zero or more of the following constants,
151in order to specify what is shared between the calling process
fea681da 152and the child process:
fea681da 153.TP
f5dbc7c8
MK
154.BR CLONE_CHILD_CLEARTID " (since Linux 2.5.49)"
155Erase child thread ID at location
d3dbc9b1 156.I ctid
f5dbc7c8
MK
157in child memory when the child exits, and do a wakeup on the futex
158at that address.
159The address involved may be changed by the
160.BR set_tid_address (2)
161system call.
162This is used by threading libraries.
163.TP
164.BR CLONE_CHILD_SETTID " (since Linux 2.5.49)"
165Store child thread ID at location
d3dbc9b1 166.I ctid
f5dbc7c8
MK
167in child memory.
168.TP
1603d6a1 169.BR CLONE_FILES " (since Linux 2.0)"
fea681da 170If
f5dbc7c8
MK
171.B CLONE_FILES
172is set, the calling process and the child process share the same file
173descriptor table.
174Any file descriptor created by the calling process or by the child
175process is also valid in the other process.
176Similarly, if one of the processes closes a file descriptor,
177or changes its associated flags (using the
178.BR fcntl (2)
179.B F_SETFD
180operation), the other process is also affected.
fea681da
MK
181
182If
f5dbc7c8
MK
183.B CLONE_FILES
184is not set, the child process inherits a copy of all file descriptors
185opened in the calling process at the time of
186.BR clone ().
187(The duplicated file descriptors in the child refer to the
188same open file descriptions (see
189.BR open (2))
190as the corresponding file descriptors in the calling process.)
191Subsequent operations that open or close file descriptors,
192or change file descriptor flags,
193performed by either the calling
194process or the child process do not affect the other process.
fea681da 195.TP
1603d6a1 196.BR CLONE_FS " (since Linux 2.0)"
fea681da
MK
197If
198.B CLONE_FS
314c8ff4 199is set, the caller and the child process share the same file system
c13182ef
MK
200information.
201This includes the root of the file system, the current
202working directory, and the umask.
203Any call to
fea681da
MK
204.BR chroot (2),
205.BR chdir (2),
206or
207.BR umask (2)
edcc65ff 208performed by the calling process or the child process also affects the
fea681da
MK
209other process.
210
c13182ef 211If
fea681da
MK
212.B CLONE_FS
213is not set, the child process works on a copy of the file system
214information of the calling process at the time of the
edcc65ff 215.BR clone ()
fea681da
MK
216call.
217Calls to
218.BR chroot (2),
219.BR chdir (2),
220.BR umask (2)
221performed later by one of the processes do not affect the other process.
fea681da 222.TP
a4cc375e 223.BR CLONE_IO " (since Linux 2.6.25)"
11f27a1c
JA
224If
225.B CLONE_IO
226is set, then the new process shares an I/O context with
227the calling process.
228If this flag is not set, then (as with
229.BR fork (2))
230the new process has its own I/O context.
231
232.\" The following based on text from Jens Axboe
a113945f 233The I/O context is the I/O scope of the disk scheduler (i.e,
11f27a1c
JA
234what the I/O scheduler uses to model scheduling of a process's I/O).
235If processes share the same I/O context,
236they are treated as one by the I/O scheduler.
237As a consequence, they get to share disk time.
238For some I/O schedulers,
239.\" the anticipatory and CFQ scheduler
240if two processes share an I/O context,
241they will be allowed to interleave their disk access.
242If several threads are doing I/O on behalf of the same process
243.RB ( aio_read (3),
244for instance), they should employ
245.BR CLONE_IO
246to get better I/O performance.
247.\" with CFQ and AS.
248
249If the kernel is not configured with the
250.B CONFIG_BLOCK
251option, this flag is a no-op.
252.TP
8722311b 253.BR CLONE_NEWIPC " (since Linux 2.6.19)"
667417b3
MK
254If
255.B CLONE_NEWIPC
256is set, then create the process in a new IPC namespace.
257If this flag is not set, then (as with
258.BR fork (2)),
259the process is created in the same IPC namespace as
260the calling process.
0236bea9 261This flag is intended for the implementation of containers.
667417b3 262
009a049e
MK
263An IPC namespace provides an isolated view of System V IPC objects (see
264.BR svipc (7))
265and (since Linux 2.6.30)
266.\" commit 7eafd7c74c3f2e67c27621b987b28397110d643f
267.\" https://lwn.net/Articles/312232/
268POSIX message queues
269(see
270.BR mq_overview (7)).
19911fa5
MK
271The common characteristic of these IPC mechanisms is that IPC
272objects are identified by mechanisms other than filesystem
273pathnames.
009a049e 274
c440fe01 275Objects created in an IPC namespace are visible to all other processes
667417b3
MK
276that are members of that namespace,
277but are not visible to processes in other IPC namespaces.
278
83c1f4b5 279When an IPC namespace is destroyed
009a049e 280(i.e., when the last process that is a member of the namespace terminates),
83c1f4b5
MK
281all IPC objects in the namespace are automatically destroyed.
282
667417b3
MK
283Use of this flag requires: a kernel configured with the
284.B CONFIG_SYSVIPC
285and
286.B CONFIG_IPC_NS
c8e18bd1 287options and that the process be privileged
667417b3
MK
288.RB ( CAP_SYS_ADMIN ).
289This flag can't be specified in conjunction with
290.BR CLONE_SYSVSEM .
291.TP
163bf178 292.BR CLONE_NEWNET " (since Linux 2.6.24)"
b9145b2c 293.\" FIXME Check when the implementation was completed
9108d867
MK
294(The implementation of this flag was only completed
295by about kernel version 2.6.29.)
163bf178
MK
296
297If
298.B CLONE_NEWNET
299is set, then create the process in a new network namespace.
300If this flag is not set, then (as with
301.BR fork (2)),
302the process is created in the same network namespace as
303the calling process.
304This flag is intended for the implementation of containers.
305
306A network namespace provides an isolated view of the networking stack
307(network device interfaces, IPv4 and IPv6 protocol stacks,
308IP routing tables, firewall rules, the
309.I /proc/net
310and
311.I /sys/class/net
312directory trees, sockets, etc.).
313A physical network device can live in exactly one
314network namespace.
315A virtual network device ("veth") pair provides a pipe-like abstraction
1a95a1be 316.\" FIXME Add pointer to veth(4) page when it is eventually completed
163bf178
MK
317that can be used to create tunnels between network namespaces,
318and can be used to create a bridge to a physical network device
319in another namespace.
320
bf032425
SH
321When a network namespace is freed
322(i.e., when the last process in the namespace terminates),
323its physical network devices are moved back to the
324initial network namespace (not to the parent of the process).
325
163bf178
MK
326Use of this flag requires: a kernel configured with the
327.B CONFIG_NET_NS
328option and that the process be privileged
cae2ec15 329.RB ( CAP_SYS_ADMIN ).
163bf178 330.TP
c10859eb 331.BR CLONE_NEWNS " (since Linux 2.4.19)"
732e54dd 332Start the child in a new mount namespace.
fea681da 333
732e54dd 334Every process lives in a mount namespace.
c13182ef 335The
fea681da
MK
336.I namespace
337of a process is the data (the set of mounts) describing the file hierarchy
c13182ef
MK
338as seen by that process.
339After a
fea681da
MK
340.BR fork (2)
341or
2777b1ca 342.BR clone ()
fea681da
MK
343where the
344.B CLONE_NEWNS
732e54dd 345flag is not set, the child lives in the same mount
4df2eb09 346namespace as the parent.
fea681da
MK
347The system calls
348.BR mount (2)
349and
350.BR umount (2)
732e54dd 351change the mount namespace of the calling process, and hence affect
fea681da 352all processes that live in the same namespace, but do not affect
732e54dd 353processes in a different mount namespace.
fea681da
MK
354
355After a
2777b1ca 356.BR clone ()
fea681da
MK
357where the
358.B CLONE_NEWNS
732e54dd 359flag is set, the cloned child is started in a new mount namespace,
fea681da
MK
360initialized with a copy of the namespace of the parent.
361
0b9bdf82 362Only a privileged process (one having the \fBCAP_SYS_ADMIN\fP capability)
fea681da
MK
363may specify the
364.B CLONE_NEWNS
365flag.
366It is not permitted to specify both
367.B CLONE_NEWNS
368and
369.B CLONE_FS
370in the same
e511ffb6 371.BR clone ()
fea681da 372call.
fea681da 373.TP
82ee147a
MK
374.BR CLONE_NEWPID " (since Linux 2.6.24)"
375.\" This explanation draws a lot of details from
376.\" http://lwn.net/Articles/259217/
377.\" Authors: Pavel Emelyanov <xemul@openvz.org>
378.\" and Kir Kolyshkin <kir@openvz.org>
379.\"
380.\" The primary kernel commit is 30e49c263e36341b60b735cbef5ca37912549264
381.\" Author: Pavel Emelyanov <xemul@openvz.org>
382If
5c95e5e8 383.B CLONE_NEWPID
82ee147a
MK
384is set, then create the process in a new PID namespace.
385If this flag is not set, then (as with
386.BR fork (2)),
387the process is created in the same PID namespace as
388the calling process.
0236bea9 389This flag is intended for the implementation of containers.
82ee147a
MK
390
391A PID namespace provides an isolated environment for PIDs:
392PIDs in a new namespace start at 1,
393somewhat like a standalone system, and calls to
394.BR fork (2),
395.BR vfork (2),
396or
27d47e71 397.BR clone ()
5584229c 398will produce processes with PIDs that are unique within the namespace.
82ee147a
MK
399
400The first process created in a new namespace
401(i.e., the process created using the
402.BR CLONE_NEWPID
403flag) has the PID 1, and is the "init" process for the namespace.
404Children that are orphaned within the namespace will be reparented
405to this process rather than
406.BR init (8).
407Unlike the traditional
408.B init
409process, the "init" process of a PID namespace can terminate,
410and if it does, all of the processes in the namespace are terminated.
411
412PID namespaces form a hierarchy.