]>
Commit | Line | Data |
---|---|---|
fea681da MK |
1 | .\" Hey Emacs! This file is -*- nroff -*- source. |
2 | .\" | |
3 | .\" Copyright (c) 1992 Drew Eckhardt <drew@cs.colorado.edu>, March 28, 1992 | |
1130df60 | 4 | .\" and Copyright (c) Michael Kerrisk, 2001, 2002, 2005 |
fea681da MK |
5 | .\" May be distributed under the GNU General Public License. |
6 | .\" Modified by Michael Haardt <michael@moria.de> | |
7 | .\" Modified 24 Jul 1993 by Rik Faith <faith@cs.unc.edu> | |
8 | .\" Modified 21 Aug 1994 by Michael Chastain <mec@shell.portal.com>: | |
9 | .\" New man page (copied from 'fork.2'). | |
10 | .\" Modified 10 June 1995 by Andries Brouwer <aeb@cwi.nl> | |
11 | .\" Modified 25 April 1998 by Xavier Leroy <Xavier.Leroy@inria.fr> | |
12 | .\" Modified 26 Jun 2001 by Michael Kerrisk | |
13 | .\" Mostly upgraded to 2.4.x | |
14 | .\" Added prototype for sys_clone() plus description | |
15 | .\" Added CLONE_THREAD with a brief description of thread groups | |
c13182ef | 16 | .\" Added CLONE_PARENT and revised entire page remove ambiguity |
fea681da MK |
17 | .\" between "calling process" and "parent process" |
18 | .\" Added CLONE_PTRACE and CLONE_VFORK | |
19 | .\" Added EPERM and EINVAL error codes | |
fd8a5be4 | 20 | .\" Renamed "__clone" to "clone" (which is the prototype in <sched.h>) |
fea681da | 21 | .\" various other minor tidy ups and clarifications. |
c11b1abf | 22 | .\" Modified 26 Jun 2001 by Michael Kerrisk <mtk.manpages@gmail.com> |
d9bfdb9c | 23 | .\" Updated notes for 2.4.7+ behavior of CLONE_THREAD |
c11b1abf | 24 | .\" Modified 15 Oct 2002 by Michael Kerrisk <mtk.manpages@gmail.com> |
fea681da MK |
25 | .\" Added description for CLONE_NEWNS, which was added in 2.4.19 |
26 | .\" Slightly rephrased, aeb. | |
27 | .\" Modified 1 Feb 2003 - added CLONE_SIGHAND restriction, aeb. | |
28 | .\" Modified 1 Jan 2004 - various updates, aeb | |
0967c11f | 29 | .\" Modified 2004-09-10 - added CLONE_PARENT_SETTID etc. - aeb. |
d9bfdb9c | 30 | .\" 2005-04-12, mtk, noted the PID caching behavior of NPTL's getpid() |
31830ef0 | 31 | .\" wrapper under BUGS. |
fd8a5be4 MK |
32 | .\" 2005-05-10, mtk, added CLONE_SYSVSEM, CLONE_UNTRACED, CLONE_STOPPED. |
33 | .\" 2005-05-17, mtk, Substantially enhanced discussion of CLONE_THREAD. | |
82ee147a MK |
34 | .\" 2008-11-18, mtk, order CLONE_* flags alphabetically |
35 | .\" 2008-11-18, mtk, document CLONE_NEWPID | |
43ce9dda | 36 | .\" 2008-11-19, mtk, document CLONE_NEWUTS |
667417b3 | 37 | .\" 2008-11-19, mtk, document CLONE_NEWIPC |
cfdc761b | 38 | .\" 2008-11-19, Jens Axboe, mtk, document CLONE_IO |
fea681da | 39 | .\" |
185341d4 MK |
40 | .\" FIXME Document CLONE_NEWUSER, which is new in 2.6.23 |
41 | .\" (also supported for unshare()?) | |
6807c79a | 42 | .\" FIXME . 2.6.25 marks the unused CLONE_STOPPED as obsolete, and it will |
21a0b03d | 43 | .\" probably be removed in the future. |
360ed6b3 | 44 | .\" |
86b91fdf | 45 | .TH CLONE 2 2010-09-10 "Linux" "Linux Programmer's Manual" |
fea681da | 46 | .SH NAME |
9b0e0996 | 47 | clone, __clone2 \- create a child process |
fea681da | 48 | .SH SYNOPSIS |
c10859eb | 49 | .nf |
86b91fdf | 50 | .BR "#define _GNU_SOURCE" " /* See feature_test_macros(7) */" |
cc4615cc MK |
51 | .\" Actually _BSD_SOURCE || _SVID_SOURCE |
52 | .\" See http://sources.redhat.com/bugzilla/show_bug.cgi?id=4749 | |
fea681da | 53 | .B #include <sched.h> |
c10859eb | 54 | |
ff929e3b MK |
55 | .BI "int clone(int (*" "fn" ")(void *), void *" child_stack , |
56 | .BI " int " flags ", void *" "arg" ", ... " | |
d3dbc9b1 | 57 | .BI " /* pid_t *" ptid ", struct user_desc *" tls \ |
ff929e3b | 58 | ", pid_t *" ctid " */ );" |
c10859eb | 59 | .fi |
fea681da | 60 | .SH DESCRIPTION |
edcc65ff MK |
61 | .BR clone () |
62 | creates a new process, in a manner similar to | |
fea681da | 63 | .BR fork (2). |
735f354f | 64 | It is actually a library function layered on top of the underlying |
e511ffb6 | 65 | .BR clone () |
fea681da MK |
66 | system call, hereinafter referred to as |
67 | .BR sys_clone . | |
68 | A description of | |
0daa9e92 | 69 | .B sys_clone |
fea681da MK |
70 | is given towards the end of this page. |
71 | ||
72 | Unlike | |
73 | .BR fork (2), | |
c13182ef | 74 | these calls |
fea681da MK |
75 | allow the child process to share parts of its execution context with |
76 | the calling process, such as the memory space, the table of file | |
c13182ef MK |
77 | descriptors, and the table of signal handlers. |
78 | (Note that on this manual | |
79 | page, "calling process" normally corresponds to "parent process". | |
80 | But see the description of | |
81 | .B CLONE_PARENT | |
fea681da MK |
82 | below.) |
83 | ||
84 | The main use of | |
edcc65ff | 85 | .BR clone () |
fea681da MK |
86 | is to implement threads: multiple threads of control in a program that |
87 | run concurrently in a shared memory space. | |
88 | ||
89 | When the child process is created with | |
c13182ef | 90 | .BR clone (), |
fea681da MK |
91 | it executes the function |
92 | application | |
c13182ef | 93 | .IR fn ( arg ). |
fea681da | 94 | (This differs from |
c13182ef | 95 | .BR fork (2), |
fea681da | 96 | where execution continues in the child from the point |
c13182ef MK |
97 | of the |
98 | .BR fork (2) | |
fea681da MK |
99 | call.) |
100 | The | |
101 | .I fn | |
102 | argument is a pointer to a function that is called by the child | |
103 | process at the beginning of its execution. | |
104 | The | |
105 | .I arg | |
106 | argument is passed to the | |
107 | .I fn | |
108 | function. | |
109 | ||
c13182ef | 110 | When the |
fea681da | 111 | .IR fn ( arg ) |
c13182ef MK |
112 | function application returns, the child process terminates. |
113 | The integer returned by | |
fea681da | 114 | .I fn |
c13182ef MK |
115 | is the exit code for the child process. |
116 | The child process may also terminate explicitly by calling | |
fea681da MK |
117 | .BR exit (2) |
118 | or after receiving a fatal signal. | |
119 | ||
120 | The | |
121 | .I child_stack | |
c13182ef MK |
122 | argument specifies the location of the stack used by the child process. |
123 | Since the child and calling process may share memory, | |
fea681da | 124 | it is not possible for the child process to execute in the |
c13182ef MK |
125 | same stack as the calling process. |
126 | The calling process must therefore | |
fea681da MK |
127 | set up memory space for the child stack and pass a pointer to this |
128 | space to | |
edcc65ff | 129 | .BR clone (). |
fea681da MK |
130 | Stacks grow downwards on all processors that run Linux |
131 | (except the HP PA processors), so | |
132 | .I child_stack | |
133 | usually points to the topmost address of the memory space set up for | |
134 | the child stack. | |
135 | ||
136 | The low byte of | |
137 | .I flags | |
fd8a5be4 MK |
138 | contains the number of the |
139 | .I "termination signal" | |
140 | sent to the parent when the child dies. | |
141 | If this signal is specified as anything other than | |
fea681da MK |
142 | .BR SIGCHLD , |
143 | then the parent process must specify the | |
c13182ef MK |
144 | .B __WALL |
145 | or | |
fea681da | 146 | .B __WCLONE |
c13182ef MK |
147 | options when waiting for the child with |
148 | .BR wait (2). | |
fea681da MK |
149 | If no signal is specified, then the parent process is not signaled |
150 | when the child terminates. | |
151 | ||
152 | .I flags | |
fd8a5be4 MK |
153 | may also be bitwise-or'ed with zero or more of the following constants, |
154 | in order to specify what is shared between the calling process | |
fea681da | 155 | and the child process: |
fea681da | 156 | .TP |
f5dbc7c8 MK |
157 | .BR CLONE_CHILD_CLEARTID " (since Linux 2.5.49)" |
158 | Erase child thread ID at location | |
d3dbc9b1 | 159 | .I ctid |
f5dbc7c8 MK |
160 | in child memory when the child exits, and do a wakeup on the futex |
161 | at that address. | |
162 | The address involved may be changed by the | |
163 | .BR set_tid_address (2) | |
164 | system call. | |
165 | This is used by threading libraries. | |
166 | .TP | |
167 | .BR CLONE_CHILD_SETTID " (since Linux 2.5.49)" | |
168 | Store child thread ID at location | |
d3dbc9b1 | 169 | .I ctid |
f5dbc7c8 MK |
170 | in child memory. |
171 | .TP | |
172 | .B CLONE_FILES | |
fea681da | 173 | If |
f5dbc7c8 MK |
174 | .B CLONE_FILES |
175 | is set, the calling process and the child process share the same file | |
176 | descriptor table. | |
177 | Any file descriptor created by the calling process or by the child | |
178 | process is also valid in the other process. | |
179 | Similarly, if one of the processes closes a file descriptor, | |
180 | or changes its associated flags (using the | |
181 | .BR fcntl (2) | |
182 | .B F_SETFD | |
183 | operation), the other process is also affected. | |
fea681da MK |
184 | |
185 | If | |
f5dbc7c8 MK |
186 | .B CLONE_FILES |
187 | is not set, the child process inherits a copy of all file descriptors | |
188 | opened in the calling process at the time of | |
189 | .BR clone (). | |
190 | (The duplicated file descriptors in the child refer to the | |
191 | same open file descriptions (see | |
192 | .BR open (2)) | |
193 | as the corresponding file descriptors in the calling process.) | |
194 | Subsequent operations that open or close file descriptors, | |
195 | or change file descriptor flags, | |
196 | performed by either the calling | |
197 | process or the child process do not affect the other process. | |
fea681da MK |
198 | .TP |
199 | .B CLONE_FS | |
200 | If | |
201 | .B CLONE_FS | |
314c8ff4 | 202 | is set, the caller and the child process share the same file system |
c13182ef MK |
203 | information. |
204 | This includes the root of the file system, the current | |
205 | working directory, and the umask. | |
206 | Any call to | |
fea681da MK |
207 | .BR chroot (2), |
208 | .BR chdir (2), | |
209 | or | |
210 | .BR umask (2) | |
edcc65ff | 211 | performed by the calling process or the child process also affects the |
fea681da MK |
212 | other process. |
213 | ||
c13182ef | 214 | If |
fea681da MK |
215 | .B CLONE_FS |
216 | is not set, the child process works on a copy of the file system | |
217 | information of the calling process at the time of the | |
edcc65ff | 218 | .BR clone () |
fea681da MK |
219 | call. |
220 | Calls to | |
221 | .BR chroot (2), | |
222 | .BR chdir (2), | |
223 | .BR umask (2) | |
224 | performed later by one of the processes do not affect the other process. | |
fea681da | 225 | .TP |
a4cc375e | 226 | .BR CLONE_IO " (since Linux 2.6.25)" |
11f27a1c JA |
227 | If |
228 | .B CLONE_IO | |
229 | is set, then the new process shares an I/O context with | |
230 | the calling process. | |
231 | If this flag is not set, then (as with | |
232 | .BR fork (2)) | |
233 | the new process has its own I/O context. | |
234 | ||
235 | .\" The following based on text from Jens Axboe | |
a113945f | 236 | The I/O context is the I/O scope of the disk scheduler (i.e, |
11f27a1c JA |
237 | what the I/O scheduler uses to model scheduling of a process's I/O). |
238 | If processes share the same I/O context, | |
239 | they are treated as one by the I/O scheduler. | |
240 | As a consequence, they get to share disk time. | |
241 | For some I/O schedulers, | |
242 | .\" the anticipatory and CFQ scheduler | |
243 | if two processes share an I/O context, | |
244 | they will be allowed to interleave their disk access. | |
245 | If several threads are doing I/O on behalf of the same process | |
246 | .RB ( aio_read (3), | |
247 | for instance), they should employ | |
248 | .BR CLONE_IO | |
249 | to get better I/O performance. | |
250 | .\" with CFQ and AS. | |
251 | ||
252 | If the kernel is not configured with the | |
253 | .B CONFIG_BLOCK | |
254 | option, this flag is a no-op. | |
255 | .TP | |
8722311b | 256 | .BR CLONE_NEWIPC " (since Linux 2.6.19)" |
667417b3 MK |
257 | If |
258 | .B CLONE_NEWIPC | |
259 | is set, then create the process in a new IPC namespace. | |
260 | If this flag is not set, then (as with | |
261 | .BR fork (2)), | |
262 | the process is created in the same IPC namespace as | |
263 | the calling process. | |
0236bea9 | 264 | This flag is intended for the implementation of containers. |
667417b3 MK |
265 | |
266 | An IPC namespace consists of the set of identifiers for | |
267 | System V IPC objects. | |
268 | (These objects are created using | |
269 | .BR msgctl (2), | |
270 | .BR semctl (2), | |
271 | and | |
272 | .BR shmctl (2)). | |
c440fe01 | 273 | Objects created in an IPC namespace are visible to all other processes |
667417b3 MK |
274 | that are members of that namespace, |
275 | but are not visible to processes in other IPC namespaces. | |
276 | ||
83c1f4b5 MK |
277 | When an IPC namespace is destroyed |
278 | (i.e, when the last process that is a member of the namespace terminates), | |
279 | all IPC objects in the namespace are automatically destroyed. | |
280 | ||
667417b3 MK |
281 | Use of this flag requires: a kernel configured with the |
282 | .B CONFIG_SYSVIPC | |
283 | and | |
284 | .B CONFIG_IPC_NS | |
c8e18bd1 | 285 | options and that the process be privileged |
667417b3 MK |
286 | .RB ( CAP_SYS_ADMIN ). |
287 | This flag can't be specified in conjunction with | |
288 | .BR CLONE_SYSVSEM . | |
289 | .TP | |
163bf178 MK |
290 | .BR CLONE_NEWNET " (since Linux 2.6.24)" |
291 | (The implementation of this flag is not yet complete, | |
292 | but probably will be mostly complete by about Linux 2.6.28.) | |
293 | ||
294 | If | |
295 | .B CLONE_NEWNET | |
296 | is set, then create the process in a new network namespace. | |
297 | If this flag is not set, then (as with | |
298 | .BR fork (2)), | |
299 | the process is created in the same network namespace as | |
300 | the calling process. | |
301 | This flag is intended for the implementation of containers. | |
302 | ||
303 | A network namespace provides an isolated view of the networking stack | |
304 | (network device interfaces, IPv4 and IPv6 protocol stacks, | |
305 | IP routing tables, firewall rules, the | |
306 | .I /proc/net | |
307 | and | |
308 | .I /sys/class/net | |
309 | directory trees, sockets, etc.). | |
310 | A physical network device can live in exactly one | |
311 | network namespace. | |
312 | A virtual network device ("veth") pair provides a pipe-like abstraction | |
313 | that can be used to create tunnels between network namespaces, | |
314 | and can be used to create a bridge to a physical network device | |
315 | in another namespace. | |
316 | ||
bf032425 SH |
317 | When a network namespace is freed |
318 | (i.e., when the last process in the namespace terminates), | |
319 | its physical network devices are moved back to the | |
320 | initial network namespace (not to the parent of the process). | |
321 | ||
163bf178 MK |
322 | Use of this flag requires: a kernel configured with the |
323 | .B CONFIG_NET_NS | |
324 | option and that the process be privileged | |
cae2ec15 | 325 | .RB ( CAP_SYS_ADMIN ). |
163bf178 | 326 | .TP |
c10859eb | 327 | .BR CLONE_NEWNS " (since Linux 2.4.19)" |
732e54dd | 328 | Start the child in a new mount namespace. |
fea681da | 329 | |
732e54dd | 330 | Every process lives in a mount namespace. |
c13182ef | 331 | The |
fea681da MK |
332 | .I namespace |
333 | of a process is the data (the set of mounts) describing the file hierarchy | |
c13182ef MK |
334 | as seen by that process. |
335 | After a | |
fea681da MK |
336 | .BR fork (2) |
337 | or | |
2777b1ca | 338 | .BR clone () |
fea681da MK |
339 | where the |
340 | .B CLONE_NEWNS | |
732e54dd | 341 | flag is not set, the child lives in the same mount |
4df2eb09 | 342 | namespace as the parent. |
fea681da MK |
343 | The system calls |
344 | .BR mount (2) | |
345 | and | |
346 | .BR umount (2) | |
732e54dd | 347 | change the mount namespace of the calling process, and hence affect |
fea681da | 348 | all processes that live in the same namespace, but do not affect |
732e54dd | 349 | processes in a different mount namespace. |
fea681da MK |
350 | |
351 | After a | |
2777b1ca | 352 | .BR clone () |
fea681da MK |
353 | where the |
354 | .B CLONE_NEWNS | |
732e54dd | 355 | flag is set, the cloned child is started in a new mount namespace, |
fea681da MK |
356 | initialized with a copy of the namespace of the parent. |
357 | ||
0b9bdf82 | 358 | Only a privileged process (one having the \fBCAP_SYS_ADMIN\fP capability) |
fea681da MK |
359 | may specify the |
360 | .B CLONE_NEWNS | |
361 | flag. | |
362 | It is not permitted to specify both | |
363 | .B CLONE_NEWNS | |
364 | and | |
365 | .B CLONE_FS | |
366 | in the same | |
e511ffb6 | 367 | .BR clone () |
fea681da | 368 | call. |
fea681da | 369 | .TP |
82ee147a MK |
370 | .BR CLONE_NEWPID " (since Linux 2.6.24)" |
371 | .\" This explanation draws a lot of details from | |
372 | .\" http://lwn.net/Articles/259217/ | |
373 | .\" Authors: Pavel Emelyanov <xemul@openvz.org> | |
374 | .\" and Kir Kolyshkin <kir@openvz.org> | |
375 | .\" | |
376 | .\" The primary kernel commit is 30e49c263e36341b60b735cbef5ca37912549264 | |
377 | .\" Author: Pavel Emelyanov <xemul@openvz.org> | |
378 | If | |
5c95e5e8 | 379 | .B CLONE_NEWPID |
82ee147a MK |
380 | is set, then create the process in a new PID namespace. |
381 | If this flag is not set, then (as with | |
382 | .BR fork (2)), | |
383 | the process is created in the same PID namespace as | |
384 | the calling process. | |
0236bea9 | 385 | This flag is intended for the implementation of containers. |
82ee147a MK |
386 | |
387 | A PID namespace provides an isolated environment for PIDs: | |
388 | PIDs in a new namespace start at 1, | |
389 | somewhat like a standalone system, and calls to | |
390 | .BR fork (2), | |
391 | .BR vfork (2), | |
392 | or | |
393 | .BR clone (2) | |
5584229c | 394 | will produce processes with PIDs that are unique within the namespace. |
82ee147a MK |
395 | |
396 | The first process created in a new namespace | |
397 | (i.e., the process created using the | |
398 | .BR CLONE_NEWPID | |
399 | flag) has the PID 1, and is the "init" process for the namespace. | |
400 | Children that are orphaned within the namespace will be reparented | |
401 | to this process rather than | |
402 | .BR init (8). | |
403 | Unlike the traditional | |
404 | .B init | |
405 | process, the "init" process of a PID namespace can terminate, | |
406 | and if it does, all of the processes in the namespace are terminated. | |
407 | ||
408 | PID namespaces form a hierarchy. | |