]>
Commit | Line | Data |
---|---|---|
fea681da MK |
1 | .\" Hey Emacs! This file is -*- nroff -*- source. |
2 | .\" | |
3 | .\" Copyright (c) 1992 Drew Eckhardt <drew@cs.colorado.edu>, March 28, 1992 | |
8c7b566c | 4 | .\" and Copyright (c) Michael Kerrisk, 2001, 2002, 2005, 2013 |
fea681da MK |
5 | .\" May be distributed under the GNU General Public License. |
6 | .\" Modified by Michael Haardt <michael@moria.de> | |
7 | .\" Modified 24 Jul 1993 by Rik Faith <faith@cs.unc.edu> | |
8 | .\" Modified 21 Aug 1994 by Michael Chastain <mec@shell.portal.com>: | |
9 | .\" New man page (copied from 'fork.2'). | |
10 | .\" Modified 10 June 1995 by Andries Brouwer <aeb@cwi.nl> | |
11 | .\" Modified 25 April 1998 by Xavier Leroy <Xavier.Leroy@inria.fr> | |
12 | .\" Modified 26 Jun 2001 by Michael Kerrisk | |
13 | .\" Mostly upgraded to 2.4.x | |
14 | .\" Added prototype for sys_clone() plus description | |
15 | .\" Added CLONE_THREAD with a brief description of thread groups | |
c13182ef | 16 | .\" Added CLONE_PARENT and revised entire page remove ambiguity |
fea681da MK |
17 | .\" between "calling process" and "parent process" |
18 | .\" Added CLONE_PTRACE and CLONE_VFORK | |
19 | .\" Added EPERM and EINVAL error codes | |
fd8a5be4 | 20 | .\" Renamed "__clone" to "clone" (which is the prototype in <sched.h>) |
fea681da | 21 | .\" various other minor tidy ups and clarifications. |
c11b1abf | 22 | .\" Modified 26 Jun 2001 by Michael Kerrisk <mtk.manpages@gmail.com> |
d9bfdb9c | 23 | .\" Updated notes for 2.4.7+ behavior of CLONE_THREAD |
c11b1abf | 24 | .\" Modified 15 Oct 2002 by Michael Kerrisk <mtk.manpages@gmail.com> |
fea681da MK |
25 | .\" Added description for CLONE_NEWNS, which was added in 2.4.19 |
26 | .\" Slightly rephrased, aeb. | |
27 | .\" Modified 1 Feb 2003 - added CLONE_SIGHAND restriction, aeb. | |
28 | .\" Modified 1 Jan 2004 - various updates, aeb | |
0967c11f | 29 | .\" Modified 2004-09-10 - added CLONE_PARENT_SETTID etc. - aeb. |
d9bfdb9c | 30 | .\" 2005-04-12, mtk, noted the PID caching behavior of NPTL's getpid() |
31830ef0 | 31 | .\" wrapper under BUGS. |
fd8a5be4 MK |
32 | .\" 2005-05-10, mtk, added CLONE_SYSVSEM, CLONE_UNTRACED, CLONE_STOPPED. |
33 | .\" 2005-05-17, mtk, Substantially enhanced discussion of CLONE_THREAD. | |
4e836144 | 34 | .\" 2008-11-18, mtk, order CLONE_* flags alphabetically |
82ee147a | 35 | .\" 2008-11-18, mtk, document CLONE_NEWPID |
43ce9dda | 36 | .\" 2008-11-19, mtk, document CLONE_NEWUTS |
667417b3 | 37 | .\" 2008-11-19, mtk, document CLONE_NEWIPC |
cfdc761b | 38 | .\" 2008-11-19, Jens Axboe, mtk, document CLONE_IO |
fea681da | 39 | .\" |
185341d4 MK |
40 | .\" FIXME Document CLONE_NEWUSER, which is new in 2.6.23 |
41 | .\" (also supported for unshare()?) | |
360ed6b3 | 42 | .\" |
8c7b566c | 43 | .TH CLONE 2 2013-01-01 "Linux" "Linux Programmer's Manual" |
fea681da | 44 | .SH NAME |
9b0e0996 | 45 | clone, __clone2 \- create a child process |
fea681da | 46 | .SH SYNOPSIS |
c10859eb | 47 | .nf |
86b91fdf | 48 | .BR "#define _GNU_SOURCE" " /* See feature_test_macros(7) */" |
cc4615cc | 49 | .\" Actually _BSD_SOURCE || _SVID_SOURCE |
a4405ff9 | 50 | .\" FIXME See http://sources.redhat.com/bugzilla/show_bug.cgi?id=4749 |
fea681da | 51 | .B #include <sched.h> |
c10859eb | 52 | |
ff929e3b MK |
53 | .BI "int clone(int (*" "fn" ")(void *), void *" child_stack , |
54 | .BI " int " flags ", void *" "arg" ", ... " | |
d3dbc9b1 | 55 | .BI " /* pid_t *" ptid ", struct user_desc *" tls \ |
ff929e3b | 56 | ", pid_t *" ctid " */ );" |
c10859eb | 57 | .fi |
fea681da | 58 | .SH DESCRIPTION |
edcc65ff MK |
59 | .BR clone () |
60 | creates a new process, in a manner similar to | |
fea681da | 61 | .BR fork (2). |
735f354f | 62 | It is actually a library function layered on top of the underlying |
e511ffb6 | 63 | .BR clone () |
fea681da MK |
64 | system call, hereinafter referred to as |
65 | .BR sys_clone . | |
66 | A description of | |
0daa9e92 | 67 | .B sys_clone |
5fab2e7c | 68 | is given toward the end of this page. |
fea681da MK |
69 | |
70 | Unlike | |
71 | .BR fork (2), | |
c13182ef | 72 | these calls |
fea681da MK |
73 | allow the child process to share parts of its execution context with |
74 | the calling process, such as the memory space, the table of file | |
c13182ef MK |
75 | descriptors, and the table of signal handlers. |
76 | (Note that on this manual | |
77 | page, "calling process" normally corresponds to "parent process". | |
78 | But see the description of | |
79 | .B CLONE_PARENT | |
fea681da MK |
80 | below.) |
81 | ||
82 | The main use of | |
edcc65ff | 83 | .BR clone () |
fea681da MK |
84 | is to implement threads: multiple threads of control in a program that |
85 | run concurrently in a shared memory space. | |
86 | ||
87 | When the child process is created with | |
c13182ef | 88 | .BR clone (), |
fea681da | 89 | it executes the function |
c13182ef | 90 | .IR fn ( arg ). |
fea681da | 91 | (This differs from |
c13182ef | 92 | .BR fork (2), |
fea681da | 93 | where execution continues in the child from the point |
c13182ef MK |
94 | of the |
95 | .BR fork (2) | |
fea681da MK |
96 | call.) |
97 | The | |
98 | .I fn | |
99 | argument is a pointer to a function that is called by the child | |
100 | process at the beginning of its execution. | |
101 | The | |
102 | .I arg | |
103 | argument is passed to the | |
104 | .I fn | |
105 | function. | |
106 | ||
c13182ef | 107 | When the |
fea681da | 108 | .IR fn ( arg ) |
c13182ef MK |
109 | function application returns, the child process terminates. |
110 | The integer returned by | |
fea681da | 111 | .I fn |
c13182ef MK |
112 | is the exit code for the child process. |
113 | The child process may also terminate explicitly by calling | |
fea681da MK |
114 | .BR exit (2) |
115 | or after receiving a fatal signal. | |
116 | ||
117 | The | |
118 | .I child_stack | |
c13182ef MK |
119 | argument specifies the location of the stack used by the child process. |
120 | Since the child and calling process may share memory, | |
fea681da | 121 | it is not possible for the child process to execute in the |
c13182ef MK |
122 | same stack as the calling process. |
123 | The calling process must therefore | |
fea681da MK |
124 | set up memory space for the child stack and pass a pointer to this |
125 | space to | |
edcc65ff | 126 | .BR clone (). |
5fab2e7c | 127 | Stacks grow downward on all processors that run Linux |
fea681da MK |
128 | (except the HP PA processors), so |
129 | .I child_stack | |
130 | usually points to the topmost address of the memory space set up for | |
131 | the child stack. | |
132 | ||
133 | The low byte of | |
134 | .I flags | |
fd8a5be4 MK |
135 | contains the number of the |
136 | .I "termination signal" | |
137 | sent to the parent when the child dies. | |
138 | If this signal is specified as anything other than | |
fea681da MK |
139 | .BR SIGCHLD , |
140 | then the parent process must specify the | |
c13182ef MK |
141 | .B __WALL |
142 | or | |
fea681da | 143 | .B __WCLONE |
c13182ef MK |
144 | options when waiting for the child with |
145 | .BR wait (2). | |
fea681da MK |
146 | If no signal is specified, then the parent process is not signaled |
147 | when the child terminates. | |
148 | ||
149 | .I flags | |
fd8a5be4 MK |
150 | may also be bitwise-or'ed with zero or more of the following constants, |
151 | in order to specify what is shared between the calling process | |
fea681da | 152 | and the child process: |
fea681da | 153 | .TP |
f5dbc7c8 MK |
154 | .BR CLONE_CHILD_CLEARTID " (since Linux 2.5.49)" |
155 | Erase child thread ID at location | |
d3dbc9b1 | 156 | .I ctid |
f5dbc7c8 MK |
157 | in child memory when the child exits, and do a wakeup on the futex |
158 | at that address. | |
159 | The address involved may be changed by the | |
160 | .BR set_tid_address (2) | |
161 | system call. | |
162 | This is used by threading libraries. | |
163 | .TP | |
164 | .BR CLONE_CHILD_SETTID " (since Linux 2.5.49)" | |
165 | Store child thread ID at location | |
d3dbc9b1 | 166 | .I ctid |
f5dbc7c8 MK |
167 | in child memory. |
168 | .TP | |
1603d6a1 | 169 | .BR CLONE_FILES " (since Linux 2.0)" |
fea681da | 170 | If |
f5dbc7c8 MK |
171 | .B CLONE_FILES |
172 | is set, the calling process and the child process share the same file | |
173 | descriptor table. | |
174 | Any file descriptor created by the calling process or by the child | |
175 | process is also valid in the other process. | |
176 | Similarly, if one of the processes closes a file descriptor, | |
177 | or changes its associated flags (using the | |
178 | .BR fcntl (2) | |
179 | .B F_SETFD | |
180 | operation), the other process is also affected. | |
fea681da MK |
181 | |
182 | If | |
f5dbc7c8 MK |
183 | .B CLONE_FILES |
184 | is not set, the child process inherits a copy of all file descriptors | |
185 | opened in the calling process at the time of | |
186 | .BR clone (). | |
187 | (The duplicated file descriptors in the child refer to the | |
188 | same open file descriptions (see | |
189 | .BR open (2)) | |
190 | as the corresponding file descriptors in the calling process.) | |
191 | Subsequent operations that open or close file descriptors, | |
192 | or change file descriptor flags, | |
193 | performed by either the calling | |
194 | process or the child process do not affect the other process. | |
fea681da | 195 | .TP |
1603d6a1 | 196 | .BR CLONE_FS " (since Linux 2.0)" |
fea681da MK |
197 | If |
198 | .B CLONE_FS | |
314c8ff4 | 199 | is set, the caller and the child process share the same file system |
c13182ef MK |
200 | information. |
201 | This includes the root of the file system, the current | |
202 | working directory, and the umask. | |
203 | Any call to | |
fea681da MK |
204 | .BR chroot (2), |
205 | .BR chdir (2), | |
206 | or | |
207 | .BR umask (2) | |
edcc65ff | 208 | performed by the calling process or the child process also affects the |
fea681da MK |
209 | other process. |
210 | ||
c13182ef | 211 | If |
fea681da MK |
212 | .B CLONE_FS |
213 | is not set, the child process works on a copy of the file system | |
214 | information of the calling process at the time of the | |
edcc65ff | 215 | .BR clone () |
fea681da MK |
216 | call. |
217 | Calls to | |
218 | .BR chroot (2), | |
219 | .BR chdir (2), | |
220 | .BR umask (2) | |
221 | performed later by one of the processes do not affect the other process. | |
fea681da | 222 | .TP |
a4cc375e | 223 | .BR CLONE_IO " (since Linux 2.6.25)" |
11f27a1c JA |
224 | If |
225 | .B CLONE_IO | |
226 | is set, then the new process shares an I/O context with | |
227 | the calling process. | |
228 | If this flag is not set, then (as with | |
229 | .BR fork (2)) | |
230 | the new process has its own I/O context. | |
231 | ||
232 | .\" The following based on text from Jens Axboe | |
a113945f | 233 | The I/O context is the I/O scope of the disk scheduler (i.e, |
11f27a1c JA |
234 | what the I/O scheduler uses to model scheduling of a process's I/O). |
235 | If processes share the same I/O context, | |
236 | they are treated as one by the I/O scheduler. | |
237 | As a consequence, they get to share disk time. | |
238 | For some I/O schedulers, | |
239 | .\" the anticipatory and CFQ scheduler | |
240 | if two processes share an I/O context, | |
241 | they will be allowed to interleave their disk access. | |
242 | If several threads are doing I/O on behalf of the same process | |
243 | .RB ( aio_read (3), | |
244 | for instance), they should employ | |
245 | .BR CLONE_IO | |
246 | to get better I/O performance. | |
247 | .\" with CFQ and AS. | |
248 | ||
249 | If the kernel is not configured with the | |
250 | .B CONFIG_BLOCK | |
251 | option, this flag is a no-op. | |
252 | .TP | |
8722311b | 253 | .BR CLONE_NEWIPC " (since Linux 2.6.19)" |
667417b3 MK |
254 | If |
255 | .B CLONE_NEWIPC | |
256 | is set, then create the process in a new IPC namespace. | |
257 | If this flag is not set, then (as with | |
258 | .BR fork (2)), | |
259 | the process is created in the same IPC namespace as | |
260 | the calling process. | |
0236bea9 | 261 | This flag is intended for the implementation of containers. |
667417b3 | 262 | |
009a049e MK |
263 | An IPC namespace provides an isolated view of System V IPC objects (see |
264 | .BR svipc (7)) | |
265 | and (since Linux 2.6.30) | |
266 | .\" commit 7eafd7c74c3f2e67c27621b987b28397110d643f | |
267 | .\" https://lwn.net/Articles/312232/ | |
268 | POSIX message queues | |
269 | (see | |
270 | .BR mq_overview (7)). | |
19911fa5 MK |
271 | The common characteristic of these IPC mechanisms is that IPC |
272 | objects are identified by mechanisms other than filesystem | |
273 | pathnames. | |
009a049e | 274 | |
c440fe01 | 275 | Objects created in an IPC namespace are visible to all other processes |
667417b3 MK |
276 | that are members of that namespace, |
277 | but are not visible to processes in other IPC namespaces. | |
278 | ||
83c1f4b5 | 279 | When an IPC namespace is destroyed |
009a049e | 280 | (i.e., when the last process that is a member of the namespace terminates), |
83c1f4b5 MK |
281 | all IPC objects in the namespace are automatically destroyed. |
282 | ||
667417b3 MK |
283 | Use of this flag requires: a kernel configured with the |
284 | .B CONFIG_SYSVIPC | |
285 | and | |
286 | .B CONFIG_IPC_NS | |
c8e18bd1 | 287 | options and that the process be privileged |
667417b3 MK |
288 | .RB ( CAP_SYS_ADMIN ). |
289 | This flag can't be specified in conjunction with | |
290 | .BR CLONE_SYSVSEM . | |
291 | .TP | |
163bf178 | 292 | .BR CLONE_NEWNET " (since Linux 2.6.24)" |
b9145b2c | 293 | .\" FIXME Check when the implementation was completed |
9108d867 MK |
294 | (The implementation of this flag was only completed |
295 | by about kernel version 2.6.29.) | |
163bf178 MK |
296 | |
297 | If | |
298 | .B CLONE_NEWNET | |
299 | is set, then create the process in a new network namespace. | |
300 | If this flag is not set, then (as with | |
301 | .BR fork (2)), | |
302 | the process is created in the same network namespace as | |
303 | the calling process. | |
304 | This flag is intended for the implementation of containers. | |
305 | ||
306 | A network namespace provides an isolated view of the networking stack | |
307 | (network device interfaces, IPv4 and IPv6 protocol stacks, | |
308 | IP routing tables, firewall rules, the | |
309 | .I /proc/net | |
310 | and | |
311 | .I /sys/class/net | |
312 | directory trees, sockets, etc.). | |
313 | A physical network device can live in exactly one | |
314 | network namespace. | |
315 | A virtual network device ("veth") pair provides a pipe-like abstraction | |
1a95a1be | 316 | .\" FIXME Add pointer to veth(4) page when it is eventually completed |
163bf178 MK |
317 | that can be used to create tunnels between network namespaces, |
318 | and can be used to create a bridge to a physical network device | |
319 | in another namespace. | |
320 | ||
bf032425 SH |
321 | When a network namespace is freed |
322 | (i.e., when the last process in the namespace terminates), | |
323 | its physical network devices are moved back to the | |
324 | initial network namespace (not to the parent of the process). | |
325 | ||
163bf178 MK |
326 | Use of this flag requires: a kernel configured with the |
327 | .B CONFIG_NET_NS | |
328 | option and that the process be privileged | |
cae2ec15 | 329 | .RB ( CAP_SYS_ADMIN ). |
163bf178 | 330 | .TP |
c10859eb | 331 | .BR CLONE_NEWNS " (since Linux 2.4.19)" |
732e54dd | 332 | Start the child in a new mount namespace. |
fea681da | 333 | |
732e54dd | 334 | Every process lives in a mount namespace. |
c13182ef | 335 | The |
fea681da MK |
336 | .I namespace |
337 | of a process is the data (the set of mounts) describing the file hierarchy | |
c13182ef MK |
338 | as seen by that process. |
339 | After a | |
fea681da MK |
340 | .BR fork (2) |
341 | or | |
2777b1ca | 342 | .BR clone () |
fea681da MK |
343 | where the |
344 | .B CLONE_NEWNS | |
732e54dd | 345 | flag is not set, the child lives in the same mount |
4df2eb09 | 346 | namespace as the parent. |
fea681da MK |
347 | The system calls |
348 | .BR mount (2) | |
349 | and | |
350 | .BR umount (2) | |
732e54dd | 351 | change the mount namespace of the calling process, and hence affect |
fea681da | 352 | all processes that live in the same namespace, but do not affect |
732e54dd | 353 | processes in a different mount namespace. |
fea681da MK |
354 | |
355 | After a | |
2777b1ca | 356 | .BR clone () |
fea681da MK |
357 | where the |
358 | .B CLONE_NEWNS | |
732e54dd | 359 | flag is set, the cloned child is started in a new mount namespace, |
fea681da MK |
360 | initialized with a copy of the namespace of the parent. |
361 | ||
0b9bdf82 | 362 | Only a privileged process (one having the \fBCAP_SYS_ADMIN\fP capability) |
fea681da MK |
363 | may specify the |
364 | .B CLONE_NEWNS | |
365 | flag. | |
366 | It is not permitted to specify both | |
367 | .B CLONE_NEWNS | |
368 | and | |
369 | .B CLONE_FS | |
370 | in the same | |
e511ffb6 | 371 | .BR clone () |
fea681da | 372 | call. |
fea681da | 373 | .TP |
82ee147a MK |
374 | .BR CLONE_NEWPID " (since Linux 2.6.24)" |
375 | .\" This explanation draws a lot of details from | |
376 | .\" http://lwn.net/Articles/259217/ | |
377 | .\" Authors: Pavel Emelyanov <xemul@openvz.org> | |
378 | .\" and Kir Kolyshkin <kir@openvz.org> | |
379 | .\" | |
380 | .\" The primary kernel commit is 30e49c263e36341b60b735cbef5ca37912549264 | |
381 | .\" Author: Pavel Emelyanov <xemul@openvz.org> | |
382 | If | |
5c95e5e8 | 383 | .B CLONE_NEWPID |
82ee147a MK |
384 | is set, then create the process in a new PID namespace. |
385 | If this flag is not set, then (as with | |
386 | .BR fork (2)), | |
387 | the process is created in the same PID namespace as | |
388 | the calling process. | |
0236bea9 | 389 | This flag is intended for the implementation of containers. |
82ee147a MK |
390 | |
391 | A PID namespace provides an isolated environment for PIDs: | |
392 | PIDs in a new namespace start at 1, | |
393 | somewhat like a standalone system, and calls to | |
394 | .BR fork (2), | |
395 | .BR vfork (2), | |
396 | or | |
27d47e71 | 397 | .BR clone () |
5584229c | 398 | will produce processes with PIDs that are unique within the namespace. |
82ee147a MK |
399 | |
400 | The first process created in a new namespace | |
401 | (i.e., the process created using the | |
402 | .BR CLONE_NEWPID | |
403 | flag) has the PID 1, and is the "init" process for the namespace. | |
404 | Children that are orphaned within the namespace will be reparented | |
405 | to this process rather than | |
406 | .BR init (8). | |
407 | Unlike the traditional | |
408 | .B init | |
409 | process, the "init" process of a PID namespace can terminate, | |
410 | and if it does, all of the processes in the namespace are terminated. | |
411 | ||
412 | PID namespaces form a hierarchy. | |