]>
Commit | Line | Data |
---|---|---|
fdcde4c4 AC |
1 | .\" Copyright (C) 1994, 1995, Daniel Quinlan <quinlan@yggdrasil.com> |
2 | .\" Copyright (C) 2002-2008, 2017, Michael Kerrisk <mtk.manpages@gmail.com> | |
3 | .\" Copyright (C) , Andries Brouwer <aeb@cwi.nl> | |
4 | .\" Copyright (C) 2023, Alejandro Colomar <alx@kernel.org> | |
5 | .\" | |
6 | .\" SPDX-License-Identifier: GPL-3.0-or-later | |
7 | .\" | |
8 | .TH proc_sys_fs 5 (date) "Linux man-pages (unreleased)" | |
9 | .SH NAME | |
10 | /proc/sys/fs/ \- kernel variables related to filesystems | |
11 | .SH DESCRIPTION | |
12 | .TP | |
13 | .I /proc/sys/fs/ | |
14 | This directory contains the files and subdirectories for kernel variables | |
15 | related to filesystems. | |
16 | .TP | |
17 | .IR /proc/sys/fs/aio\-max\-nr " and " /proc/sys/fs/aio\-nr " (since Linux 2.6.4)" | |
18 | .I aio\-nr | |
19 | is the running total of the number of events specified by | |
20 | .BR io_setup (2) | |
21 | calls for all currently active AIO contexts. | |
22 | If | |
23 | .I aio\-nr | |
24 | reaches | |
25 | .IR aio\-max\-nr , | |
26 | then | |
27 | .BR io_setup (2) | |
28 | will fail with the error | |
29 | .BR EAGAIN . | |
30 | Raising | |
31 | .I aio\-max\-nr | |
32 | does not result in the preallocation or resizing | |
33 | of any kernel data structures. | |
34 | .TP | |
35 | .I /proc/sys/fs/binfmt_misc | |
36 | Documentation for files in this directory can be found | |
37 | in the Linux kernel source in the file | |
38 | .I Documentation/admin\-guide/binfmt\-misc.rst | |
39 | (or in | |
40 | .I Documentation/binfmt_misc.txt | |
41 | on older kernels). | |
42 | .TP | |
43 | .IR /proc/sys/fs/dentry\-state " (since Linux 2.2)" | |
44 | This file contains information about the status of the | |
45 | directory cache (dcache). | |
46 | The file contains six numbers, | |
47 | .IR nr_dentry , | |
48 | .IR nr_unused , | |
49 | .I age_limit | |
50 | (age in seconds), | |
51 | .I want_pages | |
52 | (pages requested by system) and two dummy values. | |
53 | .RS | |
54 | .IP \[bu] 3 | |
55 | .I nr_dentry | |
56 | is the number of allocated dentries (dcache entries). | |
57 | This field is unused in Linux 2.2. | |
58 | .IP \[bu] | |
59 | .I nr_unused | |
60 | is the number of unused dentries. | |
61 | .IP \[bu] | |
62 | .I age_limit | |
63 | .\" looks like this is unused in Linux 2.2 to Linux 2.6 | |
64 | is the age in seconds after which dcache entries | |
65 | can be reclaimed when memory is short. | |
66 | .IP \[bu] | |
67 | .I want_pages | |
68 | .\" looks like this is unused in Linux 2.2 to Linux 2.6 | |
69 | is nonzero when the kernel has called shrink_dcache_pages() and the | |
70 | dcache isn't pruned yet. | |
71 | .RE | |
72 | .TP | |
73 | .I /proc/sys/fs/dir\-notify\-enable | |
74 | This file can be used to disable or enable the | |
75 | .I dnotify | |
76 | interface described in | |
77 | .BR fcntl (2) | |
78 | on a system-wide basis. | |
79 | A value of 0 in this file disables the interface, | |
80 | and a value of 1 enables it. | |
81 | .TP | |
82 | .I /proc/sys/fs/dquot\-max | |
83 | This file shows the maximum number of cached disk quota entries. | |
84 | On some (2.4) systems, it is not present. | |
85 | If the number of free cached disk quota entries is very low and | |
86 | you have some awesome number of simultaneous system users, | |
87 | you might want to raise the limit. | |
88 | .TP | |
89 | .I /proc/sys/fs/dquot\-nr | |
90 | This file shows the number of allocated disk quota | |
91 | entries and the number of free disk quota entries. | |
92 | .TP | |
93 | .IR /proc/sys/fs/epoll/ " (since Linux 2.6.28)" | |
94 | This directory contains the file | |
95 | .IR max_user_watches , | |
96 | which can be used to limit the amount of kernel memory consumed by the | |
97 | .I epoll | |
98 | interface. | |
99 | For further details, see | |
100 | .BR epoll (7). | |
101 | .TP | |
102 | .I /proc/sys/fs/file\-max | |
103 | This file defines | |
104 | a system-wide limit on the number of open files for all processes. | |
105 | System calls that fail when encountering this limit fail with the error | |
106 | .BR ENFILE . | |
107 | (See also | |
108 | .BR setrlimit (2), | |
109 | which can be used by a process to set the per-process limit, | |
110 | .BR RLIMIT_NOFILE , | |
111 | on the number of files it may open.) | |
112 | If you get lots | |
113 | of error messages in the kernel log about running out of file handles | |
114 | (open file descriptions) | |
115 | (look for "VFS: file\-max limit <number> reached"), | |
116 | try increasing this value: | |
117 | .IP | |
118 | .in +4n | |
119 | .EX | |
120 | echo 100000 > /proc/sys/fs/file\-max | |
121 | .EE | |
122 | .in | |
123 | .IP | |
124 | Privileged processes | |
125 | .RB ( CAP_SYS_ADMIN ) | |
126 | can override the | |
127 | .I file\-max | |
128 | limit. | |
129 | .TP | |
130 | .I /proc/sys/fs/file\-nr | |
131 | This (read-only) file contains three numbers: | |
132 | the number of allocated file handles | |
133 | (i.e., the number of open file descriptions; see | |
134 | .BR open (2)); | |
135 | the number of free file handles; | |
136 | and the maximum number of file handles (i.e., the same value as | |
137 | .IR /proc/sys/fs/file\-max ). | |
138 | If the number of allocated file handles is close to the | |
139 | maximum, you should consider increasing the maximum. | |
140 | Before Linux 2.6, | |
141 | the kernel allocated file handles dynamically, | |
142 | but it didn't free them again. | |
143 | Instead the free file handles were kept in a list for reallocation; | |
144 | the "free file handles" value indicates the size of that list. | |
145 | A large number of free file handles indicates that there was | |
146 | a past peak in the usage of open file handles. | |
147 | Since Linux 2.6, the kernel does deallocate freed file handles, | |
148 | and the "free file handles" value is always zero. | |
149 | .TP | |
150 | .IR /proc/sys/fs/inode\-max " (only present until Linux 2.2)" | |
151 | This file contains the maximum number of in-memory inodes. | |
152 | This value should be 3\[en]4 times larger | |
153 | than the value in | |
154 | .IR file\-max , | |
155 | since \fIstdin\fP, \fIstdout\fP | |
156 | and network sockets also need an inode to handle them. | |
157 | When you regularly run out of inodes, you need to increase this value. | |
158 | .IP | |
159 | Starting with Linux 2.4, | |
160 | there is no longer a static limit on the number of inodes, | |
161 | and this file is removed. | |
162 | .TP | |
163 | .I /proc/sys/fs/inode\-nr | |
164 | This file contains the first two values from | |
165 | .IR inode\-state . | |
166 | .TP | |
167 | .I /proc/sys/fs/inode\-state | |
168 | This file | |
169 | contains seven numbers: | |
170 | .IR nr_inodes , | |
171 | .IR nr_free_inodes , | |
172 | .IR preshrink , | |
173 | and four dummy values (always zero). | |
174 | .IP | |
175 | .I nr_inodes | |
176 | is the number of inodes the system has allocated. | |
177 | .\" This can be slightly more than | |
178 | .\" .I inode\-max | |
179 | .\" because Linux allocates them one page full at a time. | |
180 | .I nr_free_inodes | |
181 | represents the number of free inodes. | |
182 | .IP | |
183 | .I preshrink | |
184 | is nonzero when the | |
185 | .I nr_inodes | |
186 | > | |
187 | .I inode\-max | |
188 | and the system needs to prune the inode list instead of allocating more; | |
189 | since Linux 2.4, this field is a dummy value (always zero). | |
190 | .TP | |
191 | .IR /proc/sys/fs/inotify/ " (since Linux 2.6.13)" | |
192 | This directory contains files | |
193 | .IR max_queued_events ", " max_user_instances ", and " max_user_watches , | |
194 | that can be used to limit the amount of kernel memory consumed by the | |
195 | .I inotify | |
196 | interface. | |
197 | For further details, see | |
198 | .BR inotify (7). | |
199 | .TP | |
200 | .I /proc/sys/fs/lease\-break\-time | |
201 | This file specifies the grace period that the kernel grants to a process | |
202 | holding a file lease | |
203 | .RB ( fcntl (2)) | |
204 | after it has sent a signal to that process notifying it | |
205 | that another process is waiting to open the file. | |
206 | If the lease holder does not remove or downgrade the lease within | |
207 | this grace period, the kernel forcibly breaks the lease. | |
208 | .TP | |
209 | .I /proc/sys/fs/leases\-enable | |
210 | This file can be used to enable or disable file leases | |
211 | .RB ( fcntl (2)) | |
212 | on a system-wide basis. | |
213 | If this file contains the value 0, leases are disabled. | |
214 | A nonzero value enables leases. | |
215 | .TP | |
216 | .IR /proc/sys/fs/mount\-max " (since Linux 4.9)" | |
217 | .\" commit d29216842a85c7970c536108e093963f02714498 | |
218 | The value in this file specifies the maximum number of mounts that may exist | |
219 | in a mount namespace. | |
220 | The default value in this file is 100,000. | |
221 | .TP | |
222 | .IR /proc/sys/fs/mqueue/ " (since Linux 2.6.6)" | |
223 | This directory contains files | |
224 | .IR msg_max ", " msgsize_max ", and " queues_max , | |
225 | controlling the resources used by POSIX message queues. | |
226 | See | |
227 | .BR mq_overview (7) | |
228 | for details. | |
229 | .TP | |
230 | .IR /proc/sys/fs/nr_open " (since Linux 2.6.25)" | |
231 | .\" commit 9cfe015aa424b3c003baba3841a60dd9b5ad319b | |
232 | This file imposes a ceiling on the value to which the | |
233 | .B RLIMIT_NOFILE | |
234 | resource limit can be raised (see | |
235 | .BR getrlimit (2)). | |
236 | This ceiling is enforced for both unprivileged and privileged process. | |
237 | The default value in this file is 1048576. | |
238 | (Before Linux 2.6.25, the ceiling for | |
239 | .B RLIMIT_NOFILE | |
240 | was hard-coded to the same value.) | |
241 | .TP | |
242 | .IR /proc/sys/fs/overflowgid " and " /proc/sys/fs/overflowuid | |
243 | These files | |
244 | allow you to change the value of the fixed UID and GID. | |
245 | The default is 65534. | |
246 | Some filesystems support only 16-bit UIDs and GIDs, although in Linux | |
247 | UIDs and GIDs are 32 bits. | |
248 | When one of these filesystems is mounted | |
249 | with writes enabled, any UID or GID that would exceed 65535 is translated | |
250 | to the overflow value before being written to disk. | |
251 | .TP | |
252 | .IR /proc/sys/fs/pipe\-max\-size " (since Linux 2.6.35)" | |
253 | See | |
254 | .BR pipe (7). | |
255 | .TP | |
256 | .IR /proc/sys/fs/pipe\-user\-pages\-hard " (since Linux 4.5)" | |
257 | See | |
258 | .BR pipe (7). | |
259 | .TP | |
260 | .IR /proc/sys/fs/pipe\-user\-pages\-soft " (since Linux 4.5)" | |
261 | See | |
262 | .BR pipe (7). | |
263 | .TP | |
264 | .IR /proc/sys/fs/protected_fifos " (since Linux 4.19)" | |
265 | The value in this file is/can be set to one of the following: | |
266 | .RS | |
267 | .TP 4 | |
268 | 0 | |
269 | Writing to FIFOs is unrestricted. | |
270 | .TP | |
271 | 1 | |
272 | Don't allow | |
273 | .B O_CREAT | |
274 | .BR open (2) | |
275 | on FIFOs that the caller doesn't own in world-writable sticky directories, | |
276 | unless the FIFO is owned by the owner of the directory. | |
277 | .TP | |
278 | 2 | |
279 | As for the value 1, | |
280 | but the restriction also applies to group-writable sticky directories. | |
281 | .RE | |
282 | .IP | |
283 | The intent of the above protections is to avoid unintentional writes to an | |
284 | attacker-controlled FIFO when a program expected to create a regular file. | |
285 | .TP | |
286 | .IR /proc/sys/fs/protected_hardlinks " (since Linux 3.6)" | |
287 | .\" commit 800179c9b8a1e796e441674776d11cd4c05d61d7 | |
288 | When the value in this file is 0, | |
289 | no restrictions are placed on the creation of hard links | |
290 | (i.e., this is the historical behavior before Linux 3.6). | |
291 | When the value in this file is 1, | |
292 | a hard link can be created to a target file | |
293 | only if one of the following conditions is true: | |
294 | .RS | |
295 | .IP \[bu] 3 | |
296 | The calling process has the | |
297 | .B CAP_FOWNER | |
298 | capability in its user namespace | |
299 | and the file UID has a mapping in the namespace. | |
300 | .IP \[bu] | |
301 | The filesystem UID of the process creating the link matches | |
302 | the owner (UID) of the target file | |
303 | (as described in | |
304 | .BR credentials (7), | |
305 | a process's filesystem UID is normally the same as its effective UID). | |
306 | .IP \[bu] | |
307 | All of the following conditions are true: | |
308 | .RS 4 | |
309 | .IP \[bu] 3 | |
310 | the target is a regular file; | |
311 | .IP \[bu] | |
312 | the target file does not have its set-user-ID mode bit enabled; | |
313 | .IP \[bu] | |
314 | the target file does not have both its set-group-ID and | |
315 | group-executable mode bits enabled; and | |
316 | .IP \[bu] | |
317 | the caller has permission to read and write the target file | |
318 | (either via the file's permissions mask or because it has | |
319 | suitable capabilities). | |
320 | .RE | |
321 | .RE | |
322 | .IP | |
323 | The default value in this file is 0. | |
324 | Setting the value to 1 | |
325 | prevents a longstanding class of security issues caused by | |
326 | hard-link-based time-of-check, time-of-use races, | |
327 | most commonly seen in world-writable directories such as | |
328 | .IR /tmp . | |
329 | The common method of exploiting this flaw | |
330 | is to cross privilege boundaries when following a given hard link | |
331 | (i.e., a root process follows a hard link created by another user). | |
332 | Additionally, on systems without separated partitions, | |
333 | this stops unauthorized users from "pinning" vulnerable set-user-ID and | |
334 | set-group-ID files against being upgraded by | |
335 | the administrator, or linking to special files. | |
336 | .TP | |
337 | .IR /proc/sys/fs/protected_regular " (since Linux 4.19)" | |
338 | The value in this file is/can be set to one of the following: | |
339 | .RS | |
340 | .TP 4 | |
341 | 0 | |
342 | Writing to regular files is unrestricted. | |
343 | .TP | |
344 | 1 | |
345 | Don't allow | |
346 | .B O_CREAT | |
347 | .BR open (2) | |
348 | on regular files that the caller doesn't own in | |
349 | world-writable sticky directories, | |
350 | unless the regular file is owned by the owner of the directory. | |
351 | .TP | |
352 | 2 | |
353 | As for the value 1, | |
354 | but the restriction also applies to group-writable sticky directories. | |
355 | .RE | |
356 | .IP | |
357 | The intent of the above protections is similar to | |
358 | .IR protected_fifos , | |
359 | but allows an application to | |
360 | avoid writes to an attacker-controlled regular file, | |
361 | where the application expected to create one. | |
362 | .TP | |
363 | .IR /proc/sys/fs/protected_symlinks " (since Linux 3.6)" | |
364 | .\" commit 800179c9b8a1e796e441674776d11cd4c05d61d7 | |
365 | When the value in this file is 0, | |
366 | no restrictions are placed on following symbolic links | |
367 | (i.e., this is the historical behavior before Linux 3.6). | |
368 | When the value in this file is 1, symbolic links are followed only | |
369 | in the following circumstances: | |
370 | .RS | |
371 | .IP \[bu] 3 | |
372 | the filesystem UID of the process following the link matches | |
373 | the owner (UID) of the symbolic link | |
374 | (as described in | |
375 | .BR credentials (7), | |
376 | a process's filesystem UID is normally the same as its effective UID); | |
377 | .IP \[bu] | |
378 | the link is not in a sticky world-writable directory; or | |
379 | .IP \[bu] | |
380 | the symbolic link and its parent directory have the same owner (UID) | |
381 | .RE | |
382 | .IP | |
383 | A system call that fails to follow a symbolic link | |
384 | because of the above restrictions returns the error | |
385 | .B EACCES | |
386 | in | |
387 | .IR errno . | |
388 | .IP | |
389 | The default value in this file is 0. | |
390 | Setting the value to 1 avoids a longstanding class of security issues | |
391 | based on time-of-check, time-of-use races when accessing symbolic links. | |
392 | .TP | |
393 | .IR /proc/sys/fs/suid_dumpable " (since Linux 2.6.13)" | |
394 | .\" The following is based on text from Documentation/sysctl/kernel.txt | |
395 | The value in this file is assigned to a process's "dumpable" flag | |
396 | in the circumstances described in | |
397 | .BR prctl (2). | |
398 | In effect, | |
399 | the value in this file determines whether core dump files are | |
400 | produced for set-user-ID or otherwise protected/tainted binaries. | |
401 | The "dumpable" setting also affects the ownership of files in a process's | |
402 | .IR /proc/ pid | |
403 | directory, as described above. | |
404 | .IP | |
405 | Three different integer values can be specified: | |
406 | .RS | |
407 | .TP | |
408 | \fI0\ (default)\fP | |
409 | .\" In kernel source: SUID_DUMP_DISABLE | |
410 | This provides the traditional (pre-Linux 2.6.13) behavior. | |
411 | A core dump will not be produced for a process which has | |
412 | changed credentials (by calling | |
413 | .BR seteuid (2), | |
414 | .BR setgid (2), | |
415 | or similar, or by executing a set-user-ID or set-group-ID program) | |
416 | or whose binary does not have read permission enabled. | |
417 | .TP | |
418 | \fI1\ ("debug")\fP | |
419 | .\" In kernel source: SUID_DUMP_USER | |
420 | All processes dump core when possible. | |
421 | (Reasons why a process might nevertheless not dump core are described in | |
422 | .BR core (5).) | |
423 | The core dump is owned by the filesystem user ID of the dumping process | |
424 | and no security is applied. | |
425 | This is intended for system debugging situations only: | |
426 | this mode is insecure because it allows unprivileged users to | |
427 | examine the memory contents of privileged processes. | |
428 | .TP | |
429 | \fI2\ ("suidsafe")\fP | |
430 | .\" In kernel source: SUID_DUMP_ROOT | |
431 | Any binary which normally would not be dumped (see "0" above) | |
432 | is dumped readable by root only. | |
433 | This allows the user to remove the core dump file but not to read it. | |
434 | For security reasons core dumps in this mode will not overwrite one | |
435 | another or other files. | |
436 | This mode is appropriate when administrators are | |
437 | attempting to debug problems in a normal environment. | |
438 | .IP | |
439 | Additionally, since Linux 3.6, | |
440 | .\" 9520628e8ceb69fa9a4aee6b57f22675d9e1b709 | |
441 | .I /proc/sys/kernel/core_pattern | |
442 | must either be an absolute pathname | |
443 | or a pipe command, as detailed in | |
444 | .BR core (5). | |
445 | Warnings will be written to the kernel log if | |
446 | .I core_pattern | |
447 | does not follow these rules, and no core dump will be produced. | |
448 | .\" 54b501992dd2a839e94e76aa392c392b55080ce8 | |
449 | .RE | |
450 | .IP | |
451 | For details of the effect of a process's "dumpable" setting | |
452 | on ptrace access mode checking, see | |
453 | .BR ptrace (2). | |
454 | .TP | |
455 | .I /proc/sys/fs/super\-max | |
456 | This file | |
457 | controls the maximum number of superblocks, and | |
458 | thus the maximum number of mounted filesystems the kernel | |
459 | can have. | |
460 | You need increase only | |
461 | .I super\-max | |
462 | if you need to mount more filesystems than the current value in | |
463 | .I super\-max | |
464 | allows you to. | |
465 | .TP | |
466 | .I /proc/sys/fs/super\-nr | |
467 | This file | |
468 | contains the number of filesystems currently mounted. | |
469 | .SH SEE ALSO | |
470 | .BR proc (5), | |
471 | .BR proc_sys (5) |