man/man2/mmap.2: CAVEATS: Document danger of mappings larger than PTRDIFF_MAX
References:
- C99 draft: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf
section "6.5.6 Additive operators", paragraph 9
- object size restriction in GCC:
https://gcc.gnu.org/legacy-ml/gcc/2011-08/msg00221.html
- glibc malloc restricts object size to <=PTRDIFF_MAX in
checked_request2size() since glibc v2.30 (released in 2019, as pointed
out by Jakub Wilk):
https://sourceware.org/cgit/glibc/commit/?id=9bf8e29ca136094f
Documentation was extracted from the original patch written by Andrea
Arcangeli and upstreamed in [1]. Minor edits were made to maintain
the same documentation style as other userfaultfd ioctl commands.
Jeremy Kerr [Thu, 17 Apr 2025 02:50:07 +0000 (10:50 +0800)]
man/man7/mctp.7: Document Linux MCTP support
This change adds a brief description for the new Management Component
Transport Protocol (MCTP) support added to Linux as of
linux.git bc49d8169aa7 (2021-07-29; "mctp: Add MCTP base").
This is a fairly regular sockets-API implementation, so we're just
describing the semantics of socket(2), bind(2), sendto(2), and
recvfrom(2) for the new protocol.
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Message-ID: <20250417-mctp-v3-1-07fff4d26f73@codeconstruct.com.au>
[alx: minor tweaks] Signed-off-by: Alejandro Colomar <alx@kernel.org>
This seems to be about implementation details that are unimportant to
programmers. It is widely understood that compilers may optimize some
libc calls.
Cc: Anton Zellerhoff <wg14@ascz.de> Signed-off-by: Alejandro Colomar <alx@kernel.org>
Anton Zellerhoff [Sun, 13 Apr 2025 17:50:11 +0000 (19:50 +0200)]
man/man3/abs.3: Document u{,l,ll,imax}abs()
C2Y adds unsigned versions of the abs functions (see C2Y draft N3467 and
proposal N3349). Support for these functions will be included in GCC 15
and glibc 2.42.
Amir Goldstein [Fri, 4 Apr 2025 10:47:22 +0000 (12:47 +0200)]
man/man?/fanotify*: Reorganize documentation of FAN_FS_ERROR
The order of FAN_FS_ERROR entry in the event section was rather
arbitrary inside the group of fid info events.
FAN_FS_ERROR is a special event with error info, so place its entry
after the entries for fid info events and before the entries for
permission events.
Reduce unneeded newlines in the FAN_FS_ERROR entry.
Cc: Jan Kara <jack@suse.cz> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Message-ID: <20250404104723.1709188-1-amir73il@gmail.com> Signed-off-by: Alejandro Colomar <alx@kernel.org>
Document FAN_RESPONSE_INFO_AUDIT_RULE extended response info record
that was added in v6.3.
Cc: Jan Kara <jack@suse.cz> Cc: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Message-ID: <20250331082759.1424401-2-amir73il@gmail.com>
[alx: ffix] Signed-off-by: Alejandro Colomar <alx@kernel.org>
Amir Goldstein [Sun, 30 Mar 2025 12:55:36 +0000 (14:55 +0200)]
man/man?/fanotify*: Document FAN_PRE_ACCESS event
The new FAN_PRE_ACCESS events are created before access to a file range,
to provides an opportunity for the event listener to modify the content
of the object before the user can accesss it.
Those events are available for group in class FAN_CLASS_PRE_CONTENT
They are reported with FAN_EVENT_INFO_TYPE_RANGE info record.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Message-ID: <20250330125536.1408939-1-amir73il@gmail.com> Signed-off-by: Alejandro Colomar <alx@kernel.org>
Amir Goldstein [Mon, 31 Mar 2025 08:27:59 +0000 (10:27 +0200)]
man/man7/fanotify.7: Document FAN_DENY_ERRNO()
Document FAN_DENY_ERRNO(), which was added in Linux 6.13 to
report specific errors on file access.
Cc: Jan Kara <jack@suse.cz> Cc: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Message-ID: <20250331082759.1424401-3-amir73il@gmail.com> Signed-off-by: Alejandro Colomar <alx@kernel.org>
Amir Goldstein [Mon, 31 Mar 2025 08:27:57 +0000 (10:27 +0200)]
man/man7/fanotify.7: The response field is now a bit mask instead of an enum
Since the introduction of the FAN_AUDIT response flag,
the response field of fanotify_response is no longer an enum
it is now a bitmask, so fix the wording around FAN_ALLOW and
FAN_DENY.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Message-ID: <20250331082759.1424401-1-amir73il@gmail.com> Signed-off-by: Alejandro Colomar <alx@kernel.org>
Amir Goldstein [Mon, 31 Mar 2025 08:16:42 +0000 (10:16 +0200)]
man/man2/open_by_handle_at.2: name_to_handle_at(): Document the AT_HANDLE_CONNECTABLE flag
A flag since Linux 6.13 to indicate that the requested file_handle is
intended to be used for open_by_handle_at(2) to obtain an open file
with a known path.
Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Jeff Layton <jlayton@poochiereds.net> Cc: Christian Brauner <brauner@kernel.org> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Message-ID: <20250331081642.1423812-2-amir73il@gmail.com> Signed-off-by: Alejandro Colomar <alx@kernel.org>
Amir Goldstein [Mon, 31 Mar 2025 08:16:41 +0000 (10:16 +0200)]
man/man2/open_by_handle_at.2: name_to_handle_at(): Document the AT_HANDLE_MNT_ID_UNIQUE flag
A flag since Linux 6.12 to indicate that the requested mount_id is
a 64-bit unique id.
Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Jeff Layton <jlayton@poochiereds.net> Cc: Christian Brauner <brauner@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Aleksa Sarai <cyphar@cyphar.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Message-ID: <20250331081642.1423812-1-amir73il@gmail.com> Signed-off-by: Alejandro Colomar <alx@kernel.org>
These previously undocumented selection modes for the Linux console
are implemented in <drivers/tty/vt/selection.c>. The name "selection
mode" is slightly misleading as not all of them actually manipulate
the kernel's mouse selection buffer.
Includes clarified semantics pointed out by Jared Finder.
Cc: Hanno Böck <hanno@hboeck.de> Cc: Jann Horn <jannh@google.com> Cc: Jiri Slaby <jirislaby@kernel.org> Cc: Jakub Wilk <jwilk@jwilk.net> Signed-off-by: Günther Noack <gnoack3000@gmail.com>
Message-ID: <20250330143038.4184-5-gnoack3000@gmail.com> Acked-by: Jared Finder <jared@finder.org> Signed-off-by: Alejandro Colomar <alx@kernel.org>
Günther Noack [Sun, 30 Mar 2025 14:30:39 +0000 (16:30 +0200)]
man/man2const/TIOCLINUX.2const: Restructure documentation for TIOCL_SETSEL selection modes
* Indent the documented selection modes into tagged paragraphs.
* Document constants from the header file <tiocl.h> instead of numbers.
* Clarify expansion semantics as suggested by Jared Finder.
Signed-off-by: Günther Noack <gnoack3000@gmail.com>
Message-ID: <20250330143038.4184-4-gnoack3000@gmail.com> Acked-by: Jared Finder <jared@finder.org> Signed-off-by: Alejandro Colomar <alx@kernel.org>
Kang-Che Sung [Sun, 16 Mar 2025 18:32:13 +0000 (02:32 +0800)]
man/man3/wc{,r}tomb.3: wfix regarding MB_CUR_MAX
Add the missing length requirement about MB_CUR_MAX to wcrtomb(3).
Change the wording on the MB_CUR_MAX requirement in wctomb(3). If
programmers know the wide character to convert beforehand, they are
allowed to use a buffer smaller than MB_CUR_MAX bytes, as long as it
"fits" the sequence.
man/man2/get_mempolicy.2: SYNOPSIS: Use GNU fwd declaration of parameters for sizes of array parameters
I forgot to include this change in the global change applied recently.
Fixes: d2c2db8830f8 (2025-03-14; "man/: SYNOPSIS: Use GNU forward-declarations of parameters for sizes of array parameters") Signed-off-by: Alejandro Colomar <alx@kernel.org>
Matthieu Buffet [Fri, 7 Mar 2025 22:22:44 +0000 (23:22 +0100)]
man/man7/ip.7: Document capabilities to use IP_TRANSPARENT
CAP_NET_ADMIN has been overkill to use setsockopt(IP_TRANSPARENT)
since a discussion on LKML[1] and a patch[2] in 2011. All that is
left to do is to let devs know they don't need CAP_NET_ADMIN.
[2] linux.git 6cc7a765c298 (2011-10-20; "net: allow CAP_NET_RAW to set socket options IP{,V6}_TRANSPARENT")
Günther Noack [Mon, 3 Mar 2025 19:50:31 +0000 (20:50 +0100)]
man/man7/landlock.7: Document IPC scoping (Landlock ABI v6)
With this ABI version, Landlock can restrict outgoing interactions with
higher-privileged Landlock domains through Abstract Unix Domain sockets
and signals.
Terminology:
* The *IPC Scope* of a Landlock domain is that Landlock domain and its
nested domains.
* An *operation* (e.g., signaling, connecting to abstract UDS) is said
to be *scoped within a domain* when the flag for that operation was
set at ruleset creation time. This means that for the purpose of
this operation, only processes within the domain's IPC scope are
reachable.
Günther Noack [Mon, 3 Mar 2025 19:50:29 +0000 (20:50 +0100)]
man/man7/landlock.7: Document network support
Copy over the existing wording from kernel documentation,
as it was introduced in Linux commit 51442e8d64bc (2023-10-26, "landlock: Document network support").
Landlock rules are not only about the filesystem any more
and the new wording is more appropriate.
We need to escape the # for old versions of make(1). However, new
versions of grep(1) diagnose if it receives an escaped #. To keep both
make(1) and grep(1) happy in both their old and new versions, we need to
take advantage of # not being a comment in bash(1) when not preceeded by
a space, and also of \# being translated into # by bash(1).
Browsing a header file in the kernel source and saw the memory policy
enum used for mbind() and set_mempolicy() using an entry that I didn't
recognize. I man 2'd both system calls and didn't see an entry for
MPOL_PREFERRED_MANY. The commit on the enum entry:
linux.git b27abaccf8e8 (2021-09-02; "mm/mempolicy: add
MPOL_PREFERRED_MANY for multiple preferred nodes")
The commit message gives the rationale as to why the MPOL_PREFERRED_MANY
mode would be beneficial. Giving the ability to set the memory policy
to target different tiers of memory over various NUMA nodes.
Cc: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Matthew Cassell <mcassell411@gmail.com>
Message-ID: <20250220225232.2138-1-mcassell411@gmail.com> Signed-off-by: Alejandro Colomar <alx@kernel.org>
Askar Safin [Thu, 20 Feb 2025 09:19:25 +0000 (09:19 +0000)]
man/man3/getcwd.3: VERSIONS: The syscall can return "(unreachable)", but modern glibc wrapper cannot
I verified using an expirement (see below) that the modern glibc wrapper
getcwd() actually never returns "(unreachable)". I have also read the
modern glibc sources for all three functions documented here. None of
them return "(unreachable)".
Göran Uddeborg [Sun, 16 Feb 2025 18:59:50 +0000 (19:59 +0100)]
man/man7/mount_namespaces.7: Fix an incorrect path in an example
In the example showing how locked mounts in a less privileged mount
namespace can not be split, first </etc/shadow> is bind mounted, then an
attempt is done to unmount </mnt/dir>, which gives an error complaining
that </etc/shadow> is not mounted. The unmount should also refer to
</etc/shadow>.
The semantics of '?=' are similar to those of '=', but we need simple
assignment as if ':=', so we can't use '?='. In the future, we'll be
able to use '?:='. For now, let's use ifndef.
Fixes: 0d69e51cd4b8 (2025-02-10; "share/mk/: Use ?= assignments for user-facing variables")
Link: <https://lore.kernel.org/linux-man/378a2eba-c973-4de9-a362-6e25123bf75b@systematicsw.ab.ca/T/#m3be93ab6b875569178981b034b4a874632db2fa9> Reported-by: Brian Inglis <brian.inglis@systematicsw.ab.ca> Signed-off-by: Alejandro Colomar <alx@kernel.org>
Amit Pinhas [Wed, 12 Feb 2025 20:56:20 +0000 (22:56 +0200)]
man/man2/kill.2: RETURN VALUE: Fix wording issue with sig=0
The DESCRIPTION says:
If *sig* is 0, then no signal is sent, but existence and
permission checks are still performed; this can be used to
check for the existence of a process ID.
On the other hand, the `RETURN VALUE` section contradicted that.
On success (at least one signal was sent), zero is returned. On
error, -1 is returned...
How can I get 0 when providing sig=0, if no signal was actually
sent, which is the criteria for success of this call???
Reported-by: Amit Pinhas <amitpinhass@gmail.com> Co-authored-by: Alejandro Colomar <alx@kernel.org> Signed-off-by: Amit Pinhas <amitpinhass@gmail.com>
Message-ID: <a4fa37e0fc89a3c99982ace3fe381991ebe85b00.1739393685.git.amitpinhass@gmail.com> Signed-off-by: Alejandro Colomar <alx@kernel.org>
GNUmakefile: help: Show only variables assigned with '?='
The others are internal stuff that most likely shouldn't be touched.
Cc: Sam James <sam@gentoo.org> Cc: Paul Smith <psmith@gnu.org> Cc: Guenther Noack <gnoack@google.com> Signed-off-by: Alejandro Colomar <alx@kernel.org>
share/mk/: Use ?= assignments for user-facing variables
This allows users specifying them as environment variables.
Cc: Sam James <sam@gentoo.org> Cc: Paul Smith <psmith@gnu.org> Cc: Guenther Noack <gnoack@google.com> Signed-off-by: Alejandro Colomar <alx@kernel.org>
GNUmakefile: Require the user to specify '-R' if their make(1) is too old
And everyone's make(1) is too old. :-)
This will allow us to use ?= assignments. Once a new GNU make(1)
release is done, we'll be able to rely on our setting of MAKEFLAGS+=-R
at the top of the GNUMakefile, but currently, that's not enough, and the
user must specify -R to unset implicit variables.
Cc: Sam James <sam@gentoo.org> Cc: Paul Smith <psmith@gnu.org> Cc: Guenther Noack <gnoack@google.com> Signed-off-by: Alejandro Colomar <alx@kernel.org>
GNU make(1) 4.2 seems to be interpreting those characters as the start
of a comment, so we need to escape them. That seems to calm those old
versions of make(1), and doesn't affect negatively the newer ones, and
doesn't affect negatively grep(1) either.
Fixes: 35a780a99bd8 (2024-07-20; "share/mk/: CPPFLAGS: Only define _FORTIFY_SOURCE if it's not already defined") Fixes: 2130162900ab (2024-11-03; "share/mk/, etc/shellcheck/: lint-sh: Add target to lint shell scripts") Reported-by: Boris Pigin <boris.pigin@gmail.com> Cc: Sam James <sam@gentoo.org> Signed-off-by: Alejandro Colomar <alx@kernel.org>
Mark Harris [Sun, 9 Feb 2025 01:24:14 +0000 (17:24 -0800)]
man/man3/timespec_get.3: Correct return value and clarify description
- 0, not -1, is returned for an unsupported time base or error
(C23 7.29.2.6, 7.29.2.7; POSIX.1-2024 line 74358).
- Clarify that any supported value of base is always nonzero (i.e.,
there is no overlap between the two return value cases that may
require errno or some other source to disambiguate)
(C23 7.29.2.6, 7.29.2.7; POSIX.1-2024 line 74357).
- Clarify that timespec_getres(NULL, base) is a valid call to check
whether the specified time base is supported (C23 7.29.2.7).
- Clarify that the resolution for a particular time base is constant
for the lifetime of the process (i.e., there is no need to retrieve
it repeatedly) (C23 7.29.2.7).
- Calls to these functions are not technically equivalent to any
clock_* function call; at least the return value will be different.
Clarify that it is the time and resolution that are the same.
- The ERRORS section is removed, because it states only what is true
for every function that does not state otherwise (i.e., errno might
be affected by underlying system calls).
Fixes: 7bda5119fe5e (2024-09-08; "timespec_get.3, timespec_getres.3: Add page and link page") Cc: наб <nabijaczleweli@nabijaczleweli.xyz> Signed-off-by: Mark Harris <mark.hsj@gmail.com>
Message-ID: <5f8dc5d2dc51f080a18de53e98610df43389b98b.1739063937.git.mark.hsj@gmail.com> Signed-off-by: Alejandro Colomar <alx@kernel.org>
man/man3/regex.3: EXAMPLES: Don't use z length modifier with unsigned int
GCC and Clang print warnings:
$ clang main.c -Wall
main.c:30:23: warning: format specifies type 'size_t' (aka 'unsigned long') but the argument has type 'unsigned int' [-Wformat]
30 | printf("#%zu:\n", i);
| ~~~ ^
| %u
1 warning generated.
$ gcc main.c -Wall
main.c: In function ‘main’:
main.c:30:16: warning: format ‘%zu’ expects argument of type ‘size_t’, but argument 2 has type ‘unsigned int’ [-Wformat=]
30 | printf("#%zu:\n", i);
| ~~^ ~
| | |
| | unsigned int
| long unsigned int
| %u
The autogroup feature can be contolled at runtime when built into the
kernel. Disabling it in this case still creates autogroups and still
shows the autogroup membership for the task in /proc. The scheduler
code will just not use the the autogroup task group. This can be
confusing to users. Add a sentence to this effect to sched.7 to point
this out.
The kernel code shows how this is used. The sched_autogroup_enabled
toggle is only used in one place.
kernel/sched/autogroup.h:
static inline struct task_group *
autogroup_task_group(struct task_struct *p, struct task_group *tg)
{
extern unsigned int sysctl_sched_autogroup_enabled;
int enabled = READ_ONCE(sysctl_sched_autogroup_enabled);
if (enabled && task_wants_autogroup(p, tg))
return p->signal->autogroup->tg;
return tg;
}
task_wants_autogroup() is in kernel/sched/autogroup.c:
One can see that any group set other than root also bypasses the use of
the autogroup. All of the machinery around the creation of the
autogroup is not effected by the toggle.
From userspace:
0
/autogroup-112 nice 0
Note, systemd based system these days is not really using autogroups at
all anyway because any task in a non-root cgroup bypasses the autogroup
as well.
Cc: Carlos O'Donell <codonell@redhat.com> Signed-off-by: Phil Auld <pauld@redhat.com>
Message-ID: <20250116143747.2366152-1-pauld@redhat.com> Signed-off-by: Alejandro Colomar <alx@kernel.org>
man/man7/pathname.7: Pathnames are opaque C strings
On Mon, Jan 27, 2025 at 07:27:59PM +0100, наб wrote:
> Skimming the thread: UNIX paths are sequences of non-NUL bytes.
>
> It is never correct to expect to be able to have a (parse, unparse)
> operation pair for which unparse(parse(x)) = x for path x.
>
> It's obviously wrong to reject a pathname just because you dont like it.
>
> Thus, when displaying a path, either (a) dump it directly to the output
> (the user has configured their display device to understand the paths they use),
> or if that's not possible (b) setlocale(LC_ALL, "") + mbrtowc() loop
> and render the result (applying usual ?/� substitutions for mbrtowc()
> errors makes sense here).
>
> There are very few operations on paths that are actually reasonable
> to do, ever; those are: appending stuff, prepending stuff
> (this is just appending stuff with the arguments backwards),
> and cleaving at /es;
> the "stuff" better be copied whole-sale from some other path
> or an unprocessed argument (or, sure, the PFCS).
>
> If you're getting bytes to append to a path, do that directly.
>
> If you're getting characters to append to a path,
> then wctomb(3) is the only non-invalid solution,
> since that (obviously) turns characters into bytes in the current
> locale, which (ex def) is the operation desired.
>
> I don't understand what the UTF-32 dance is supposed to be.
>
> If you're recommending transcoding paths, don't.
>
> To re-iterate: paths are not character sequences.
> They do not represent characters.
> You can't meaningfully coerce them thusly without loss of precision
> (this is ok to do for display! and nothing else).
> If at any point you find yourself turning wchar_t -> char
> you are doing something wrong;
> if you find yourself doing char -> wchar_t for anything beside display
> you should probably reconsider.
>
> This is different under Win32 of course. But that concerns us naught.
The goal of this new manual page is to help people create programs that
do the right thing even in the face of unusual paths. The information
that I used to create this new manual page came from these sources:
Jason Yundt [Tue, 14 Jan 2025 21:14:25 +0000 (16:14 -0500)]
man/man7/man-pages.7: Stop telling contributors to write titles in all caps
Recently, I submitted my first patch to the Linux man-pages project. In
my patch, I had created a new manual page. On the manual page’s title
line, I had written the title of my new page in all caps because
man-pages(7) said that I should write it that way. It turns out that
man-pages(7) was wrong and that the title on the title line should have
matched the title in the manual page’s filename [1][2]. This commit
corrects man-pages(7) so that it does not tell contributors to use all
caps when writing titles on title lines.
The _exit(2) function is a better choice for exiting a child in many
cases. Most prominently it avoids calls of functions registered with
atexit(3) by the parent.
There are valid reasons to call exit(3) and the example is actually one
of them: flush FILE-based output. Since atexit(3) is never called, we
could just stay with exit(3).