Many times, this page use the terminology "mount point", where
"mount" would be better. A "mount point" is the location at which
a mount is attached. A "mount" is an association between a
filesystem and a mount point.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Michael Kerrisk [Tue, 17 Aug 2021 03:04:11 +0000 (05:04 +0200)]
mount_namespaces.7: Relocate the "Restrictions on mount namespaces" subsection
The "Restrictions on mount namespaces" subsection belongs lower in
the page, following the discussion of concepts (e.g., shared
subtrees and propagation) that are discussed elsewhere in the page.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Michael Kerrisk [Tue, 17 Aug 2021 02:19:48 +0000 (04:19 +0200)]
mount_namespaces.7: Repair earlier text after injection of new list item in previous commit
The previous commit injected a large block of text into a list,
separating one example in the previous list item from a
"continuation" in the following list item. repair that.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Michael Kerrisk [Fri, 13 Aug 2021 21:40:50 +0000 (23:40 +0200)]
mount_namespaces.7: More clearly explain the notion of locked mounts
For a long time, this manual page has had a brief discussion of
"locked" mounts, without clearly saying what this concept is, or
why it exists. Expand the discussion with an explanation of what
locked mounts are, why mounts are locked, and some examples of the
effect of locking.
Thanks to Christian Brauner for a lot of help in understanding
these details.
Reported-by: Christian Brauner <christian.brauner@ubuntu.com> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Michael Kerrisk [Wed, 18 Aug 2021 01:02:55 +0000 (03:02 +0200)]
chmod.2, chown.2, open.2, mkdir.2, mknod.2, readlink.2, stat.2, symlink.2, mkfifo.3, scandir.3, sem_wait.3: ERRORS: combine errors into a single alphabetic list
These pages split out extra errors for some APIs into a separate
list. Probably, the pages are easier to ready if all errors are
combined into a single list.
Note that there still remain a few pages where the errors are
listed separately for different APIs. For the moment, it seems
best to leave those pages as is, since the error lists are
largely distinct in those pages.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Michael Kerrisk [Tue, 17 Aug 2021 22:35:06 +0000 (00:35 +0200)]
accept.2, access.2, getpriority.2, mlock.2: ERRORS: combine errors into a single list
These split out errors into separate lists (perhaps per API,
perhaps "may" vs "shall", perhaps "Linux-specific" vs
standard(??)), but there's no good reason to do this. It makes
the error list harder to read, and is inconsistent with other
pages. So, combine the errors into a single list.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Michael Kerrisk [Sun, 15 Aug 2021 23:57:52 +0000 (01:57 +0200)]
user_namespaces.7: Minor wording improvement
Mainly in preparation for the following patch on project IDs maps.
Add some words that will make the parallels between the rules for
updating uid_map and projid_map clearer.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Many times, these pages use the terminology "mount point", where
"mount" would be better. A "mount point" is the location at which
a mount is attached. A "mount" is an association between a
filesystem and a mount point.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Michael Kerrisk [Thu, 12 Aug 2021 22:26:37 +0000 (00:26 +0200)]
mount_setattr.2: srcfix: add note explaining Christian's use of -ve dirfd values
From email with Christian Brauner:
>>>>>> int fd_tree = open_tree(-EBADF, source,
>>>>>> OPEN_TREE_CLONE | OPEN_TREE_CLOEXEC |
>>>>>> AT_EMPTY_PATH | (recursive ? AT_RECURSIVE : 0));
>>>>>
>>>>> ???
>>>>> What is the significance of -EBADF here? As far as I can tell, it
>>>>> is not meaningful to open_tree()?
>>>>
>>>> I always pass -EBADF for similar reasons to [2]. Feel free to just use -1.
>>>
>>> ????
>>> But here, both -EBADF and -1 seem to be wrong. This argument
>>> is a dirfd, and so should either be a file descriptor or the
>>> value AT_FDCWD, right?
>>
>> [1]: In this code "source" is expected to be absolute. If it's not
>> absolute we should fail. This can be achieved by passing -1/-EBADF,
>> afaict.
>
> D'oh! Okay. I hadn't considered that use case for an invalid dirfd.
> (And now I've done some adjustments to openat(2),which contains a
> rationale for the *at() functions.)
>
> So, now I understand your purpose, but still the code is obscure,
> since
>
> * You use a magic value (-EBADF) rather than (say) -1.
> * There's no explanation (comment about) of the fact that you want
> to prevent relative pathnames.
>
> So, I've changed the code to use -1, not -EBADF, and I've added some
> comments to explain that the intent is to prevent relative pathnames.
> Okay?
Sounds good.
>
> But, there is still the meta question: what's the problem with using
> a relative pathname?
Nothing per se. Ok, you asked so it's your fault:
When writing programs I like to never use relative paths with AT_FDCWD
because. Because making assumptions about the current working directory
of the calling process is just too easy to get wrong; especially when
pivot_root() or chroot() are in play.
My absolut preference (joke intended) is to open a well-known starting
point with an absolute path to get a dirfd and then scope all future
operations beneath that dirfd. This already works with old-style
openat() and _very_ cautious programming but openat2() and its
resolve-flag space have made this **chef's kiss**.
If I can't operate based on a well-known dirfd I use absolute paths with
a -EBADF dirfd passed to *at() functions.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Michael Kerrisk [Thu, 12 Aug 2021 03:41:56 +0000 (05:41 +0200)]
mount_setattr.2: Clarify the description of "detached" mounts
From email:
>> Thanks. I made it "detached". Elsewhere, the page already explains
>> that a detached mount is one that:
>>
>> must have been created by calling open_tree(2) with the
>> OPEN_TREE_CLONE flag and it must not already have been
>> visible in the filesystem.
>>
>> Which seems a fine explanation.
>>
>> ????
>> But, just a thought... "visible in the filesystem" seems not quite accurate.
>> What you really mean I guess is that it must not already have been
>> /visible in the filesystem hierarchy/previously mounted/something else/,
>> right?
I suppose that I should have clarified that my main problem was
that you were using the word "filesystem" in a way that I find
unconventional/ambiguous. I mean, I normally take the term
"filesystem" to be "a storage system for folding files".
Here, you are using "filesystem" to mean something else, what
I might call like "the single directory hierarchy" or "the
filesystem hierarchy" or "the list of mount points".
> A detached mount is created via the OPEN_TREE_CLONE flag. It is a
> separate new mount so "previously mounted" is not applicable.
> A detached mount is _related_ to what the MS_BIND flag gives you with
> mount(2). However, they differ conceptually and technically. A MS_BIND
> mount(2) is always visible in the fileystem when mount(2) returns, i.e.
> it is discoverable by regular path-lookup starting within the
> filesystem.
>
> However, a detached mount can be seen as a split of MS_BIND into two
> distinct steps:
> 1. fd_tree = open_tree(OPEN_TREE_CLONE): create a new mount
> 2. move_mount(fd_tree, <somewhere>): attach the mount to the filesystem
>
> 1. and 2. together give you the equivalent of MS_BIND.
> In between 1. and 2. however the mount is detached. For the kernel
> "detached" means that an anonymous mount namespace is attached to it
> which doen't appear in proc and has a 0 sequence number (Technically,
> there's a bit of semantical argument to be made that "attached" and
> "detached" are ambiguous as they could also be taken to mean "does or
> does not have a parent mount". This ambiguity e.g. appears in
> do_move_mount(). That's why the kernel itself calls it an "anonymous
> mount". However, an OPEN_TREE_CLONE-detached mount of course doesn't
> have a parent mount so it works.).
>
> For userspace it's better to think of detached and attached in terms of
> visibility in the filesystem or in a mount namespace. That's more
> straightfoward, more relevant, and hits the target in 90% of the cases.
>
> However, the better and clearer picture is to say that a
> OPEN_TREE_CLONE-detached mount is a mount that has never been
> move_mount()ed. Which in turn can be defined as the detached mount has
> never been made visible in a mount namespace. Once that has happened the
> mount is irreversibly an attached mount.
>
> I keep thinking that maybe we should just say "anonymous mount"
> everywhere. So changing the wording to:
I'm not against the word "detached". To user space, I think it is a
little more meaningful than "anonymous". For the moment, I'll stay with
"detached", but if you insist on "anonymous", I'll probably change it.
> [...]
> EINVAL The mount that is to be ID mapped is not an anonymous mount;
> that is, the mount has already been visible in a mount namespace.
I like that text *a lot* better! Thanks very much for suggesting
wordings. It makes my life much easier.
I've made the text:
EINVAL The mount that is to be ID mapped is not a detached
mount; that is, the mount has not previously been
visible in a mount namespace.
> [...]
> The mount must be an anonymous mount; that is, it must have been
> created by calling open_tree(2) with the OPEN_TREE_CLONE flag and it
> must not already have been visible in a mount namespace, i.e. it must
> not have been attached to the filesystem hierarchy with syscalls such
> as move_mount() syscall.
And that too! I've made the text:
• The mount must be a detached mount; that is, it must have
been created by calling open_tree(2) with the
OPEN_TREE_CLONE flag and it must not already have been
visible in a mount namespace. (To put things another way:
the mount must not have been attached to the filesystem
hierarchy with a system call such as move_mount(2).)
Reported-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Michael Kerrisk [Thu, 12 Aug 2021 03:16:42 +0000 (05:16 +0200)]
mount_setattr.2: EXAMPLES: use -1 rather than -EBADF
From email with Christian Braner:
> [1]: In this code "source" is expected to be absolute. If it's not
> absolute we should fail. This can be achieved by passing -1/-EBADF,
> afaict.
D'oh! Okay. I hadn't considered that use case for an invalid dirfd.
(And now I've done some adjustments to openat(2),which contains a
rationale for the *at() functions.)
So, now I understand your purpose, but still the code is obscure,
since
* You use a magic value (-EBADF) rather than (say) -1.
* There's no explanation (comment about) of the fact that you want
to prevent relative pathnames.
So, I've changed the code to use -1, not -EBADF, and I've added some
comments to explain that the intent is to prevent relative pathnames.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Make the description of the EBADF error for invalid 'dirfd' more
uniform. In particular, note that the error only occurs when the
pathname is relative, and that it occurs when the 'dirfd' is
neither valid *nor* has the value AT_FDCWD.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Michael Kerrisk [Tue, 10 Aug 2021 09:11:40 +0000 (11:11 +0200)]
pthread_setname_np.3: EXAMPLES: remove a bug by simplify the code
From an email conversation with Alexis:
Hello Alexis,
On 8/6/21 7:06 PM, Alexis Wilke wrote:
> Hi guys,
>
> The pthread_setname_np(3) manual page has an example where the second
> argument is used to get a size of the thread name.
>
> https://man7.org/linux/man-pages/man3/pthread_setname_np.3.html#EXAMPLES
>
> The current code:
>
> rc = pthread_getname_np(thread, thread_name,
> (argc > 2) ? atoi(argv[1]) : NAMELEN);
>
> The suggested code:
>
> rc = pthread_getname_np(thread, thread_name,
> (argc > 2) ? atoi(argv[2]) : NAMELEN);
I agree that there's a problem, but I think we could go even simpler:
> I'm thinking that maybe the author meant to compute the length like so:
>
> rc = pthread_getname_np(thread, thread_name,
> (argc > 2) ? strlen(argv[1]) + 1 :
> NAMELEN);
>
> But I think that the atoi() points to using argv[2] as a number
> representing the length.
>
> (Of course, it should be tested against NAMELEN as a maximum, but I
> understand that examples do not always show how to verify each possible
> error).
I imagine that the author's intention was to allow the user to do
experiments where argv[2] specified a number less than NAMELEN,
in order to see the resulting ERANGE error. But, that experiment
is of limited value, and complicates the code unnecessarily, IMO,
so that's why I made the change above.
--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
Reported-by: Alexis Wilke <alexis@m2osw.com> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Michael Kerrisk [Tue, 10 Aug 2021 08:41:05 +0000 (10:41 +0200)]
mount_setattr.2: Rework the discussion of MOUNT_ATTR__ATIME
Phrases such as "In the new mount API" will date fast. Remove it.
Also:
* Make it clear that MOUNT_ATTR__ATIME expresses a bit field.
* Replace 'enum' with 'enumeration'.
* Clarify what is meant by "partially" set MOUNT_ATTR__ATIME.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Michael Kerrisk [Tue, 10 Aug 2021 07:23:08 +0000 (09:23 +0200)]
mount_setattr.2: Remove description of propagation types
These types are already well described in mount_namespaces(7);
indeed, much of the text from that page seems to have just been
cut and pasted into this page! Simply referring the reader to
mount_namespaces(7) is sufficient.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Michael Kerrisk [Tue, 10 Aug 2021 07:13:56 +0000 (09:13 +0200)]
mount_setattr.2: Reword the description of the 'propagation field'
Point out that this field can have the value zero, meaning
no change. And avoid discussions of 'enum', and simply say
that otherwise the field has one of the MS_* values.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
seccomp.2: Clarify that bad system calls kill the thread
Reported-by: Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Michael Kerrisk [Tue, 10 Aug 2021 00:08:49 +0000 (02:08 +0200)]
mount_setattr.2: Move the discussion of ID-mapped mounts to NOTES
Having this discussion under DESCRIPTION clutters that section,
and has the effect of burying the discussion of propagation. Move
the discussion to NOTES, to make the page more readable.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Michael Kerrisk [Mon, 9 Aug 2021 20:56:47 +0000 (22:56 +0200)]
mount_setattr.2: Minor clean-ups in example program
- Change some instances of "-" to "\"
- Use C99 style (declare variables nearer use in code)
- Add a bit of white space
- Remove one 'const...const' added by Alex that caused
compiler warnings
- Use "reverse Christmas tree" form for declarations in main()
- Other minor changes
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
mount_setattr.2: Minor tweaks to Christian's patch
- Fix SYNOPSIS to fit in 78 columns
Also, we don't show when an include is included for a specific type,
unless that header is included _only_ for the type,
or there might be confusion (e.g., termios).
Instead, that type should be documented in system_data_types(7),
with a link page mount_attr-struct(3).
- Fix references to mount_setattr(). See man-pages(7):
Any reference to the subject of the current manual page should be writ‐
ten with the name in bold followed by a pair of parentheses in Roman
(normal) font. For example, in the fcntl(2) man page, references to
the subject of the page would be written as: fcntl(). The preferred
way to write this in the source file is:
.BR fcntl ()
- Fix line breaks according to semantic newline rules (and add some commas)
- Fix wrong usage of .IR when .RI should have been used
- Fix formatting of variable part in FOO<number>:
- Make italic the variable part (as groff_man(7) recommends)
- Remove <>
- Use syntax recommended by G. Branden Robinson (groff)
- Fix unnecessary uses of .BR or .IR when .B or .I would suffice
- Fix formatting of punctuation
In some cases, it was in italics or bold, and it should always be in roman.
- Use uppercase to begin text, even in bullet points, since those were
multi-sentence.
- Simplify usage of .RS/.RE in combination with .IP
- s/fat/FAT/ as fs(7) does
- Slightly reword some sentences for consistency
- Use Linux-specific for consistency with other pages (in VERSIONS)
- EXAMPLES: Place the return type in a line of its own (as in other pages)
- Fix alignment of code
- Replace unnecessary use of the GNU extension ({}) by do {} while (0)
In that case, there was no return value (moreover, it's a noreturn).
- Break complex declaration lines into a line for each variable
The variables were being initialized, some to non-zero values,
so for clarity, a line for each one seems more appropriate.
- Add const to pointers when possible
- s/\\/\e/
- Remove unmatched groff commands
Cc: Christian Brauner <brauner@kernel.org> Signed-off-by: Alejandro Colomar <alx.manpages@gmail.com> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Kurt Kanzenbach [Sun, 8 Aug 2021 08:41:16 +0000 (10:41 +0200)]
futex.2: Document FUTEX_LOCK_PI2
FUTEX_LOCK_PI2 is a new futex operation which was recently introduced into the
Linux kernel. It works exactly like FUTEX_LOCK_PI. However, it has support for
selectable clocks for timeouts. By default CLOCK_MONOTONIC is used. If
FUTEX_CLOCK_REALTIME is specified then the timeout is measured against
CLOCK_REALTIME.
This new operation addresses an inconsistency in the futex interface:
FUTEX_LOCK_PI only works with timeouts based on CLOCK_REALTIME in contrast to
all the other PI operations.
Document the FUTEX_LOCK_PI2 command.
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Alejandro Colomar <alx.manpages@gmail.com> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
- Move example program to a new EXAMPLES section
- Invert logic in the handler to have the failure in the
conditional path, and the success out of any conditionals.
- Use NULL, EXIT_SUCCESS, and EXIT_FAILURE instead of magic numbers
- Separate declarations from code
- Put function return type on its own line
- Put function opening brace on its line
Cc: Peter Collingbourne <pcc@google.com> Signed-off-by: Alejandro Colomar <alx.manpages@gmail.com> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>