Finn O'Leary [Wed, 31 Jul 2019 19:53:19 +0000 (19:53 +0000)]
setxattr.2: Add ERANGE to 'ERRORS' section
Hi,
Both the Ext2 filesystem handler and the Ext4 filesystem handler will
return the ERANGE error code. Ext2 will return it if the name or value is
too long to be able to be stored, Ext4 will return it if the name is too
long. For reference, the relevant files/lines (with excerpts) are:
fs/ext2/xattr.c: lines 394 to 396 in ext2_xattr_set
> 394 name_len = strlen(name);
> 395 if (name_len > 255 || value_len > sb->s_blocksize)
> 396 return -ERANGE;
fs/ext4/xattr.c: lines 2317 to 2318 in ext4_xattr_set_handle
> 2317 if (strlen(name) > 255)
> 2318 return -ERANGE;
Other filesystems also return this code:
xfs/libxfs/xfs_attr.h: lines 53 to 55
> * The maximum size (into the kernel or returned from the kernel) of an
> * attribute value or the buffer used for an attr_list() call. Larger
> * sizes will result in an ERANGE return code.
It's possible that more filesystem handlers do this, a cursory grep shows
that most of the filesystem xattr handler files mention ERANGE in some
form. A suggested patch is below (I'm not 100% sure on the wording through).
Thanks
--
- Finn
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Yang Xu [Wed, 24 Jul 2019 06:52:23 +0000 (14:52 +0800)]
prctl.2: Correct some details for PR_SET_TIMERSLACK
In kernel/sys.c, arg2 is an unsigned long value and it will never
less than 0. Also, since kernel commit id da8b44d5a9f8 (Linux
4.6), timer_slack_ns and default timer_slack_ns have been
converted into u64, the return value of PR_GET_TIMERSLACK has been
limited under ULONG_MAX.
The timer slack value also can be inherited by a child created via
fork(2).
Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Yang Xu <xuyang2018.jy@cn.fujitsu.com> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
But if tmo_p->tv_sec is negative, the ppoll() call is not
equivalent to the corresponding poll() call. The kernel rejects
negative values of tv_sec with an EINVAL error; it does not
interpret the value as meaning an infinite timeout.
(Yes, the kernel interprets tmo_p == NULL as an infinite timeout,
but the man page is still wrong for the case tmo_p->tv_sec < 0.)
Suggested fix: Following the end of the second extract above, add:
except that negative time values in tmo_p are not
interpreted as an infinite timeout.
Also, in the ERRORS section, change the text for EINVAL to:
EINVAL The nfds value exceeds the RLIMIT_NOFILE value or
*tmo_p contains an invalid (negative) time value.
Reported-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Michael Kerrisk [Fri, 26 Jul 2019 14:54:16 +0000 (16:54 +0200)]
pivot_root.2: 'new_root' must be a mount point
It appears that 'new_root' may not have needed to be a mount
point on ancient kernels, but already in Linux 2.4.5, there
was the diff shown below. Verified also by testing.
@@ -1631,8 +1605,9 @@
* - we don't move root/cwd if they are not at the root (reason: if something
* cared enough to change them, it's probably wrong to force them elsewhere)
* - it's okay to pick a root that isn't the root of a file system, e.g.
- * /nfs/my_root where /nfs is the mount point. Better avoid creating
- * unreachable mount points this way, though.
+ * /nfs/my_root where /nfs is the mount point. It must be a mountpoint,
+ * though, so you may need to say mount --bind /nfs/my_root /nfs/my_root
+ * first.
*/
Michael Kerrisk [Tue, 23 Jul 2019 21:04:27 +0000 (23:04 +0200)]
mount_namespaces.7: Clarify implications for other NS if mount point is removed in one NS
If a mount point is deleted or renamed or removed in one mount
namespace, this will cause an object that is mounted at that
location in another mount namespace to be unmounted (as verified
by experiment). This was implied by the existing text, but it is
better to make this detail explicit.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
* Establish the abbreviations DSO and PID in the lead paragraph
since they are used later.
* Parallelize descriptions of help, usage, and version options
with the "and exit" language used in getent(1), iconv(1),
locale(1), localedef(1), memusage(1), memusagestat(1),
mtrace(1), pldd(1), sprof(1), time(1), iconvconfig(8),
zdump(8), and zic(8).
Signed-off-by: G. Branden Robinson <g.branden.robinson@gmail.com> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
glibc 2.30 isn't released yet, but a fix has been committed, and
Debian has even cherry-picked it for Debian GNU/Linux 10
("buster"). pldd works nicely now.
Signed-off-by: G. Branden Robinson <g.branden.robinson@gmail.com> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
if (sigprocmask(SIG_UNBLOCK, &blocking, NULL) == -1)
errExit("sigprocmask");
exit(EXIT_SUCCESS);
}
$ ./siginfo_nonqueuing
Child 4042 with UID 1000 exiting with status 10
Child 4040 with UID 1000 exiting with status 20
Child 4041 with UID 1000 exiting with status 30
caught signal 17
si_pid=4042, si_uid=1000, si_status=10
Cc: Oleg Nesterov <oleg@redhat.com> Cc: Lennart Poettering <lennart@poettering.net> Acked-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Michal Sekletar <msekleta@redhat.com> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Michael Kerrisk [Wed, 3 Jul 2019 08:06:36 +0000 (10:06 +0200)]
dlopen.3: Clarify when an executable's symbols can be used for symbol resolution
The --export-dynamic linker option is not the only way that main's
global symbols may end up in the dynamic symbol table and thus be
used to satisfy symbol reference in a shared object. A symbol
may also be placed into the dynamic symbol table if ld(1)
notices a dependency in another object during the static link.
Verified by experiment; see previous commit.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Michael Kerrisk [Wed, 3 Jul 2019 07:45:55 +0000 (09:45 +0200)]
dlopen.3: Clarify the rules for symbol resolution in a dlopen'ed object
The existing text wrongly implied that symbol look up first
occurred in the object and then in main, and did not mention
whether dependencies of main where used for symbol resolution.
cc $CFLAGS -g -fPIC -shared -o lib_x1.so lib_x1.c
cc $CFLAGS -g -fPIC -shared -o lib_y2.so lib_y2.c
cc $CFLAGS -g -fPIC -shared -o lib_y1.so lib_y1.c ./lib_y2.so
cc $CFLAGS -g -fPIC -shared -o lib_m1.so lib_m1.c
#ED="-Wl,--export-dynamic"
cc $CFLAGS $ED -Wl,--rpath,$PWD -o prog prog.c -ldl lib_m1.so
$ sh Build.sh
$ ./prog x
Link map as shown from dl_iterate_phdr() callbacks:
Name =
Name = linux-vdso.so.1
Name = /lib64/libdl.so.2
Name = /home/mtk/tlpi/code/shlibs/dlopen_sym_res_expt/lib_m1.so
Name = /lib64/libc.so.6
Name = /lib64/ld-linux-x86-64.so.2
Name = ./lib_x1.so
Name = ./lib_y1.so
Name = ./lib_y2.so
Called y1_enter
Called lib_x1.c::prog_x1
Called prog.c::prog_y1_exp
Called lib_y1.c::prog_y1_noexp
Called lib_x1.c::x1_y1
Called lib_m1.c::m1_y1
Called lib_y2.c::y2
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
dlopen.3: Amend error in description of dlclose() behavior
-If the reference count drops to zero and no other loaded libraries use
-symbols in it, then the dynamic library is unloaded.
+If the reference count drops to zero,
+then the dynamic library is unloaded.
I doubted the removed text, because it provide little clue about
the scenario. The POSIX dlclose(3) specification actually details
the scenario sufficiently:
Although a dlclose() operation is not required to remove
any functions or data objects from the address space,
neither is an implementation prohibited from doing so.
The only restriction on such a removal is that no func‐
tion nor data object shall be removed to which references
have been relocated, until or unless all such references
are removed. For instance, an executable object file that
had been loaded with a dlopen() operation specifying the
RTLD_GLOBAL flag might provide a target for dynamic relo‐
cations performed in the processing of other relocatable
objects—in such environments, an application may assume
that no relocation, once made, shall be undone or remade
unless the executable object file containing the relo‐
cated object has itself been removed.
Verified by experiment:
$ cat openlibs.c # Test program
int
main(int argc, char *argv[])
{
void *libHandle[MAX_LIBS];
int lcnt;
static void testref(void) {
/* The following reference, to a symbol in lib_x1.so shows that
RTLD_GLOBAL may pin a library when it might otherwise have been
released with dlclose() */
extern void x1_func(void);
x1_func();
}
$ cc -shared -fPIC -o lib_x1.so lib_x1.c
$ cc -shared -fPIC -o lib_y1.so lib_y1.c
$ cc -o openlibs openlibs.c -ldl
Note that x1_dstor was called only when handle 1 (lib_y1.so) was closed.
But, if we edit lib_y1 to remove the reference to x1_func(), things are
different:
Michael Kerrisk [Mon, 1 Jul 2019 07:48:11 +0000 (09:48 +0200)]
capabilities.7: CAP_FOWNER also allows modifying user xattrs on sticky directories
See fs/xattr.c::xattr_permission()"
/*
* In the user.* namespace, only regular files and directories can have
* extended attributes. For sticky directories, only the owner and
* privileged users can write attributes.
*/
if (!strncmp(name, XATTR_USER_PREFIX, XATTR_USER_PREFIX_LEN)) {
if (!S_ISREG(inode->i_mode) && !S_ISDIR(inode->i_mode))
return (mask & MAY_WRITE) ? -EPERM : -ENODATA;
if (S_ISDIR(inode->i_mode) && (inode->i_mode & S_ISVTX) &&
(mask & MAY_WRITE) && !inode_owner_or_capable(inode))
return -EPERM;
}
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Michael Kerrisk [Tue, 25 Jun 2019 04:40:21 +0000 (06:40 +0200)]
ipc.5: Remove old link to svipc.7/sysvipc.7 page
Long ago, the sysvipc.7 page was called ipc.5, which was both a
misnaming (too general a name) and an inconsistent section. The
page was renamed (to svipc.7) many years ago, and the link with
the old name has probably ceased to be needed. So, remove it.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
if (*xp != INIT_VALUE)
printf("It looks like the variable passed to the exit handler "
"has gone out of scope\n");
/* Produce a core dump, which we can examine with GDB to look at the
frames on the stack, if desired */
printf("===\n");
printf("About to abort\n");
abort();
}
static void
recur(int lev, int *xp)
{
int rloc;
int big[65536-12]; /* 12*4 == 48 other bytes allocated on
this stack frame */
tos = (char *) &rloc;
big[0] = lev;
big[0]++;
printf("&rloc = %p (%d) (%d)\n", (void *) &rloc, lev, *xp);
if (lev > 1)
recur(lev - 1, xp);
else {
printf("exit() from recur()\n");
exit(EXIT_SUCCESS);
}
}
int
main(int argc, char *argv[])
{
int lev;
int *xp;
int xx;
if (argc < 2) {
fprintf(stderr, "Usage: %s {s|h} [how]\n", argv[0]);
fprintf(stderr, "\ts => exitFunc() arg is in main() stack\n");
fprintf(stderr, "\th => exitFunc() arg is allocated on heapn");
fprintf(stderr, "\tIf 'how' is not present, then return from main()\n");
fprintf(stderr, "\tIf 'how' is 0, then exit() from main()\n");
fprintf(stderr, "\tIf 'how' is > 0, then make 'how' recursive "
"function calls, and then exit()\n");
exit(EXIT_FAILURE);
}
tos = (char *) &xp;
if (argv[1][0] == 'h') {
xp = malloc(sizeof(int));
if (xp == NULL) {
perror("malloc");
exit(EXIT_FAILURE);
}
printf("Argument for exitFunc() is allocated on heap\n");
} else {
xp = &xx;
printf("Argument for exitFunc() is allocated on stack in main()\n");
}
Mark Wielaard [Wed, 29 May 2019 23:08:39 +0000 (01:08 +0200)]
mprotect.2: pkey_mprotect() acts like mprotect() if pkey is set to -1, not 0
The mprotect.2 NOTES say:
On systems that do not support protection keys in
hardware, pkey_mprotect() may still be used, but pkey must
be set to 0. When called this way, the operation of
pkey_mprotect() is equivalent to mprotect().
But this is not what the glibc manual says:
It is also possible to call pkey_mprotect with a key value
of -1, in which case it will behave in the same way as
mprotect.
Which is correct. Both the glibc implementation and the
kernel check whether pkey is -1. 0 is not a valid pkey when
memory protection keys are not supported in hardware.
Signed-off-by: Mark Wielaard <mark@klomp.org> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
I've found the exec man page quite difficult to read when trying
to find the behavior for a specific function. Since the names of
the functions are inline and the order of the descriptions isn't
clear, it's hard to find which paragraphs apply to each function.
I thought it would be much easier to read if the grouping based on
letters is stated.
fanotify.7: Reword FAN_REPORT_FID data structure inclusion semantics
Improved the readability of a sentence that describes the use of
FAN_REPORT_FID and how this particular flag influences what data
structures a listening application could expect to receive when
describing an event.
Signed-off-by: Matthew Bobrowski <mbobrowski@mbobrowski.org> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>