==================== Changes in man-pages-3.80 ==================== Released: 2015-02-21, Munich Contributors ------------ The following people contributed patches/fixes or (noted in brackets in the changelog below) reports, notes, and ideas that have been incorporated in changes in this release: Akihiro Motoki Andy Lutomirski Bill McConnaughey Chris Mayo Christophe Blaess David Wilson Denys Vlasenko Doug Goldstein Eric Wong Heinrich Schuchardt J William Piggott James Hunt Jan Chaloupka Jan Stancek Jeff Layton Jens Thoms Toerring Kevin Easton Luke Faraone Mark Seaborn Mathieu Malaterre Michael Kerrisk Michal Hocko Minchan Kim Patrick Horgan Peng Haitao Ralf Baechle Rob Somers Simon Paillard Stephen Smalley Tao Ma Tobias Herzke Vince Weaver Vlastimil Babka Zbigniew BrzeziƄski Apologies if I missed anyone! New and rewritten pages ----------------------- ioctl_fat.2 Heinrich Schuchardt [Michael Kerrisk] New man page for the ioctl(2) FAT API The ioctl(2) system call may be used to retrieve information about the FAT file system and to set file attributes. madvise.2 Michael Kerrisk Summary: this page has been significantly reorganised and rewritten Michael Kerrisk Recast discussion of 'advice' into two groups of values madvise() is one of those system calls that has congealed over time, as has the man page. It's helpful to split the discussion of 'advice' into those flags into two groups: * Those flags that are (1) widespread across implementations; (2) have counterparts in posix_madvise(3); and (3) were present in the initial Linux madvise implementation. * The rest, which are values that (1) may not have counterparts in other implementations; (2) have no counterparts in posix_madvise(3); and (3) were added to Linux in more recent times. Michael Kerrisk Explicitly list the five flags provided by posix_fadvise() Over time, bit rot has afflicted this page. Since the original text was written many new Linux-specific flags have been added. So, now it's better to explicitly list the flags that correspond to the POSIX analog of madvise(). Jan Chaloupka [Hugh Dickins, Michael Kerrisk] Starting with Linux 3.5, more file systems support MADV_REMOVE Michael Kerrisk Split EINVAL error into separate cases Michael Kerrisk Explain MADV_REMOVE in terms of file hole punching Michael Kerrisk MADV_REMOVE can be applied only to shared writable mappings Michael Kerrisk MADV_REMOVE cannot be applied to locked or Huge TLB pages Michael Kerrisk [Vlastimil Babka] Clarify that MADV_DONTNEED has effect on pages only if it succeeds Michael Kerrisk [Vlastimil Babka] Clarifications for MADV_DONTNEED Michael Kerrisk [Michal Hocko] Improve MADV_DONTNEED description Michael Kerrisk MADV_DONTNEED cannot be applied to Huge TLB or locked pages Michael Kerrisk [Vlastimil Babka] Remove mention of "shared pages" as a cause of EINVAL for MADV_DONTNEED Michael Kerrisk [Vlastimil Babka] Note Huge TLB as a cause of EINVAL for MADV_DONTNEED Michael Kerrisk [Minchan Kim] Add mention of VM_PFNMAP in discussion of MADV_DONTNEED and MADV_REMOVE Michael Kerrisk Drop sentence saying that kernel may ignore 'advice' The sentence creates misunderstandings, and does not really add information. Michael Kerrisk Note that some Linux-specific 'advice' change memory-access semantics Michael Kerrisk NOTES: Remove crufty text about "command" versus "advice" The point made in this fairly ancient text is more or less evident from the DESCRIPTION, and it's not clear what "standard" is being referred to. Michael Kerrisk Mention POSIX.1-2008 addition of POSIX_MADV_NOREUSE Michael Kerrisk Remove "POSIX.1b" from CONFORMING TO Michael Kerrisk Move mention of posix_fadvise() from CONFORMING TO to SEE ALSO Michael Kerrisk ERRORS: add EPERM error case for MADV_HWPOISON Michael Kerrisk Note that madvise() is nonstandard, but widespread Newly documented interfaces in existing pages --------------------------------------------- proc.5 Michael Kerrisk (Briefly) document /proc/PID/attr/socketcreate Michael Kerrisk (Briefly) document /proc/PID/attr/keycreate Michael Kerrisk [Stephen Smalley] Document /proc/PID/attr/{current,exec,fscreate,prev} Heavily based on Stephen Smalley's text in https://lwn.net/Articles/28222/ From: Stephen Smalley To: LKML and others Subject: [RFC][PATCH] Process Attribute API for Security Modules Date: 08 Apr 2003 16:17:52 -0400 Michael Kerrisk Document /proc/sys/kernel/auto_msgmni socket.7 David Wilson Document SO_REUSEPORT socket option New and changed links --------------------- get_thread_area.2 Andy Lutomirski Make get_thread_area.2 a link to rewritten set_thread_area.2 page Changes to individual pages --------------------------- time.1 Michael Kerrisk Make option argument formatting consistent with other pages access.2 Denys Vlasenko Explain how access() check treats capabilities We have users who are terribly confused why their binaries with CAP_DAC_OVERRIDE capability see EACCESS from access() calls, but are able to read the file. The reason is access() isn't the "can I read/write/execute this file?" question, it is the "(assuming that I'm a setuid binary,) can *the user who invoked me* read/write/execute this file?" question. That's why it uses real UIDs as documented, and why it ignores capabilities when capability-endorsed binaries are run by non-root (this patch adds this information). To make users more likely to notice this less-known detail, the patch expands the explanation with rationale for this logic into a separate paragraph. arch_prctl.2 set_thread_area.2 get_thread_area.2 Andy Lutomirski Improve TLS documentation The documentation for set_thread_area was very vague. This improves it, accounts for recent kernel changes, and merges it with get_thread_area.2. get_thread_area.2 now becomes a link. While I'm at it, clarify the related arch_prctl.2 man page. cacheflush.2 Ralf Baechle Update some portability details and bugs Michael Kerrisk Refer reader to BUGS in discussion of EINVAL error capget.2 Michael Kerrisk Document V3 capabilities constants Michael Kerrisk Rewrite discussion of kernel versions that support file capabilities File capabilities ceased to be optional in Linux 2.6.33. clone.2 Peng Haitao Fix description of CLONE_PARENT_SETTID CLONE_PARENT_SETTID only stores child thread ID in parent memory. clone.2 execve.2 Kevin Easton Document interaction of execve(2) with CLONE_FILES This patch the fact that a successful execve(2) in a process that is sharing a file descriptor table results in unsharing the table. I discovered this through testing and verified it by source inspection - there is a call to unshare_files() early in do_execve_common(). fcntl.2 Michael Kerrisk [Jeff Layton] Clarify cases of conflict between traditional record and OFD locks Verified by experiment on Linux 3.15 and 3.19rc4. fork.2 Michal Hocko EAGAIN is not reported when task allocation fails I am not sure why we have: "EAGAIN fork() cannot allocate sufficient memory to copy the parent's page tables and allocate a task structure or the child." The text seems to be there from the time when man-pages were moved to git so there is no history for it. And it doesn't reflect reality: the kernel reports both dup_task_struct and dup_mm failures as ENOMEM to the userspace. This seems to be the case from early 2.x times so let's simply remove this part. Heinrich Schuchardt Child and parent run in separate memory spaces fork.2 should clearly point out that child and parent process run in separate memory spaces. Michael Kerrisk NOTES: add "C library/kernel ABI differences" subheading getpid.2 Michael Kerrisk NOTES: add "C library/kernel ABI differences" subheading getxattr.2 Michael Kerrisk Various rewordings plus one or two details clarified Michael Kerrisk Add pointer to example in listxattr(2) killpg.2 Michael Kerrisk NOTES: add "C library/kernel ABI differences" subheading listxattr.2 Heinrich Schuchardt Provide example program Michael Kerrisk Reword discussion of size==0 case Michael Kerrisk Add note on handling increases in sizes of keys or values Michael Kerrisk Remove mention of which filesystems implement ACLs Such a list will only become outdated (as it already was). migrate_pages.2 Jan Stancek Document EFAULT and EINVAL errors I encountered these errors while writing testcase for migrate_pages syscall for LTP (Linux test project). I checked stable kernel tree 3.5 to see which paths return these. Both can be returned from get_nodes(), which is called from: SYSCALL_DEFINE4(migrate_pages, pid_t, pid, unsigned long, maxnode, const unsigned long __user *, old_nodes, const unsigned long __user *, new_nodes) The testcase does following: EFAULT a) old_nodes/new_nodes is area mmaped with PROT_NONE b) old_nodes/new_nodes is area not mmapped in process address space, -1 or area that has been just munmmaped EINVAL a) maxnodes overflows kernel limit b) new_nodes contain node, which has no memory or does not exist or is not returned for get_mempolicy(MPOL_F_MEMS_ALLOWED). modify_ldt.2 Andy Lutomirski Overhaul the documentation This clarifies the behavior and documents all four functions. Andy Lutomirski Clarify the lm bit's behavior The lm bit should never have existed in the first place. Sigh. mprotect.2 Mark Seaborn Mention effect of READ_IMPLIES_EXEC personality flag I puzzled over mprotect()'s effect on /proc/*/maps for a while yesterday -- it was setting "x" without PROT_EXEC being specified. Here is a patch to add some explanation. msgget.2 Michael Kerrisk Add details of MSGMNI default value msgop.2 Michael Kerrisk Clarify wording of MSGMAX and MSGMNB limits perf_event_open.2 Vince Weaver Clarify PERF_EVENT_IOC_REFRESH behavior Currently the PERF_EVENT_IOC_REFRESH ioctl, when applied to a group leader, will refresh all children. Also if a refresh value of 0 is chosen then the refresh becomes infinite (never runs out). Back in 2011 PAPI was relying on these behaviors but I was told that both were unsupported and subject to being removed at any time. (See https://lkml.org/lkml/2011/5/24/337 ) However the behavior has not been changed. This patch updates the manpage to still list the behavior as unsupported, but removes the inaccurate description of it only being a problem with 2.6 kernels. prctl.2 Michael Kerrisk [Bill McConnaughey] Mention file capabilities in discussion of PR_SET_DUMPABLE Michael Kerrisk Greatly expand discussion of "dumpable" flag In particular, detail the interactions with /proc/sys/fs/suid_dumpable. Michael Kerrisk Reorder paragraphs describing PR_SET_DUMPABLE Michael Kerrisk Mention SUID_DUMP_DISABLE and SUID_DUMP_USER under PR_SET_DUMPABLE Michael Kerrisk Executing a file with capabilities also resets the parent death signal ptrace.2 James Hunt Explain behaviour should ptrace tracer call execve(2) This behaviour was verified by reading the kernel source and confirming the behaviour using a test program. Denys Vlasenko Add information on PTRACE_SEIZE versus PTRACE_ATTACH differences Extend description of PTRACE_SEIZE with the short summary of its differences from PTRACE_ATTACH. The following paragraph: PTRACE_EVENT_STOP Stop induced by PTRACE_INTERRUPT command, or group-stop, or ini- tial ptrace-stop when a new child is attached (only if attached using PTRACE_SEIZE), or PTRACE_EVENT_STOP if PTRACE_SEIZE was used. has an editing error (the part after last comma makes no sense). Removing it. Mention that legacy post-execve SIGTRAP is disabled by PTRACE_SEIZE. sched_setattr.2 Michael Kerrisk [Christophe Blaess] SYNOPSIS: remove 'const' from 'attr' sched_getattr() argument semget.2 Michael Kerrisk Note default value for SEMMNI and SEMMSL semop.2 Michael Kerrisk Note defaults for SEMOPM and warn against increasing > 1000 sendfile.2 Eric Wong Caution against modifying sent pages setxattr.2 Michael Kerrisk ERRORS: add ENOTSUP for invalid namespace prefix Michael Kerrisk Remove redundant text under ENOTSUP error Michael Kerrisk Note that zero-length attribute values are permitted Michael Kerrisk Rework text describing 'flags' argument stat.2 Michael Kerrisk NOTES: add "C library/kernel ABI differences" subheading statfs.2 Michael Kerrisk [Jan Chaloupka] Document the 'f_flags' field added in Linux 2.6.36 Michael Kerrisk Clarify that 'statfs' structure has some padding bytes The number of padding bytes has changed over tyme, as some bytes are used, so describe this aspect of the structure less explicitly. Tao Ma Add OCFS2_SUPER_MAGIC Michael Kerrisk Use __fsword_t in statfs structure definition This more closely matches modern glibc reality. Michael Kerrisk Add a note on the __fsword_t type Michael Kerrisk Document 'f_spare' more vaguely wait.2 Michael Kerrisk Note that waitpid() is a wrapper for wait4() Michael Kerrisk Note that wait() is a library function implemented via wait4() wait4.2 Michael Kerrisk NOTES: add "C library/kernel ABI differences" subheading encrypt.3 Rob Somers Improve code example I (and some others) found that the original example code did not seem to work as advertised. The new code (used by permission of the original author, Jens Thoms Toerring) was found on comp.os.linux.development. mktemp.3 Luke Faraone DESCRIPTION reference to BUGS corrected mktemp(3)'s DESCRIPTION referenced NOTES, but no such section exists. Corrected to refer to BUGS. pthread_attr_setschedparam.3 Tobias Herzke Describe EINVAL in ERRORS resolver.3 host.conf.5 Simon Paillard host.conf 'order' option deprecated, replaced by nsswitch.conf(5) http://www.sourceware.org/bugzilla/show_bug.cgi?id=2389 http://repo.or.cz/w/glibc.git/commit/b9c65d0902e5890c4f025b574725154032f8120a Reported at http://bugs.debian.org/270368, http://bugs.debian.org/396633, and http://bugs.debian.org/344233. statvfs.3 Michael Kerrisk Document missing 'f_flag' bit values And reorganize information relating to which flags are in POSIX.1. Michael Kerrisk [Jan Chaloupka] statvfs() now populates 'f_flag' from statfs()'s f_flag field These changes came with glibc 2.13, and the kernel's addition of a 'f_flags' field in Linux 2.6.36. syslog.3 Michael Kerrisk [Doug Goldstein] Remove unneeded vsyslog() does not need this. tzset.3 J William Piggott Add offset format tzset.3 does not illustrate the POSIX offset format. Specifically, there is no indication in the manual what the optional components of it are. random.4 Michael Kerrisk Note maximum number of bytes returned by read(2) on /dev/random Michael Kerrisk [Mathieu Malaterre] Since Linux 3.16, reads from /dev/urandom return at most 32 MB See https://bugs.debian.org/775328 and https://bugzilla.kernel.org/show_bug.cgi?id=80981#c9 core.5 Michael Kerrisk [Bill McConnaughey] Executing a file that has capabilities also prevents core dumps Michael Kerrisk Document "%i" and "%I" core_pattern specifiers intro.5 Michael Kerrisk Remove words "and protocols" There are no protocol descriptions in Section 5. Protocols are in Section 7. proc.5 Michael Kerrisk Add reference to prctl(2) in discussion of /proc/sys/fs/suid_dumpable And note that /proc/sys/fs/suid_dumpable defines the value assigned to the process "dumpable" flag in certain circumstances. Michael Kerrisk Note that CAP_SYS_ADMIN is required to list /proc/PID/map_files This might however change in the future; see the Jan 2015 LKML thread: Re: [RFC][PATCH v2] procfs: Always expose /proc//map_files/ and make it readable resolv.conf.5 Michael Kerrisk SEE ALSO: add nsswitch.conf(5) capabilities.7 Michael Kerrisk Mention SECBIT_KEEP_CAPS as an alternative to prctl() PR_SET_KEEPCAPS Chris Mayo NOTES: add last kernel versions for obsolete options The CONFIG_SECURITY_CAPABILITIES option was removed by commit 5915eb53861c5776cfec33ca4fcc1fd20d66dd27 The CONFIG_SECURITY_FILE_CAPABILITIES option removed in Linux 2.6.33 as already mentioned in DESCRIPTION. pthreads.7 Michael Kerrisk SEE ALSO: add fork(2) socket.7 Michael Kerrisk Add some details for SO_REUSEPORT unix.7 Jan Chaloupka Mention SOCK_STREAM socket for ioctl_type of ioctl() from https://bugzilla.redhat.com/show_bug.cgi?id=1110401. unix.7 is not clear about socket type of ioctl_type argument of ioctl() function. The description of SIOCINQ is applicable only for SOCK_STREAM socket. For SOCK_DGRAM, udp(7) man page gives correct description of SIOCINQ ldconfig.8 Michael Kerrisk Place options in alphabetical order Michael Kerrisk Note glibc version number for '-l' option Michael Kerrisk Document -c/--format option Michael Kerrisk Add long form of some options Michael Kerrisk [Patrick Horgan] ld.so.conf uses only newlines as delimiters mtk: confirmed by reading source of parse_conf() in elf/ldconfig.c. Michael Kerrisk Document -V/--version option Michael Kerrisk Document -i/--ignore-aux-cache option ld.so.8 Michael Kerrisk Relocate "Hardware capabilities" to be a subsection under notes This is more consistent with standard man-pages headings and layout. Michael Kerrisk (Briefly) document LD_TRACE_PRELINKING Michael Kerrisk Remove duplicate description of LD_BIND_NOT