Chris MacGregor [Thu, 27 Feb 2014 18:40:59 +0000 (10:40 -0800)]
hwclock: fix possible hang and other set_hardware_clock_exact() issues
In sys-utils/hwclock.c, set_hardware_clock_exact() has some problems when the
process gets pre-empted (for more than 100ms) before reaching the time for
which it waits:
1. The "continue" statement causes execution to skip the final tdiff
assignment at the end of the do...while loop, leading to the while condition
using the wrong value of tdiff, and thus always exiting the loop once
newhwtime != sethwtime (e.g., after 1 second). This masks bug # 2, below.
2. The previously-existing bug is that because it starts over waiting for the
desired time whenever two successive calls to gettimeofday() return values >
100ms apart, the loop will never terminate unless the process holds the CPU
(without losing it for more than 100ms) for at least 500ms. This can happen
on a heavily loaded machine or on a virtual machine (or on a heavily loaded
virtual machine). This has been observed to occur, preventing a machine from
completing the shutdown or reboot process due to a "hwclock --systohc" call in
a shutdown script.
The new implementation presented in this patch takes a somewhat different
approach, intended to accomplish the same goals:
It computes the desired target system time (at which the requested hardware
clock time will be applied to the hardware clock), and waits for that time to
arrive. If it misses the time (such as due to being pre-empted for too long),
it recalculates the target time, and increases the tolerance (how late it can
be relative to the target time, and still be "close enough". Thus, if all is
well, the time will be set *very* precisely. On a machine where the hwclock
process is repeatedly pre-empted, it will set the time as precisely as is
possible under the conditions present on that particular machine. In any
case, it will always terminate eventually (and pretty quickly); it will never
hang forever.
[kzak@redhat.com: - tiny coding style changes]
Signed-off-by: Chris MacGregor <chrismacgregor@google.com> Signed-off-by: Karel Zak <kzak@redhat.com>
Karel Zak [Wed, 5 Mar 2014 10:06:59 +0000 (11:06 +0100)]
chcpu: cleanup return codes
The code currently always return EXIT_SUCCESS, that's strange. It
seems better to return 0 on success, 1 on complete failure and 64 on
partial success.
docs: fix two command representations in the man page of more
The previous-file command is not :P but :p, and the back-to-where
command is not an acute accent but an apostrophe. Also condense
some of the descriptions and remove some useless comments.
Sami Kerola [Fri, 21 Feb 2014 19:25:30 +0000 (19:25 +0000)]
logger: allow user to send structured journald messages
This feature is hopefully mostly used to give MESSAGE_ID labels for
messages coming from scripts, making search of messages easy. The
logger(1) manual page update should give enough information how to use
--journald option.
[kzak@redhat.com: - add missing #ifdefs
- use xalloc.h]
Signed-off-by: Sami Kerola <kerolasa@iki.fi> Signed-off-by: Karel Zak <kzak@redhat.com>
Stewart Smith [Tue, 4 Mar 2014 04:27:27 +0000 (15:27 +1100)]
lscpu: don't assume filesystem supports d_type when searching for NUMA nodes
Not all file systems support the d_type field and simply checking for
d_type == DT_DIR in is_node_dirent would cause the test suite to fail
if run on (for example) XFS.
The simple fix is to check for DT_DIR or DT_UNKNOWN in is_node_dirent.
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Karel Zak [Mon, 3 Mar 2014 09:36:15 +0000 (10:36 +0100)]
umount: don't use mountinfo if possible
The umount(8) always parses /proc/self/mountinfo to get fstype and to
merge kernel mount options with userspace mount options from
/run/mount/utab. This behavior is overkill in many cases and it's
pretty expensive as kernel has to always compose *whole* mountinfo.
This performance disadvantage is visible for crazy use-cases with huge
number of mountpoints and frequently called umount(8).
It seems that we can bypass /proc/self/mountinfo by statfs() to get
filesystem type (statfs.f_type magic) and analyze /run/mount/utab
before we parse mountinfo.
This optimization is not used when:
* umount(8) executed by non-root (as user= in utab is expected)
* umount --lazy / --force (target is probably unreachable NFS, then
use statfs() is pretty bad idea)
* target is not a directory (e.g. umount /dev/sda1)
* there is (deprecated) writeable mtab
Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Karel Zak <kzak@redhat.com>
Sami Kerola [Fri, 21 Feb 2014 10:17:11 +0000 (10:17 +0000)]
more: fix double free crash
Commit b9579f1f44b46c9f12f1e01b01c02d82ae1cf728 moved fclose() to
checkf(), but missed removing file closure in magic(). Ironically the
cause of regression is in previous commit message.
If both -f and -t are given, flush the timing fd on each write, similar
to the behavior on the script fd. This allows playback of still-running
sessions, and reduces the risk of ending up with empty timing files when
script(1) exits abnormally.
Karel Zak [Fri, 21 Feb 2014 11:04:18 +0000 (12:04 +0100)]
mkfs: mark this wrapper as DEPRECATED
Theodore Ts'o:
I'll add that I've never been convinced that the mkfs front end is all
that useful. It's probably better for people to explicitly run
/sbin/mkfs.xfs, /sbin/mkfs.ext4, etc.., so you don't have to worry
about which options get passed down to the file system specific mkfs
program, and which ones are interpreted by /sbin/mkfs --- and I don't
believe /sbin/mkfs adds enough (err, any?) value that using
"/sbin/mkfs -t xxx" vs "/sbin/mkfs.xxx" makes any sense whatsoever.
... and I absolutely agree.
Reported-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Karel Zak <kzak@redhat.com>
Sami Kerola [Sun, 16 Feb 2014 23:54:15 +0000 (23:54 +0000)]
tests: make tests to run parallel
Unarguably this change makes test output to be more messy, but when I
compare run time tells with clear numbers parallel is quicker. For me
the quickness is important factor. Running test suite always after a
change is preferrably quick, and if something is indicated to be broken
it is ok to spend time in drilling down what happen.
$ time ./tests/run.sh --parallel=5
[...]
real 1m48.037s
Same without parallelization.
$ time ./tests/run.sh
real 3m16.687s
The default is changed to be parallel, where job count is same as number
of CPUs.
[kzak@redhat.com: - propagate --parallel into function.sh
- don't use extra title for non-parallel execution
- disable by default]
Signed-off-by: Sami Kerola <kerolasa@iki.fi> Signed-off-by: Karel Zak <kzak@redhat.com>
Thomas Bächler [Sun, 16 Feb 2014 13:58:06 +0000 (14:58 +0100)]
libmount: initialize *root to NULL in mnt_table_get_root_fs
mnt_table_get_root_fs only works when *root is set to NULL. This
is not only undocumented, but also unintuitive. Fix it by initializing
*root inside mnt_table_get_root_fs.
Masatake YAMATO [Thu, 30 Jan 2014 14:52:38 +0000 (23:52 +0900)]
ionice: add the way to specify the target processes with pgid and uid
ioprio_get and ioprio_set system call accept not only process ID but
also process group ID(pgid) and user ID(uid) to specify the target
process(es). However, ionice command accepts only process ID. With
this patch a user can specify the target processes with pgid(-P
option) and uid(-u option).
[kzak@redhat.com: - tiny cleanup in usage()]
Signed-off-by: Masatake YAMATO <yamato@redhat.com> Signed-off-by: Karel Zak <kzak@redhat.com>
Rodrigo Campos [Sat, 25 Jan 2014 19:17:27 +0000 (19:17 +0000)]
fallocate: Hide #ifdef tricks to call fallocate in a function
Future patches will add more calls to fallocate(), so it will be useful to have
all these tricks inside a function.
The error message when fallocate is not supported is slightly changed: the file
name is not printed as a prefix because is not available in the context of the
function. Also, to only print one of the two possible errors (as happens when
using directly exit()), an else clause was added.
Phillip Susi [Fri, 7 Feb 2014 22:01:20 +0000 (17:01 -0500)]
fix mkfs --verbose and man page
mkfs did not actually accept the long form --verbose option.
Also the man page seemed to indicate that version/verbose/help
options were passed to the filesystem specific utility when this
is not the case.
Andy Lutomirski [Fri, 24 Jan 2014 20:02:59 +0000 (12:02 -0800)]
setpriv: Fix --apparmor-profile
There were two bugs. First, trying to access /proc/self/attr/exec
with O_CREAT | O_EXCL has no chance of working. Second, it turns
out that the correct command to send is "exec", not "changeprofile".
Of course, there was no way to know this until:
Sami Kerola [Tue, 21 Jan 2014 22:05:05 +0000 (22:05 +0000)]
last: make session gone determination more robust
Earlier determination that used kill with signal zero to pid was prone to
false positive reports, due reuse of pid space and unrelated processes.
New function is_phantom() tries do a little bit better job, but fails to
be perfect. It seems linking to gether utmp session start time or
terminal id with /proc/<pid>/ information is not as simple as one might
hope.
Reported-by: Karel Zak <kzak@redhat.com> Signed-off-by: Sami Kerola <kerolasa@iki.fi>
Sami Kerola [Wed, 15 Jan 2014 20:15:51 +0000 (20:15 +0000)]
cal: limit year to 32 bit value
This is done to keep things simple, when considering tests, for both 64
and 32 bit architectures. Setting the upper limit of a year value to to
2^31-1 (2147483646) should be enough for anyone.
Reported-by: Mike Frysinger <vapier@gentoo.org>
Reference: http://www.spinics.net/lists/util-linux-ng/msg08662.html Signed-off-by: Sami Kerola <kerolasa@iki.fi>
Karel Zak [Fri, 24 Jan 2014 12:58:40 +0000 (13:58 +0100)]
losetup: wait for udev
On system with /dev/lop-control the udevd creates /dev/loopN nodes.
It seems better to wait a moment after unsuccessful open(/dev/loopN)
and try it to open again.
The problem is pretty visible on systems where udevd also modifies
permission for loopN devices, then open() fails with EACCES when
losetup executed by non-root user (but user who is in "disk" group).
Addresses: https://bugzilla.redhat.com/show_bug.cgi?id=1045432 Signed-off-by: Karel Zak <kzak@redhat.com>
Karel Zak [Fri, 24 Jan 2014 12:04:14 +0000 (13:04 +0100)]
include/c.h: prefer nanosleep() over usleep()
Let's use nanosleep() although if usleep() exists. The nanosleep
function does no interact with signals and other timers.
The patch introduces xusleep() as replacement to libc (or our fallback)
usleep(). Yes, we don't want to use struct timespec + nanosleep()
everywhere in code as nano-time resolution is useless for us.
The patch also enlarges delays in some busy wait loops. It seems
enough to try read/write 4x per second.
Karel Zak [Mon, 20 Jan 2014 11:07:35 +0000 (12:07 +0100)]
wipefs: call BLKRRPART when erase partition table
It's better to be smart than make things inconsistent (without
BLKRRPART kernel still uses the erased PT and udev-db still contains
obsolete information).
Karel Zak [Thu, 16 Jan 2014 15:38:30 +0000 (16:38 +0100)]
libblkid: no more probe for btrfs backup superblock
* Linux kernel cares about the first superblock only
* backup superblock are FS specific stuff and there is no reason to
care about it in generic tools
* the problem with broken btrfs utils has been already fixed (it was
possible to use the utils on filesystem with erased primary
superblok without any warning message).
Karel Zak [Thu, 16 Jan 2014 11:22:13 +0000 (12:22 +0100)]
script: don't wait for empty descriptors if child is dead
The current code waits for empty file master and slave descriptors,
but it makes sense only if there is child process that cares (read)
about data in the descriptors.