Pádraig Brady [Tue, 23 Feb 2010 08:43:04 +0000 (08:43 +0000)]
sort: add a --debug option to highlight key extents
* src/sort (usage): Add description for --debug.
(write_bytes): Pass a line structure so it can subsequently
be passed to compare to highlight the keys when in debug mode.
Also transform TAB and NUL characters written to stdout so
that the highlighting in debug mode aligns correctly.
(human_numcompare): Pass an "endptr" so we can record the extent
of the number matched.
(general_numcompare): Likewise.
(find_unit_order): Likewise.
(getmonth): Likewise.
(numcompare): Likewise. Note we reuse find_unit_order() for this,
which is a good enough approximation, and means we don't need to
change the strnumcmp() interface.
(check_mixed_SI_IEC): Return whether iec_present, so that can be
used to set the "endptr" in find_unit_order. Also make the key
parameter optional, which will be the case from numcompare().
(count_tabs): A new function to determine how much to adjust
the mbswidth() values by (TABs don't have a width).
(mark_key): A new function to output the key highlighting to stdout.
(debug_key): A new function to determine the offset and width
of the key highlighting.
(key_compare): Pass the show_debug parameter so the key highlighting
is only displayed when explicitly called. For each key type, set
the length (lena) and whether leading blanks are auto skipped (skipb)
which are then used by debug_key() to highlight the portion of the
key used in the comparison.
(compare): Pass the show_debug parameter so the key highlighting
is only displayed when explicitly called. Call debug_key() to
highlight the last resort comparison.
(check): Output highlighting for disorder line to stdout.
(main): Process the --debug option and make it mutually exlusive
with the -o option as I don't see it useful there, even potentially
harmful if someone left a --debug in by mistake when updating a file.
Also restricting debug output to stdout, simplifies the logic
for dealing with temporary files.
* doc/coreutils.texi (sort invocation): Describe the --debug option,
and reference it from the --key description.
* tests/misc/sort-debug-keys: A new test for highlighting keys.
* tests/Makefile.am: Reference the new test.
* NEWS: Mention the new feature.
Jim Meyering [Sun, 9 May 2010 16:29:22 +0000 (18:29 +0200)]
tests: loosen/tighten the always_defined_macros check
* cfg.mk (.re-defmac): Generate better regexps: allow white space
before the '#', and append a word-boundary requirement.
Without the latter, #define NULL_DEV ... would evoke a false-positive.
Jim Meyering [Sun, 9 May 2010 10:08:21 +0000 (12:08 +0200)]
tests: improve the always_defined_macros check
* cfg.mk (sc_always_defined_macros): Adjust its helpers not to depend
on the existence of ./lib. Instead, extract symbols directly from
gnulib/lib/*.in.h files.
Jim Meyering [Mon, 3 May 2010 20:00:30 +0000 (22:00 +0200)]
doc: factor hard-coded project-specific bits from README-release, ...
using the new --mail-headers option to gnulib's announce-gen, and
the updated maint.mk rules to connect the pieces.
* README-release: Remove hard-coded To:, Cc: etc. parts, now
that they're emitted automatically into the announcement template.
* cfg.mk (announcement_Cc_): Override the default.
* gnulib: Update to latest, to get newer announce-gen and maint.mk.
Jim Meyering [Mon, 3 May 2010 13:35:56 +0000 (15:35 +0200)]
maint: factor trap-related code out of two syntax-check rules
* cfg.mk (gl_trap_): Define, using a loop and eval'd trap,
rather than repeated "trap" uses. Also handle "13", SIGPIPE.
(sc_always_defined_macros): Use it.
(sc_system_h_headers): Likewise.
Jim Meyering [Mon, 3 May 2010 10:05:14 +0000 (12:05 +0200)]
maint: extend the always_defined_macros syntax-check
* cfg.mk (gl_generated_headers_): Define.
(headers_with_interesting_macro_defs): Remove headers covered
by the above.
(.re-defmac): Extract symbol names from many more files.
(sc_always_defined_macros): Use VC_LIST_EXCEPT, not VC_LIST, so
that we can use the usual exception mechanism.
Test for $(gnulib_dir), not system.h.
* .x-sc_always_defined_macros: New file. Exempt src/seq.c.
* Makefile.am (syntax_check_exceptions): Add it here.
Jim Meyering [Mon, 3 May 2010 08:50:23 +0000 (10:50 +0200)]
maint: remove now-redundant definitions provided by signal.h
* src/dd.c (SA_NODEFER, SA_RESETHAND): Remove definitions,
now that gnulib guarantees they are defined in <signal.h>.
* src/ls.c (SA_RESTART): Likewise.
Jim Meyering [Mon, 3 May 2010 08:45:47 +0000 (10:45 +0200)]
maint: remove now-redundant definitions provided by sys/wait.h
* src/timeout.c (WIFSIGNALED, WTERMSIG): Remove definitions,
now that gnulib guarantees they are defined in <sys/wait.h>.
* src/operand2sig.c: Likewise.
* src/kill.c: Likewise.
* src/sort.c (general_numcompare): Use long doubles unconditionally,
and strtold when available, to convert numbers with greater range and
precision. Performance was seen to be on par with standard doubles.
* doc/coreutils.texi (sort invocation): Amend the -g description to
mention long double rather than double, and strtold rather than strtod.
* src/getlimits.c (main): Output floating point limits for use in tests.
* tests/misc/sort-float: A new test to ensure sort is using long
doubles when possible, and that locale specific floats are handled.
* tests/Makefile.am: Reference the new test.
* tests/test-lib.sh (getlimits_): Normalize indenting.
* NEWS: Mention the new behaviour.
Reported by Nelson Beebe.
Jim Meyering [Sun, 25 Apr 2010 08:35:51 +0000 (10:35 +0200)]
doc: tweak factor-describing wording
* doc/coreutils.texi (factor invocation): Don't say that "factoring
large prime numbers is hard". A pedant might ding you, since it's
trivial to factor a number that is known to be prime. Instead, say
that "factoring large numbers... is hard". Reported by Andreas Eder.
Jim Meyering [Sat, 24 Apr 2010 15:16:56 +0000 (17:16 +0200)]
build: enable gnulib modules for more replacement headers
* bootstrap.conf (gnulib_modules): Add the following:
netinet_in, sys_ioctl, sys_wait, so that we can eliminate
the #if HAVE_<header>_H tests guarding their header inclusions.
Dmitry V. Levin [Sat, 30 Jan 2010 16:02:36 +0000 (16:02 +0000)]
tests: fix exit status of signal handlers in shell scripts
The value of `$?' on entrance to signal handlers in shell scripts
cannot be relied upon, so set the exit code explicitly.
* cfg.mk (sc_always_defined_macros, sc_system_h_headers): Set
the exit code in signal handler explicitly to 128 + SIG<SIGNAL>.
* src/Makefile.am (sc_tight_scope): Likewise.
* tests/test-lib.sh: Likewise.
Eric Blake [Wed, 21 Apr 2010 14:17:59 +0000 (08:17 -0600)]
base64: always treat input in binary mode
Necessary for cygwin. Technically, this patch is not correct,
in that it clobbers O_APPEND, but it is no different than any
other use of xfreopen to force binary mode, so all such uses
should be fixed at once in a later patch.
* src/base64.c (main): Open input in binary mode.
* THANKS: Update.
Reported by Yutaka Amanai.
maint: update a couple of NEWS items for the pending release
* NEWS: Mention that cp and mv from the previous release did
not support preserving extended attributes (fixed in e489fd04).
Improve the grammar for the "cp capabilities" item.
sort: fix parsing of end field in obsolescent key formats
This regression was introduced in commit 224a69b5, 2009-02-24,
"sort: Fix two bugs with determining the end of field".
The specific regression being that we include 1 field too many when
an end field is specified using obsolescent key syntax (+POS -POS).
* src/sort.c (struct keyfield): Clarify the description of the eword
member, as suggested by Alan Curry.
(main): When processing obsolescent format key specifications,
normalize eword to a zero based count when no specific end char is given
for an end field. This matches what's done when keys are specified with -k.
* tests/misc/sort: Add a few more tests for the obsolescent key formats,
with test 07i being the particular failure addressed by this change.
* THANKS: Add Alan Curry who precisely identified the issue.
* NEWS: Mention the fix.
Reported by Santiago Rodríguez
cp: preserve "capabilities" when also preserving file ownership
* src/copy.c (copy_reg): Copy xattrs _after_ setting file ownership
so that capabilities are not cleared when setting ownership.
* tests/cp/capability: A new root test.
* tests/Makefile.am (root_tests): Reference the new test.
* NEWS: Mention the fix.
maint: fix build on platforms that replace strsignal
* src/Makefile.am (kill_LDADD): Add $(LIBTHREAD) so that
we link with the appropriate libraries to provide Thread Local Storage
on platforms that replace strsignal (like AIX for example).
Tested-by: Daniel Richard G. <danielg@teragram.com>
tests: avoid spurious failure of ls/color-norm test
* tests/ls/color-norm: Use the "time" output by `ls -l`
to check normal style. Previously we used the size from `ls -s`,
but the size of "empty" files can vary depending on whether
SELinux is enabled for example.
Jim Meyering [Wed, 14 Apr 2010 13:48:31 +0000 (15:48 +0200)]
tests: avoid spurious failure of root-only ls/capability test
* tests/ls/capability: Adjust this test not to expect the no-op escape
sequence that was removed from all other tests by 2010-01-30 commit 5d43617e, "ls --color: don't emit a final no-op escape sequence".
* src/copy.c (copy_reg): Suppress SELinux ENOTSUP warnings consistently
between the destination being present or not. Previously we did
not suppress ENOTSUP messages when the destination was present.
(copy_internal): Use the same ENOTSUP supression method as
copy_reg() even though the issue was not seen in this case.
* tests/cp/cp-a-selinux: Add a test case for the issue and
group the other test cases in the file more coherently.
* tests/cp/cp-mv-enotsup-xattr: Do the same check for xattr
warnings, even though they did not have the issue.
The 2010-03-26 commit, 4c38625e, "doc: fix info on cp --preserve..."
was not entirely correct as cp --preserve=all does produce some
xattr warnings.
* src/copy.h: Update and clarify the comments for reduce_diagnostics
and require_preserve_{xattr,context}.
* doc/coreutils.texi (cp invocation): Update the -a and
--preserve=xattr,context options to say when and which
xattr warnings are output.
(mv invocation): Mention that some warnings are output
when preserving xattrs.
doc: mention that "capabilities" are preserved by cp/mv
* doc/coreutils.texi (cp invocation): Mention that
"capabilities" are preserved when implemented using
extended attributes.
(mv invocation): Mention ACLs etc. are maintained
due to xattrs being copied.
Jim Meyering [Sat, 10 Apr 2010 12:19:11 +0000 (14:19 +0200)]
maint: new syntax-check rule: prohibit empty lines at EOF
* cfg.mk (detect_empty_lines_at_EOF_): Define.
(sc_prohibit_empty_lines_at_EOF): New rule.
* .x-sc_prohibit_empty_lines_at_EOF: New file. Exempt pr test inputs.
* Makefile.am (syntax_check_exceptions): Add it.
Pádraig Brady suggested to parse the output of tail -n1.
Jim Meyering [Fri, 9 Apr 2010 08:49:38 +0000 (10:49 +0200)]
maint: ftruncate is always available, even without gnulib
Now that even MinGW provides ftruncate, we know that all
reasonable portability targets provide this function.
Remove the workaround code. We nearly removed the gnulib
module three years ago:
http://thread.gmane.org/gmane.comp.lib.gnulib.bugs/9203
and it is now officially "obsolete".
* bootstrap.conf (gnulib_modules): Remove ftruncate.
* src/copy.c (copy_reg): Remove use of HAVE_FTRUNCATE and its
no-longer-used workaround code.
* src/truncate.c: Remove a comment about handling missing ftruncate.
Jim Meyering [Tue, 6 Apr 2010 14:01:32 +0000 (16:01 +0200)]
maint: fix a masked syntax-check violation
* m4/jm-macros.m4 (ARGMATCH_DIE): Use usage(EXIT_FAILURE), not usage(1).
* .x-sc_prohibit_magic_number_exit: Remove *.m4 exemption that was
masking the above.
Jim Meyering [Fri, 2 Apr 2010 18:30:48 +0000 (20:30 +0200)]
build: update gnulib submodule to latest
* cfg.mk: Update to use new _sc_search_regexp interface. Run this:
perl -pi -e 's/\b_prohibit_regexp\b/_sc_search_regexp/;'
-e 's/\bmsg=/halt=/; s/\bre=/prohibit=/;' cfg.mk
and then adjust backslashes so they still line up.
Pádraig Brady [Fri, 26 Mar 2010 07:42:01 +0000 (07:42 +0000)]
nice,chroot: use more standard option parsing
Related to the 2010-03-25 commit, 88d4b346,
"timeout: use more standard option parsing".
* src/nice.c (main): Don't use parse_long_options()
which is a helper for commands that don't have any
long options specific to them.
* src/chroot.c (main): Likewise.
* tests/misc/nice-fail: Remove a case that now
passes due to us accepting multiple instances of the
--help and --version options.
* tests/misc/chroot-fail: Likewise.
Kim Hansen [Thu, 25 Mar 2010 17:43:10 +0000 (17:43 +0000)]
timeout: use more standard option parsing
* src/timeout.c (main): Don't use parse_long_options()
which is a helper for commands that don't have any
long options specific to them.
* tests/misc/timeout-parameters: Remove a case that now
passes due to us accepting multiple instances of the
--help and --version options.
* THANKS: Add the author.
Pádraig Brady [Fri, 26 Mar 2010 11:19:16 +0000 (11:19 +0000)]
doc: fix info on cp --preserve=all, which does _not_ give xattr warnings
The info docs have been inaccurate since 2009-04-17, commit 941bd482,
"mv: ignore xattr-preservation failure when not supported by filesystem"
* doc/coreutils.texi (cp invocation): Say that cp --preserve=all
does _not_ output errors when failing to copy xattrs.
Jim Meyering [Sat, 20 Mar 2010 20:05:24 +0000 (21:05 +0100)]
cfg.mk: remove comments with sed rather than cpp -fpreprocessed
* cfg.mk (_sed_remove_comments): Define, starting with gettext's
moopp sed code, but factoring it to be more understandable.
(sc_space_before_open_paren): Adapt.
Prompted by Bruno Haible's suggestion to use gettext's moopp code.
Pádraig Brady [Fri, 19 Mar 2010 21:40:05 +0000 (21:40 +0000)]
maint: mbsalign: fix an edge case where we truncate too much
* gl/lib/mbsalign.c (mbsalign): Ensure the temporary destination buffer
is big enough, as it may need to be bigger than the source buffer
in the presence of single byte non printable chars.
* gl/tests/test-mbsalign.c (main): Add a test to trigger the issue.
Pádraig Brady [Mon, 15 Mar 2010 14:04:31 +0000 (14:04 +0000)]
maint: update the mbsalign module
* gl/lib/mbsalign.c (mbsalign): Support the MBA_UNIBYTE_FALLBACK
flag which reverts to unibyte mode if one can't allocate memory
or if there are invalid multibyte characters present.
Note memory is no longer dynamically allocated in unibyte mode so
one can assume that mbsalign() will not return an error if this
flag is present. Don't calculate twice, the number of spaces,
when centering. Suppress a signed/unsigned comparison warning.
(ambsalign): A new wrapper function to dynamically allocate
the minimum memory required to hold the aligned string.
* gl/lib/mbsalign.h: Add the MBA_UNIBYTE_FALLBACK flag and
also document others that may be implemented in future.
(ambsalign): A prototype for the new wrapper.
* gl/tests/test-mbsalign.c (main): New test program.
* gl/modules/mbsalign-tests: A new index to reference the tests.
* .x-sc_program_name: Exclude test-mbsalign.c from this check.
Eric Blake [Wed, 17 Mar 2010 15:31:44 +0000 (09:31 -0600)]
rm: tweak wording about loss of data warning
* src/rm.c (usage): Update wording to make two points more
apparent: undelete is not trivial, and partial recovery should be
a consideration factor in deciding whether rm is secure enough.
Initially suggested by Reuben Thomas.
Pádraig Brady [Mon, 15 Mar 2010 23:03:30 +0000 (23:03 +0000)]
timeout: add the --kill-after option
Based on a report from Kim Hansen who wanted to
send a KILL signal to the monitored command
when `timeout` itself received a termination signal.
Rather than changing such a signal into a KILL,
we provide the more general mechanism of sending
the KILL after the specified grace period.
* src/timeout.c (cleanup): If a non zero kill delay
is specified, (re)set the alarm to that delay, after
which a KILL signal will be sent to the process group.
(usage): Mention the new option. Separate the description
of DURATION since it's now specified in 2 places.
Clarify that the duration is an integer.
(parse_duration): A new function refactored from main(),
since this logic is now called for two parameters.
(main): Parse the -k option.
* doc/coreutils.texi (timeout invocation): Describe the
new --kill-after option and use @display rather than
@table to show the duration suffixes. Clarify that
a duration of 0 disables the associated timeout.
* tests/misc/timeout-parameters: Check invalid --kill-after.
* tests/misc/timeout: Check a valid --kill-after works.
* NEWS: Mention the new feature.
Joey Degges [Mon, 1 Mar 2010 10:26:22 +0000 (10:26 +0000)]
sort: inform the system about our input access pattern
Tell the system that we'll access input sequentially,
so that we more efficiently process uncached files in a few cases:
Reading from faster flash devices. E.g. 21 MB/s key:
NORMAL 31.6s (26.8 user)
SEQUENTIAL 27.7s
WILLNEED 27.7s
Processing in parallel with readahead when using a small 1M buffer:
NORMAL 24.7s (21.1 user)
SEQUENTIAL 22.7s
WILLNEED 25.6s
A small benefit when merging:
NORMAL 25.0s (16.9 user)
SEQUENTIAL 24.6s (16.6 user)
WILLNEED 38.4s (13.1 user)
Note WILLNEED is presented above for comparison to show it
has some unwanted characteristics due to its synchronous
prepopulation of the cache. It has a good benefit on a
mechanical disk @ 80MB/s and a multicore system with
competing processes:
NORMAL 14.73s
SEQUENTIAL 10.95s
WILLNEED 05.22s
However the scheduling differences causing this result
are probably best explicitly managed using `nice` etc.
* m4/jm-macros.m4 (coreutils_MACROS): check for posix_fadvise().
* src/sort.c (fadvise_input): A new function to apply
the POSIX_FADV_SEQUENTIAL hint to an input stream.
(stream_open): Call the above function for all input streams.
Jim Meyering [Wed, 3 Mar 2010 19:55:27 +0000 (20:55 +0100)]
tests: don't let the LANGUAGE envvar perturb tests
* tests/envvar-check (vars): Add LANGUAGE to the list of envvars
to unset. At least in glibc (as an extension to POSIX), its value
actually trumps LC_ALL:
$ LC_ALL=es_ES LANGUAGE=fr_FR.UTF-8 /bin/cat no-such
/bin/cat: no-such: Aucun fichier ou dossier de ce type
but only when the default locale is not C:
$ LC_ALL=C LANGUAGE=fr_FR.UTF-8 /bin/cat no-such
/bin/cat: no-such: No such file or directory
Pádraig Brady [Fri, 26 Feb 2010 15:33:16 +0000 (15:33 +0000)]
sort: fix issues with month sorting in some locales
* src/sort.c (char fold_toupper[]): Change to unsigned
so as the correct comparisons are made in getmonth().
This fixes unibyte locales where abbreviated months
have characters that are > 0x7F, but it also works for
multibyte locales with the caveat that multibyte characters
are matched case sensitively.
With this change, the following example sorts correctly:
$ echo -e "1 márta\n2 Feabhra" | LANG=ga_IE.utf8 sort -k2,2M
2 Feabhra
1 márta
* src/sort.c (inittables): Since we ignore blanks around months
in the input, don't include them when they're present in the locale.
With this change, the following example sorts correctly:
$ echo -e "1 2月\n2 1月" | LANG=ja_JP.utf8 sort -k2,2M
2 1月
1 2月
* tests/misc/sort-month: A new test to exercise the above cases.
* tests/Makefile.am: Reference the new test.
* NEWS: Mention the fix.
Eric Blake [Thu, 25 Feb 2010 15:36:39 +0000 (08:36 -0700)]
expr: clarify error message
* src/expr.c (eval4, eval3): Clarify that expr expects integers,
and not the broader category of numbers.
* tests/misc/expr: Update test accordingly.
Suggested by Dan Jacobson.
Pádraig Brady [Thu, 18 Feb 2010 08:38:30 +0000 (08:38 +0000)]
maint: clean up the output from syntax-check rules
* cfg.mk (sc_tight_scope): Pass the -s (silent) flag to `make`
so that it doesn't report about calling sub makes.
(sc_check-AUTHORS): Likewise.
(sc_strftime_check): Don't display stderr from `info`.
* src/Makefile.am (sc_tight_scope): Don't annotate with "GEN".
(sc_check-AUTHORS): Likewise.
Moritz Orbach [Tue, 16 Feb 2010 22:48:00 +0000 (22:48 +0000)]
ls: fix a regression by honoring NORMAL attributes again
Output the NORMAL attribute before non file name text.
This attribute will continue into file names that would
not otherwise be colored unless FILE is also set.
The regression was introduced with commit 483297d5, 28-02-2009,
"ls --color no longer outputs unnecessary escape sequences".
* src/ls.c (set_normal_color): A new function to output the
NORMAL attribute sequence if it's enabled.
(print_current_files): Output NORMAL before printing long format info.
(print_file_name_and_frills): Output NORMAL before printing file name.
(print_color_indicator): Reset the attributes before a file name with
attributes so that NORMAL attributes will not combine with them.
(print_name_with_quoting): Ensure attributes are reset after printing
the file name if NORMAL attributes were output.
* tests/ls/color-norm: A new test for NORMAL and FILE combinations.
* tests/Makefile.am: Reference the new test.
* NEWS: Mention the fix.
Reported in https://savannah.gnu.org/bugs/?26512
Pádraig Brady [Fri, 12 Feb 2010 18:34:33 +0000 (18:34 +0000)]
maint: fix the man page correlation tests
These checks were not being run as distcheck-hook targets
are only supported in the top-level Makefile. Instead
these tests are now run during a syntax-check.
* cfg.mk (sc_man_file_correlation): A new syntax check to
call the 2 existing tests to check the correlation between
the programs and man/*.[1x].
* man/Makefile.am (sc_man_file_correlation): Call the 2 existing
man page correlation tests.
(check-x-vs-1): Remove the "GEN" annotation as it's a bit verbose.
(check-programs-vs-x): Likewise.
* src/Makefile.am (all_programs.list): Exclude libstdbuf.so
from the list of programs. This issue was not noticed as
the checks were not actually being run.