Paul Eggert [Fri, 11 Jul 2025 21:43:32 +0000 (14:43 -0700)]
tests: fix fraction comparison in sort-float
Problem reported by Cosima Neidahl <https://bugs.gnu.org/78985#13>.
* tests/sort/sort-float.sh: At top level, use C locale at first.
(dbl_minima_order): Assume C locale.
Use string comparison for the fractional parts.
2025-07-10 Paul Eggert <eggert@cs.ucla.edu>
tests: fix integer overflow in sort-float
Problem reported by Cosima Neidahl <https://bugs.gnu.org/78985>.
* tests/sort/sort-float.sh (dbl_minima_order):
Use expr instead of test, to avoid problems with integers
too large for the shell.
Paul Eggert [Thu, 10 Jul 2025 17:17:29 +0000 (10:17 -0700)]
tests: fix integer overflow in sort-float
Problem reported by Cosima Neidahl <https://bugs.gnu.org/78985>.
* tests/sort/sort-float.sh (dbl_minima_order):
Use expr instead of test to compare fractions,
to avoid problems with integers too large for the shell.
Paul Eggert [Thu, 10 Jul 2025 04:59:50 +0000 (21:59 -0700)]
factor: use 64-bit internal counters
* src/factor.c (factor_using_pollard_rho)
(factor_using_pollard_rho2, mp_factor_using_pollard_rho):
Use int_fast64_t for internal counters rather than int, as int
could overflow on some somewhat-practical examples. Problem
discovered on a hypothetical platform where W_TYPE_SIZE is neither
32 nor 64, when factoring 0x7ffffffffffffeab7fffffffffff7369 == 170141183460469225450570946617781744489, causing k to overflow in
mp_factor_using_pollard_rho. Presumably a similar problem exists
in the previous stable coreutils 9.7, too, on 32-bit platforms
with somewhat-larger test cases, though I haven’t take the
somewhat-extensive CPU time to discover it.
Paul Eggert [Thu, 10 Jul 2025 04:22:12 +0000 (21:22 -0700)]
factor: fix mp_factor_using_pollard_rho aliasing
* src/factor.c (mp_factor_using_pollard_rho):
Fix recently-introduced aliasing bug by computing q
before g gets updated in place. Problem discovered
on a hypothetical platform where W_TYPE_SIZE
is neither 32 nor 64.
Paul Eggert [Wed, 9 Jul 2025 17:36:04 +0000 (10:36 -0700)]
factor: speed up converting strings to uuint
* src/factor.c: Do not include c-ctype.h.
(strtouuint): Don’t bother generating a number on
error; just return a strtol_error value other than LONGINT_OK.
Speed up overflow checking.
Paul Eggert [Wed, 9 Jul 2025 16:48:21 +0000 (09:48 -0700)]
factor: simplify primes table
* src/factor.c (primes_ptab): New table of primes, replacing
primes_diff and primes_diff8. All uses changed. This is simpler
and should improve performance slightly. Although this limits the
table’s primes to 2**15 instead of to 668221, the limit can easily
grow to 2**32 by changing the type of ‘prime’, without hurting
performance significantly compared to the primes_diff and
primes_diff8 approach.
* src/make-prime-list.c (output_primes):
For each prime p, output p instead of two differences.
Paul Eggert [Mon, 7 Jul 2025 03:18:10 +0000 (20:18 -0700)]
factor: speed up Pollard-rho loop counters
* src/factor.c (factor_using_pollard_rho)
(factor_using_pollard_rho2, mp_factor_using_pollard_rho):
Use int, not unsigned long int, for counters that won’t go above
2**31 on practical platforms. This yields a significant speedup
on GCC 15 x86-64, and using signed values allows for automatic
checks for overflow when using gcc -fsanitize=undefined.
Paul Eggert [Sun, 6 Jul 2025 20:59:56 +0000 (13:59 -0700)]
factor: avoid an mpz init+clear
* src/factor.c (mp_factor_insert_ui): Rename fom
mp_factor_insert_ui, and change arg type from unsigned long int to
mp_limb_t. All uses changed. This avoids creating and freeing a
small mpz_t.
(mp_factor_using_division): Add a static assert requiring that
that mp_limb_t be wide enough, which it should be (and is in
standard GMP).
Paul Eggert [Sun, 6 Jul 2025 16:34:00 +0000 (09:34 -0700)]
factor: Pollard-rho a is now mp_limb_t
* src/factor.c (factor_using_pollard_rho)
(factor_using_pollard_rho2): Use mp_limb_t, not unsigned long int,
for a parameter. This avoids some casts, and avoids a theoretical
bug where converting to mp_limb_t loses info.
Paul Eggert [Sun, 6 Jul 2025 05:12:22 +0000 (22:12 -0700)]
factor: redo ge2 in terms of lt2
lt2 a bit more natural, given the current implementation.
* src/factor.c (lt2): New function.
(ge2): Rewrite in terms of lt2.
(gt2): Remove. All callers changed to use lt2.
Paul Eggert [Sat, 5 Jul 2025 18:04:54 +0000 (11:04 -0700)]
factor: speed up umul_ppmm when !USE_LONGLONG_H
* src/factor.c (umul_ppmm): When !USE_LONGLONG_H so we need to
define this, speed things up if there is an unsigned type uuroom_t
wide enough to hold two words. Do not make a similar change for
udiv_qrnnd, as it is not performance critical and anyway on GCC 15
x86-64 that would mean a subroutine call.
Paul Eggert [Sat, 5 Jul 2025 14:07:25 +0000 (07:07 -0700)]
factor: faster gcd_odd since 2nd is odd
* src/factor.c (gcd_odd, gcd2_odd): Speed up, given that the
second argument is always odd.
(gcd_odd): Avoid recomputing a temporary.
(gcd2_odd): Test for zero only if a multiple of B.
This saves an ‘assume’.
Paul Eggert [Wed, 2 Jul 2025 21:46:00 +0000 (14:46 -0700)]
factor: tune submod2
* src/factor.c (submod2): Use ckd_sub to subtract by hand rather
than using sub_ddmmss plus a compare. This speeds things up a
bit, on x86-64 with GCC 15 anyway.
Paul Eggert [Tue, 1 Jul 2025 18:48:44 +0000 (11:48 -0700)]
factor: define SQUARE_OF_FIRST_OMITTED_PRIME
* src/make-prime-list.c (output_primes): Output
SQUARE_OF_FIRST_OMITTED_PRIME, not FIRST_OMITTED_PRIME. All uses
changed. This way, the uses don’t need to worry about casts to
avoid overflow.
Paul Eggert [Tue, 1 Jul 2025 02:05:55 +0000 (19:05 -0700)]
factor: improve millerrabin2 API
* src/factor.c (make_uuint2): New function.
(powm2, millerrabin2): Pass two-word args as uuints,
not as mp_limb_t const [2] pointers. All uses changed.
(prime2_p): Rework to use the new API, fixing a FIXME.
Paul Eggert [Mon, 30 Jun 2025 14:34:53 +0000 (07:34 -0700)]
factor: put FACTORS first
* src/factor.c (factor_using_division, mp_finish_up_in_single)
(mp_finish_in_single, factor_using_pollard_rho)
(factor_using_pollard_rho2, factor_up, factor):
Put FACTORS arg first, for consistency.
This is just a refactoring and should not affect speed.
Paul Eggert [Mon, 23 Jun 2025 17:26:28 +0000 (10:26 -0700)]
factor: use 1-word code only when tested
* src/factor.c (print_factors): Use single-precision code only
when word size is 32 or 64, as it isn't tested for other sizes and
is known to not work when it the size is 128. This change does
not affect any known practical platforms.
Paul Eggert [Sat, 21 Jun 2025 18:41:11 +0000 (11:41 -0700)]
factor: don’t prove primality
Suggested for consideration by Torbjörn Granlund in:
https://lists.gnu.org/r/coreutils/2025-01/msg00000.html
* src/factor.c (PROVE_PRIMALITY): Now defaults to false.
(mp_prime_p): Help the compiler by telling it mpz_prob_prime_p
returns nonnegative.
* tests/factor/create-test.sh (bigprime): Test 2^400 - 593,
since that’s now practical.
* tests/local.mk (factor_tests): Add new test.
Paul Eggert [Sun, 15 Jun 2025 06:10:25 +0000 (23:10 -0700)]
factor: always use Baillie-PSW
* src/factor.c (USE_BAILLIE_PSW): New constant.
(prime_p, prime2_p): Use it, i.e., always use Baillie-PSW.
Do so by using mp_prime_p. Do not tell GCC these functions
are pure; the pure mark was present only to pacify GCC
and is no longer needed now that thes functions call mp_prime_p.
Paul Eggert [Thu, 12 Jun 2025 20:51:46 +0000 (13:51 -0700)]
factor: use mpz_probab_prime_p
Inspired by a proposal by Torbjörn Granlund in:
https://lists.gnu.org/r/coreutils/2025-01/msg00000.html
* src/factor.c (mp_prime_p): Use mpz_probab_prime_p rather than doing
it by hand, as mpz_probab_prime_p uses Baillie-PSW which is a win.
This removes the need for the ret2 label and goto.
Paul Eggert [Thu, 12 Jun 2025 20:45:57 +0000 (13:45 -0700)]
factor: don’t give up before last prime in table
* src/factor.c (prime_p, prime2_p, mp_prime_p): Do not skip the
flag_prove_primality test for the last prime in the table, i.e.,
when r == PRIMES_PTAB_ENTRIES - 1.
(mp_prime_p): There is no longer a need for the ret1 label or goto.
Paul Eggert [Fri, 4 Jul 2025 18:52:36 +0000 (11:52 -0700)]
factor: port back to mini-gmp
mini-gmp lacs mpn_tdiv_qr, so supply an emulation of it
when using mini-gmp.
* src/factor.c (copy_mpn_from_mpz, mpn_tdiv_qr) [!mpn_tdiv_qr]:
New functions.
Paul Eggert [Wed, 11 Jun 2025 00:40:34 +0000 (17:40 -0700)]
factor: speed up multiprecision Pollard’s rho
These changes are taken from a proposal by Torbjörn Granlund in:
https://lists.gnu.org/r/coreutils/2025-01/msg00000.html
On my x86-64 platform, they improve speed by more than 8× when
factoring 340282366920938463463374607431768211457.
* src/factor.c (mp_modadd, mp_modsub, mp_modadd_1, mp_mulredc):
New functions.
(MP_FACTOR_USING_POLLARD_RHO_N_MAX): New macro.
(mp_factor_using_pollard_rho): Act on mpn not mpz, and on
mp_limb_t not unsigned long int. Reorder args. All uses changed.
Paul Eggert [Tue, 10 Jun 2025 00:28:34 +0000 (17:28 -0700)]
factor: use a more functional style
This is mostly to make the code a bit easier to read.
It shrinks the size of factor.o by 0.7% on my x86-64 platform
though it doesn’t affect CPU performance significantly.
* src/factor.c (mp_no_factors): Rename from mp_factor_init.
(mp_no_factors, mp_factor_using_division, mp_factor):
Return struct mp_factors rather than modifying one passed by reference.
All uses changed.
Paul Eggert [Sun, 8 Jun 2025 18:34:51 +0000 (11:34 -0700)]
factor: omit unnecessary divisions by small primes
* src/factor.c (mp_factor_using_division):
When continuing in single precision, don’t divide by primes that
were already cast out in multiple precision.
On my platform this gave a 2.5% speedup when factoring
2**128 + 172261 = 4999 * 68070087401668026297934508388031283,
as W_TYPE_SIZE == 64 and 4999 is the last prime in the primes table.
Paul Eggert [Sun, 8 Jun 2025 07:59:44 +0000 (00:59 -0700)]
factor: use primes_diff more consistently
* src/factor.c (mp_factor_using_division): Use index into
primes_diff consistently with other uses. This is mostly just a
style change, and likely doesn’t change the generated machine code.
Paul Eggert [Sun, 8 Jun 2025 05:29:54 +0000 (22:29 -0700)]
factor: refactor to for later performance speedup
This does not affect performance much, but it should allow future
performance improvements.
* src/factor.c (factor_using_division): Two new args I and P,
which generalize this function. All uses changed.
(mp_finish_up_in_single, factor_up): New functions, like the
non-*_up* versions but with two new args PRIME_IDX and PRIME.
They mostly just have the old body of the old non-*_up_ versions.
(mp_finish_in_single, factor): Rewrite in terms of the new functions.
Paul Eggert [Wed, 4 Jun 2025 17:12:29 +0000 (10:12 -0700)]
factor: switch from mp to single when doable
This significantly improves performance when a number exceeds
2**(W_TYPE_SIZE - 1) and is the product of a prime less than
FIRST_OMITTED_PRIME and another prime less than 2**(W_TYPE_SIZE - 1).
On my platform, for example, it doubled the speed of factoring
4999 * (2**128 - 159).
* src/factor.c (mp_size, mp_finish_in_single): New functions.
(mp_factor_using_division, mp_factor_using_pollard_rho):
Finish using single precision when possible.
* tests/factor/factor.pl (lt-5000-times-128-bit): New test.
Paul Eggert [Wed, 4 Jun 2025 15:07:23 +0000 (08:07 -0700)]
factor: primes_diff idx type consistency
* src/factor.c (factor_insert_refind):
Use idx_t for indexes into primes_diff,
for consistency with other indexes into primes_diff.
This has no practical effect unless the primes_diff
table becomes unreasonably large.
Paul Eggert [Mon, 2 Jun 2025 17:02:14 +0000 (10:02 -0700)]
factor: mp insert multiplicity too
Support a multiplicity argument in the mp case, too.
This helps keeps the two cases in sync, for maintenance.
* src/factor.c (mp_factor_insert, mp_factor_insert_ui):
New arg M, for multiplicity. All callers changed.
Paul Eggert [Mon, 2 Jun 2025 05:47:36 +0000 (22:47 -0700)]
factor: prefer non-macros
Use something other than a macro when that is easy and won’t hurt
performance.
* src/factor.c (__ll_B, __ll_lowpart, _ll_highpart) [!USE_LONGLONG_H]:
(MAX_NFACTS, highbit_to_mask, factor_insert, PRIMES_PTAB_ENTRIES):
Make these enums, or constants, or static functions instead of macros.
(highbit_to_mask): Rename from HIGHBIT_TO_MASK. All uses changed.
Paul Eggert [Sat, 31 May 2025 01:58:21 +0000 (18:58 -0700)]
factor: factor insertion simplifications
* src/factor.c (factor_insert_multiplicity):
Adjust to keep in sync with mp_factor_insert changes below,
by adding 1 to the index and using memmove to move.
(mp_factor_insert): Omit redundant call to mpz_cmp.
Prefer idx_t (always nonnegative) to ptrdiff_t,
by adding 1 to the indexes.
Prefer mpz_init_set to mpz_init+mpz_set.
Use memmove to move, rather than doing it by hand.
Paul Eggert [Sat, 24 May 2025 18:57:05 +0000 (11:57 -0700)]
factor: check unsigned char counts
* src/factor.c (MAX_NFACTS): Allow word size of 128 bits,
even if this is only theoretical now.
Check that struct factors’s unsigned char counts won’t overflow.
Paul Eggert [Thu, 22 May 2025 22:35:22 +0000 (15:35 -0700)]
factor: simplify longlong.h setup
* src/factor.c (USE_LONGLONG_H):
Default to false on unusual (but standard-conforming)
platforms that lack int64_t etc.
(UWtype, UHWtype): Now typedefs, not macros.
(UQItype): Remove.
(SItype, USItype, DItype, UDItype): Use standard C types.
Paul Eggert [Sun, 1 Jun 2025 23:28:47 +0000 (16:28 -0700)]
factor: prefer uuint to two words in a couple of places
This simplifies things slightly by using uuint for
some two-word integers.
* src/factor.c (strtouuint): Accept uuint *, not two mp_limb_t *.
All callers changed.
(print_factors_single): Accept uuint, not two limbs.
All callers changed.
(print_factors): Use simpler test for high bit,
one that need not worry about promoting to int.
Paul Eggert [Sun, 1 Jun 2025 23:18:09 +0000 (16:18 -0700)]
factor: remove wide_uuint
Simplify by using GMP’s word type instead of pretending to roll our own.
* src/factor.c (wide_uuint): Remove. All uses replaced by mp_limb_t.
(umul_ppmm) [!umul_ppmm]: Don’t assume unsigned long is at least half
as wide as mp_limb_t. This simpler anyway.
(strtouuint): Rename from strto2wide_uint. All uses changed.
Paul Eggert [Sun, 1 Jun 2025 23:02:21 +0000 (16:02 -0700)]
factor: use same word size as GMP
Remove experimental code for 128-bit words as it does not work and
we lack time to figure out why. Instead, ensure that words are
the same size as with GMP.
* src/factor.c (USE_INT128): Remove. All uses removed.
(wide_uint, W_TYPE_SIZE): Define to be the same as GMP.
(MP_LIMB_MAX): New macro. Check that it matches W_TYPE_SIZE.
(USE_LONGLONG_H): Default to true.
(UHWtype) [USE_LONGLONG_H]: Define to unsigned int, same as GMP.
(prime_p): Go back to not worrying about 128-bit words,
since GMP doesn’t worry and doesn’t use them.
(lbuf_putbitcnt): New function, since we cannot assume
that bitcnt_t fits into mp_limb_t.
(print_factors): Use it.
* src/make-prime-list.c (output_primes):
Don’t assume that wide_uint’s maximum is UINTMAX_MAX.
Paul Eggert [Mon, 2 Jun 2025 06:18:23 +0000 (23:18 -0700)]
factor: unsigned long → mp_bitcnt_t
* src/factor.c (struct mp_factors): e (multiplicity) member
is now of type mp_bitcnt_t, not unsigned long int, since
its value is at most a bit count. All uses changed.
Paul Eggert [Sun, 1 Jun 2025 22:55:27 +0000 (15:55 -0700)]
factor: generalize BIG_POWER_OF_10
* src/factor.c (BIG_POWER_OF_10, LOG_BIG_POWER_OF_10):
Place fewer restrictions on BIG_POWER_OF_10.
This is only for currently-theoretical hosts;
it shouldn’t affect machine code on practical platforms.
Paul Eggert [Sun, 1 Jun 2025 22:51:26 +0000 (15:51 -0700)]
factor: remove wide_int
* src/factor.c (wide_int): Remove, since it gets in the
way of using mp_limb_t for words. All uses removed.
(submod2, HIGHBIT_TO_MASK, divexact_21):
Rewrite without using wide_int.
This shouldn't change the machine code these days,
as compilers are pretty smart about isolating the
top bit of an unsigned int.
Paul Eggert [Sun, 1 Jun 2025 01:17:34 +0000 (18:17 -0700)]
factor: don’t used uninitialized uu[0]
In practice there’s no bug but we might as well avoid the
undefined behavior.
* src/factor.c (hi_is_set): New static function.
(factor_insert_large, prime2_p, print_factors_single): Use it.
* src/local.mk: Similarly to commit v8.22-156-g09937e9d0
track speedlist.h with nodist_src_stty_SOURCES and DISTCLEANFILES
to ensure the make distcheck manifest comparison passes.
Addresses https://bug.gnu.org/78960
* src/local.mk: Use the coarser BUILT_SOURCES mechanism
to generate speedlist.h, rather than a specific dependency
(which did seem to work for parallel builds).
Fixes https://bugs.gnu.org/78960
Paul Eggert [Mon, 30 Jun 2025 22:40:57 +0000 (15:40 -0700)]
od: port to Apple clang 14
* src/od.c (print_function_type): New type. Use it for convenience.
(width_bytes): Omit duplicate entries, such as ‘double’ vs ‘long
double’ on macOS. Problem reported by Bruno Haible
<https://bugs.gnu.org/78933>.
(decode_one_format): Cast null pointer to print_function_type
to pacify Apple clang-1400.0.29.202.
Pádraig Brady [Mon, 30 Jun 2025 14:34:22 +0000 (15:34 +0100)]
build: fix VPATH builds with --enable-single-binary
* src/local.mk: Adjust the dependency so that speedlist.h
is built irrespective of the object file name.
Note we could use BUILT_SOURCES for this,
but it's better to have this more accurate dependency.
Pádraig Brady [Mon, 30 Jun 2025 13:25:56 +0000 (14:25 +0100)]
od: reinstate half float validation check
Reinstate check removed in commit 56aa549a0 so that we
disallow -f2 when configured with utils_cv_ieee_16_bit_supported=no.
Otherwise the output routines will consume floats,
i.e. 4 bytes at a time. Without this extra check
the tests/od/od-endian.sh will fail with this configuration.
* src/od.c (decode_one_format): Reinstate the explicit check
for this configuration edge case.
Paul Eggert [Mon, 30 Jun 2025 00:12:07 +0000 (17:12 -0700)]
od: pacify gcc -Wduplicated-cond
Problem reported by Pádraig Brady <https://bugs.gnu.org/78880#43>.
This patch doesn’t fix any bugs; it merely pacifies GCC.
* src/od.c (ispec_to_format): New function, replacing
the old ISPEC_TO_FORMAT macro. All uses changed.
This part of the change is just refactoring.
(decode_one_format): Pacify à la ispec_to_format.
Paul Eggert [Sun, 29 Jun 2025 03:57:00 +0000 (20:57 -0700)]
od: be more consistent re sizeof
* src/od.c (width_bytes, decode_one_format): Don’t assume a signed
type has the same size as the corresponding unsigned type.
This has no effect on practical platforms; it’s just for
consistency there.
Paul Eggert [Sat, 28 Jun 2025 16:49:42 +0000 (09:49 -0700)]
od: simpler static initialization
* src/od.c (address_base, address_pad_len, format_address):
Initialize statically rather than dynamically.
(limit_bytes_to_format): Remove. All uses replaced by
checking sign of end_offset.
(max_bytes_to_format): Remove static var. Now local to ‘main’.
(end_offset): -1 now means no limit. All uses changed.
Paul Eggert [Sat, 28 Jun 2025 00:08:28 +0000 (17:08 -0700)]
od: omit some duplicate code
On x86-64 (for example) print_long, print_long_long, and
print_intmax all behave identically, so give GCC enough info so
that it generates code for just one of these functions.
* src/od.c (enum size_spec): Arrange for enum values to
be the same if they represent types that behave the same.
(width_bytes, ISPEC_TO_FORMAT, decode_one_format):
Match the enum size_spec changes.
Paul Eggert [Fri, 27 Jun 2025 23:08:06 +0000 (16:08 -0700)]
od: replace lookup tables with simple arithmetic
* src/od.c (FMT_BYTES_ALLOCATED): Use a simpler formula.
Although slightly too generous, the storage wasted is very small
and it pacifies gcc -Wformat-overflow=2.
(bytes_to_oct_digits, bytes_to_signed_dec_digits)
(bytes_to_unsigned_dec_digits, bytes_to_hex_digoits): Remove.
All uses replaced by algorithmic calculations, which are good
enough: they are valid for integers up to 2620 bits (!) and might
be slightly conservative for wider integers. Remove related
static_asserts, which are no longer needed.
Paul Eggert [Fri, 27 Jun 2025 15:45:53 +0000 (08:45 -0700)]
od: support uintmax_t too
This has practical effect only on hypothetical platforms where
uintmax_t is wider than unsigned long long int.
* src/od.c (enum size_spec): New constant INTMAX.
(MAX_INTEGRAL_TYPE_WIDTH): Now equals UINTMAX_WIDTH.
(FMT_BYTES_ALLOCATED): Allow for the extra "l" in "%lld".
Also, fix off-by-two error in size calculation.
(width_bytes, integral_type_size): Add entries for uintmax_t.
(print_intmax): New function.
(decode_one_function): Use it.
(ISPEC_TO_FORMAT): New arg Max_fmt. All uses changed.