Paul Eggert [Sat, 5 Jul 2025 14:07:25 +0000 (07:07 -0700)]
factor: faster gcd_odd since 2nd is odd
* src/factor.c (gcd_odd, gcd2_odd): Speed up, given that the
second argument is always odd.
(gcd_odd): Avoid recomputing a temporary.
(gcd2_odd): Test for zero only if a multiple of B.
This saves an ‘assume’.
Paul Eggert [Wed, 2 Jul 2025 21:46:00 +0000 (14:46 -0700)]
factor: tune submod2
* src/factor.c (submod2): Use ckd_sub to subtract by hand rather
than using sub_ddmmss plus a compare. This speeds things up a
bit, on x86-64 with GCC 15 anyway.
Paul Eggert [Tue, 1 Jul 2025 18:48:44 +0000 (11:48 -0700)]
factor: define SQUARE_OF_FIRST_OMITTED_PRIME
* src/make-prime-list.c (output_primes): Output
SQUARE_OF_FIRST_OMITTED_PRIME, not FIRST_OMITTED_PRIME. All uses
changed. This way, the uses don’t need to worry about casts to
avoid overflow.
Paul Eggert [Tue, 1 Jul 2025 02:05:55 +0000 (19:05 -0700)]
factor: improve millerrabin2 API
* src/factor.c (make_uuint2): New function.
(powm2, millerrabin2): Pass two-word args as uuints,
not as mp_limb_t const [2] pointers. All uses changed.
(prime2_p): Rework to use the new API, fixing a FIXME.
Paul Eggert [Mon, 30 Jun 2025 14:34:53 +0000 (07:34 -0700)]
factor: put FACTORS first
* src/factor.c (factor_using_division, mp_finish_up_in_single)
(mp_finish_in_single, factor_using_pollard_rho)
(factor_using_pollard_rho2, factor_up, factor):
Put FACTORS arg first, for consistency.
This is just a refactoring and should not affect speed.
Paul Eggert [Mon, 23 Jun 2025 17:26:28 +0000 (10:26 -0700)]
factor: use 1-word code only when tested
* src/factor.c (print_factors): Use single-precision code only
when word size is 32 or 64, as it isn't tested for other sizes and
is known to not work when it the size is 128. This change does
not affect any known practical platforms.
Paul Eggert [Sat, 21 Jun 2025 18:41:11 +0000 (11:41 -0700)]
factor: don’t prove primality
Suggested for consideration by Torbjörn Granlund in:
https://lists.gnu.org/r/coreutils/2025-01/msg00000.html
* src/factor.c (PROVE_PRIMALITY): Now defaults to false.
(mp_prime_p): Help the compiler by telling it mpz_prob_prime_p
returns nonnegative.
* tests/factor/create-test.sh (bigprime): Test 2^400 - 593,
since that’s now practical.
* tests/local.mk (factor_tests): Add new test.
Paul Eggert [Sun, 15 Jun 2025 06:10:25 +0000 (23:10 -0700)]
factor: always use Baillie-PSW
* src/factor.c (USE_BAILLIE_PSW): New constant.
(prime_p, prime2_p): Use it, i.e., always use Baillie-PSW.
Do so by using mp_prime_p. Do not tell GCC these functions
are pure; the pure mark was present only to pacify GCC
and is no longer needed now that thes functions call mp_prime_p.
Paul Eggert [Thu, 12 Jun 2025 20:51:46 +0000 (13:51 -0700)]
factor: use mpz_probab_prime_p
Inspired by a proposal by Torbjörn Granlund in:
https://lists.gnu.org/r/coreutils/2025-01/msg00000.html
* src/factor.c (mp_prime_p): Use mpz_probab_prime_p rather than doing
it by hand, as mpz_probab_prime_p uses Baillie-PSW which is a win.
This removes the need for the ret2 label and goto.
Paul Eggert [Thu, 12 Jun 2025 20:45:57 +0000 (13:45 -0700)]
factor: don’t give up before last prime in table
* src/factor.c (prime_p, prime2_p, mp_prime_p): Do not skip the
flag_prove_primality test for the last prime in the table, i.e.,
when r == PRIMES_PTAB_ENTRIES - 1.
(mp_prime_p): There is no longer a need for the ret1 label or goto.
Paul Eggert [Fri, 4 Jul 2025 18:52:36 +0000 (11:52 -0700)]
factor: port back to mini-gmp
mini-gmp lacs mpn_tdiv_qr, so supply an emulation of it
when using mini-gmp.
* src/factor.c (copy_mpn_from_mpz, mpn_tdiv_qr) [!mpn_tdiv_qr]:
New functions.
Paul Eggert [Wed, 11 Jun 2025 00:40:34 +0000 (17:40 -0700)]
factor: speed up multiprecision Pollard’s rho
These changes are taken from a proposal by Torbjörn Granlund in:
https://lists.gnu.org/r/coreutils/2025-01/msg00000.html
On my x86-64 platform, they improve speed by more than 8× when
factoring 340282366920938463463374607431768211457.
* src/factor.c (mp_modadd, mp_modsub, mp_modadd_1, mp_mulredc):
New functions.
(MP_FACTOR_USING_POLLARD_RHO_N_MAX): New macro.
(mp_factor_using_pollard_rho): Act on mpn not mpz, and on
mp_limb_t not unsigned long int. Reorder args. All uses changed.
Paul Eggert [Tue, 10 Jun 2025 00:28:34 +0000 (17:28 -0700)]
factor: use a more functional style
This is mostly to make the code a bit easier to read.
It shrinks the size of factor.o by 0.7% on my x86-64 platform
though it doesn’t affect CPU performance significantly.
* src/factor.c (mp_no_factors): Rename from mp_factor_init.
(mp_no_factors, mp_factor_using_division, mp_factor):
Return struct mp_factors rather than modifying one passed by reference.
All uses changed.
Paul Eggert [Sun, 8 Jun 2025 18:34:51 +0000 (11:34 -0700)]
factor: omit unnecessary divisions by small primes
* src/factor.c (mp_factor_using_division):
When continuing in single precision, don’t divide by primes that
were already cast out in multiple precision.
On my platform this gave a 2.5% speedup when factoring
2**128 + 172261 = 4999 * 68070087401668026297934508388031283,
as W_TYPE_SIZE == 64 and 4999 is the last prime in the primes table.
Paul Eggert [Sun, 8 Jun 2025 07:59:44 +0000 (00:59 -0700)]
factor: use primes_diff more consistently
* src/factor.c (mp_factor_using_division): Use index into
primes_diff consistently with other uses. This is mostly just a
style change, and likely doesn’t change the generated machine code.
Paul Eggert [Sun, 8 Jun 2025 05:29:54 +0000 (22:29 -0700)]
factor: refactor to for later performance speedup
This does not affect performance much, but it should allow future
performance improvements.
* src/factor.c (factor_using_division): Two new args I and P,
which generalize this function. All uses changed.
(mp_finish_up_in_single, factor_up): New functions, like the
non-*_up* versions but with two new args PRIME_IDX and PRIME.
They mostly just have the old body of the old non-*_up_ versions.
(mp_finish_in_single, factor): Rewrite in terms of the new functions.
Paul Eggert [Wed, 4 Jun 2025 17:12:29 +0000 (10:12 -0700)]
factor: switch from mp to single when doable
This significantly improves performance when a number exceeds
2**(W_TYPE_SIZE - 1) and is the product of a prime less than
FIRST_OMITTED_PRIME and another prime less than 2**(W_TYPE_SIZE - 1).
On my platform, for example, it doubled the speed of factoring
4999 * (2**128 - 159).
* src/factor.c (mp_size, mp_finish_in_single): New functions.
(mp_factor_using_division, mp_factor_using_pollard_rho):
Finish using single precision when possible.
* tests/factor/factor.pl (lt-5000-times-128-bit): New test.
Paul Eggert [Wed, 4 Jun 2025 15:07:23 +0000 (08:07 -0700)]
factor: primes_diff idx type consistency
* src/factor.c (factor_insert_refind):
Use idx_t for indexes into primes_diff,
for consistency with other indexes into primes_diff.
This has no practical effect unless the primes_diff
table becomes unreasonably large.
Paul Eggert [Mon, 2 Jun 2025 17:02:14 +0000 (10:02 -0700)]
factor: mp insert multiplicity too
Support a multiplicity argument in the mp case, too.
This helps keeps the two cases in sync, for maintenance.
* src/factor.c (mp_factor_insert, mp_factor_insert_ui):
New arg M, for multiplicity. All callers changed.
Paul Eggert [Mon, 2 Jun 2025 05:47:36 +0000 (22:47 -0700)]
factor: prefer non-macros
Use something other than a macro when that is easy and won’t hurt
performance.
* src/factor.c (__ll_B, __ll_lowpart, _ll_highpart) [!USE_LONGLONG_H]:
(MAX_NFACTS, highbit_to_mask, factor_insert, PRIMES_PTAB_ENTRIES):
Make these enums, or constants, or static functions instead of macros.
(highbit_to_mask): Rename from HIGHBIT_TO_MASK. All uses changed.
Paul Eggert [Sat, 31 May 2025 01:58:21 +0000 (18:58 -0700)]
factor: factor insertion simplifications
* src/factor.c (factor_insert_multiplicity):
Adjust to keep in sync with mp_factor_insert changes below,
by adding 1 to the index and using memmove to move.
(mp_factor_insert): Omit redundant call to mpz_cmp.
Prefer idx_t (always nonnegative) to ptrdiff_t,
by adding 1 to the indexes.
Prefer mpz_init_set to mpz_init+mpz_set.
Use memmove to move, rather than doing it by hand.
Paul Eggert [Sat, 24 May 2025 18:57:05 +0000 (11:57 -0700)]
factor: check unsigned char counts
* src/factor.c (MAX_NFACTS): Allow word size of 128 bits,
even if this is only theoretical now.
Check that struct factors’s unsigned char counts won’t overflow.
Paul Eggert [Thu, 22 May 2025 22:35:22 +0000 (15:35 -0700)]
factor: simplify longlong.h setup
* src/factor.c (USE_LONGLONG_H):
Default to false on unusual (but standard-conforming)
platforms that lack int64_t etc.
(UWtype, UHWtype): Now typedefs, not macros.
(UQItype): Remove.
(SItype, USItype, DItype, UDItype): Use standard C types.
Paul Eggert [Sun, 1 Jun 2025 23:28:47 +0000 (16:28 -0700)]
factor: prefer uuint to two words in a couple of places
This simplifies things slightly by using uuint for
some two-word integers.
* src/factor.c (strtouuint): Accept uuint *, not two mp_limb_t *.
All callers changed.
(print_factors_single): Accept uuint, not two limbs.
All callers changed.
(print_factors): Use simpler test for high bit,
one that need not worry about promoting to int.
Paul Eggert [Sun, 1 Jun 2025 23:18:09 +0000 (16:18 -0700)]
factor: remove wide_uuint
Simplify by using GMP’s word type instead of pretending to roll our own.
* src/factor.c (wide_uuint): Remove. All uses replaced by mp_limb_t.
(umul_ppmm) [!umul_ppmm]: Don’t assume unsigned long is at least half
as wide as mp_limb_t. This simpler anyway.
(strtouuint): Rename from strto2wide_uint. All uses changed.
Paul Eggert [Sun, 1 Jun 2025 23:02:21 +0000 (16:02 -0700)]
factor: use same word size as GMP
Remove experimental code for 128-bit words as it does not work and
we lack time to figure out why. Instead, ensure that words are
the same size as with GMP.
* src/factor.c (USE_INT128): Remove. All uses removed.
(wide_uint, W_TYPE_SIZE): Define to be the same as GMP.
(MP_LIMB_MAX): New macro. Check that it matches W_TYPE_SIZE.
(USE_LONGLONG_H): Default to true.
(UHWtype) [USE_LONGLONG_H]: Define to unsigned int, same as GMP.
(prime_p): Go back to not worrying about 128-bit words,
since GMP doesn’t worry and doesn’t use them.
(lbuf_putbitcnt): New function, since we cannot assume
that bitcnt_t fits into mp_limb_t.
(print_factors): Use it.
* src/make-prime-list.c (output_primes):
Don’t assume that wide_uint’s maximum is UINTMAX_MAX.
Paul Eggert [Mon, 2 Jun 2025 06:18:23 +0000 (23:18 -0700)]
factor: unsigned long → mp_bitcnt_t
* src/factor.c (struct mp_factors): e (multiplicity) member
is now of type mp_bitcnt_t, not unsigned long int, since
its value is at most a bit count. All uses changed.
Paul Eggert [Sun, 1 Jun 2025 22:55:27 +0000 (15:55 -0700)]
factor: generalize BIG_POWER_OF_10
* src/factor.c (BIG_POWER_OF_10, LOG_BIG_POWER_OF_10):
Place fewer restrictions on BIG_POWER_OF_10.
This is only for currently-theoretical hosts;
it shouldn’t affect machine code on practical platforms.
Paul Eggert [Sun, 1 Jun 2025 22:51:26 +0000 (15:51 -0700)]
factor: remove wide_int
* src/factor.c (wide_int): Remove, since it gets in the
way of using mp_limb_t for words. All uses removed.
(submod2, HIGHBIT_TO_MASK, divexact_21):
Rewrite without using wide_int.
This shouldn't change the machine code these days,
as compilers are pretty smart about isolating the
top bit of an unsigned int.
Paul Eggert [Sun, 1 Jun 2025 01:17:34 +0000 (18:17 -0700)]
factor: don’t used uninitialized uu[0]
In practice there’s no bug but we might as well avoid the
undefined behavior.
* src/factor.c (hi_is_set): New static function.
(factor_insert_large, prime2_p, print_factors_single): Use it.
* src/local.mk: Similarly to commit v8.22-156-g09937e9d0
track speedlist.h with nodist_src_stty_SOURCES and DISTCLEANFILES
to ensure the make distcheck manifest comparison passes.
Addresses https://bug.gnu.org/78960
* src/local.mk: Use the coarser BUILT_SOURCES mechanism
to generate speedlist.h, rather than a specific dependency
(which did seem to work for parallel builds).
Fixes https://bugs.gnu.org/78960
Paul Eggert [Mon, 30 Jun 2025 22:40:57 +0000 (15:40 -0700)]
od: port to Apple clang 14
* src/od.c (print_function_type): New type. Use it for convenience.
(width_bytes): Omit duplicate entries, such as ‘double’ vs ‘long
double’ on macOS. Problem reported by Bruno Haible
<https://bugs.gnu.org/78933>.
(decode_one_format): Cast null pointer to print_function_type
to pacify Apple clang-1400.0.29.202.
Pádraig Brady [Mon, 30 Jun 2025 14:34:22 +0000 (15:34 +0100)]
build: fix VPATH builds with --enable-single-binary
* src/local.mk: Adjust the dependency so that speedlist.h
is built irrespective of the object file name.
Note we could use BUILT_SOURCES for this,
but it's better to have this more accurate dependency.
Pádraig Brady [Mon, 30 Jun 2025 13:25:56 +0000 (14:25 +0100)]
od: reinstate half float validation check
Reinstate check removed in commit 56aa549a0 so that we
disallow -f2 when configured with utils_cv_ieee_16_bit_supported=no.
Otherwise the output routines will consume floats,
i.e. 4 bytes at a time. Without this extra check
the tests/od/od-endian.sh will fail with this configuration.
* src/od.c (decode_one_format): Reinstate the explicit check
for this configuration edge case.
Paul Eggert [Mon, 30 Jun 2025 00:12:07 +0000 (17:12 -0700)]
od: pacify gcc -Wduplicated-cond
Problem reported by Pádraig Brady <https://bugs.gnu.org/78880#43>.
This patch doesn’t fix any bugs; it merely pacifies GCC.
* src/od.c (ispec_to_format): New function, replacing
the old ISPEC_TO_FORMAT macro. All uses changed.
This part of the change is just refactoring.
(decode_one_format): Pacify à la ispec_to_format.
Paul Eggert [Sun, 29 Jun 2025 03:57:00 +0000 (20:57 -0700)]
od: be more consistent re sizeof
* src/od.c (width_bytes, decode_one_format): Don’t assume a signed
type has the same size as the corresponding unsigned type.
This has no effect on practical platforms; it’s just for
consistency there.
Paul Eggert [Sat, 28 Jun 2025 16:49:42 +0000 (09:49 -0700)]
od: simpler static initialization
* src/od.c (address_base, address_pad_len, format_address):
Initialize statically rather than dynamically.
(limit_bytes_to_format): Remove. All uses replaced by
checking sign of end_offset.
(max_bytes_to_format): Remove static var. Now local to ‘main’.
(end_offset): -1 now means no limit. All uses changed.
Paul Eggert [Sat, 28 Jun 2025 00:08:28 +0000 (17:08 -0700)]
od: omit some duplicate code
On x86-64 (for example) print_long, print_long_long, and
print_intmax all behave identically, so give GCC enough info so
that it generates code for just one of these functions.
* src/od.c (enum size_spec): Arrange for enum values to
be the same if they represent types that behave the same.
(width_bytes, ISPEC_TO_FORMAT, decode_one_format):
Match the enum size_spec changes.
Paul Eggert [Fri, 27 Jun 2025 23:08:06 +0000 (16:08 -0700)]
od: replace lookup tables with simple arithmetic
* src/od.c (FMT_BYTES_ALLOCATED): Use a simpler formula.
Although slightly too generous, the storage wasted is very small
and it pacifies gcc -Wformat-overflow=2.
(bytes_to_oct_digits, bytes_to_signed_dec_digits)
(bytes_to_unsigned_dec_digits, bytes_to_hex_digoits): Remove.
All uses replaced by algorithmic calculations, which are good
enough: they are valid for integers up to 2620 bits (!) and might
be slightly conservative for wider integers. Remove related
static_asserts, which are no longer needed.
Paul Eggert [Fri, 27 Jun 2025 15:45:53 +0000 (08:45 -0700)]
od: support uintmax_t too
This has practical effect only on hypothetical platforms where
uintmax_t is wider than unsigned long long int.
* src/od.c (enum size_spec): New constant INTMAX.
(MAX_INTEGRAL_TYPE_WIDTH): Now equals UINTMAX_WIDTH.
(FMT_BYTES_ALLOCATED): Allow for the extra "l" in "%lld".
Also, fix off-by-two error in size calculation.
(width_bytes, integral_type_size): Add entries for uintmax_t.
(print_intmax): New function.
(decode_one_function): Use it.
(ISPEC_TO_FORMAT): New arg Max_fmt. All uses changed.
Paul Eggert [Fri, 27 Jun 2025 06:51:37 +0000 (23:51 -0700)]
od: initialize type-size tables statically
* src/od.c (NO_SIZE): Make it explicitly 0, as the
initializers now rely on this.
(MAX_INTEGRAL_TYPE_SIZE): Remove. All uses replaced by
ARRAY_CARDINALITY (integral_type_size) - 1.
Move static assertion down to where this can be used.
(integral_type_size, fp_type_size): Make them const,
and initialize them statically.
(main): Omit no-longer-needed initialization code.
Paul Eggert [Thu, 26 Jun 2025 15:45:47 +0000 (08:45 -0700)]
od: prefer intmax_t to uintmax_t
* src/od.c (MAX_ADDRESS_LENGTH, pseudo_offset, n_bytes_to_skip)
(max_bytes_to_format, end_offset, skip, format_address_none)
(format_address_std, format_address_paren, format_address_label)
(write_block, parse_old_offset, dump, dump_strings, main):
Prefer intmax_t to uintmax_t. This makes no practical difference,
and lets -fsanitize=undefined check for signed integer overflow.
(skip, dump): Remove no-longer-needed casts.
(xstr2nonneg): New static function. All callers of xstrtoumax
now call this function instead.
(main): Use ckd_add to detect signed integer overflow, since
the unsigned trick no longer works reliably.
Let xstrtol_fatal report the overflow, instead of doing
it by hand ourselves.
Paul Eggert [Sun, 29 Jun 2025 00:29:22 +0000 (17:29 -0700)]
od: fix '+N.' bug
* src/od.c (parse_old_offset): First arg is now char *,
not char const *. If a decimal number, temporarily
modify the string so that xstrtoumax does not complain
about the '.'.
* tests/od/od.pl: Test for the bug.
Paul Eggert [Thu, 26 Jun 2025 06:22:37 +0000 (23:22 -0700)]
od: fix some unlikely integer overflows
* src/od.c (print_n_spaces, pad_at, pad_at_overflow):
New static functions.
(struct tspec, PRINT_FIELDS, print_named_ascii, print_ascii)
(decode_one_format, write_block, main):
Use idx_t, not int, for counts that depend on the number
of bytes in an object.
(decode_one_format): Use print_n_spaces to output spaces.
(PRINT_FIELDS, print_named_ascii, print_ascii):
Use pad_at to avoid integer overflow.
(write_block): Do not use %*s to pad, as the total pad might
exceed INT_MAX. Instead, pad by hand with putchar (' ').
(main): Use pad_at_overflow to report integer overflow due to
oversize -w. Use better way to tell whether -w is used,
without needing IF_LINT.
* tests/od/big-w.sh: New test.
* tests/local.mk (all_tests): Add it.
Paul Eggert [Wed, 25 Jun 2025 03:01:04 +0000 (20:01 -0700)]
od: don’t assume no holes in wide unsigned
Also, fix minor related typos.
* src/od.c (MAX_INTEGRAL_TYPE_SIZE, MAX_ADDRESS_LENGTH):
Now a constant, not a macro.
(MAX_INTEGRAL_TYPE_WIDTH): New constant. Use it instead of
CHAR_BIT, so as not to assume that uintmax_t and unsigned long
long int are hole-free. This doesn’t matter on practical porting
targets, though there is still a mainframe or two that have holes.
(FMT_BYTES_ALLOCATED): Fix typo by changing "jd" to "jo".
Fix off-by-one typo in static assertion.
Paul Eggert [Wed, 25 Jun 2025 02:13:20 +0000 (19:13 -0700)]
maint: assume long long int
It’s long been safe to assume C99+ support for long long int.
* .gitignore: Remove m4/longlong.m4.
* bootstrap.conf (buildreq): Boost git prereq from 1.4.4 to 1.5.5,
syncing with Gnulib.
(bootstrap_post_import_hook): Remove m4/longlong.m4.
* m4/jm-macros.m4 (gl_CHECK_ALL_TYPES):
No need to require AC_TYPE_UNSIGNED_LONG_LONG_INT.
* src/factor.c (DItype, UDItype):
* src/od.c (main):
Assume HAVE_LONG_LONG_INT.
* src/od.c: (unsigned_long_long_int):
Remove. All uses replaced with unsigned long long int.
Paul Eggert [Sat, 28 Jun 2025 15:15:42 +0000 (08:15 -0700)]
od: fix theoretical size_t malloc overflow
* src/od.c (dump, dump_strings): Use idx_t allocators
rather than size_t allocators, to avoid unchecked integer
overflow on theoretical platforms where SIZE_MAX < IDX_MAX.
* doc/coreutils.texi (cksum common options): Reorder and tweak the info
to make it clearer that --check does not support the legacy crc output
from the cksum command.
Reported at https://bugs.debian.org/1108363
Pádraig Brady [Tue, 24 Jun 2025 14:47:48 +0000 (15:47 +0100)]
od: output standard diagnostics for invalid -w arguments
* src/od.c (main): Don't pass LONGINT_OK to xstrtol_fatal(),
as otherwise it will abort().
* tests/od/od.pl: Add test cases.
* NEWS: Mention the bug fix.
Pádraig Brady [Tue, 24 Jun 2025 00:17:12 +0000 (01:17 +0100)]
od: fix various off-by-one issues with --strings with -N
* src/od.c (dump_strings): There are three related issues here
due to not accounting for the terminating NUL char appropriately.
1. Ensure BUF always has enough space for the terminating NUL.
This avoids CWE-122: Heap-based Buffer Overflow,
where we wrote a single NUL byte directly after the allocated buffer.
I.e., there should be no buffer overflow with:
printf '%100s' | od -N100 -S1
2. Ensure we support -S == -N (END_OFFSET - STRING_MIN == ADDRESS):
I.e., there should be output with:
printf '%100s' | od -N10 -S10
3. Ensure we always output a valid address by ensuring
the ADDRESS and I variables are kept in sync.
I.e., this should output address 0000000 not 1777777777777777777777:
printf '%100s' | od -N10 -S1
As well as fixing these we simplify by using a single loop
to read the data, rather than two.
* doc/coreutils.texi (od invocation): Clarify that -N
implicitly NUL terminates strings.
* tests/od/od-N.sh: Add test cases.
* NEWS: Mention the bug fixes.