From: Dean Rasheed <dean.a.rasheed@gmail.com>
Date: Thu, 7 Aug 2025 08:20:02 +0000 (+0100)
Subject: Optimise non-native 128-bit addition in int128.h.
X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=d9bb8ef093d62763cfd19d37e6bb8182998a3f88;p=thirdparty%2Fpostgresql.git

Optimise non-native 128-bit addition in int128.h.

On platforms without native 128-bit integer support, simplify the test
for carry in int128_add_uint64() by noting that the low-part addition
is unsigned integer arithmetic, which is just modular arithmetic.
Therefore the test for carry can simply be written as "new value < old
value" (i.e., a test for modular wrap-around). This can then be made
branchless so that on modern compilers it produces the same machine
instructions as native 128-bit addition, making it significantly
simpler and faster.

Similarly, the test for carry in int128_add_int64() can be written in
much the same way, but with an extra term to compensate for the sign
of the value being added. Again, on modern compilers this leads to
branchless code, often identical to the native 128-bit integer
addition machine code.

Author: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: John Naylor <johncnaylorls@gmail.com>
Discussion: https://postgr.es/m/CAEZATCWgBMc9ZwKMYqQpaQz2X6gaamYRB+RnMsUNcdMcL2Mj_w@mail.gmail.com
---

diff --git a/src/include/common/int128.h b/src/include/common/int128.h
index 8c300e56d9a..a84e5ca25f0 100644
--- a/src/include/common/int128.h
+++ b/src/include/common/int128.h
@@ -68,17 +68,17 @@ int128_add_uint64(INT128 *i128, uint64 v)
 #else
 	/*
 	 * First add the value to the .lo part, then check to see if a carry needs
-	 * to be propagated into the .hi part.  A carry is needed if both inputs
-	 * have high bits set, or if just one input has high bit set while the new
-	 * .lo part doesn't.  Remember that .lo part is unsigned; we cast to
-	 * signed here just as a cheap way to check the high bit.
+	 * to be propagated into the .hi part.  Since this is unsigned integer
+	 * arithmetic, which is just modular arithmetic, a carry is needed if the
+	 * new .lo part is less than the old .lo part (i.e., if modular
+	 * wrap-around occurred).  Writing this in the form below, rather than
+	 * using an "if" statement causes modern compilers to produce branchless
+	 * machine code identical to the native code.
 	 */
 	uint64		oldlo = i128->lo;
 
 	i128->lo += v;
-	if (((int64) v < 0 && (int64) oldlo < 0) ||
-		(((int64) v < 0 || (int64) oldlo < 0) && (int64) i128->lo >= 0))
-		i128->hi++;
+	i128->hi += (i128->lo < oldlo);
 #endif
 }
 
@@ -93,23 +93,19 @@ int128_add_int64(INT128 *i128, int64 v)
 #else
 	/*
 	 * This is much like the above except that the carry logic differs for
-	 * negative v.  Ordinarily we'd need to subtract 1 from the .hi part
-	 * (corresponding to adding the sign-extended bits of v to it); but if
-	 * there is a carry out of the .lo part, that cancels and we do nothing.
+	 * negative v -- we need to subtract 1 from the .hi part if the new .lo
+	 * value is greater than the old .lo value.  That can be achieved without
+	 * any branching by adding the sign bit from v (v >> 63 = 0 or -1) to the
+	 * previous result (for negative v, if the new .lo value is less than the
+	 * old .lo value, the two terms cancel and we leave the .hi part
+	 * unchanged, otherwise we subtract 1 from the .hi part).  With modern
+	 * compilers this often produces machine code identical to the native
+	 * code.
 	 */
 	uint64		oldlo = i128->lo;
 
 	i128->lo += v;
-	if (v >= 0)
-	{
-		if ((int64) oldlo < 0 && (int64) i128->lo >= 0)
-			i128->hi++;
-	}
-	else
-	{
-		if (!((int64) oldlo < 0 || (int64) i128->lo >= 0))
-			i128->hi--;
-	}
+	i128->hi += (i128->lo < oldlo) + (v >> 63);
 #endif
 }