Dickinson's post at <http://bugs.python.org/file10357/msum4.py>.
See those links for more details, proofs and other references.
- Note 1: IEEE 754R floating point semantics are assumed,
- but the current implementation does not re-establish special
- value semantics across iterations (i.e. handling -Inf + Inf).
+ Note 1: IEEE 754 floating-point semantics with a rounding mode of
+ roundTiesToEven are assumed.
- Note 2: No provision is made for intermediate overflow handling;
- therefore, sum([1e+308, 1e-308, 1e+308]) returns 1e+308 while
- sum([1e+308, 1e+308, 1e-308]) raises an OverflowError due to the
+ Note 2: No provision is made for intermediate overflow handling;
+ therefore, fsum([1e+308, -1e+308, 1e+308]) returns 1e+308 while
+ fsum([1e+308, 1e+308, -1e+308]) raises an OverflowError due to the
overflow of the first partial sum.
- Note 3: The intermediate values lo, yr, and hi are declared volatile so
- aggressive compilers won't algebraically reduce lo to always be exactly 0.0.
- Also, the volatile declaration forces the values to be stored in memory as
- regular doubles instead of extended long precision (80-bit) values. This
- prevents double rounding because any addition or subtraction of two doubles
- can be resolved exactly into double-sized hi and lo values. As long as the
- hi value gets forced into a double before yr and lo are computed, the extra
- bits in downstream extended precision operations (x87 for example) will be
- exactly zero and therefore can be losslessly stored back into a double,
- thereby preventing double rounding.
-
- Note 4: A similar implementation is in Modules/cmathmodule.c.
- Be sure to update both when making changes.
-
- Note 5: The signature of math.fsum() differs from builtins.sum()
+ Note 3: The algorithm has two potential sources of fragility. First, C
+ permits arithmetic operations on `double`s to be performed in an
+ intermediate format whose range and precision may be greater than those of
+ `double` (see for example C99 ยง5.2.4.2.2, paragraph 8). This can happen for
+ example on machines using the now largely historical x87 FPUs. In this case,
+ `fsum` can produce incorrect results. If `FLT_EVAL_METHOD` is `0` or `1`, or
+ `FLT_EVAL_METHOD` is `2` and `long double` is identical to `double`, then we
+ should be safe from this source of errors. Second, an aggressively
+ optimizing compiler can re-associate operations so that (for example) the
+ statement `yr = hi - x;` is treated as `yr = (x + y) - x` and then
+ re-associated as `yr = y + (x - x)`, giving `y = yr` and `lo = 0.0`. That
+ re-association would be in violation of the C standard, and should not occur
+ except possibly in the presence of unsafe optimizations (e.g., -ffast-math,
+ -fassociative-math). Such optimizations should be avoided for this module.
+
+ Note 4: The signature of math.fsum() differs from builtins.sum()
because the start argument doesn't make sense in the context of
accurate summation. Since the partials table is collapsed before
returning a result, sum(seq2, start=sum(seq1)) may not equal the
Py_ssize_t i, j, n = 0, m = NUM_PARTIALS;
double x, y, t, ps[NUM_PARTIALS], *p = ps;
double xsave, special_sum = 0.0, inf_sum = 0.0;
- volatile double hi, yr, lo;
+ double hi, yr, lo;
iter = PyObject_GetIter(seq);
if (iter == NULL)