From: Willy Tarreau Date: Fri, 25 Nov 2016 10:55:10 +0000 (+0100) Subject: BUG/MINOR: freq-ctr: make swrate_add() support larger values X-Git-Tag: v1.7.0~6 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=3758581e197dd7b390fb3da94c08f34e0d319c07;p=thirdparty%2Fhaproxy.git BUG/MINOR: freq-ctr: make swrate_add() support larger values Reinhard Vicinus reported that the reported average response times cannot be larger than 16s due to the double multiply being performed by swrate_add() which causes an overflow very quickly. Indeed, with N=512, the highest average value is 16448. One solution proposed by Reinhard is to turn to long long, but this involves 64x64 multiplies and 64->32 divides, which are extremely expensive on 32-bit platforms. There is in fact another way to avoid the overflow without using larger integers, it consists in avoiding the multiply using the fact that x*(n-1)/N = x-(x/N). Now it becomes possible to store average values as large as 8.4 millions, which is around 2h18mn. Interestingly, this improvement also makes the code cheaper to execute both on 32 and on 64 bit platforms : Before : 00000000 : 0: 8b 54 24 04 mov 0x4(%esp),%edx 4: 8b 0a mov (%edx),%ecx 6: 89 c8 mov %ecx,%eax 8: c1 e0 09 shl $0x9,%eax b: 29 c8 sub %ecx,%eax d: 8b 4c 24 0c mov 0xc(%esp),%ecx 11: c1 e8 09 shr $0x9,%eax 14: 01 c8 add %ecx,%eax 16: 89 02 mov %eax,(%edx) After : 00000020 : 20: 8b 4c 24 04 mov 0x4(%esp),%ecx 24: 8b 44 24 0c mov 0xc(%esp),%eax 28: 8b 11 mov (%ecx),%edx 2a: 01 d0 add %edx,%eax 2c: 81 c2 ff 01 00 00 add $0x1ff,%edx 32: c1 ea 09 shr $0x9,%edx 35: 29 d0 sub %edx,%eax 37: 89 01 mov %eax,(%ecx) This fix may be backported to 1.6. --- diff --git a/include/proto/freq_ctr.h b/include/proto/freq_ctr.h index 65388b1f0d..70b295e8a9 100644 --- a/include/proto/freq_ctr.h +++ b/include/proto/freq_ctr.h @@ -182,7 +182,19 @@ unsigned int freq_ctr_remain_period(struct freq_ctr_period *ctr, unsigned int pe * * So basically by summing values and applying the last result an (N-1)/N factor * we just get N times the values over the long term, so we can recover the - * constant value V by dividing by N. + * constant value V by dividing by N. In order to limit the impact of integer + * overflows, we'll use this equivalence which saves us one multiply : + * + * N - 1 1 x0 + * x1 = x0 * ------- = x0 * ( 1 - --- ) = x0 - ---- + * N N N + * + * And given that x0 is discrete here we'll have to saturate the values before + * performing the divide, so the value insertion will become : + * + * x0 + N - 1 + * x1 = x0 - ------------ + * N * * A value added at the entry of the sliding window of N values will thus be * reduced to 1/e or 36.7% after N terms have been added. After a second batch, @@ -220,7 +232,7 @@ unsigned int freq_ctr_remain_period(struct freq_ctr_period *ctr, unsigned int pe */ static inline unsigned int swrate_add(unsigned int *sum, unsigned int n, unsigned int v) { - return *sum = *sum * (n - 1) / n + v; + return *sum = *sum - (*sum + n - 1) / n + v; } /* Returns the average sample value for the sum over a sliding window of