From: Willy Tarreau Date: Thu, 4 Jul 2019 09:48:16 +0000 (+0200) Subject: MINOR: pools: always pre-initialize allocated memory outside of the lock X-Git-Tag: v2.1-dev1~20 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=828675421e32d136c35d27d26e4ee5d24d5b440c;p=thirdparty%2Fhaproxy.git MINOR: pools: always pre-initialize allocated memory outside of the lock When calling mmap(), in general the system gives us a page but does not really allocate it until we first dereference it. And it turns out that this time is much longer than the time to perform the mmap() syscall. Unfortunately, when running with memory debugging enabled, we mmap/munmap() each object resulting in lots of such calls and a high contention on the allocator. And the first accesses to the page being done under the pool lock is extremely damaging to other threads. The simple fact of writing a 0 at the beginning of the page after allocating it and placing the POOL_LINK pointer outside of the lock is enough to boost the performance by 8x in debug mode and to save the watchdog from triggering on lock contention. This is what this patch does. --- diff --git a/include/common/memory.h b/include/common/memory.h index 8fc7cc842a..5f96ac079b 100644 --- a/include/common/memory.h +++ b/include/common/memory.h @@ -421,6 +421,10 @@ static inline void *pool_alloc_area(size_t size) ret = mmap(NULL, (size + 4095) & -4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0); if (ret == MAP_FAILED) return NULL; + /* let's dereference the page before returning so that the real + * allocation in the system is performed without holding the lock. + */ + *(int *)ret = 0; if (pad >= sizeof(void *)) *(void **)(ret + pad - sizeof(void *)) = ret + pad; return ret + pad; diff --git a/src/memory.c b/src/memory.c index 8fed2f4e94..a858a09834 100644 --- a/src/memory.c +++ b/src/memory.c @@ -337,6 +337,14 @@ void *__pool_refill_alloc(struct pool_head *pool, unsigned int avail) HA_SPIN_UNLOCK(POOL_LOCK, &pool->lock); ptr = pool_alloc_area(pool->size + POOL_EXTRA); +#ifdef DEBUG_MEMORY_POOLS + /* keep track of where the element was allocated from. This + * is done out of the lock so that the system really allocates + * the data without harming other threads waiting on the lock. + */ + if (ptr) + *POOL_LINK(pool, ptr) = (void *)pool; +#endif HA_SPIN_LOCK(POOL_LOCK, &pool->lock); if (!ptr) { pool->failed++; @@ -355,10 +363,6 @@ void *__pool_refill_alloc(struct pool_head *pool, unsigned int avail) pool->free_list = ptr; } pool->used++; -#ifdef DEBUG_MEMORY_POOLS - /* keep track of where the element was allocated from */ - *POOL_LINK(pool, ptr) = (void *)pool; -#endif return ptr; } void *pool_refill_alloc(struct pool_head *pool, unsigned int avail)