From: Willy Tarreau <w@1wt.eu>
Date: Sun, 17 Mar 2024 09:20:56 +0000 (+0100)
Subject: MEDIUM: ring: significant boost in the loop by checking the ring queue ptr first
X-Git-Tag: v3.0-dev6~11
X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=30a659c3554ce45ae2c280ba02d53a4f6aba17b3;p=thirdparty%2Fhaproxy.git

MEDIUM: ring: significant boost in the loop by checking the ring queue ptr first

By doing that and placing the cpu_relax at the right places, the ARM
reaches 6.0M/s on 80 threads. On x86_64, at 3C6T the EPYC sees a small
increase from 4.45M to 4.57M but at 24C48T it sees a drop from 3.82M
to 3.33M due to the write contention hidden behind the CAS that
implements the FETCH_OR(), that we'll address next.
---

diff --git a/src/ring.c b/src/ring.c
index c445a23cc4..74772314a4 100644
--- a/src/ring.c
+++ b/src/ring.c
@@ -272,21 +272,23 @@ ssize_t ring_write(struct ring *ring, size_t maxlen, const struct ist pfx[], siz
 		 * we must detect a new leader ASAP so that the fewest possible
 		 * threads check the tail.
 		 */
-		while ((tail_ofs = HA_ATOMIC_LOAD(tail_ptr)) & RING_TAIL_LOCK) {
-			next_cell = HA_ATOMIC_LOAD(ring_queue_ptr);
-			if (next_cell != &cell)
-				goto wait_for_flush; // another thread arrived, we should go to wait now
-			__ha_cpu_relax_for_read();
-		}
 
 		/* the tail is available again and we're still the leader, try
 		 * again.
 		 */
-		if (HA_ATOMIC_LOAD(ring_queue_ptr) != &cell)
-			goto wait_for_flush; // another thread arrived, we should go to wait now
+		while (1) {
+			next_cell = HA_ATOMIC_LOAD(ring_queue_ptr);
+			if (next_cell != &cell)
+				goto wait_for_flush; // FIXME: another thread arrived, we should go to wait now
+			__ha_cpu_relax_for_read();
+
+			tail_ofs = HA_ATOMIC_FETCH_OR(tail_ptr, RING_TAIL_LOCK);
+			if (!(tail_ofs & RING_TAIL_LOCK))
+				break;
 
+			__ha_cpu_relax_for_read();
+		}
 		/* OK the queue is locked, let's attempt to get the tail lock */
-		tail_ofs = HA_ATOMIC_FETCH_OR(tail_ptr, RING_TAIL_LOCK);
 
 		/* did we get it ? */
 		if (!(tail_ofs & RING_TAIL_LOCK)) {