From: Willy Tarreau Date: Wed, 4 Aug 2021 09:44:17 +0000 (+0200) Subject: MEDIUM: threads: add a stronger thread_isolate_full() call X-Git-Tag: v2.5-dev4~75 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=88d1c5d3fbe36b29fb50b699ef4478d275d736ae;p=thirdparty%2Fhaproxy.git MEDIUM: threads: add a stronger thread_isolate_full() call The current principle of running under isolation was made to access sensitive data while being certain that no other thread was using them in parallel, without necessarily having to place locks everywhere. The main use case are "show sess" and "show fd" which run over long chains of pointers. The thread_isolate() call relies on the "harmless" bit that indicates for a given thread that it's not currently doing such sensitive things, which is advertised using thread_harmless_now() and which ends usings thread_harmless_end(), which also waits for possibly concurrent threads to complete their work if they took this opportunity for starting something tricky. As some system calls were notoriously slow (e.g. mmap()), a bunch of thread_harmless_now() / thread_harmless_end() were placed around them to let waiting threads do their work while such other threads were not able to modify memory contents. But this is not sufficient for performing memory modifications. One such example is the server deletion code. By modifying memory, it not only requires that other threads are not playing with it, but are not either in the process of touching it. The fact that a pool_alloc() or pool_free() on some structure may call thread_harmless_now() and let another thread start to release the same object's memory is not acceptable. This patch introduces the concept of "idle threads". Threads entering the polling loop are idle, as well as those that are waiting for all others to become idle via the new function thread_isolate_full(). Once thread_isolate_full() is granted, the thread is not idle anymore, and it is released using thread_release() just like regular isolation. Its users have to keep in mind that across this call nothing is granted as another thread might have performed shared memory modifications. But such users are extremely rare and are actually expecting this from their peers as well. Note that that in case of backport, this patch depends on previous patch: MINOR: threads: make thread_release() not wait for other ones to complete --- diff --git a/include/haproxy/thread.h b/include/haproxy/thread.h index 0ca773cf60..51c7278489 100644 --- a/include/haproxy/thread.h +++ b/include/haproxy/thread.h @@ -59,6 +59,7 @@ extern int thread_cpus_enabled_at_boot; */ enum { all_threads_mask = 1UL }; enum { threads_harmless_mask = 0 }; +enum { threads_idle_mask = 0 }; enum { threads_sync_mask = 0 }; enum { threads_want_rdv_mask = 0 }; enum { tid_bit = 1UL }; @@ -119,6 +120,14 @@ static inline void ha_tkillall(int sig) raise(sig); } +static inline void thread_idle_now() +{ +} + +static inline void thread_idle_end() +{ +} + static inline void thread_harmless_now() { } @@ -131,6 +140,10 @@ static inline void thread_isolate() { } +static inline void thread_isolate_full() +{ +} + static inline void thread_release() { } @@ -155,6 +168,7 @@ static inline unsigned long thread_isolated() void thread_harmless_till_end(); void thread_isolate(); +void thread_isolate_full(); void thread_release(); void thread_sync_release(); void ha_tkill(unsigned int thr, int sig); @@ -164,6 +178,7 @@ void ha_rwlock_init(HA_RWLOCK_T *l); extern volatile unsigned long all_threads_mask; extern volatile unsigned long threads_harmless_mask; +extern volatile unsigned long threads_idle_mask; extern volatile unsigned long threads_sync_mask; extern volatile unsigned long threads_want_rdv_mask; extern THREAD_LOCAL unsigned long tid_bit; /* The bit corresponding to the thread id */ @@ -245,6 +260,36 @@ static inline void ha_thread_relax(void) #endif } +/* Marks the thread as idle, which means that not only it's not doing anything + * dangerous, but in addition it has not started anything sensitive either. + * This essentially means that the thread currently is in the poller, thus + * outside of any execution block. Needs to be terminated using + * thread_idle_end(). This is needed to release a concurrent call to + * thread_isolate_full(). + */ +static inline void thread_idle_now() +{ + HA_ATOMIC_OR(&threads_idle_mask, tid_bit); +} + +/* Ends the harmless period started by thread_idle_now(), i.e. the thread is + * about to restart engaging in sensitive operations. This must not be done on + * a thread marked harmless, as it could cause a deadlock between another + * thread waiting for idle again and thread_harmless_end() in this thread. + * + * The right sequence is thus: + * thread_idle_now(); + * thread_harmless_now(); + * poll(); + * thread_harmless_end(); + * thread_idle_end(); + */ +static inline void thread_idle_end() +{ + HA_ATOMIC_AND(&threads_idle_mask, ~tid_bit); +} + + /* Marks the thread as harmless. Note: this must be true, i.e. the thread must * not be touching any unprotected shared resource during this period. Usually * this is called before poll(), but it may also be placed around very slow diff --git a/src/ev_epoll.c b/src/ev_epoll.c index 1de2e0fc4c..35a8a9c79e 100644 --- a/src/ev_epoll.c +++ b/src/ev_epoll.c @@ -183,6 +183,7 @@ static void _do_poll(struct poller *p, int exp, int wake) _update_fd(fd); } + thread_idle_now(); thread_harmless_now(); /* now let's wait for polled events */ @@ -210,6 +211,8 @@ static void _do_poll(struct poller *p, int exp, int wake) tv_leaving_poll(wait_time, status); thread_harmless_end(); + thread_idle_end(); + if (sleeping_thread_mask & tid_bit) _HA_ATOMIC_AND(&sleeping_thread_mask, ~tid_bit); diff --git a/src/ev_evports.c b/src/ev_evports.c index c7bf4f6f70..a61392d54a 100644 --- a/src/ev_evports.c +++ b/src/ev_evports.c @@ -151,6 +151,7 @@ static void _do_poll(struct poller *p, int exp, int wake) _update_fd(fd); } + thread_idle_now(); thread_harmless_now(); /* @@ -204,6 +205,8 @@ static void _do_poll(struct poller *p, int exp, int wake) tv_leaving_poll(wait_time, nevlist); thread_harmless_end(); + thread_idle_end(); + if (sleeping_thread_mask & tid_bit) _HA_ATOMIC_AND(&sleeping_thread_mask, ~tid_bit); diff --git a/src/ev_kqueue.c b/src/ev_kqueue.c index ce71484de3..5ccaf159aa 100644 --- a/src/ev_kqueue.c +++ b/src/ev_kqueue.c @@ -125,6 +125,7 @@ static void _do_poll(struct poller *p, int exp, int wake) changes = _update_fd(fd, changes); } + thread_idle_now(); thread_harmless_now(); if (changes) { @@ -176,6 +177,8 @@ static void _do_poll(struct poller *p, int exp, int wake) tv_leaving_poll(wait_time, status); thread_harmless_end(); + thread_idle_end(); + if (sleeping_thread_mask & tid_bit) _HA_ATOMIC_AND(&sleeping_thread_mask, ~tid_bit); diff --git a/src/ev_poll.c b/src/ev_poll.c index bb9d8f87a8..627f4bf8e2 100644 --- a/src/ev_poll.c +++ b/src/ev_poll.c @@ -164,6 +164,7 @@ static void _do_poll(struct poller *p, int exp, int wake) break; } while (!_HA_ATOMIC_CAS(&maxfd, &old_maxfd, new_maxfd)); + thread_idle_now(); thread_harmless_now(); fd_nbupdt = 0; @@ -207,6 +208,8 @@ static void _do_poll(struct poller *p, int exp, int wake) tv_leaving_poll(wait_time, status); thread_harmless_end(); + thread_idle_end(); + if (sleeping_thread_mask & tid_bit) _HA_ATOMIC_AND(&sleeping_thread_mask, ~tid_bit); diff --git a/src/ev_select.c b/src/ev_select.c index c143bd9168..12fdaff39c 100644 --- a/src/ev_select.c +++ b/src/ev_select.c @@ -156,6 +156,7 @@ static void _do_poll(struct poller *p, int exp, int wake) break; } while (!_HA_ATOMIC_CAS(&maxfd, &old_maxfd, new_maxfd)); + thread_idle_now(); thread_harmless_now(); fd_nbupdt = 0; @@ -182,6 +183,8 @@ static void _do_poll(struct poller *p, int exp, int wake) tv_leaving_poll(delta_ms, status); thread_harmless_end(); + thread_idle_end(); + if (sleeping_thread_mask & tid_bit) _HA_ATOMIC_AND(&sleeping_thread_mask, ~tid_bit); diff --git a/src/thread.c b/src/thread.c index d1f88a3e78..44375e62f3 100644 --- a/src/thread.c +++ b/src/thread.c @@ -37,6 +37,7 @@ THREAD_LOCAL struct thread_info *ti = &ha_thread_info[0]; volatile unsigned long threads_want_rdv_mask __read_mostly = 0; volatile unsigned long threads_harmless_mask = 0; +volatile unsigned long threads_idle_mask = 0; volatile unsigned long threads_sync_mask = 0; volatile unsigned long all_threads_mask __read_mostly = 1; // nbthread 1 assumed by default THREAD_LOCAL unsigned int tid = 0; @@ -92,6 +93,50 @@ void thread_isolate() */ } +/* Isolates the current thread : request the ability to work while all other + * threads are idle, as defined by thread_idle_now(). It only returns once + * all of them are both harmless and idle, with the current thread's bit in + * threads_harmless_mask and idle_mask cleared. Needs to be completed using + * thread_release(). By doing so the thread also engages in being safe against + * any actions that other threads might be about to start under the same + * conditions. This specifically targets destruction of any internal structure, + * which implies that the current thread may not hold references to any object. + * + * Note that a concurrent thread_isolate() will usually win against + * thread_isolate_full() as it doesn't consider the idle_mask, allowing it to + * get back to the poller or any other fully idle location, that will + * ultimately release this one. + */ +void thread_isolate_full() +{ + unsigned long old; + + _HA_ATOMIC_OR(&threads_idle_mask, tid_bit); + _HA_ATOMIC_OR(&threads_harmless_mask, tid_bit); + __ha_barrier_atomic_store(); + _HA_ATOMIC_OR(&threads_want_rdv_mask, tid_bit); + + /* wait for all threads to become harmless */ + old = threads_harmless_mask; + while (1) { + unsigned long idle = _HA_ATOMIC_LOAD(&threads_idle_mask); + + if (unlikely((old & all_threads_mask) != all_threads_mask)) + old = _HA_ATOMIC_LOAD(&threads_harmless_mask); + else if ((idle & all_threads_mask) == all_threads_mask && + _HA_ATOMIC_CAS(&threads_harmless_mask, &old, old & ~tid_bit)) + break; + + ha_thread_relax(); + } + + /* we're not idle anymore at this point. Other threads waiting on this + * condition will need to wait until out next pass to the poller, or + * our next call to thread_isolate_full(). + */ + _HA_ATOMIC_AND(&threads_idle_mask, ~tid_bit); +} + /* Cancels the effect of thread_isolate() by releasing the current thread's bit * in threads_want_rdv_mask. This immediately allows other threads to expect be * executed, though they will first have to wait for this thread to become