From: Willy Tarreau Date: Fri, 15 Jul 2022 10:48:58 +0000 (+0200) Subject: BUG/MINOR: debug: enter ha_panic() only once X-Git-Tag: v2.7-dev2~50 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=55433f9b344b430f3f8ec91106b9d8772168eafc;p=thirdparty%2Fhaproxy.git BUG/MINOR: debug: enter ha_panic() only once Some panic dumps are mangled or truncated due to the watchdog firing at the same time on multiple threads and calling ha_panic() simultaneously. What may happen in this case is that the second one waits for the first one to finish but as soon as it's done the second one resets the buffer and dumps again, sometimes resetting the first one's dump. Also the first one's abort() may trigger while the second one is currently dumping, resulting in a full dump followed by a truncated one, leading to confusion. Sometimes some lines appear in the middle of a dump as well. It doesn't happen often and is easier to trigger by causing massive deadlocks. There's no reason for the process to resist to a panic, so we can safely add a counter and no nothing on subsequent calls. Ideally we'd wait there forever but as this may happen inside a signal handler (e.g. watchdog), it doesn't always work, so the easiest thing to do is to return so that the thread is interrupted as soon as possible and brought to the debug handler to be dumped. This should be backported, at least to 2.6 and possibly to older versions as well. --- diff --git a/src/debug.c b/src/debug.c index 74249ff513..1dd446f78e 100644 --- a/src/debug.c +++ b/src/debug.c @@ -55,6 +55,7 @@ * dump + 1. Only used when USE_THREAD_DUMP is set. */ volatile unsigned int thread_dump_state = 0; +unsigned int panic_started = 0; unsigned int debug_commands_issued = 0; /* dumps a backtrace of the current thread that is appended to buffer . @@ -337,6 +338,14 @@ static int debug_parse_cli_show_libs(char **args, char *payload, struct appctx * /* dumps a state of all threads into the trash and on fd #2, then aborts. */ void ha_panic() { + if (HA_ATOMIC_FETCH_ADD(&panic_started, 1) != 0) { + /* a panic dump is already in progress, let's not disturb it, + * we'll be called via signal DEBUGSIG. By returning we may be + * able to leave a current signal handler (e.g. WDT) so that + * this will ensure more reliable signal delivery. + */ + return; + } chunk_reset(&trash); chunk_appendf(&trash, "Thread %u is about to kill the process.\n", tid + 1); ha_thread_dump_all_to_trash();