From 285b6285e799d3480150375cb853d40973d3ab4c Mon Sep 17 00:00:00 2001 From: Yann Ylavic Date: Mon, 28 Feb 2022 11:56:43 +0000 Subject: [PATCH] ap_regex: Use Thread Local Storage (if efficient) to avoid allocations. MIME-Version: 1.0 Content-Type: text/plain; charset=utf8 Content-Transfer-Encoding: 8bit PCRE2 wants an opaque context by providing the API to allocate and free it, so to minimize these calls we maintain one opaque context per thread (in Thread Local Storage, TLS) grown as needed, and while at it we do the same for PCRE1 ints vectors. Note that this requires a fast TLS mechanism to be worth it, which is the case of apr_thread_data_get/set() from/to apr_thread_current() when APR_HAS_THREAD_LOCAL; otherwise we'll do the allocation and freeing for each ap_regexec(). The small stack vector is used for PCRE1 && !APR_HAS_THREAD_LOCAL only now. Follow up to r1897240: APR_HAS_THREAD_LOCAL wants #ifdef instead of #if. Follow up to r1897240: CHANGES entry. ap_regex: PCRE needs buffers sized against the number of captures only. No more (useless), no less (or PCRE will allocate a new buffer by itself to satisfy the needs), so we should base our buffer size solely on the number of captures in the regex (determined at compile time from the pattern). The nmatch provided by the user is used to fill in pmatch only (up to that), but "our" buffers are sized exactly as needed to avoid oversized allocations or PCRE allocating by itself. ap_regex: Follow up to r1897244: Fix pmatch overflow and returned value at limits. Don't write to pmatch[nlimit:] when ncaps > nmatch, rc should not exceed nmatch either as before r1897244. ap_regex: Follow up to r1897240: Fix issues spotted by Rüdiger (thanks!). #include "apr_thread_proc.h" is enough/needed by util_pcre.c and main.c. Fix compilation (vector => ovector) for !HAVE_PCRE2 && APR_HAS_THREAD_LOCAL. Check pcre2_match_data_create() return value for HAVE_PCRE2 && !APR_HAS_THREAD_LOCAL. ap_regex: Follow up to r1897240: runtime fallback to alloc/free. Even though APR_HAS_THREAD_LOCAL is compiled in, ap_regexec() might still be called by non a apr_thread_t thread, let's fall back to alloc/free in this case too. ap_regex: Follow up to r1897240: no ap_thread_current() yet. ap_regex: Follow up to r1897240: cleanups. ap_regex: Follow up to r1897240: cleanup PCRE2 match data on exit. ap_regex: Follow up to r1897240: #if APR_HAS_THREAD_LOCAL, not #ifdef. core: Efficient ap_thread_current() with APR < 1.8. #define ap_thread_create, ap_thread_current_create and ap_thread_current to their apr-1.8+ equivalent if available, or implement them using the compiler's thread_local mechanism if available, or finally provide stubs otherwise. #define AP_HAS_THREAD_LOCAL to 1 in the two former case or 0 otherwise, while AP_THREAD_LOCAL is defined to the compiler's keyword iff AP_HAS_THREAD_LOCAL. Replace all apr_thread_create() calls with ap_thread_create() so that httpd threads can use ap_thread_current()'s pool data as Thread Local Storage. Bump MMN minor. * include/httpd.h(): Define AP_HAS_THREAD_LOCAL, AP_THREAD_LOCAL (eventually), ap_thread_create(), ap_thread_current_create() and ap_thread_current(). * server/util.c: Implement ap_thread_create(), ap_thread_current_create() and ap_thread_current() when APR < 1.8. * modules/core/mod_watchdog.c, modules/http2/h2_workers.c, modules/ssl/mod_ssl_ct.c: Use ap_thread_create() instead of apr_thread_create. * server/main.c: Use AP_HAS_THREAD_LOCAL and ap_thread_current_create instead of APR's. * server/util_pcre.c: Use AP_HAS_THREAD_LOCAL and ap_thread_current instead of APR's. * server/mpm/event/event.c, server/mpm/worker/worker.c, server/mpm/prefork/prefork.c: Use ap_thread_create() instead of apr_thread_create. Create an apr_thread_t/ap_thread_current() for the main chaild thread usable at child_init(). * server/mpm/winnt/child.c: Use ap_thread_create() instead of CreateThread(). Create an apr_thread_t/ap_thread_current() for the main chaild thread usable Follow up to r1897460: APLOGNOs. Follow up to r1897460: !APR_HAS_THREAD implies no ap_thread_* either. core: Follow up to r1897460: Implement and use ap_thread_current_after_fork(). thread_local variables are not (always?) reset on fork(), so we need a way to set the current_thread to NULL in the child process. Implement and use ap_thread_current_after_fork() for that. * include/httpd.h: Define ap_thread_current_after_fork(). * server/util.c: Implement ap_thread_current_after_fork(). * server/mpm/event/event.c, server/mpm/prefork/prefork.c, server/mpm/worker/worker.c: Use ap_thread_current_after_fork(). * server/mpm/winnt/child.c: Windows processes are not fork()ed and each child runs the main(), so ap_thread_current_create() was already called there. core: Follow up to r1897460: Provide ap_thread_main_create(). Replace ap_thread_current_create() by ap_thread_main_create() which is how it's used by httpd. The former is now a local helper only to implement the latter. This allows to consolidate/factorize common code in the main() of httpd and the unix MPMs. ap_regex: Follow up to r1897240: Fetch the ovector _after_ the match. Possibly(?) pcre2_match() can modifiy the given pcre2_match_data and invalidate the old ovector, be safe and fetch it after. main: Follow up to r1897240: Fix bad log copypasta. Don't stderr printf the "stat" and "failed" results from the previous apr_app_initialize() call for an error in ap_thread_main_create(). core: Follow up to r1897240: Opt-out for AP_HAS_THREAD_LOCAL and/or pcre's usage. If the compiler's thread_local is not efficient enough on some platforms, or not desired, have a way to disable its usage in httpd (at compile time). Handle -DAP_NO_THREAD_LOCAL and/or -DAPREG_NO_THREAD_LOCAL as build opt-out for thread_local usage in httpd gobally and/or in ap_regex only (respectively). core: Follow up to r1897240: Provide/export ap_thread_current_create() For completness, and possibly to ease backport to 2.4.x for MPM winnt. core: Follow up to r1897240 & r1897691: Syntax. Add compiled and loaded PCRE version numbers to "httpd -V" output and to mod_info page. Forgotten file needed for r1612934. Minor mmn bump due to r1612940. Backports: r1897240, r1897241, r1897242, r1897244, r1897248, r1897250, r1897260, r1897261, r1897263, r1897386, r1897459, r1897460, r1897461, r1897462, r1897472, r1897543, r1897651, r1897680, r1897689, r1897691, r1897692, r1612934, r1612940, r1613189 Submitted by: ylavic, rjung, rjung, rjung Reviewed by: ylavic, rpluem, covener, steffenal, wrowe GH: closes #289 (https://github.com/apache/httpd/pull/289) git-svn-id: https://svn.apache.org/repos/asf/httpd/httpd/branches/2.4.x@1898467 13f79535-47bb-0310-9956-ffa450edef68 --- CHANGES | 14 ++- include/ap_mmn.h | 8 +- include/ap_regex.h | 15 +++ include/httpd.h | 64 ++++++++++ modules/core/mod_watchdog.c | 2 +- modules/generators/mod_info.c | 6 + modules/http2/h2_workers.c | 4 +- server/main.c | 34 +++++- server/mpm/event/event.c | 33 +++-- server/mpm/prefork/prefork.c | 23 +++- server/mpm/winnt/child.c | 25 +++- server/mpm/worker/worker.c | 33 +++-- server/util.c | 129 ++++++++++++++++++++ server/util_pcre.c | 219 ++++++++++++++++++++++++++-------- 14 files changed, 522 insertions(+), 87 deletions(-) diff --git a/CHANGES b/CHANGES index 42c30d6106a..3c909b3bc59 100644 --- a/CHANGES +++ b/CHANGES @@ -1,10 +1,16 @@ -*- coding: utf-8 -*- Changes with Apache 2.4.53 - * mod_md) do not interfere with requests to /.well-known/acme-challenge/ - resources if challenge type 'http-01' is not configured for a domain. - Fixes . - [Stefan Eissing] + *) ap_regex: Use Thread Local Storage (TLS) to recycle ap_regexec() buffers + when an efficient TLS implementation is available. [Yann Ylavic] + + *) core, mod_info: Add compiled and loaded PCRE versions to version + number display. [Rainer Jung] + + *) mod_md: do not interfere with requests to /.well-known/acme-challenge/ + resources if challenge type 'http-01' is not configured for a domain. + Fixes . + [Stefan Eissing] *) mod_dav: Fix regression when gathering properties which could lead to huge memory consumption proportional to the number of resources. diff --git a/include/ap_mmn.h b/include/ap_mmn.h index 90ff1a86a6f..15b0cdc849c 100644 --- a/include/ap_mmn.h +++ b/include/ap_mmn.h @@ -587,7 +587,11 @@ * 20120211.120 (2.4.51-dev) Add dav_liveprop_elem structure and * dav_get_liveprop_element(). * 20120211.121 (2.4.51-dev) Add ap_post_read_request() - * + * 20120211.122 (2.4.51-dev) Add ap_thread_create(), ap_thread_main_create() + * and ap_thread_current() + * 20120211.123 (2.4.51-dev) Added ap_pcre_version_string(), AP_REG_PCRE_COMPILED + * and AP_REG_PCRE_LOADED to ap_regex.h. + * */ #define MODULE_MAGIC_COOKIE 0x41503234UL /* "AP24" */ @@ -595,7 +599,7 @@ #ifndef MODULE_MAGIC_NUMBER_MAJOR #define MODULE_MAGIC_NUMBER_MAJOR 20120211 #endif -#define MODULE_MAGIC_NUMBER_MINOR 121 /* 0...n */ +#define MODULE_MAGIC_NUMBER_MINOR 123 /* 0...n */ /** * Determine if the server's current MODULE_MAGIC_NUMBER is at least a diff --git a/include/ap_regex.h b/include/ap_regex.h index 7588ad149e1..50d5abaa0fd 100644 --- a/include/ap_regex.h +++ b/include/ap_regex.h @@ -90,6 +90,12 @@ extern "C" { #define AP_REG_DEFAULT (AP_REG_DOTALL|AP_REG_DOLLAR_ENDONLY) +/* Arguments for ap_pcre_version_string */ +enum { + AP_REG_PCRE_COMPILED = 0, /** PCRE version used during program compilation */ + AP_REG_PCRE_LOADED /** PCRE version loaded at runtime */ +}; + /* Error values: */ enum { AP_REG_ASSERT = 1, /** internal error ? */ @@ -113,6 +119,15 @@ typedef struct { /* The functions */ +/** + * Return PCRE version string. + * @param which Either AP_REG_PCRE_COMPILED (PCRE version used + * during program compilation) or AP_REG_PCRE_LOADED + * (PCRE version used at runtime) + * @return The PCRE version string + */ +AP_DECLARE(const char *) ap_pcre_version_string(int which); + /** * Get default compile flags * @return Bitwise OR of AP_REG_* flags diff --git a/include/httpd.h b/include/httpd.h index 2057ec31b2c..f27bb2fb0e8 100644 --- a/include/httpd.h +++ b/include/httpd.h @@ -47,6 +47,7 @@ #include "ap_release.h" #include "apr.h" +#include "apr_version.h" #include "apr_general.h" #include "apr_tables.h" #include "apr_pools.h" @@ -2410,6 +2411,69 @@ AP_DECLARE(void *) ap_realloc(void *ptr, size_t size) AP_FN_ATTR_WARN_UNUSED_RESULT AP_FN_ATTR_ALLOC_SIZE(2); +#if APR_HAS_THREADS + +#if APR_VERSION_AT_LEAST(1,8,0) && !defined(AP_NO_THREAD_LOCAL) + +/** + * APR 1.8+ implement those already. + */ +#if APR_HAS_THREAD_LOCAL +#define AP_HAS_THREAD_LOCAL 1 +#define AP_THREAD_LOCAL APR_THREAD_LOCAL +#else +#define AP_HAS_THREAD_LOCAL 0 +#endif +#define ap_thread_create apr_thread_create +#define ap_thread_current apr_thread_current +#define ap_thread_current_create apr_thread_current_create +#define ap_thread_current_after_fork apr_thread_current_after_fork + +#else /* APR_VERSION_AT_LEAST(1,8,0) && !defined(AP_NO_THREAD_LOCAL) */ + +#ifndef AP_NO_THREAD_LOCAL +/** + * AP_THREAD_LOCAL keyword mapping the compiler's. + */ +#if defined(__cplusplus) && __cplusplus >= 201103L +#define AP_THREAD_LOCAL thread_local +#elif defined(__STDC_VERSION__) && __STDC_VERSION__ >= 201112 +#define AP_THREAD_LOCAL _Thread_local +#elif defined(__GNUC__) /* works for clang too */ +#define AP_THREAD_LOCAL __thread +#elif defined(WIN32) && defined(_MSC_VER) +#define AP_THREAD_LOCAL __declspec(thread) +#endif +#endif /* ndef AP_NO_THREAD_LOCAL */ + +#ifndef AP_THREAD_LOCAL +#define AP_HAS_THREAD_LOCAL 0 +#define ap_thread_create apr_thread_create +#else /* AP_THREAD_LOCAL */ +#define AP_HAS_THREAD_LOCAL 1 +AP_DECLARE(apr_status_t) ap_thread_create(apr_thread_t **thread, + apr_threadattr_t *attr, + apr_thread_start_t func, + void *data, apr_pool_t *pool); +#endif /* AP_THREAD_LOCAL */ + +AP_DECLARE(apr_status_t) ap_thread_current_create(apr_thread_t **current, + apr_threadattr_t *attr, + apr_pool_t *pool); +AP_DECLARE(void) ap_thread_current_after_fork(void); +AP_DECLARE(apr_thread_t *) ap_thread_current(void); + +#endif /* APR_VERSION_AT_LEAST(1,8,0) && !defined(AP_NO_THREAD_LOCAL) */ + +AP_DECLARE(apr_status_t) ap_thread_main_create(apr_thread_t **thread, + apr_pool_t *pool); + +#else /* APR_HAS_THREADS */ + +#define AP_HAS_THREAD_LOCAL 0 + +#endif /* APR_HAS_THREADS */ + /** * Get server load params * @param ld struct to populate: -1 in fields means error diff --git a/modules/core/mod_watchdog.c b/modules/core/mod_watchdog.c index e7e05287a03..99ec7cfc4dc 100644 --- a/modules/core/mod_watchdog.c +++ b/modules/core/mod_watchdog.c @@ -280,7 +280,7 @@ static apr_status_t wd_startup(ap_watchdog_t *w, apr_pool_t *p) } /* Start the newly created watchdog */ - rc = apr_thread_create(&w->thread, NULL, wd_worker, w, p); + rc = ap_thread_create(&w->thread, NULL, wd_worker, w, p); if (rc == APR_SUCCESS) { apr_pool_pre_cleanup_register(p, w, wd_worker_cleanup); } diff --git a/modules/generators/mod_info.c b/modules/generators/mod_info.c index b044273062e..1662242afe9 100644 --- a/modules/generators/mod_info.c +++ b/modules/generators/mod_info.c @@ -454,6 +454,12 @@ static int show_server_settings(request_rec * r) "
Compiled with APU Version: " "%s
\n", APU_VERSION_STRING); #endif + ap_rprintf(r, + "
Server loaded PCRE Version: " + "%s
\n", ap_pcre_version_string(AP_REG_PCRE_LOADED)); + ap_rprintf(r, + "
Compiled with PCRE Version: " + "%s
\n", ap_pcre_version_string(AP_REG_PCRE_COMPILED)); ap_rprintf(r, "
Module Magic Number: " "%d:%d
\n", MODULE_MAGIC_NUMBER_MAJOR, diff --git a/modules/http2/h2_workers.c b/modules/http2/h2_workers.c index ae250b0f5ae..a4883eec71b 100644 --- a/modules/http2/h2_workers.c +++ b/modules/http2/h2_workers.c @@ -101,8 +101,8 @@ static apr_status_t activate_slot(h2_workers *workers, h2_slot *slot) * to the idle queue */ apr_atomic_inc32(&workers->worker_count); slot->timed_out = 0; - rv = apr_thread_create(&slot->thread, workers->thread_attr, - slot_run, slot, workers->pool); + rv = ap_thread_create(&slot->thread, workers->thread_attr, + slot_run, slot, workers->pool); if (rv != APR_SUCCESS) { apr_atomic_dec32(&workers->worker_count); } diff --git a/server/main.c b/server/main.c index 62e06df0453..7da7aa2ca20 100644 --- a/server/main.c +++ b/server/main.c @@ -21,6 +21,7 @@ #include "apr_lib.h" #include "apr_md5.h" #include "apr_time.h" +#include "apr_thread_proc.h" #include "apr_version.h" #include "apu_version.h" @@ -98,13 +99,17 @@ static void show_compile_settings(void) printf("Server's Module Magic Number: %u:%u\n", MODULE_MAGIC_NUMBER_MAJOR, MODULE_MAGIC_NUMBER_MINOR); #if APR_MAJOR_VERSION >= 2 - printf("Server loaded: APR %s\n", apr_version_string()); - printf("Compiled using: APR %s\n", APR_VERSION_STRING); + printf("Server loaded: APR %s, PCRE %s\n", + apr_version_string(), ap_pcre_version_string(AP_REG_PCRE_LOADED)); + printf("Compiled using: APR %s, PCRE %s\n", + APR_VERSION_STRING, ap_pcre_version_string(AP_REG_PCRE_COMPILED)); #else - printf("Server loaded: APR %s, APR-UTIL %s\n", - apr_version_string(), apu_version_string()); - printf("Compiled using: APR %s, APR-UTIL %s\n", - APR_VERSION_STRING, APU_VERSION_STRING); + printf("Server loaded: APR %s, APR-UTIL %s, PCRE %s\n", + apr_version_string(), apu_version_string(), + ap_pcre_version_string(AP_REG_PCRE_LOADED)); + printf("Compiled using: APR %s, APR-UTIL %s, PCRE %s\n", + APR_VERSION_STRING, APU_VERSION_STRING, + ap_pcre_version_string(AP_REG_PCRE_COMPILED)); #endif /* sizeof(foo) is long on some platforms so we might as well * make it long everywhere to keep the printf format @@ -348,6 +353,23 @@ static process_rec *init_process(int *argc, const char * const * *argv) process->argc = *argc; process->argv = *argv; process->short_name = apr_filepath_name_get((*argv)[0]); + +#if AP_HAS_THREAD_LOCAL + { + apr_status_t rv; + apr_thread_t *thd = NULL; + if ((rv = ap_thread_main_create(&thd, process->pool))) { + char ctimebuff[APR_CTIME_LEN]; + apr_ctime(ctimebuff, apr_time_now()); + fprintf(stderr, "[%s] [crit] (%d) %s: failed " + "to initialize thread context, exiting\n", + ctimebuff, rv, (*argv)[0]); + apr_terminate(); + exit(1); + } + } +#endif + return process; } diff --git a/server/mpm/event/event.c b/server/mpm/event/event.c index 4c65fb6b809..5b1604d7935 100644 --- a/server/mpm/event/event.c +++ b/server/mpm/event/event.c @@ -2201,11 +2201,11 @@ static void create_listener_thread(thread_starter * ts) my_info = (proc_info *) ap_malloc(sizeof(proc_info)); my_info->pslot = my_child_num; my_info->tslot = -1; /* listener thread doesn't have a thread slot */ - rv = apr_thread_create(&ts->listener, thread_attr, listener_thread, - my_info, pruntime); + rv = ap_thread_create(&ts->listener, thread_attr, listener_thread, + my_info, pruntime); if (rv != APR_SUCCESS) { ap_log_error(APLOG_MARK, APLOG_ALERT, rv, ap_server_conf, APLOGNO(00474) - "apr_thread_create: unable to create listener thread"); + "ap_thread_create: unable to create listener thread"); /* let the parent decide how bad this really is */ clean_child_exit(APEXIT_CHILDSICK); } @@ -2396,12 +2396,12 @@ static void *APR_THREAD_FUNC start_threads(apr_thread_t * thd, void *dummy) /* We let each thread update its own scoreboard entry. This is * done because it lets us deal with tid better. */ - rv = apr_thread_create(&threads[i], thread_attr, - worker_thread, my_info, pruntime); + rv = ap_thread_create(&threads[i], thread_attr, + worker_thread, my_info, pruntime); if (rv != APR_SUCCESS) { ap_log_error(APLOG_MARK, APLOG_ALERT, rv, ap_server_conf, APLOGNO(03104) - "apr_thread_create: unable to create worker thread"); + "ap_thread_create: unable to create worker thread"); /* let the parent decide how bad this really is */ clean_child_exit(APEXIT_CHILDSICK); } @@ -2536,6 +2536,17 @@ static void child_main(int child_num_arg, int child_bucket) apr_pool_create(&pchild, pconf); apr_pool_tag(pchild, "pchild"); +#if AP_HAS_THREAD_LOCAL + if (!one_process) { + apr_thread_t *thd = NULL; + if ((rv = ap_thread_main_create(&thd, pchild))) { + ap_log_error(APLOG_MARK, APLOG_EMERG, rv, ap_server_conf, APLOGNO(10377) + "Couldn't initialize child main thread"); + clean_child_exit(APEXIT_CHILDFATAL); + } + } +#endif + /* close unused listeners and pods */ for (i = 0; i < retained->mpm->num_buckets; i++) { if (i != child_bucket) { @@ -2605,11 +2616,11 @@ static void child_main(int child_num_arg, int child_bucket) ts->child_num_arg = child_num_arg; ts->threadattr = thread_attr; - rv = apr_thread_create(&start_thread_id, thread_attr, start_threads, - ts, pchild); + rv = ap_thread_create(&start_thread_id, thread_attr, start_threads, + ts, pchild); if (rv != APR_SUCCESS) { ap_log_error(APLOG_MARK, APLOG_ALERT, rv, ap_server_conf, APLOGNO(00480) - "apr_thread_create: unable to create worker thread"); + "ap_thread_create: unable to create worker thread"); /* let the parent decide how bad this really is */ clean_child_exit(APEXIT_CHILDSICK); } @@ -2744,6 +2755,10 @@ static int make_child(server_rec * s, int slot, int bucket) } if (!pid) { +#if AP_HAS_THREAD_LOCAL + ap_thread_current_after_fork(); +#endif + my_bucket = &all_buckets[bucket]; #ifdef HAVE_BINDPROCESSOR diff --git a/server/mpm/prefork/prefork.c b/server/mpm/prefork/prefork.c index 0bdfd5ae743..b5adb57bea1 100644 --- a/server/mpm/prefork/prefork.c +++ b/server/mpm/prefork/prefork.c @@ -395,7 +395,6 @@ static void child_main(int child_num_arg, int child_bucket) { #if APR_HAS_THREADS apr_thread_t *thd = NULL; - apr_os_thread_t osthd; sigset_t sig_mask; #endif apr_pool_t *ptrans; @@ -427,9 +426,23 @@ static void child_main(int child_num_arg, int child_bucket) apr_allocator_owner_set(allocator, pchild); apr_pool_tag(pchild, "pchild"); +#if AP_HAS_THREAD_LOCAL + if (one_process) { + thd = ap_thread_current(); + } + else if ((status = ap_thread_main_create(&thd, pchild))) { + ap_log_error(APLOG_MARK, APLOG_EMERG, status, ap_server_conf, APLOGNO(10378) + "Couldn't initialize child main thread"); + clean_child_exit(APEXIT_CHILDFATAL); + } +#elif APR_HAS_THREADS + { + apr_os_thread_t osthd = apr_os_thread_current(); + apr_os_thread_put(&thd, &osthd, pchild); + } +#endif #if APR_HAS_THREADS - osthd = apr_os_thread_current(); - apr_os_thread_put(&thd, &osthd, pchild); + ap_assert(thd != NULL); #endif apr_pool_create(&ptrans, pchild); @@ -722,6 +735,10 @@ static int make_child(server_rec *s, int slot) } if (!pid) { +#if AP_HAS_THREAD_LOCAL + ap_thread_current_after_fork(); +#endif + my_bucket = &all_buckets[bucket]; #ifdef HAVE_BINDPROCESSOR diff --git a/server/mpm/winnt/child.c b/server/mpm/winnt/child.c index ad03d24e1c0..05151a885ea 100644 --- a/server/mpm/winnt/child.c +++ b/server/mpm/winnt/child.c @@ -784,8 +784,8 @@ static winnt_conn_ctx_t *winnt_get_connection(winnt_conn_ctx_t *context) */ static DWORD __stdcall worker_main(void *thread_num_val) { - apr_thread_t *thd; - apr_os_thread_t osthd; + apr_thread_t *thd = NULL; + apr_os_thread_t osthd = NULL; static int requests_this_child = 0; winnt_conn_ctx_t *context = NULL; int thread_num = (int)thread_num_val; @@ -793,7 +793,16 @@ static DWORD __stdcall worker_main(void *thread_num_val) conn_rec *c; apr_int32_t disconnected; +#if AP_HAS_THREAD_LOCAL + if (ap_thread_current_create(&thd, NULL, pchild) != APR_SUCCESS) { + ap_log_error(APLOG_MARK, APLOG_WARNING, 0, ap_server_conf, APLOGNO(10376) + "Couldn't initialize worker thread, thread locals won't " + "be available"); + osthd = apr_os_thread_current(); + } +#else osthd = apr_os_thread_current(); +#endif while (1) { @@ -826,8 +835,10 @@ static DWORD __stdcall worker_main(void *thread_num_val) continue; } - thd = NULL; - apr_os_thread_put(&thd, &osthd, context->ptrans); + if (osthd) { + thd = NULL; + apr_os_thread_put(&thd, &osthd, context->ptrans); + } c->current_thread = thd; ap_process_connection(c, context->sock); @@ -842,6 +853,12 @@ static DWORD __stdcall worker_main(void *thread_num_val) ap_update_child_status_from_indexes(0, thread_num, SERVER_DEAD, NULL); +#if AP_HAS_THREAD_LOCAL + if (!osthd) { + apr_pool_destroy(apr_thread_pool_get(thd)); + } +#endif + return 0; } diff --git a/server/mpm/worker/worker.c b/server/mpm/worker/worker.c index 5cf8ead94e1..7e3a5542406 100644 --- a/server/mpm/worker/worker.c +++ b/server/mpm/worker/worker.c @@ -847,11 +847,11 @@ static void create_listener_thread(thread_starter *ts) my_info->pid = my_child_num; my_info->tid = -1; /* listener thread doesn't have a thread slot */ my_info->sd = 0; - rv = apr_thread_create(&ts->listener, thread_attr, listener_thread, - my_info, pruntime); + rv = ap_thread_create(&ts->listener, thread_attr, listener_thread, + my_info, pruntime); if (rv != APR_SUCCESS) { ap_log_error(APLOG_MARK, APLOG_ALERT, rv, ap_server_conf, APLOGNO(00275) - "apr_thread_create: unable to create listener thread"); + "ap_thread_create: unable to create listener thread"); /* let the parent decide how bad this really is */ clean_child_exit(APEXIT_CHILDSICK); } @@ -967,11 +967,11 @@ static void * APR_THREAD_FUNC start_threads(apr_thread_t *thd, void *dummy) /* We let each thread update its own scoreboard entry. This is * done because it lets us deal with tid better. */ - rv = apr_thread_create(&threads[i], thread_attr, - worker_thread, my_info, pruntime); + rv = ap_thread_create(&threads[i], thread_attr, + worker_thread, my_info, pruntime); if (rv != APR_SUCCESS) { ap_log_error(APLOG_MARK, APLOG_ALERT, rv, ap_server_conf, APLOGNO(03142) - "apr_thread_create: unable to create worker thread"); + "ap_thread_create: unable to create worker thread"); /* let the parent decide how bad this really is */ clean_child_exit(APEXIT_CHILDSICK); } @@ -1115,6 +1115,17 @@ static void child_main(int child_num_arg, int child_bucket) apr_pool_create(&pchild, pconf); apr_pool_tag(pchild, "pchild"); +#if AP_HAS_THREAD_LOCAL + if (!one_process) { + apr_thread_t *thd = NULL; + if ((rv = ap_thread_main_create(&thd, pchild))) { + ap_log_error(APLOG_MARK, APLOG_EMERG, rv, ap_server_conf, APLOGNO(10375) + "Couldn't initialize child main thread"); + clean_child_exit(APEXIT_CHILDFATAL); + } + } +#endif + /* close unused listeners and pods */ for (i = 0; i < retained->mpm->num_buckets; i++) { if (i != child_bucket) { @@ -1194,11 +1205,11 @@ static void child_main(int child_num_arg, int child_bucket) ts->child_num_arg = child_num_arg; ts->threadattr = thread_attr; - rv = apr_thread_create(&start_thread_id, thread_attr, start_threads, - ts, pchild); + rv = ap_thread_create(&start_thread_id, thread_attr, start_threads, + ts, pchild); if (rv != APR_SUCCESS) { ap_log_error(APLOG_MARK, APLOG_ALERT, rv, ap_server_conf, APLOGNO(00282) - "apr_thread_create: unable to create worker thread"); + "ap_thread_create: unable to create worker thread"); /* let the parent decide how bad this really is */ clean_child_exit(APEXIT_CHILDSICK); } @@ -1315,6 +1326,10 @@ static int make_child(server_rec *s, int slot, int bucket) } if (!pid) { +#if AP_HAS_THREAD_LOCAL + ap_thread_current_after_fork(); +#endif + my_bucket = &all_buckets[bucket]; #ifdef HAVE_BINDPROCESSOR diff --git a/server/util.c b/server/util.c index 424a28a59f3..6cfe0035c49 100644 --- a/server/util.c +++ b/server/util.c @@ -3161,6 +3161,135 @@ AP_DECLARE(void *) ap_realloc(void *ptr, size_t size) return p; } +#if APR_HAS_THREADS + +#if APR_VERSION_AT_LEAST(1,8,0) && !defined(AP_NO_THREAD_LOCAL) + +#define ap_thread_current_create apr_thread_current_create + +#else /* APR_VERSION_AT_LEAST(1,8,0) && !defined(AP_NO_THREAD_LOCAL) */ + +#if AP_HAS_THREAD_LOCAL + +struct thread_ctx { + apr_thread_start_t func; + void *data; +}; + +static AP_THREAD_LOCAL apr_thread_t *current_thread = NULL; + +static void *APR_THREAD_FUNC thread_start(apr_thread_t *thread, void *data) +{ + struct thread_ctx *ctx = data; + + current_thread = thread; + return ctx->func(thread, ctx->data); +} + +AP_DECLARE(apr_status_t) ap_thread_create(apr_thread_t **thread, + apr_threadattr_t *attr, + apr_thread_start_t func, + void *data, apr_pool_t *pool) +{ + struct thread_ctx *ctx = apr_palloc(pool, sizeof(*ctx)); + + ctx->func = func; + ctx->data = data; + return apr_thread_create(thread, attr, thread_start, ctx, pool); +} + +#endif /* AP_HAS_THREAD_LOCAL */ + +AP_DECLARE(apr_status_t) ap_thread_current_create(apr_thread_t **current, + apr_threadattr_t *attr, + apr_pool_t *pool) +{ + apr_status_t rv; + apr_abortfunc_t abort_fn = apr_pool_abort_get(pool); + apr_allocator_t *allocator; + apr_os_thread_t osthd; + apr_pool_t *p; + + *current = ap_thread_current(); + if (*current) { + return APR_EEXIST; + } + + rv = apr_allocator_create(&allocator); + if (rv != APR_SUCCESS) { + if (abort_fn) + abort_fn(rv); + return rv; + } + rv = apr_pool_create_unmanaged_ex(&p, abort_fn, allocator); + if (rv != APR_SUCCESS) { + apr_allocator_destroy(allocator); + return rv; + } + apr_allocator_owner_set(allocator, p); + + osthd = apr_os_thread_current(); + rv = apr_os_thread_put(current, &osthd, p); + if (rv != APR_SUCCESS) { + apr_pool_destroy(p); + return rv; + } + +#if AP_HAS_THREAD_LOCAL + current_thread = *current; +#endif + return APR_SUCCESS; +} + +AP_DECLARE(void) ap_thread_current_after_fork(void) +{ +#if AP_HAS_THREAD_LOCAL + current_thread = NULL; +#endif +} + +AP_DECLARE(apr_thread_t *) ap_thread_current(void) +{ +#if AP_HAS_THREAD_LOCAL + return current_thread; +#else + return NULL; +#endif +} + +#endif /* APR_VERSION_AT_LEAST(1,8,0) && !defined(AP_NO_THREAD_LOCAL) */ + +static apr_status_t main_thread_cleanup(void *arg) +{ + apr_thread_t *thd = arg; + apr_pool_destroy(apr_thread_pool_get(thd)); + return APR_SUCCESS; +} + +AP_DECLARE(apr_status_t) ap_thread_main_create(apr_thread_t **thread, + apr_pool_t *pool) +{ + apr_status_t rv; + apr_threadattr_t *attr = NULL; + + /* Create an apr_thread_t for the main child thread to set up its Thread + * Local Storage. Since it's detached and won't apr_thread_exit(), destroy + * its pool before exiting via a cleanup of the given pool. + */ + if ((rv = apr_threadattr_create(&attr, pool)) + || (rv = apr_threadattr_detach_set(attr, 1)) + || (rv = ap_thread_current_create(thread, attr, pool))) { + *thread = NULL; + return rv; + } + + apr_pool_cleanup_register(pool, *thread, main_thread_cleanup, + apr_pool_cleanup_null); + return APR_SUCCESS; +} + +#endif /* APR_HAS_THREADS */ + AP_DECLARE(void) ap_get_sload(ap_sload_t *ld) { int i, j, server_limit, thread_limit; diff --git a/server/util_pcre.c b/server/util_pcre.c index aa0b442d0ea..0a9dc50112d 100644 --- a/server/util_pcre.c +++ b/server/util_pcre.c @@ -55,6 +55,7 @@ POSSIBILITY OF SUCH DAMAGE. #include "httpd.h" #include "apr_strings.h" #include "apr_tables.h" +#include "apr_thread_proc.h" #ifdef HAVE_PCRE2 #define PCRE2_CODE_UNIT_WIDTH 8 @@ -89,6 +90,26 @@ static const char *const pstring[] = { "match failed" /* AP_REG_NOMATCH */ }; +AP_DECLARE(const char *) ap_pcre_version_string(int which) +{ +#ifdef HAVE_PCRE2 + static char buf[80]; +#endif + switch (which) { + case AP_REG_PCRE_COMPILED: + return APR_STRINGIFY(PCREn(MAJOR)) "." APR_STRINGIFY(PCREn(MINOR)) " " APR_STRINGIFY(PCREn(DATE)); + case AP_REG_PCRE_LOADED: +#ifdef HAVE_PCRE2 + pcre2_config(PCRE2_CONFIG_VERSION, buf); + return buf; +#else + return pcre_version(); +#endif + default: + return "Unknown"; + } +} + AP_DECLARE(apr_size_t) ap_regerror(int errcode, const ap_regex_t *preg, char *errbuf, apr_size_t errbuf_size) { @@ -243,7 +264,134 @@ AP_DECLARE(int) ap_regcomp(ap_regex_t * preg, const char *pattern, int cflags) * ints. However, if the number of possible capturing brackets is small, use a * block of store on the stack, to reduce the use of malloc/free. The threshold * is in a macro that can be changed at configure time. + * Yet more unfortunately, PCRE2 wants an opaque context by providing the API + * to allocate and free it, so to minimize these calls we maintain one opaque + * context per thread (in Thread Local Storage, TLS) grown as needed, and while + * at it we do the same for PCRE1 ints vectors. Note that this requires a fast + * TLS mechanism to be worth it, which is the case of apr_thread_data_get/set() + * from/to ap_thread_current() when AP_HAS_THREAD_LOCAL; otherwise we'll do + * the allocation and freeing for each ap_regexec(). */ + +#ifdef HAVE_PCRE2 +typedef pcre2_match_data* match_data_pt; +typedef size_t* match_vector_pt; +#else +typedef int* match_data_pt; +typedef int* match_vector_pt; +#endif + +static APR_INLINE +match_data_pt alloc_match_data(apr_size_t size, + match_vector_pt small_vector) +{ + match_data_pt data; + +#ifdef HAVE_PCRE2 + data = pcre2_match_data_create(size, NULL); +#else + if (size > POSIX_MALLOC_THRESHOLD) { + data = malloc(size * sizeof(int) * 3); + } + else { + data = small_vector; + } +#endif + + return data; +} + +static APR_INLINE +void free_match_data(match_data_pt data, apr_size_t size) +{ +#ifdef HAVE_PCRE2 + pcre2_match_data_free(data); +#else + if (size > POSIX_MALLOC_THRESHOLD) { + free(data); + } +#endif +} + +#if AP_HAS_THREAD_LOCAL && !defined(APREG_NO_THREAD_LOCAL) + +struct apreg_tls { + match_data_pt data; + apr_size_t size; +}; + +#ifdef HAVE_PCRE2 +static apr_status_t apreg_tls_cleanup(void *arg) +{ + struct apreg_tls *tls = arg; + pcre2_match_data_free(tls->data); /* NULL safe */ + return APR_SUCCESS; +} +#endif + +static match_data_pt get_match_data(apr_size_t size, + match_vector_pt small_vector, + int *to_free) +{ + apr_thread_t *current; + struct apreg_tls *tls = NULL; + + /* Even though AP_HAS_THREAD_LOCAL, we may still be called by a + * native/non-apr thread, let's fall back to alloc/free in this case. + */ + current = ap_thread_current(); + if (!current) { + *to_free = 1; + return alloc_match_data(size, small_vector); + } + + apr_thread_data_get((void **)&tls, "apreg", current); + if (!tls || tls->size < size) { + apr_pool_t *tp = apr_thread_pool_get(current); + if (!tls) { + tls = apr_pcalloc(tp, sizeof(*tls)); +#ifdef HAVE_PCRE2 + apr_thread_data_set(tls, "apreg", apreg_tls_cleanup, current); +#else + apr_thread_data_set(tls, "apreg", NULL, current); +#endif + } + + tls->size *= 2; + if (tls->size < size) { + tls->size = size; + if (tls->size < POSIX_MALLOC_THRESHOLD) { + tls->size = POSIX_MALLOC_THRESHOLD; + } + } + +#ifdef HAVE_PCRE2 + pcre2_match_data_free(tls->data); /* NULL safe */ + tls->data = pcre2_match_data_create(tls->size, NULL); + if (!tls->data) { + tls->size = 0; + return NULL; + } +#else + tls->data = apr_palloc(tp, tls->size * sizeof(int) * 3); +#endif + } + + return tls->data; +} + +#else /* AP_HAS_THREAD_LOCAL && !defined(APREG_NO_THREAD_LOCAL) */ + +static APR_INLINE match_data_pt get_match_data(apr_size_t size, + match_vector_pt small_vector, + int *to_free) +{ + *to_free = 1; + return alloc_match_data(size, small_vector); +} + +#endif /* AP_HAS_THREAD_LOCAL && !defined(APREG_NO_THREAD_LOCAL) */ + AP_DECLARE(int) ap_regexec(const ap_regex_t *preg, const char *string, apr_size_t nmatch, ap_regmatch_t *pmatch, int eflags) @@ -257,78 +405,55 @@ AP_DECLARE(int) ap_regexec_len(const ap_regex_t *preg, const char *buff, ap_regmatch_t *pmatch, int eflags) { int rc; - int options = 0; - apr_size_t nlim; + int options = 0, to_free = 0; + match_vector_pt ovector = NULL; + apr_size_t ncaps = (apr_size_t)preg->re_nsub + 1; #ifdef HAVE_PCRE2 - pcre2_match_data *matchdata; - size_t *ovector; + match_data_pt data = get_match_data(ncaps, NULL, &to_free); #else - int small_ovector[POSIX_MALLOC_THRESHOLD * 3]; - int allocated_ovector = 0; - int *ovector = NULL; + int small_vector[POSIX_MALLOC_THRESHOLD * 3]; + match_data_pt data = get_match_data(ncaps, small_vector, &to_free); #endif + if (!data) { + return AP_REG_ESPACE; + } + if ((eflags & AP_REG_NOTBOL) != 0) options |= PCREn(NOTBOL); if ((eflags & AP_REG_NOTEOL) != 0) options |= PCREn(NOTEOL); #ifdef HAVE_PCRE2 - /* TODO: create a generic TLS matchdata buffer of some nmatch limit, - * e.g. 10 matches, to avoid a malloc-per-call. If it must be alloced, - * implement a general context using palloc and no free implementation. - */ - nlim = ((apr_size_t)preg->re_nsub + 1) > nmatch - ? ((apr_size_t)preg->re_nsub + 1) : nmatch; - matchdata = pcre2_match_data_create(nlim, NULL); - if (matchdata == NULL) - return AP_REG_ESPACE; - ovector = pcre2_get_ovector_pointer(matchdata); rc = pcre2_match((const pcre2_code *)preg->re_pcre, (const unsigned char *)buff, len, - 0, options, matchdata, NULL); - if (rc == 0) - rc = nlim; /* All captured slots were filled in */ + 0, options, data, NULL); + ovector = pcre2_get_ovector_pointer(data); #else - if (nmatch > 0) { - if (nmatch <= POSIX_MALLOC_THRESHOLD) { - ovector = &(small_ovector[0]); - } - else { - ovector = (int *)malloc(sizeof(int) * nmatch * 3); - if (ovector == NULL) - return AP_REG_ESPACE; - allocated_ovector = 1; - } - } + ovector = data; rc = pcre_exec((const pcre *)preg->re_pcre, NULL, buff, (int)len, - 0, options, ovector, nmatch * 3); - if (rc == 0) - rc = nmatch; /* All captured slots were filled in */ + 0, options, ovector, ncaps * 3); #endif if (rc >= 0) { - apr_size_t i; - nlim = (apr_size_t)rc < nmatch ? (apr_size_t)rc : nmatch; - for (i = 0; i < nlim; i++) { + apr_size_t n = rc, i; + if (n == 0 || n > nmatch) + rc = n = nmatch; /* All capture slots were filled in */ + for (i = 0; i < n; i++) { pmatch[i].rm_so = ovector[i * 2]; pmatch[i].rm_eo = ovector[i * 2 + 1]; } for (; i < nmatch; i++) pmatch[i].rm_so = pmatch[i].rm_eo = -1; - } - -#ifdef HAVE_PCRE2 - pcre2_match_data_free(matchdata); -#else - if (allocated_ovector) - free(ovector); -#endif - - if (rc >= 0) { + if (to_free) { + free_match_data(data, ncaps); + } return 0; } else { + if (to_free) { + free_match_data(data, ncaps); + } #ifdef HAVE_PCRE2 if (rc <= PCRE2_ERROR_UTF8_ERR1 && rc >= PCRE2_ERROR_UTF8_ERR21) return AP_REG_INVARG; -- 2.47.2