From: William A. Rowe Jr Date: Mon, 14 Nov 2016 15:29:20 +0000 (+0000) Subject: Improve legibility of reviewing the generated table, using hex rather than dec X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=e6a8f03afcd39dea4d2e36eaeaa41364384e4e00;p=thirdparty%2Fapache%2Fhttpd.git Improve legibility of reviewing the generated table, using hex rather than dec Submitted by: wrowe Backport: r1754536 Correct T_HTTP_TOKEN_STOP per RFC2068 (2.2) - RFC7230 (3.2.6), which has always defined 'token' as CHAR or VCHAR - visible USASCII only. NUL char is also a stop, end of parsing. Submitted by: wrowe Backport: r1754538 Be more explicit about NUL in case iscntrl is inconsistent Submitted by: wrowe Backport: r1754539 Introduce T_HTTP_CTRLS for efficiently finding non-text chars Submitted by: wrowe Backport: r1754540 Introduce ap_scan_http_field_content, ap_scan_http_token and ap_get_http_token [later reverted] for more efficient string handling. Submitted by: wrowe Backport: r1754541 With NUL as a TOKEN_STOP, this code is more efficient Submitted by: wrowe Backport: r1754544 We arrive here for more than one cause; offer a more general statement Submitted by: wrowe Backport: r1754547 Strictly observe spec on obs-fold Submitted by: wrowe Backport: r1754548 Leave an emphatic TODO per Jeff's observations Submitted by: trawick Backport: r1754555 Introduce ap_scan_http_token / ap_scan_http_field_content for a much more efficient pass through the header text; rather than reparsing the strings over and over under the HTTP_CONFORMANCE_STRICT fules. Improve logic and legibility by eliminating multiple repetitive tests of the STRICT flag, and simply reorder 'classic' behavior first and this new parser second to simplify the diff. Because of the whitespace change (which I had wished to dodge), reading this --ignore-all-space is a whole lot easier. Particularly against 2.4.x branch, which is now identical in the 'classic' logic flow. Both of which I'll share with dev@ Submitted by: wrowe Backport: r1754556 Friendly catch by RĂ¼diger, restore line mis-removed by the previous commit Submitted by: rpluem Backport: r1754568 Clean up doubled-'{' Correct usage for ap_scan_http_token (had used _get_ syntax) Correct logic, detect no 'token' chars, or missing ':' Submitted by: wrowe, rpluem Backport: r1754569,r1754570,r1754577 Replacement solution to identify VCHAR/ASCII symbols, even in EBCDIC. Looking for someone with an EBCDIC environment to post the output of the test_char.h generated file for verification. Submitted by: wrowe Backport: r1754579 Clean up an edge case where obs-fold continuation preceeds the first header, as with r1755098, but this time ensure the previous header processing logic ensures there was a previous header as identified by jchampion. This patch restructures the loop for legibility with a loop continuation, allowing us to flatten all of this hard-to-follow code. The subsequent patch will be a whitespace-only change for formatting. Testing len > 0 is redundant when *field is a "\0" and mismatches here, folded flag was a no-op, unused once we added continue; logic. Fix these as initially attempted in r1755114. Improve comments and reflow whitespace. Submitted by: wrowe Backport: r1755123,r1755124,r1755125,r1755126 As promised, reduce this logic by net 9 code lines, shifting the burden of killing trailing whitespace to the purpose-agnostic read logic. Whitespace before or after an obs-fold, and before or after a field value have no semantic purpose at all. Because we are building a buffer for all folded values, reducing the size of the newly allocated buffer is always to our advantage. Submitted by: wrowe Backport: r1755233 Treat empty obs-fold line as a noop, eliminate all intra-obs-fold excess whitespace, and observe the 1 SP per obs-folding per spec. Submitted by: wrowe Backport: r1755234,r1755235,r1755236 Treat empty obs-fold line as abusive traffic. Submitted by: wrowe Backport: r1755263 Stop reflecting irrelevant data to the request error notes, particularly for abusive and malformed traffic the non-technical consumer of a user-agent has no control over. Simply take note where the administrator-configured limits have been exceeded, that administrator can find details in the error log if desired. Submitted by: wrowe Backport: r1755264 Follow up to r1755264. Don't crash when ap_rgetline() returns a NULL field on ENOSPC. Submitted by: ylavic Backport: r1755343 Follow on to r1755264, for the case of merged header length exceptions, and ensure the field header name is truncated to a sane log width. Submitted by: wrowe Backport: r1755744 git-svn-id: https://svn.apache.org/repos/asf/httpd/httpd/branches/2.4.x-merge-http-strict@1769649 13f79535-47bb-0310-9956-ffa450edef68 --- diff --git a/include/ap_mmn.h b/include/ap_mmn.h index 6e6bef47c78..6fd27fe57a5 100644 --- a/include/ap_mmn.h +++ b/include/ap_mmn.h @@ -488,8 +488,9 @@ * 20120211.66 (2.4.24-dev) Rename ap_proxy_check_backend() to * ap_proxy_check_connection(). * 20120211.67 (2.5.0-dev) Add http09_enable to core_server_config - * Add http_conformance to core_server_config, - * add ap_has_cntrl() + * Add http_conformance to core_server_config + * Add ap_has_cntrl(), ap_get_http_token() + * Add ap_scan_http_field_[content|token]() */ #define MODULE_MAGIC_COOKIE 0x41503234UL /* "AP24" */ diff --git a/include/httpd.h b/include/httpd.h index 18cfd04c245..592e54188a3 100644 --- a/include/httpd.h +++ b/include/httpd.h @@ -1585,6 +1585,31 @@ AP_DECLARE(int) ap_find_etag_weak(apr_pool_t *p, const char *line, const char *t */ AP_DECLARE(int) ap_find_etag_strong(apr_pool_t *p, const char *line, const char *tok); +/* Scan a string for field content chars, as defined by RFC7230 section 3.2 + * including VCHAR/obs-text, as well as HT and SP + * @param ptr The string to scan + * @return A pointer to the first (non-HT) ASCII ctrl character. + * @note lws and trailing whitespace are scanned, the caller is responsible + * for trimming leading and trailing whitespace + */ +AP_DECLARE(const char *) ap_scan_http_field_content(const char *ptr); + +/* Scan a string for token characters, as defined by RFC7230 section 3.2.6 + * @param ptr The string to scan + * @return A pointer to the first non-token character. + */ +AP_DECLARE(const char *) ap_scan_http_token(const char *ptr); + +/* Retrieve a token, advancing the pointer to the first non-token character + * and returning a copy of the token string. + * @param ptr The string to scan. On return, this points to the first non-token + * character encountered, or NULL if *ptr was not a token character + * @return A copy of the token string + * @note The caller must handle leading and trailing whitespace as applicable + * and evaluate the terminating character. + */ +AP_DECLARE(char *) ap_get_http_token(apr_pool_t *p, const char **ptr); + /** * Retrieve an array of tokens in the format "1#token" defined in RFC2616. Only * accepts ',' as a delimiter, does not accept quoted strings, and errors on diff --git a/server/gen_test_char.c b/server/gen_test_char.c index 9cace3e4ee7..9e55bad6c40 100644 --- a/server/gen_test_char.c +++ b/server/gen_test_char.c @@ -20,6 +20,7 @@ #define apr_isalpha(c) (isalpha(((unsigned char)(c)))) #define apr_iscntrl(c) (iscntrl(((unsigned char)(c)))) #define apr_isprint(c) (isprint(((unsigned char)(c)))) +#define apr_isascii(c) (isascii(((unsigned char)(c)))) #include #define APR_HAVE_STDIO_H 1 #define APR_HAVE_STRING_H 1 @@ -31,6 +32,48 @@ #endif +#if APR_CHARSET_EBCDIC +/* See util.c for complete explanation of this table */ +static const short ucharmap[] = { + 0x00, 0x01, 0x02, 0x03, 0x9C, 0x09, 0x86, 0x7F, + 0x97, 0x8D, 0x8E, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F, + 0x10, 0x11, 0x12, 0x13, 0x9D, 0x85, 0x08, 0x87, + 0x18, 0x19, 0x92, 0x8F, 0x1C, 0x1D, 0x1E, 0x1F, + 0x80, 0x81, 0x82, 0x83, 0x84, 0x0A, 0x17, 0x1B, + 0x88, 0x89, 0x8A, 0x8B, 0x8C, 0x05, 0x06, 0x07, + 0x90, 0x91, 0x16, 0x93, 0x94, 0x95, 0x96, 0x04, + 0x98, 0x99, 0x9A, 0x9B, 0x14, 0x15, 0x9E, 0x1A, + 0x20, 0xA0, 0xE2, 0xE4, 0xE0, 0xE1, 0xE3, 0xE5, + 0xE7, 0xF1, 0xA2, 0x2E, 0x3C, 0x28, 0x2B, 0x7C, + 0x26, 0xE9, 0xEA, 0xEB, 0xE8, 0xED, 0xEE, 0xEF, + 0xEC, 0xDF, 0x21, 0x24, 0x2A, 0x29, 0x3B, 0xAC, + 0x2D, 0x2F, 0xC2, 0xC4, 0xC0, 0xC1, 0xC3, 0xC5, + 0xC7, 0xD1, 0xA6, 0x2C, 0x25, 0x5F, 0x3E, 0x3F, + 0xF8, 0xC9, 0xCA, 0xCB, 0xC8, 0xCD, 0xCE, 0xCF, + 0xCC, 0x60, 0x3A, 0x23, 0x40, 0x27, 0x3D, 0x22, + 0xD8, 0x61, 0x62, 0x63, 0x64, 0x65, 0x66, 0x67, + 0x68, 0x69, 0xAB, 0xBB, 0xF0, 0xFD, 0xFE, 0xB1, + 0xB0, 0x6A, 0x6B, 0x6C, 0x6D, 0x6E, 0x6F, 0x70, + 0x71, 0x72, 0xAA, 0xBA, 0xE6, 0xB8, 0xC6, 0xA4, + 0xB5, 0x7E, 0x73, 0x74, 0x75, 0x76, 0x77, 0x78, + 0x79, 0x7A, 0xA1, 0xBF, 0xD0, 0xDD, 0xDE, 0xAE, + 0x5E, 0xA3, 0xA5, 0xB7, 0xA9, 0xA7, 0xB6, 0xBC, + 0xBD, 0xBE, 0x5B, 0x5D, 0xAF, 0xA8, 0xB4, 0xD7, + 0x7B, 0x61, 0x62, 0x63, 0x64, 0x65, 0x66, 0x67, + 0x68, 0x69, 0xAD, 0xF4, 0xF6, 0xF2, 0xF3, 0xF5, + 0x7D, 0x6A, 0x6B, 0x6C, 0x6D, 0x6E, 0x6F, 0x70, + 0x71, 0x72, 0xB9, 0xFB, 0xFC, 0xF9, 0xFA, 0xFF, + 0x5C, 0xF7, 0x73, 0x74, 0x75, 0x76, 0x77, 0x78, + 0x79, 0x7A, 0xB2, 0xD4, 0xD6, 0xD2, 0xD3, 0xD5, + 0x30, 0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37, + 0x38, 0x39, 0xB3, 0xDB, 0xDC, 0xD9, 0xDA, 0x9F +}; +#define test_isascii_equiv(c) ((ucharmap[(unsigned char)c] & ~0x7f) == 0) +#else +#define test_isascii_equiv(c) apr_isascii(c) +#endif + + #if defined(WIN32) || defined(OS2) #define NEED_ENHANCED_ESCAPES #endif @@ -52,6 +95,7 @@ #define T_ESCAPE_LOGITEM (0x10) #define T_ESCAPE_FORENSIC (0x20) #define T_ESCAPE_URLENCODED (0x40) +#define T_HTTP_CTRLS (0x80) int main(int argc, char *argv[]) { @@ -67,6 +111,7 @@ int main(int argc, char *argv[]) "#define T_ESCAPE_LOGITEM (%u)\n" "#define T_ESCAPE_FORENSIC (%u)\n" "#define T_ESCAPE_URLENCODED (%u)\n" + "#define T_HTTP_CTRLS (%u)\n" "\n" "static const unsigned char test_char_table[256] = {", T_ESCAPE_SHELL_CMD, @@ -75,11 +120,12 @@ int main(int argc, char *argv[]) T_HTTP_TOKEN_STOP, T_ESCAPE_LOGITEM, T_ESCAPE_FORENSIC, - T_ESCAPE_URLENCODED); + T_ESCAPE_URLENCODED, + T_HTTP_CTRLS); for (c = 0; c < 256; ++c) { flags = 0; - if (c % 20 == 0) + if (c % 8 == 0) printf("\n "); /* escape_shell_cmd */ @@ -115,11 +161,25 @@ int main(int argc, char *argv[]) flags |= T_ESCAPE_URLENCODED; } - /* these are the "tspecials" (RFC2068) or "separators" (RFC2616) */ - if (c && (apr_iscntrl(c) || strchr(" \t()<>@,;:\\\"/[]?={}", c))) { + /* Stop for any non-'token' character, including ctrls, obs-text, + * and "tspecials" (RFC2068) a.k.a. "separators" (RFC2616) + * XXX: We need to verify that ASCII C0 ctrls/DEL in our EBCDIC table + * are captured by apr_iscntrl() + */ + if (!c || apr_iscntrl(c) || strchr(" \t()<>@,;:\\\"/[]?={}", c) + || !test_isascii_equiv(c)) { flags |= T_HTTP_TOKEN_STOP; } + /* Catch CTRLs other than VCHAR, HT and SP, and obs-text (RFC7230 3.2) + * This includes only the C0 plane, not C1 (which is obs-text itself.) + * XXX: We need to verify that ASCII C0 ctrls/DEL in our EBCDIC table + * are captured by apr_iscntrl() + */ + if (!c || (apr_iscntrl(c) && c != '\t' && test_isascii_equiv(c))) { + flags |= T_HTTP_CTRLS; + } + /* For logging, escape all control characters, * double quotes (because they delimit the request in the log file) * backslashes (because we use backslash for escaping) @@ -137,7 +197,7 @@ int main(int argc, char *argv[]) flags |= T_ESCAPE_FORENSIC; } - printf("%u%c", flags, (c < 255) ? ',' : ' '); + printf("0x%02x%c", flags, (c < 255) ? ',' : ' '); } printf("\n};\n"); diff --git a/server/protocol.c b/server/protocol.c index 9b8c19263a3..d8a2ae1cb64 100644 --- a/server/protocol.c +++ b/server/protocol.c @@ -743,6 +743,16 @@ static int read_request_line(request_rec *r, apr_bucket_brigade *bb) return 1; } +/* get the length of the field name for logging, but no more than 80 bytes */ +#define LOG_NAME_MAX_LEN 80 +static int field_name_len(const char *field) +{ + const char *end = ap_strchr_c(field, ':'); + if (end == NULL || end - field > LOG_NAME_MAX_LEN) + return LOG_NAME_MAX_LEN; + return end - field; +} + static int table_do_fn_check_lengths(void *r_, const char *key, const char *value) { @@ -752,26 +762,13 @@ static int table_do_fn_check_lengths(void *r_, const char *key, r->status = HTTP_BAD_REQUEST; apr_table_setn(r->notes, "error-notes", - apr_pstrcat(r->pool, "Size of a request header field " - "after merging exceeds server limit.
" - "\n
\n",
-                               ap_escape_html(r->pool, key),
-                               "
\n", NULL)); - ap_log_rerror(APLOG_MARK, APLOG_INFO, 0, r, APLOGNO(00560) "Request header " - "exceeds LimitRequestFieldSize after merging: %s", key); + "Size of a request header field exceeds server limit."); + ap_log_rerror(APLOG_MARK, APLOG_INFO, 0, r, APLOGNO(00560) "Request " + "header exceeds LimitRequestFieldSize after merging: %.*s", + field_name_len(key), key); return 0; } -/* get the length of the field name for logging, but no more than 80 bytes */ -#define LOG_NAME_MAX_LEN 80 -static int field_name_len(const char *field) -{ - const char *end = ap_strchr_c(field, ':'); - if (end == NULL || end - field > LOG_NAME_MAX_LEN) - return LOG_NAME_MAX_LEN; - return end - field; -} - AP_DECLARE(void) ap_get_mime_headers_core(request_rec *r, apr_bucket_brigade *bb) { char *last_field = NULL; @@ -790,7 +787,6 @@ AP_DECLARE(void) ap_get_mime_headers_core(request_rec *r, apr_bucket_brigade *bb */ while(1) { apr_status_t rv; - int folded = 0; field = NULL; rv = ap_rgetline(&field, r->server->limit_req_fieldsize + 2, @@ -811,103 +807,126 @@ AP_DECLARE(void) ap_get_mime_headers_core(request_rec *r, apr_bucket_brigade *bb * exceeds the configured limit for a field size. */ if (rv == APR_ENOSPC) { - const char *field_escaped; - if (field && len) { - /* ensure ap_escape_html will terminate correctly */ - field[len - 1] = '\0'; - field_escaped = ap_escape_html(r->pool, field); - } - else { - field_escaped = field = ""; - } - apr_table_setn(r->notes, "error-notes", - apr_psprintf(r->pool, - "Size of a request header field " - "exceeds server limit.
\n" - "
\n%.*s\n
\n", - field_name_len(field_escaped), - field_escaped)); + "Size of a request header field " + "exceeds server limit."); ap_log_rerror(APLOG_MARK, APLOG_INFO, 0, r, APLOGNO(00561) "Request header exceeds LimitRequestFieldSize%s" "%.*s", - *field ? ": " : "", - field_name_len(field), field); + (field && *field) ? ": " : "", + (field) ? field_name_len(field) : 0, + (field) ? field : ""); } return; } - if (last_field != NULL) { - if ((len > 0) && ((*field == '\t') || *field == ' ')) { - /* This line is a continuation of the preceding line(s), - * so append it to the line that we've set aside. - * Note: this uses a power-of-two allocator to avoid - * doing O(n) allocs and using O(n^2) space for - * continuations that span many many lines. - */ - apr_size_t fold_len = last_len + len + 1; /* trailing null */ + /* For all header values, and all obs-fold lines, the presence of + * additional whitespace is a no-op, so collapse trailing whitespace + * to save buffer allocation and optimize copy operations. + * Do not remove the last single whitespace under any condition. + */ + while (len > 1 && (field[len-1] == '\t' || field[len-1] == ' ')) { + field[--len] = '\0'; + } - if (fold_len >= (apr_size_t)(r->server->limit_req_fieldsize)) { - const char *field_escaped; + if (*field == '\t' || *field == ' ') { - r->status = HTTP_BAD_REQUEST; - /* report what we have accumulated so far before the - * overflow (last_field) as the field with the problem - */ - field_escaped = ap_escape_html(r->pool, last_field); - apr_table_setn(r->notes, "error-notes", - apr_psprintf(r->pool, - "Size of a request header field " - "after folding " - "exceeds server limit.
\n" - "
\n%.*s\n
\n", - field_name_len(field_escaped), - field_escaped)); - ap_log_rerror(APLOG_MARK, APLOG_INFO, 0, r, APLOGNO(00562) - "Request header exceeds LimitRequestFieldSize " - "after folding: %.*s", - field_name_len(last_field), last_field); - return; - } + /* Append any newly-read obs-fold line onto the preceding + * last_field line we are processing + */ + apr_size_t fold_len; + + if (last_field == NULL) { + r->status = HTTP_BAD_REQUEST; + ap_log_rerror(APLOG_MARK, APLOG_INFO, 0, r, APLOGNO(03442) + "Line folding encountered before first" + " header line"); + return; + } + + if (field[1] == '\0') { + r->status = HTTP_BAD_REQUEST; + ap_log_rerror(APLOG_MARK, APLOG_INFO, 0, r, APLOGNO(03443) + "Empty folded line encountered"); + return; + } + + /* Leading whitespace on an obs-fold line can be + * similarly discarded */ + while (field[1] == '\t' || field[1] == ' ') { + ++field; --len; + } + /* This line is a continuation of the preceding line(s), + * so append it to the line that we've set aside. + * Note: this uses a power-of-two allocator to avoid + * doing O(n) allocs and using O(n^2) space for + * continuations that span many many lines. + */ + fold_len = last_len + len + 1; /* trailing null */ + + if (fold_len >= (apr_size_t)(r->server->limit_req_fieldsize)) { + r->status = HTTP_BAD_REQUEST; + /* report what we have accumulated so far before the + * overflow (last_field) as the field with the problem + */ + apr_table_setn(r->notes, "error-notes", + "Size of a request header field " + "exceeds server limit."); + ap_log_rerror(APLOG_MARK, APLOG_INFO, 0, r, APLOGNO(00562) + "Request header exceeds LimitRequestFieldSize " + "after folding: %.*s", + field_name_len(last_field), last_field); + return; + } + + if (fold_len > alloc_len) { + char *fold_buf; + alloc_len += alloc_len; if (fold_len > alloc_len) { - char *fold_buf; - alloc_len += alloc_len; - if (fold_len > alloc_len) { - alloc_len = fold_len; - } - fold_buf = (char *)apr_palloc(r->pool, alloc_len); - memcpy(fold_buf, last_field, last_len); - last_field = fold_buf; + alloc_len = fold_len; } - memcpy(last_field + last_len, field, len +1); /* +1 for nul */ - last_len += len; - folded = 1; + fold_buf = (char *)apr_palloc(r->pool, alloc_len); + memcpy(fold_buf, last_field, last_len); + last_field = fold_buf; } - else /* not a continuation line */ { + memcpy(last_field + last_len, field, len +1); /* +1 for nul */ + /* Replace obs-fold w/ SP per RFC 7230 3.2.4 */ + if (conf->http_conformance & AP_HTTP_CONFORMANCE_STRICT) { + last_field[last_len] = ' '; + } + last_len += len; + + /* We've appended this obs-fold line to last_len, proceed to + * read the next input line + */ + continue; + } + else if (last_field != NULL) { - if (r->server->limit_req_fields + /* Process the previous last_field header line with all obs-folded + * segments already concatinated (this is not operating on the + * most recently read input line). + */ + + if (r->server->limit_req_fields && (++fields_read > r->server->limit_req_fields)) { - r->status = HTTP_BAD_REQUEST; - apr_table_setn(r->notes, "error-notes", - "The number of request header fields " - "exceeds this server's limit."); - ap_log_rerror(APLOG_MARK, APLOG_INFO, 0, r, APLOGNO(00563) - "Number of request headers exceeds " - "LimitRequestFields"); - return; - } + r->status = HTTP_BAD_REQUEST; + apr_table_setn(r->notes, "error-notes", + "The number of request header fields " + "exceeds this server's limit."); + ap_log_rerror(APLOG_MARK, APLOG_INFO, 0, r, APLOGNO(00563) + "Number of request headers exceeds " + "LimitRequestFields"); + return; + } - if (!(value = strchr(last_field, ':'))) { /* Find ':' or */ - r->status = HTTP_BAD_REQUEST; /* abort bad request */ - apr_table_setn(r->notes, "error-notes", - apr_psprintf(r->pool, - "Request header field is " - "missing ':' separator.
\n" - "
\n%.*s
\n", - (int)LOG_NAME_MAX_LEN, - ap_escape_html(r->pool, - last_field))); + if (!(conf->http_conformance & AP_HTTP_CONFORMANCE_STRICT)) + { + /* Not Strict, using the legacy parser */ + + if (!(value = strchr(last_field, ':'))) { /* Find ':' or */ + r->status = HTTP_BAD_REQUEST; /* abort bad request */ ap_log_rerror(APLOG_MARK, APLOG_INFO, 0, r, APLOGNO(00564) "Request header field is missing ':' " "separator: %.*s", (int)LOG_NAME_MAX_LEN, @@ -920,74 +939,70 @@ AP_DECLARE(void) ap_get_mime_headers_core(request_rec *r, apr_bucket_brigade *bb *value++ = '\0'; /* NUL-terminate at colon */ while (*value == ' ' || *value == '\t') { - ++value; /* Skip to start of value */ + ++value; /* Skip to start of value */ } /* Strip LWS after field-name: */ while (tmp_field > last_field - && (*tmp_field == ' ' || *tmp_field == '\t')) { + && (*tmp_field == ' ' || *tmp_field == '\t')) { *tmp_field-- = '\0'; } - - /* Strip LWS after field-value: */ - tmp_field = last_field + last_len - 1; - while (tmp_field > value - && (*tmp_field == ' ' || *tmp_field == '\t')) { - *tmp_field-- = '\0'; + } + else /* Using strict RFC7230 parsing */ + { + /* Ensure valid token chars before ':' per RFC 7230 3.2.4 */ + value = (char *)ap_scan_http_token(last_field); + if ((value == last_field) || *value != ':') { + r->status = HTTP_BAD_REQUEST; + ap_log_rerror(APLOG_MARK, APLOG_INFO, 0, r, APLOGNO(02426) + "Request header field name is malformed: " + "%.*s", (int)LOG_NAME_MAX_LEN, last_field); + return; } - if (conf->http_conformance & AP_HTTP_CONFORMANCE_STRICT) { - int err = 0; + *value++ = '\0'; /* NUL-terminate last_field name at ':' */ - if (*last_field == '\0') { - err = HTTP_BAD_REQUEST; - ap_log_rerror(APLOG_MARK, APLOG_INFO, 0, r, APLOGNO(02425) - "Empty request header field name not allowed"); - } - else if (ap_has_cntrl(last_field)) { - err = HTTP_BAD_REQUEST; - ap_log_rerror(APLOG_MARK, APLOG_INFO, 0, r, APLOGNO(02426) - "[HTTP strict] Request header field name contains " - "control character: %.*s", - (int)LOG_NAME_MAX_LEN, last_field); - } - else if (ap_has_cntrl(value)) { - err = HTTP_BAD_REQUEST; - ap_log_rerror(APLOG_MARK, APLOG_INFO, 0, r, APLOGNO(02427) - "Request header field '%.*s' contains " - "control character", (int)LOG_NAME_MAX_LEN, - last_field); - } - if (err && !(conf->http_conformance & AP_HTTP_CONFORMANCE_LOGONLY)) { - r->status = err; - return; - } + while (*value == ' ' || *value == '\t') { + ++value; /* Skip LWS of value */ } - apr_table_addn(r->headers_in, last_field, value); - /* reset the alloc_len so that we'll allocate a new - * buffer if we have to do any more folding: we can't - * use the previous buffer because its contents are - * now part of r->headers_in + /* Find invalid, non-HT ctrl char, or the trailing NULL */ + tmp_field = (char *)ap_scan_http_field_content(value); + + /* Reject value for all garbage input (CTRLs excluding HT) + * e.g. only VCHAR / SP / HT / obs-text are allowed per + * RFC7230 3.2.6 - leave all more explicit rule enforcement + * for specific header handler logic later in the cycle */ - alloc_len = 0; + if (*tmp_field != '\0') { + r->status = HTTP_BAD_REQUEST; + ap_log_rerror(APLOG_MARK, APLOG_INFO, 0, r, APLOGNO(02427) + "Request header value is malformed: " + "%.*s", (int)LOG_NAME_MAX_LEN, value); + return; + } + } + + apr_table_addn(r->headers_in, last_field, value); - } /* end if current line is not a continuation starting with tab */ + /* This last_field header is now stored in headers_in, + * resume processing of the current input line. + */ } - /* Found a blank line, stop. */ + /* Found the terminating empty end-of-headers line, stop. */ if (len == 0) { break; } - /* Keep track of this line so that we can parse it on - * the next loop iteration. (In the folded case, last_field - * has been updated already.) + /* Keep track of this new header line so that we can extend it across + * any obs-fold or parse it on the next loop iteration. We referenced + * our previously allocated buffer in r->headers_in, + * so allocate a fresh buffer if required. */ - if (!folded) { - last_field = field; - last_len = len; - } + alloc_len = 0; + last_field = field; + last_len = len; } /* Combine multiple message-header fields with the same @@ -1083,7 +1098,7 @@ request_rec *ap_read_request(conn_rec *conn) } else if (r->method == NULL) { ap_log_rerror(APLOG_MARK, APLOG_INFO, 0, r, APLOGNO(00566) - "request failed: invalid characters in URI"); + "request failed: malformed request line"); } access_status = r->status; r->status = HTTP_OK; diff --git a/server/util.c b/server/util.c index 2fe71a13c5c..d878c32990c 100644 --- a/server/util.c +++ b/server/util.c @@ -79,7 +79,7 @@ * char in here and get it to work, because if char is signed then it * will first be sign extended. */ -#define TEST_CHAR(c, f) (test_char_table[(unsigned)(c)] & (f)) +#define TEST_CHAR(c, f) (test_char_table[(unsigned char)(c)] & (f)) /* Win32/NetWare/OS2 need to check for both forward and back slashes * in ap_getparents() and ap_escape_url. @@ -1525,7 +1525,7 @@ AP_DECLARE(const char *) ap_parse_token_list_strict(apr_pool_t *p, while (!string_end) { const unsigned char c = (unsigned char)*cur; - if (!TEST_CHAR(c, T_HTTP_TOKEN_STOP) && c != '\0') { + if (!TEST_CHAR(c, T_HTTP_TOKEN_STOP)) { /* Non-separator character; we are finished with leading * whitespace. We must never have encountered any trailing * whitespace before the delimiter (comma) */ @@ -1593,6 +1593,46 @@ AP_DECLARE(const char *) ap_parse_token_list_strict(apr_pool_t *p, return NULL; } +/* Scan a string for HTTP VCHAR/obs-text characters including HT and SP + * (as used in header values, for example, in RFC 7230 section 3.2) + * returning the pointer to the first non-HT ASCII ctrl character. + */ +AP_DECLARE(const char *) ap_scan_http_field_content(const char *ptr) +{ + for ( ; !TEST_CHAR(*ptr, T_HTTP_CTRLS); ++ptr) ; + + return ptr; +} + +/* Scan a string for HTTP token characters, returning the pointer to + * the first non-token character. + */ +AP_DECLARE(const char *) ap_scan_http_token(const char *ptr) +{ + for ( ; !TEST_CHAR(*ptr, T_HTTP_TOKEN_STOP); ++ptr) ; + + return ptr; +} + +/* Retrieve a token, advancing the pointer to the first non-token character + * and returning a copy of the token string. + * The caller must handle whitespace and determine the meaning of the + * terminating character. Returns NULL if the character at **ptr is not + * a valid token character. + */ +AP_DECLARE(char *) ap_get_http_token(apr_pool_t *p, const char **ptr) +{ + const char *tok_end = ap_scan_http_token(*ptr); + char *tok; + + if (tok_end == *ptr) + return NULL; + + tok = apr_pstrmemdup(p, *ptr, tok_end - *ptr); + *ptr = tok_end; + return tok; +} + /* Retrieve a token, spacing over it and returning a pointer to * the first non-white byte afterwards. Note that these tokens * are delimited by semis and commas; and can also be delimited