Ben Darnell [Wed, 27 May 2026 01:30:28 +0000 (21:30 -0400)]
simple_httpclient: Strip auth headers on cross-origin redirects
When following a redirect to a different origin (scheme, host, or port),
auth-related headers (Authorization and Cookie) should be stripped to
avoid exposing them to the new host.
Ben Darnell [Tue, 26 May 2026 17:39:53 +0000 (13:39 -0400)]
http1connection: Enforce max_body_size in _GzipMessageDelegate
This ensures we limit the post-decompression size of the body, and not
only the compressed size (which is enforced via the Content-Length
header at header-processing time).
This previously used substring search, which is incorrect, although
unlikely to be a vulnerability because there are no free-form text
fields allowed in this response format.
Ben Darnell [Tue, 10 Mar 2026 16:19:50 +0000 (12:19 -0400)]
httputil: Add CRLF to _FORBIDDEN_HEADER_CHARS_RE
I think these were omitted due to quirks of an older version of the
parsing code. Linefeeds are already effectively prohibited within
header values since they are interpreted as delimiters, so the net
effect of this change is to prohibit bare carriage returns within
header values. This RE is used only when parsing headers inside
multipart/form-data bodies; for HTTP headers CR was already prohibited.
Ben Darnell [Fri, 6 Mar 2026 19:50:25 +0000 (14:50 -0500)]
web: Validate characters in all cookie attributes.
Our previous control character check was missing a check for
U+007F, and also semicolons, which are only allowed in quoted
parts of values. This commit checks all attributes and
updates the set of disallowed characters.
Ben Darnell [Tue, 3 Mar 2026 19:36:14 +0000 (14:36 -0500)]
httputil: Add limits on multipart form data parsing
The new default limits prevent a DoS vulnerability involving
requests with many multipart parts. It also adds a defense-in-depth
limit on the size of multipart headers, which would have mitigated
the vulnerability fixed in 6.5.3.
New data structures are added to allow users to configure these limits,
and to disable multipart parsing entirely if they choose. However,
due to the complexity of the plumbing required to pass these
configuration options through the stack, the only configuration
provided in this commit is the ability to set a global default.
Ben Darnell [Thu, 11 Dec 2025 03:00:03 +0000 (22:00 -0500)]
test: Use time.perf_counter instead of time.time for performance tests
On windows, time.time has low resolution (about 15ms), which makes
performance tests flaky. time.perf_counter has much higher resolution
and is the recommended way to measure elapsed time.
Ben Darnell [Wed, 10 Dec 2025 20:15:25 +0000 (15:15 -0500)]
web: Harden against invalid HTTP reason phrases
We allow applications to set custom reason phrases for the HTTP status
line (to support custom status codes), but if this were exposed to
untrusted data it could be exploited in various ways. This commit
guards against invalid reason phrases in both HTTP headers and in
error pages.
Ben Darnell [Wed, 10 Dec 2025 15:55:02 +0000 (10:55 -0500)]
httputil: Fix quadratic behavior in _parseparam
Prior to this change, _parseparam had O(n^2) behavior when parsing
certain inputs, which could be a DoS vector. This change adapts
logic from the equivalent function in the python standard library
in https://github.com/python/cpython/pull/136072/files
Ben Darnell [Tue, 9 Dec 2025 15:19:34 +0000 (10:19 -0500)]
demos: Remove s3server demo
This program does not demonstrate anything particularly interesting
about Tornado, nor is it a good stylistic example to follow. Its
handling of path validation is rudimentary and can be insecure in
some configurations. It makes more sense to remove it than to
try and improve it.
Ben Darnell [Tue, 9 Dec 2025 18:27:27 +0000 (13:27 -0500)]
httputil: Fix quadratic performance of repeated header lines
Previouisly, when many header lines with the same name were found
in an HTTP request or response, repeated string concatenation would
result in quadratic performance. This change does the concatenation
lazily (with a cache) so that repeated headers can be processed
efficiently.
Security: The previous behavior allowed a denial of service attack
via a maliciously crafted HTTP message, but only if the
max_header_size was increased from its default of 64kB.
Ben Darnell [Tue, 9 Dec 2025 17:10:18 +0000 (12:10 -0500)]
process_test: Use isolated mode for subprocess tests
Prompt customizations (notably the PYTHONSTARTUP file used by
vscode's terminal integration) can interfere with tests that run
interactive interpreters in a subprocess. Run those interpreters
in isolated mode to avoid this problem.)
Ben Darnell [Tue, 22 Jul 2025 17:54:03 +0000 (17:54 +0000)]
http1connection: Improve error logging for invalid host headers
This was previously being logged as an uncaught exception in application
code, which is wrong for a malformed request. HTTPInputError now passes
through the app-error logging to be caught and reported as a 400
(which logs at the warning level to the access log and info to the
general log).
Ben Darnell [Thu, 24 Jul 2025 20:37:48 +0000 (20:37 +0000)]
websocket: Expand testing of next-ping calculation
Includes end-to-end tests that the correct number of pings are sent
(piggybacking on an existing test) and a unit test for the
`ping_sleep_time` calculation.
Oliver Sanders [Thu, 19 Jun 2025 10:06:29 +0000 (11:06 +0100)]
websocket_ping: fix ping interval with non-zero timeout and improve docs.
* Fix a bug that caused the ping interval to be less frequent than
configured.
* Fix erroneous documentation of the websocket_ping_timeout default and
clarify units for the ping interval.
Ben Darnell [Thu, 22 May 2025 14:59:48 +0000 (10:59 -0400)]
httputil: Fix support for non-latin1 filenames in multipart uploads
The change to be stricter about characters allowed in HTTP headers
inadvertently broke support for non-latin1 filenames in multipart
uploads (this was missed in testing because our i18n test case only
used characters in latin1). This commit adds a hacky workaround without
changing any APIs to make it safe for a 6.5.1 patch release; a more
robust solution will follow for future releases.
Ben Darnell [Thu, 8 May 2025 17:29:43 +0000 (13:29 -0400)]
httputil: Raise errors instead of logging in multipart/form-data parsing
We used to continue after logging an error, which allowed repeated
errors to spam the logs. The error raised here will still be logged,
but only once per request, consistent with other error handling in
Tornado.
Ben Darnell [Fri, 25 Apr 2025 19:31:13 +0000 (15:31 -0400)]
httputil: Reject header lines beginning with invalid whitespace
The obs-fold feature is defined only for tabs and spaces.
The str.isspace() method also accepts other whitespace characters.
These characters are not valid in HTTP headers and should be treated
as errors instead of triggering line folding.
Ben Darnell [Fri, 25 Apr 2025 18:08:18 +0000 (14:08 -0400)]
httputil: Process the Host header more strictly
- It is now an error to have multiple Host headers
- The Host header is now mandatory except in HTTP/1.0 mode
- Host headers containing characters that are disallowed by RFC 3986
are now rejected
Ben Darnell [Fri, 25 Apr 2025 15:53:44 +0000 (11:53 -0400)]
websocket: deprecate callback argument to websocket_connect
This was missed in the 6.0-era deprecation of callback arguments.
The on_message_callback remains because even in coroutine-oriented
code it is often more convenient to use a callback than to
loop on read_message.
Oliver Sanders [Tue, 22 Apr 2025 17:19:00 +0000 (18:19 +0100)]
websockets: fix ping_timeout (#3376)
* websockets: fix ping_timeout
* Closes #3258
* Closes #2905
* Closes #2655
* Fixes an issue with the calculation of ping timeout interval that
could cause connections to be erroneously timed out and closed
from the server end.
Ben Darnell [Wed, 19 Feb 2025 19:06:22 +0000 (14:06 -0500)]
httputil: Improve handling of trailing whitespace in headers
HTTPHeaders had undocumented assumptions about trailing whitespace,
leading to an unintentional regression in Tornado 6.4.1 in which
passing the arguments of an AsyncHTTPClient header_callback to
HTTPHeaders.parse_line would result in errors.
This commit moves newline parsing from parse to parse_line.
It also strips trailing whitespace from continuation lines (trailing
whitespace is not allowed in HTTP headers, but we didn't reject it
in continuation lines).
This commit also deprecates continuation lines and the legacy
handling of LF without CR.
Ben Darnell [Thu, 27 Mar 2025 20:30:08 +0000 (16:30 -0400)]
httputil: Make parse_request_start_line stricter
The method is now restricted to being valid token characters as defined
in RFC 9110, allowing us to correctly issue status code 400 or 405
as appropriate (this can make a difference with some caching proxies).
The request-target no longer allows control characters. This is less
strict than the RFC (which does not allow non-ascii characters),
but prioritizes backwards compatibility.
Ben Darnell [Thu, 27 Mar 2025 20:22:33 +0000 (16:22 -0400)]
httputil: Centralize regexes based directly on RFCs
This will make it easier to stay in strict conformance with the RFCs.
Note that this commit makes a few small semantic changes to response
start-line parsing: status codes must be exactly three digits, and
control characters are not allowed in reason phrases.