From: Rich Bowen Date: Thu, 14 May 2026 20:12:01 +0000 (+0000) Subject: rewrite guide: document URL encoding/decoding pipeline X-Git-Url: http://git.ipfire.org/gitweb/index.cgi?a=commitdiff_plain;h=81a33d0ead60a6b90d40bd1e1e079802c364914d;p=thirdparty%2Fapache%2Fhttpd.git rewrite guide: document URL encoding/decoding pipeline tech.xml: new "URL Encoding and Decoding" section explaining that httpd unescapes the URI before pattern matching, how to use %{THE_REQUEST} for the raw form, AllowEncodedSlashes options, and a summary of the [B]/[BNP]/[NE] flags with links to flags.xml. flags.xml: add cross-references to tech.html#encoding from the [B], [BNP], and [NE] flag sections. Restore section headers for flag_bnp and flag_bctls that were inadvertently dropped. git-svn-id: https://svn.apache.org/repos/asf/httpd/httpd/trunk@1934202 13f79535-47bb-0310-9956-ffa450edef68 --- diff --git a/docs/manual/rewrite/TODO.md b/docs/manual/rewrite/TODO.md index 705b876746..06bd28adfb 100644 --- a/docs/manual/rewrite/TODO.md +++ b/docs/manual/rewrite/TODO.md @@ -66,7 +66,7 @@ address. Sorted by priority. Behind a reverse proxy, check %{HTTP:X-Forwarded-Proto} instead. Add to the HTTPS redirect recipe in remapping.xml. -- [ ] **URL encoding pipeline** — Apache decodes percent-encoded chars +- [x] **URL encoding pipeline** — Apache decodes percent-encoded chars before pattern matching. %{THE_REQUEST} preserves the raw form. AllowEncodedSlashes, [B]/[NE]/[BNP] flags. No coherent explanation exists in the guide. Could be a new section in diff --git a/docs/manual/rewrite/flags.xml b/docs/manual/rewrite/flags.xml index 1e586b8a1d..ae1c13e01a 100644 --- a/docs/manual/rewrite/flags.xml +++ b/docs/manual/rewrite/flags.xml @@ -123,8 +123,12 @@ RewriteRule "^search/(.*)$" "/search.php?term=$1" "[B= ?]"

To limit the characters escaped this way, see #flag_bne and #flag_bctls

+ +

See URL Encoding and Decoding for +a full explanation of how Apache decodes URIs before pattern matching.

+
BNP|backrefnoplus (don't escape space to +)

The [BNP] flag instructs RewriteRule to escape the space character @@ -139,8 +143,12 @@ RewriteRule "^search/(.*)$" "/search.php/$1" "[B,BNP]"

This flag is available in version 2.4.26 and later.

+ +

See URL Encoding and Decoding for +background on how encoding is handled in the rewrite pipeline.

+
BCTLS

The [BCTLS] flag is similar to the [B] flag, but only escapes control characters and the space character. This is the same set of @@ -583,6 +591,9 @@ being converted to its hexcode equivalent, %23, which will then result in a 404 Not Found error condition.

+

See URL Encoding and Decoding for +the full picture of how Apache encodes and decodes URIs during +rewriting.

NS|nosubreq diff --git a/docs/manual/rewrite/tech.xml b/docs/manual/rewrite/tech.xml index d70ae1fe39..e73de19ddd 100644 --- a/docs/manual/rewrite/tech.xml +++ b/docs/manual/rewrite/tech.xml @@ -129,6 +129,77 @@ RewriteRule "^/old" "/other" [L]
+
URL Encoding and Decoding + +

Apache httpd unescapes URL-encoded characters in the request URI before any + RewriteRule pattern + matching takes place. A request for + /my%20page/cats%3Fdogs is decoded to + /my page/cats?dogs, and that decoded string is what the + RewriteRule pattern matches against.

+ +

This means you cannot write a pattern that matches the literal + URL-encoded form. If you need to distinguish + /horses%2Fponies from /horses/ponies, use + %{THE_REQUEST} in a RewriteCond — it preserves the + original request line exactly as the client sent it, before any + decoding:

+ + +# Match only the literally-encoded %2F, not a real path separator +RewriteCond "%{THE_REQUEST}" "/horses%2F" +RewriteRule "^/horses/ponies$" "/special-handler" [L] + + +

After substitution, mod_rewrite re-encodes the + resulting URI for output. Several flags control this behavior:

+ + + +
+ AllowEncodedSlashes + +

By default, Apache returns 404 for any URL containing an encoded + slash (%2F). The AllowEncodedSlashes directive controls + this behavior:

+ +
    +
  • Off (default) — reject %2F with + 404.
  • +
  • On — allow %2F and decode it to + / before passing to handlers.
  • +
  • NoDecode — allow %2F but keep it + in encoded form, letting the backend application distinguish it + from a real path separator.
  • +
+ +

When using the [B] flag with + URLs that may contain encoded slashes, you typically need + AllowEncodedSlashes NoDecode to prevent Apache from + rejecting the re-encoded result.

+ +
+ +
+
Ruleset Processing

Now when mod_rewrite is triggered in these two API phases, it