]> git.ipfire.org Git - thirdparty/Python/cpython.git/commit
[3.14] gh-138907: Support RFC 9309 in robotparser (GH-138908) (GH-149374)
authorMiss Islington (bot) <31488909+miss-islington@users.noreply.github.com>
Mon, 4 May 2026 18:28:54 +0000 (20:28 +0200)
committerGitHub <noreply@github.com>
Mon, 4 May 2026 18:28:54 +0000 (18:28 +0000)
commit3b0a3c4738da288956a83a1b402d56d2da1fc977
tree4bc870d514ce73cff3bca8d0bd2a0e3353254080
parentb05ee207513991541fe280e109a7aeb6eb6be9b6
[3.14] gh-138907: Support RFC 9309 in robotparser (GH-138908) (GH-149374)

* empty lines are always ignored instead of separating groups
* the "user-agent" line after a rule starts a new group
* groups matching the same user agent are now merged
* the rule with the longest match wins instead of the first matching rule
* in case of equal matches, the “Allow” rule wins over “Disallow”
* special characters “$” and “*” are now supported in rules
* prefer full match for user agent

(cherry picked from commit bc285e583286c739e553e49c19fd946cb63432c7)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Doc/library/urllib.robotparser.rst
Lib/test/test_robotparser.py
Lib/urllib/robotparser.py
Misc/NEWS.d/next/Library/2026-04-25-14-11-24.gh-issue-138907.u21Wnh.rst [new file with mode: 0644]