From 91b85c534678fc21facb04530eac3609116b4dc9 Mon Sep 17 00:00:00 2001 From: "Miss Islington (bot)" <31488909+miss-islington@users.noreply.github.com> Date: Wed, 5 Nov 2025 20:06:37 +0100 Subject: [PATCH] [3.13] gh-139313: Improve docs on XML security (GH-139460) (GH-141066) Clarify that: - it takes parsing for an attack - that some doors are closed by default - only Expat version 2.7.2 has all the fixes - use of the bundle depends on configuration (cherry picked from commit baa9f338971c6a13433a8232db77cd45e6b87b77) Co-authored-by: Sebastian Pipping --- Doc/library/pyexpat.rst | 9 +++++++++ Doc/library/xml.rst | 22 +++++++++++++++++----- 2 files changed, 26 insertions(+), 5 deletions(-) diff --git a/Doc/library/pyexpat.rst b/Doc/library/pyexpat.rst index 10bc4c7bdf93..0164d2d358c5 100644 --- a/Doc/library/pyexpat.rst +++ b/Doc/library/pyexpat.rst @@ -558,6 +558,15 @@ otherwise stated. .. method:: xmlparser.ExternalEntityRefHandler(context, base, systemId, publicId) + .. warning:: + + Implementing a handler that accesses local files and/or the network + may create a vulnerability to + `external entity attacks `_ + if :class:`xmlparser` is used with user-provided XML content. + Please reflect on your `threat model `_ + before implementing this handler. + Called for references to external entities. *base* is the current base, as set by a previous call to :meth:`SetBase`. The public and system identifiers, *systemId* and *publicId*, are strings if given; if the public identifier is not diff --git a/Doc/library/xml.rst b/Doc/library/xml.rst index 3f7455734744..acd8d399fe32 100644 --- a/Doc/library/xml.rst +++ b/Doc/library/xml.rst @@ -53,11 +53,22 @@ XML security An attacker can abuse XML features to carry out denial of service attacks, access local files, generate network connections to other machines, or -circumvent firewalls. - -Expat versions lower than 2.6.0 may be vulnerable to "billion laughs", -"quadratic blowup" and "large tokens". Python may be vulnerable if it uses such -older versions of Expat as a system-provided library. +circumvent firewalls when attacker-controlled XML is being parsed, +in Python or elsewhere. + +The built-in XML parsers of Python rely on the library `libexpat`_, commonly +called Expat, for parsing XML. + +By default, Expat itself does not access local files or create network +connections. + +Expat versions lower than 2.7.2 may be vulnerable to the "billion laughs", +"quadratic blowup" and "large tokens" vulnerabilities, or to disproportional +use of dynamic memory. +Python bundles a copy of Expat, and whether Python uses the bundled or a +system-wide Expat, depends on how the Python interpreter +:option:`has been configured <--with-system-expat>` in your environment. +Python may be vulnerable if it uses such older versions of Expat. Check :const:`!pyexpat.EXPAT_VERSION`. :mod:`xmlrpc` is **vulnerable** to the "decompression bomb" attack. @@ -90,5 +101,6 @@ large tokens be used to cause denial of service in the application parsing XML. The issue is known as :cve:`2023-52425`. +.. _libexpat: https://github.com/libexpat/libexpat .. _Billion Laughs: https://en.wikipedia.org/wiki/Billion_laughs .. _ZIP bomb: https://en.wikipedia.org/wiki/Zip_bomb -- 2.47.3