From: Bernát Gábor Date: Tue, 2 Jun 2026 07:45:30 +0000 (-0700) Subject: gh-150717: Avoid mark-array allocation for groupless regex patterns (GH-150719) X-Git-Url: http://git.ipfire.org/gitweb/index.cgi?a=commitdiff_plain;h=c79e18a8e5ff4fda1a3d9201e65b0c6048b56b68;p=thirdparty%2FPython%2Fcpython.git gh-150717: Avoid mark-array allocation for groupless regex patterns (GH-150719) state_init() always did PyMem_New(state->mark, groups*2), which for a pattern with no capturing groups is PyMem_Malloc(0) -- a real allocation (plus matching free) on every match/search/fullmatch call, for an array that is never read: groupless patterns emit no MARK opcodes and group 0's span is taken from state->start/ptr. Guard the allocation with `if (pattern->groups)`. state->mark stays NULL (set by the preceding memset), and both the error path and state_fini already PyMem_Free(NULL) safely. --- diff --git a/Misc/NEWS.d/next/Library/2026-06-01-08-12-34.gh-issue-150717.LVRJXH.rst b/Misc/NEWS.d/next/Library/2026-06-01-08-12-34.gh-issue-150717.LVRJXH.rst new file mode 100644 index 000000000000..da7171fcc72c --- /dev/null +++ b/Misc/NEWS.d/next/Library/2026-06-01-08-12-34.gh-issue-150717.LVRJXH.rst @@ -0,0 +1,2 @@ +Avoid an unnecessary per-call memory allocation when matching :mod:`re` +patterns that have no capturing groups. Patch by Bernát Gábor. diff --git a/Modules/_sre/sre.c b/Modules/_sre/sre.c index 7a07ed1d7aca..058a03148c82 100644 --- a/Modules/_sre/sre.c +++ b/Modules/_sre/sre.c @@ -548,10 +548,16 @@ state_init(SRE_STATE* state, PatternObject* pattern, PyObject* string, memset(state, 0, sizeof(SRE_STATE)); - state->mark = PyMem_New(const void *, pattern->groups * 2); - if (!state->mark) { - PyErr_NoMemory(); - goto err; + /* Patterns with no capturing groups never emit MARK opcodes and never + read state->mark (group 0's span comes from state->start/ptr), so skip + the allocation entirely -- state->mark stays NULL, which both the err + path and state_fini already free safely. */ + if (pattern->groups) { + state->mark = PyMem_New(const void *, pattern->groups * 2); + if (!state->mark) { + PyErr_NoMemory(); + goto err; + } } state->lastmark = -1; state->lastindex = -1;