Python opcodes, rather than one large C case statement.
For certain newer compilers, this interpreter provides
significantly better performance. Preliminary numbers on our machines suggest
-anywhere from -3% to 30% faster Python code, and a geometric mean of 9-15%
+anywhere up to 30% faster Python code, and a geometric mean of 3-5%
faster on ``pyperformance`` depending on platform and architecture. The
baseline is Python 3.14 built with Clang 19 without this new interpreter.
__ https://en.wikipedia.org/wiki/Tail_call
+.. attention::
+
+ This section previously reported a 9-15% geomean speedup. This number has since been
+ cautiously revised down to 3-5%. While we expect performance results to be better
+ than what we report, our estimates are more conservative due to a
+ `compiler bug <https://github.com/llvm/llvm-project/issues/106846>`_ found in
+ Clang/LLVM 19. We were unaware of this bug, and it artifically boosted
+ our numbers, resulting in inaccurate results. We sincerely apologize for
+ communicating results that were only accurate for certain versions of LLVM 19
+ and 20. At the time of writing, this bug has not yet been fixed in LLVM 19-21. Thus
+ any benchmarks with those versions of LLVM may produce artifically inflated numbers.
+ (Thanks to Nelson Elhage for bringing this to light.)
+
(Contributed by Ken Jin in :gh:`128563`, with ideas on how to implement this
in CPython by Mark Shannon, Garrett Gu, Haoran Xu, and Josh Haberman.)