Historically, the adaptive interpreter was referred to as `tier 1` and
the JIT as `tier 2`. You will see remnants of this in the code.
-## The Optimizer and Executors
+## The Trace Recorder and Executors
-The program begins running on the adaptive interpreter, until a `JUMP_BACKWARD`
-instruction determines that it is "hot" because the counter in its
+There are two interpreters in this section:
+ 1. Adaptive interpreter (the default behavior)
+ 2. Trace recording interpreter (enabled on JIT builds)
+
+The program begins running on the adaptive interpreter, until a `JUMP_BACKWARD` or
+`RESUME` instruction determines that it is "hot" because the counter in its
[inline cache](interpreter.md#inline-cache-entries) indicates that it
executed more than some threshold number of times (see
[`backoff_counter_triggers`](../Include/internal/pycore_backoff.h)).
-It then calls the function `_PyOptimizer_Optimize()` in
+It then calls the function `_PyJit_TryInitializeTracing` in
[`Python/optimizer.c`](../Python/optimizer.c), passing it the current
-[frame](frames.md) and instruction pointer. `_PyOptimizer_Optimize()`
-constructs an object of type
-[`_PyExecutorObject`](../Include/internal/pycore_optimizer.h) which implements
-an optimized version of the instruction trace beginning at this jump.
-
-The optimizer determines where the trace ends, and the executor is set up
+[frame](frames.md), instruction pointer and state.
+The interpreter then switches into "tracing mode" via the macro
+`ENTER_TRACING()`. On platforms that support computed goto and tail-calling
+interpreters, the dispatch table is swapped out, while other platforms that do
+not support either use a single flag in the opcode.
+Execution between the normal interpreter and tracing interpreter are
+interleaved via this dispatch mechanism. This means that while logically
+there are two interpreters, the implementation appears to be a single
+interpreter.
+
+During tracing mode, after each interpreter instruction's `DISPATCH()`,
+the interpreter jumps to the `TRACE_RECORD` instruction. This instruction
+records the previous instruction executed and also any live values of the next
+operation it may require. It then translates the previous instruction to
+a sequence of micro-ops using `_PyJit_translate_single_bytecode_to_trace`.
+To ensure that the adaptive interpreter instructions
+and cache entries are up-to-date, the trace recording interpreter always resets
+the adaptive counters of adaptive instructions it sees.
+This forces a re-specialization of any new instruction should an instruction
+deoptimize. Thus, feeding the trace recorder up-to-date information.
+Finally, the `TRACE_RECORD` instruction decides when to stop tracing
+using various heuristics.
+
+Once trace recording concludes, `LEAVE_TRACING()` swaps out the dispatch
+table/the opcode flag set earlier by `ENTER_TRACING()` is unset.
+`stop_tracing_and_jit()` then calls `_PyOptimizer_Optimize()` which optimizes
+the trace and constructs an
+[`_PyExecutorObject`](../Include/internal/pycore_optimizer.h).
+
+JIT execution is set up
to either return to the adaptive interpreter and resume execution, or
transfer control to another executor (see `_PyExitData` in
-Include/internal/pycore_optimizer.h).
+Include/internal/pycore_optimizer.h). When resuming to the adaptive interpreter,
+a "side exit", generated by an `EXIT_IF` may trigger recording of another trace.
+While a "deopt", generated by a `DEOPT_IF`, does not trigger recording.
The executor is stored on the [`code object`](code_objects.md) of the frame,
in the `co_executors` field which is an array of executors. The start
The micro-op (abbreviated `uop` to approximate `μop`) optimizer is defined in
[`Python/optimizer.c`](../Python/optimizer.c) as `_PyOptimizer_Optimize`.
-It translates an instruction trace into a sequence of micro-ops by replacing
-each bytecode by an equivalent sequence of micro-ops (see
-`_PyOpcode_macro_expansion` in
-[pycore_opcode_metadata.h](../Include/internal/pycore_opcode_metadata.h)
-which is generated from [`Python/bytecodes.c`](../Python/bytecodes.c)).
-The micro-op sequence is then optimized by
+It takes a micro-op sequence from the trace recorder and optimizes with
`_Py_uop_analyze_and_optimize` in
[`Python/optimizer_analysis.c`](../Python/optimizer_analysis.c)
and an instance of `_PyUOpExecutor_Type` is created to contain it.