]> git.ipfire.org Git - thirdparty/postgresql.git/commit
instrumentation: Use Time-Stamp Counter on x86-64 to lower overhead
authorAndres Freund <andres@anarazel.de>
Tue, 7 Apr 2026 16:48:07 +0000 (12:48 -0400)
committerAndres Freund <andres@anarazel.de>
Tue, 7 Apr 2026 17:00:24 +0000 (13:00 -0400)
commit294520c44487ecaade7a6ea8781b973f9ed03909
treeeae8a6668008306c8216fae91cfd65e484ceaa9b
parentbcb2cf41f964e9ed4d27abe4dee8daa249f659bb
instrumentation: Use Time-Stamp Counter on x86-64 to lower overhead

This allows the direct use of the Time-Stamp Counter (TSC) value retrieved
from the CPU using RDTSC/RDTSCP instructions, instead of APIs like
clock_gettime() on POSIX systems.

This reduces the overhead of EXPLAIN with ANALYZE and TIMING ON. Tests showed
that the overhead on top of actual runtime when instrumenting queries moving
lots of rows through the plan can be reduced from 2x as slow to 1.2x as slow
compared to the actual runtime. More complex workloads such as TPCH queries
have also shown ~20% gains when instrumented compared to before.

To control use of the TSC, the new "timing_clock_source" GUC is introduced,
whose default ("auto") automatically uses the TSC when reliable, for example
when running on modern Intel CPUs, or when running on Linux and the system
clocksource is reported as "tsc". The use of the operating system clock source
can be enforced by setting "system", or on x86-64 architectures the use of TSC
can be enforced by explicitly setting "tsc".

In order to use the TSC the frequency is first determined by use of CPUID, and
if not available, by running a short calibration loop at program start,
falling back to the system clock source if TSC values are not stable.

Note, that we split TSC usage into the RDTSC CPU instruction which does not
wait for out-of-order execution (faster, less precise) and the RDTSCP
instruction, which waits for outstanding instructions to retire. RDTSCP is
deemed to have little benefit in the typical InstrStartNode() /
InstrStopNode() use case of EXPLAIN, and can be up to twice as slow. To
separate these use cases, the new macro INSTR_TIME_SET_CURRENT_FAST() is
introduced, which uses RDTSC.

The original macro INSTR_TIME_SET_CURRENT() uses RDTSCP and is supposed to be
used when precision is more important than performance. When the system timing
clock source is used both of these macros instead utilize the system
APIs (clock_gettime / QueryPerformanceCounter) like before.

Additional users of interval timing, such as track_io_timing and
track_wal_io_timing could also benefit from being converted to use
INSTR_TIME_SET_CURRENT_FAST() but are left for future changes.

Author: Lukas Fittl <lukas@fittl.com>
Author: Andres Freund <andres@anarazel.de>
Author: David Geier <geidav.pg@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: David Geier <geidav.pg@gmail.com>
Reviewed-by: Lukas Fittl <lukas@fittl.com>
Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com> (in an earlier version)
Reviewed-by: Maciek Sakrejda <m.sakrejda@gmail.com> (in an earlier version)
Reviewed-by: Robert Haas <robertmhaas@gmail.com> (in an earlier version)
Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com> (in an earlier version)
Discussion: https://postgr.es/m/20200612232810.f46nbqkdhbutzqdg@alap3.anarazel.de
doc/src/sgml/config.sgml
src/backend/executor/instrument.c
src/backend/postmaster/launch_backend.c
src/backend/utils/misc/guc_parameters.dat
src/backend/utils/misc/guc_tables.c
src/backend/utils/misc/postgresql.conf.sample
src/common/instr_time.c
src/include/portability/instr_time.h
src/include/utils/guc_hooks.h
src/include/utils/guc_tables.h
src/tools/pgindent/typedefs.list