]> git.ipfire.org Git - thirdparty/glibc.git/blame - manual/tunables.texi
x86: Enable non-temporal memset tunable for AMD
[thirdparty/glibc.git] / manual / tunables.texi
CommitLineData
b31b4d6a
SP
1@node Tunables
2@c @node Tunables, , Internal Probes, Top
3@c %MENU% Tunable switches to alter libc internal behavior
4@chapter Tunables
5@cindex tunables
6
7@dfn{Tunables} are a feature in @theglibc{} that allows application authors and
8distribution maintainers to alter the runtime library behavior to match
9their workload. These are implemented as a set of switches that may be
10modified in different ways. The current default method to do this is via
11the @env{GLIBC_TUNABLES} environment variable by setting it to a string
12of colon-separated @var{name}=@var{value} pairs. For example, the following
bdc674d9
PE
13example enables @code{malloc} checking and sets the @code{malloc}
14trim threshold to 128
b31b4d6a
SP
15bytes:
16
17@example
18GLIBC_TUNABLES=glibc.malloc.trim_threshold=128:glibc.malloc.check=3
19export GLIBC_TUNABLES
20@end example
21
22Tunables are not part of the @glibcadj{} stable ABI, and they are
23subject to change or removal across releases. Additionally, the method to
24modify tunable values may change between releases and across distributions.
25It is possible to implement multiple `frontends' for the tunables allowing
26distributions to choose their preferred method at build time.
27
28Finally, the set of tunables available may vary between distributions as
29the tunables feature allows distributions to add their own tunables under
30their own namespace.
31
86f65dff
L
32Passing @option{--list-tunables} to the dynamic loader to print all
33tunables with minimum and maximum values:
34
35@example
36$ /lib64/ld-linux-x86-64.so.2 --list-tunables
37glibc.rtld.nns: 0x4 (min: 0x1, max: 0x10)
317f1c0a 38glibc.elision.skip_lock_after_retries: 3 (min: 0, max: 2147483647)
86f65dff
L
39glibc.malloc.trim_threshold: 0x0 (min: 0x0, max: 0xffffffffffffffff)
40glibc.malloc.perturb: 0 (min: 0, max: 255)
41glibc.cpu.x86_shared_cache_size: 0x100000 (min: 0x0, max: 0xffffffffffffffff)
317f1c0a
L
42glibc.pthread.rseq: 1 (min: 0, max: 1)
43glibc.cpu.prefer_map_32bit_exec: 0 (min: 0, max: 1)
86f65dff 44glibc.mem.tagging: 0 (min: 0, max: 255)
317f1c0a 45glibc.elision.tries: 3 (min: 0, max: 2147483647)
86f65dff 46glibc.elision.enable: 0 (min: 0, max: 1)
317f1c0a
L
47glibc.malloc.hugetlb: 0x0 (min: 0x0, max: 0xffffffffffffffff)
48glibc.cpu.x86_rep_movsb_threshold: 0x2000 (min: 0x100, max: 0xffffffffffffffff)
86f65dff 49glibc.malloc.mxfast: 0x0 (min: 0x0, max: 0xffffffffffffffff)
317f1c0a
L
50glibc.rtld.dynamic_sort: 2 (min: 1, max: 2)
51glibc.elision.skip_lock_busy: 3 (min: 0, max: 2147483647)
52glibc.malloc.top_pad: 0x20000 (min: 0x0, max: 0xffffffffffffffff)
86f65dff 53glibc.cpu.x86_rep_stosb_threshold: 0x800 (min: 0x1, max: 0xffffffffffffffff)
317f1c0a 54glibc.cpu.x86_non_temporal_threshold: 0xc0000 (min: 0x4040, max: 0xfffffffffffffff)
46b5e98e 55glibc.cpu.x86_memset_non_temporal_threshold: 0xc0000 (min: 0x4040, max: 0xfffffffffffffff)
86f65dff 56glibc.cpu.x86_shstk:
317f1c0a 57glibc.pthread.stack_cache_size: 0x2800000 (min: 0x0, max: 0xffffffffffffffff)
86f65dff 58glibc.cpu.hwcap_mask: 0x6 (min: 0x0, max: 0xffffffffffffffff)
317f1c0a
L
59glibc.malloc.mmap_max: 0 (min: 0, max: 2147483647)
60glibc.elision.skip_trylock_internal_abort: 3 (min: 0, max: 2147483647)
848746e8 61glibc.cpu.plt_rewrite: 0 (min: 0, max: 2)
86f65dff
L
62glibc.malloc.tcache_unsorted_limit: 0x0 (min: 0x0, max: 0xffffffffffffffff)
63glibc.cpu.x86_ibt:
64glibc.cpu.hwcaps:
317f1c0a 65glibc.elision.skip_lock_internal_abort: 3 (min: 0, max: 2147483647)
86f65dff
L
66glibc.malloc.arena_max: 0x0 (min: 0x1, max: 0xffffffffffffffff)
67glibc.malloc.mmap_threshold: 0x0 (min: 0x0, max: 0xffffffffffffffff)
68glibc.cpu.x86_data_cache_size: 0x8000 (min: 0x0, max: 0xffffffffffffffff)
69glibc.malloc.tcache_count: 0x0 (min: 0x0, max: 0xffffffffffffffff)
70glibc.malloc.arena_test: 0x0 (min: 0x1, max: 0xffffffffffffffff)
71glibc.pthread.mutex_spin_count: 100 (min: 0, max: 32767)
72glibc.rtld.optional_static_tls: 0x200 (min: 0x0, max: 0xffffffffffffffff)
73glibc.malloc.tcache_max: 0x0 (min: 0x0, max: 0xffffffffffffffff)
74glibc.malloc.check: 0 (min: 0, max: 3)
75@end example
76
b31b4d6a
SP
77@menu
78* Tunable names:: The structure of a tunable name
79* Memory Allocation Tunables:: Tunables in the memory allocation subsystem
0c7b002f 80* Dynamic Linking Tunables:: Tunables in the dynamic linking subsystem
07ed18d2 81* Elision Tunables:: Tunables in elision subsystem
6310e6be 82* POSIX Thread Tunables:: Tunables in the POSIX thread subsystem
ea9b0ecb
SP
83* Hardware Capability Tunables:: Tunables that modify the hardware
84 capabilities seen by @theglibc{}
26450d04
RE
85* Memory Related Tunables:: Tunables that control the use of memory by
86 @theglibc{}.
31be941e
SK
87* gmon Tunables:: Tunables that control the gmon profiler, used in
88 conjunction with gprof
89
b31b4d6a
SP
90@end menu
91
92@node Tunable names
93@section Tunable names
94@cindex Tunable names
95@cindex Tunable namespaces
96
97A tunable name is split into three components, a top namespace, a tunable
98namespace and the tunable name. The top namespace for tunables implemented in
99@theglibc{} is @code{glibc}. Distributions that choose to add custom tunables
100in their maintained versions of @theglibc{} may choose to do so under their own
101top namespace.
102
103The tunable namespace is a logical grouping of tunables in a single
104module. This currently holds no special significance, although that may
105change in the future.
106
107The tunable name is the actual name of the tunable. It is possible that
108different tunable namespaces may have tunables within them that have the
109same name, likewise for top namespaces. Hence, we only support
110identification of tunables by their full name, i.e. with the top
111namespace, tunable namespace and tunable name, separated by periods.
112
113@node Memory Allocation Tunables
114@section Memory Allocation Tunables
115@cindex memory allocation tunables
116@cindex malloc tunables
117@cindex tunables, malloc
118
119@deftp {Tunable namespace} glibc.malloc
120Memory allocation behavior can be modified by setting any of the
121following tunables in the @code{malloc} namespace:
122@end deftp
123
124@deftp Tunable glibc.malloc.check
125This tunable supersedes the @env{MALLOC_CHECK_} environment variable and is
2d2d9f2b 126identical in features. This tunable has no effect by default and needs the
fb1621a8 127debug library @file{libc_malloc_debug} to be preloaded using the
2d2d9f2b 128@code{LD_PRELOAD} environment variable.
b31b4d6a 129
83e55c98 130Setting this tunable to a non-zero value less than 4 enables a special (less
bdc674d9 131efficient) memory allocator for the @code{malloc} family of functions that is
ec2c1fce
FW
132designed to be tolerant against simple errors such as double calls of
133free with the same argument, or overruns of a single byte (off-by-one
134bugs). Not all such errors can be protected against, however, and memory
135leaks can result. Any detected heap corruption results in immediate
136termination of the process.
b31b4d6a
SP
137
138Like @env{MALLOC_CHECK_}, @code{glibc.malloc.check} has a problem in that it
139diverges from normal program behavior by writing to @code{stderr}, which could
140by exploited in SUID and SGID binaries. Therefore, @code{glibc.malloc.check}
6c6fce57 141is disabled by default for SUID and SGID binaries.
b31b4d6a
SP
142@end deftp
143
144@deftp Tunable glibc.malloc.top_pad
145This tunable supersedes the @env{MALLOC_TOP_PAD_} environment variable and is
146identical in features.
147
148This tunable determines the amount of extra memory in bytes to obtain from the
149system when any of the arenas need to be extended. It also specifies the
150number of bytes to retain when shrinking any of the arenas. This provides the
151necessary hysteresis in heap size such that excessive amounts of system calls
152can be avoided.
153
6c93af6b 154The default value of this tunable is @samp{131072} (128 KB).
b31b4d6a
SP
155@end deftp
156
157@deftp Tunable glibc.malloc.perturb
158This tunable supersedes the @env{MALLOC_PERTURB_} environment variable and is
159identical in features.
160
161If set to a non-zero value, memory blocks are initialized with values depending
162on some low order bits of this tunable when they are allocated (except when
bdc674d9 163allocated by @code{calloc}) and freed. This can be used to debug the use of
b31b4d6a
SP
164uninitialized or freed heap memory. Note that this option does not guarantee
165that the freed block will have any specific values. It only guarantees that the
166content the block had before it was freed will be overwritten.
167
168The default value of this tunable is @samp{0}.
169@end deftp
170
171@deftp Tunable glibc.malloc.mmap_threshold
172This tunable supersedes the @env{MALLOC_MMAP_THRESHOLD_} environment variable
173and is identical in features.
174
175When this tunable is set, all chunks larger than this value in bytes are
176allocated outside the normal heap, using the @code{mmap} system call. This way
177it is guaranteed that the memory for these chunks can be returned to the system
178on @code{free}. Note that requests smaller than this threshold might still be
179allocated via @code{mmap}.
180
181If this tunable is not set, the default value is set to @samp{131072} bytes and
182the threshold is adjusted dynamically to suit the allocation patterns of the
183program. If the tunable is set, the dynamic adjustment is disabled and the
184value is set as static.
185@end deftp
186
187@deftp Tunable glibc.malloc.trim_threshold
188This tunable supersedes the @env{MALLOC_TRIM_THRESHOLD_} environment variable
189and is identical in features.
190
191The value of this tunable is the minimum size (in bytes) of the top-most,
192releasable chunk in an arena that will trigger a system call in order to return
193memory to the system from that arena.
194
195If this tunable is not set, the default value is set as 128 KB and the
196threshold is adjusted dynamically to suit the allocation patterns of the
197program. If the tunable is set, the dynamic adjustment is disabled and the
198value is set as static.
199@end deftp
200
201@deftp Tunable glibc.malloc.mmap_max
202This tunable supersedes the @env{MALLOC_MMAP_MAX_} environment variable and is
203identical in features.
204
205The value of this tunable is maximum number of chunks to allocate with
206@code{mmap}. Setting this to zero disables all use of @code{mmap}.
207
208The default value of this tunable is @samp{65536}.
209@end deftp
210
211@deftp Tunable glibc.malloc.arena_test
212This tunable supersedes the @env{MALLOC_ARENA_TEST} environment variable and is
213identical in features.
214
215The @code{glibc.malloc.arena_test} tunable specifies the number of arenas that
216can be created before the test on the limit to the number of arenas is
217conducted. The value is ignored if @code{glibc.malloc.arena_max} is set.
218
219The default value of this tunable is 2 for 32-bit systems and 8 for 64-bit
220systems.
221@end deftp
222
223@deftp Tunable glibc.malloc.arena_max
224This tunable supersedes the @env{MALLOC_ARENA_MAX} environment variable and is
225identical in features.
226
227This tunable sets the number of arenas to use in a process regardless of the
228number of cores in the system.
229
230The default value of this tunable is @code{0}, meaning that the limit on the
231number of arenas is determined by the number of CPU cores online. For 32-bit
232systems the limit is twice the number of cores online and on 64-bit systems, it
233is 8 times the number of cores online.
234@end deftp
ea9b0ecb 235
d5c3fafc
DD
236@deftp Tunable glibc.malloc.tcache_max
237The maximum size of a request (in bytes) which may be met via the
238per-thread cache. The default (and maximum) value is 1032 bytes on
23964-bit systems and 516 bytes on 32-bit systems.
240@end deftp
241
242@deftp Tunable glibc.malloc.tcache_count
243The maximum number of chunks of each size to cache. The default is 7.
1f50f2ad 244The upper limit is 65535. If set to zero, the per-thread cache is effectively
5ad533e8 245disabled.
d5c3fafc
DD
246
247The approximate maximum overhead of the per-thread cache is thus equal
248to the number of bins times the chunk count in each bin times the size
249of each chunk. With defaults, the approximate maximum overhead of the
250per-thread cache is approximately 236 KB on 64-bit systems and 118 KB
251on 32-bit systems.
252@end deftp
253
254@deftp Tunable glibc.malloc.tcache_unsorted_limit
255When the user requests memory and the request cannot be met via the
256per-thread cache, the arenas are used to meet the request. At this
257time, additional chunks will be moved from existing arena lists to
258pre-fill the corresponding cache. While copies from the fastbins,
259smallbins, and regular bins are bounded and predictable due to the bin
260sizes, copies from the unsorted bin are not bounded, and incur
261additional time penalties as they need to be sorted as they're
262scanned. To make scanning the unsorted list more predictable and
263bounded, the user may set this tunable to limit the number of chunks
264that are scanned from the unsorted list while searching for chunks to
265pre-fill the per-thread cache with. The default, or when set to zero,
266is no limit.
be8aa923 267@end deftp
d5c3fafc 268
c48d92b4 269@deftp Tunable glibc.malloc.mxfast
bdc674d9 270One of the optimizations @code{malloc} uses is to maintain a series of ``fast
c48d92b4
DD
271bins'' that hold chunks up to a specific size. The default and
272maximum size which may be held this way is 80 bytes on 32-bit systems
273or 160 bytes on 64-bit systems. Applications which value size over
274speed may choose to reduce the size of requests which are serviced
275from fast bins with this tunable. Note that the value specified
bdc674d9 276includes @code{malloc}'s internal overhead, which is normally the size of one
c48d92b4
DD
277pointer, so add 4 on 32-bit systems or 8 on 64-bit systems to the size
278passed to @code{malloc} for the largest bin size to enable.
279@end deftp
280
5f6d8d97
AZ
281@deftp Tunable glibc.malloc.hugetlb
282This tunable controls the usage of Huge Pages on @code{malloc} calls. The
283default value is @code{0}, which disables any additional support on
284@code{malloc}.
285
286Setting its value to @code{1} enables the use of @code{madvise} with
287@code{MADV_HUGEPAGE} after memory allocation with @code{mmap}. It is enabled
288only if the system supports Transparent Huge Page (currently only on Linux).
98d5fcb8
AZ
289
290Setting its value to @code{2} enables the use of Huge Page directly with
291@code{mmap} with the use of @code{MAP_HUGETLB} flag. The huge page size
292to use will be the default one provided by the system. A value larger than
293@code{2} specifies huge page size, which will be matched against the system
294supported ones. If provided value is invalid, @code{MAP_HUGETLB} will not
295be used.
5f6d8d97
AZ
296@end deftp
297
0c7b002f
SN
298@node Dynamic Linking Tunables
299@section Dynamic Linking Tunables
300@cindex dynamic linking tunables
301@cindex rtld tunables
302
303@deftp {Tunable namespace} glibc.rtld
304Dynamic linker behavior can be modified by setting the
305following tunables in the @code{rtld} namespace:
306@end deftp
307
308@deftp Tunable glibc.rtld.nns
309Sets the number of supported dynamic link namespaces (see @code{dlmopen}).
310Currently this limit can be set between 1 and 16 inclusive, the default is 4.
311Each link namespace consumes some memory in all thread, and thus raising the
312limit will increase the amount of memory each thread uses. Raising the limit
17796419
SN
313is useful when your application uses more than 4 dynamic link namespaces as
314created by @code{dlmopen} with an lmid argument of @code{LM_ID_NEWLM}.
315Dynamic linker audit modules are loaded in their own dynamic link namespaces,
316but they are not accounted for in @code{glibc.rtld.nns}. They implicitly
317increase the per-thread memory usage as necessary, so this tunable does
318not need to be changed to allow many audit modules e.g. via @env{LD_AUDIT}.
0c7b002f
SN
319@end deftp
320
ffb17e7b
SN
321@deftp Tunable glibc.rtld.optional_static_tls
322Sets the amount of surplus static TLS in bytes to allocate at program
323startup. Every thread created allocates this amount of specified surplus
324static TLS. This is a minimum value and additional space may be allocated
325for internal purposes including alignment. Optional static TLS is used for
326optimizing dynamic TLS access for platforms that support such optimizations
327e.g. TLS descriptors or optimized TLS access for POWER (@code{DT_PPC64_OPT}
328and @code{DT_PPC_OPT}). In order to make the best use of such optimizations
329the value should be as many bytes as would be required to hold all TLS
330variables in all dynamic loaded shared libraries. The value cannot be known
331by the dynamic loader because it doesn't know the expected set of shared
332libraries which will be loaded. The existing static TLS space cannot be
333changed once allocated at process startup. The default allocation of
334optional static TLS is 512 bytes and is allocated in every thread.
335@end deftp
336
15a0c573
CLT
337@deftp Tunable glibc.rtld.dynamic_sort
338Sets the algorithm to use for DSO sorting, valid values are @samp{1} and
339@samp{2}. For value of @samp{1}, an older O(n^3) algorithm is used, which is
340long time tested, but may have performance issues when dependencies between
341shared objects contain cycles due to circular dependencies. When set to the
342value of @samp{2}, a different algorithm is used, which implements a
343topological sort through depth-first search, and does not exhibit the
344performance issues of @samp{1}.
345
0884724a 346The default value of this tunable is @samp{2}.
15a0c573 347@end deftp
ffb17e7b 348
d370155b
JT
349@deftp Tunable glibc.rtld.enable_secure
350Used to run a program as if it were a setuid process. The only valid value
351is @samp{1} as this tunable can only be used to set and not unset
352@code{enable_secure}. Setting this tunable to @samp{1} also disables all other
353tunables. This tunable is intended to facilitate more extensive verification
354tests for @code{AT_SECURE} programs and not meant to be a security feature.
355
356The default value of this tunable is @samp{0}.
357@end deftp
358
07ed18d2
RA
359@node Elision Tunables
360@section Elision Tunables
361@cindex elision tunables
362@cindex tunables, elision
363
364@deftp {Tunable namespace} glibc.elision
365Contended locks are usually slow and can lead to performance and scalability
366issues in multithread code. Lock elision will use memory transactions to under
367certain conditions, to elide locks and improve performance.
368Elision behavior can be modified by setting the following tunables in
369the @code{elision} namespace:
370@end deftp
371
372@deftp Tunable glibc.elision.enable
373The @code{glibc.elision.enable} tunable enables lock elision if the feature is
374supported by the hardware. If elision is not supported by the hardware this
375tunable has no effect.
376
377Elision tunables are supported for 64-bit Intel, IBM POWER, and z System
378architectures.
379@end deftp
380
381@deftp Tunable glibc.elision.skip_lock_busy
382The @code{glibc.elision.skip_lock_busy} tunable sets how many times to use a
383non-transactional lock after a transactional failure has occurred because the
384lock is already acquired. Expressed in number of lock acquisition attempts.
385
386The default value of this tunable is @samp{3}.
387@end deftp
388
389@deftp Tunable glibc.elision.skip_lock_internal_abort
390The @code{glibc.elision.skip_lock_internal_abort} tunable sets how many times
391the thread should avoid using elision if a transaction aborted for any reason
392other than a different thread's memory accesses. Expressed in number of lock
393acquisition attempts.
394
395The default value of this tunable is @samp{3}.
396@end deftp
397
398@deftp Tunable glibc.elision.skip_lock_after_retries
399The @code{glibc.elision.skip_lock_after_retries} tunable sets how many times
400to try to elide a lock with transactions, that only failed due to a different
401thread's memory accesses, before falling back to regular lock.
402Expressed in number of lock elision attempts.
403
404This tunable is supported only on IBM POWER, and z System architectures.
405
406The default value of this tunable is @samp{3}.
407@end deftp
408
409@deftp Tunable glibc.elision.tries
410The @code{glibc.elision.tries} sets how many times to retry elision if there is
411chance for the transaction to finish execution e.g., it wasn't
412aborted due to the lock being already acquired. If elision is not supported
413by the hardware this tunable is set to @samp{0} to avoid retries.
414
415The default value of this tunable is @samp{3}.
416@end deftp
417
418@deftp Tunable glibc.elision.skip_trylock_internal_abort
419The @code{glibc.elision.skip_trylock_internal_abort} tunable sets how many
420times the thread should avoid trying the lock if a transaction aborted due to
421reasons other than a different thread's memory accesses. Expressed in number
422of try lock attempts.
423
424The default value of this tunable is @samp{3}.
425@end deftp
426
6310e6be
KW
427@node POSIX Thread Tunables
428@section POSIX Thread Tunables
429@cindex pthread mutex tunables
430@cindex thread mutex tunables
431@cindex mutex tunables
432@cindex tunables thread mutex
433
434@deftp {Tunable namespace} glibc.pthread
435The behavior of POSIX threads can be tuned to gain performance improvements
436according to specific hardware capabilities and workload characteristics by
437setting the following tunables in the @code{pthread} namespace:
438@end deftp
439
440@deftp Tunable glibc.pthread.mutex_spin_count
441The @code{glibc.pthread.mutex_spin_count} tunable sets the maximum number of times
442a thread should spin on the lock before calling into the kernel to block.
443Adaptive spin is used for mutexes initialized with the
444@code{PTHREAD_MUTEX_ADAPTIVE_NP} GNU extension. It affects both
445@code{pthread_mutex_lock} and @code{pthread_mutex_timedlock}.
446
447The thread spins until either the maximum spin count is reached or the lock
448is acquired.
449
450The default value of this tunable is @samp{100}.
451@end deftp
452
dd45734e
FW
453@deftp Tunable glibc.pthread.stack_cache_size
454This tunable configures the maximum size of the stack cache. Once the
455stack cache exceeds this size, unused thread stacks are returned to
456the kernel, to bring the cache size below this limit.
457
458The value is measured in bytes. The default is @samp{41943040}
64d9ebae 459(forty mibibytes).
dd45734e
FW
460@end deftp
461
e3e58982
FW
462@deftp Tunable glibc.pthread.rseq
463The @code{glibc.pthread.rseq} tunable can be set to @samp{0}, to disable
464restartable sequences support in @theglibc{}. This enables applications
465to perform direct restartable sequence registration with the kernel.
466The default is @samp{1}, which means that @theglibc{} performs
467registration on behalf of the application.
468
469Restartable sequences are a Linux-specific extension.
470@end deftp
471
b630be09
CM
472@deftp Tunable glibc.pthread.stack_hugetlb
473This tunable controls whether to use Huge Pages in the stacks created by
474@code{pthread_create}. This tunable only affects the stacks created by
475@theglibc{}, it has no effect on stack assigned with
476@code{pthread_attr_setstack}.
477
478The default is @samp{1} where the system default value is used. Setting
479its value to @code{0} enables the use of @code{madvise} with
480@code{MADV_NOHUGEPAGE} after stack creation with @code{mmap}.
481
482This is a memory utilization optimization, since internal glibc setup of either
483the thread descriptor and the guard page might force the kernel to move the
484thread stack originally backup by Huge Pages to default pages.
485@end deftp
486
ea9b0ecb
SP
487@node Hardware Capability Tunables
488@section Hardware Capability Tunables
489@cindex hardware capability tunables
490@cindex hwcap tunables
491@cindex tunables, hwcap
03feacb5
L
492@cindex hwcaps tunables
493@cindex tunables, hwcaps
905947c3
L
494@cindex data_cache_size tunables
495@cindex tunables, data_cache_size
496@cindex shared_cache_size tunables
497@cindex tunables, shared_cache_size
498@cindex non_temporal_threshold tunables
46b5e98e
NG
499@cindex memset_non_temporal_threshold tunables
500@cindex tunables, non_temporal_threshold, memset_non_temporal_threshold
ea9b0ecb 501
dce452dc 502@deftp {Tunable namespace} glibc.cpu
ea9b0ecb 503Behavior of @theglibc{} can be tuned to assume specific hardware capabilities
dce452dc 504by setting the following tunables in the @code{cpu} namespace:
ea9b0ecb
SP
505@end deftp
506
dce452dc 507@deftp Tunable glibc.cpu.hwcap_mask
ea9b0ecb
SP
508This tunable supersedes the @env{LD_HWCAP_MASK} environment variable and is
509identical in features.
510
28c3f14f 511The @code{AT_HWCAP} key in the Auxiliary Vector specifies instruction set
ea9b0ecb 512extensions available in the processor at runtime for some architectures. The
dce452dc 513@code{glibc.cpu.hwcap_mask} tunable allows the user to mask out those
ea9b0ecb
SP
514capabilities at runtime, thus disabling use of those extensions.
515@end deftp
905947c3 516
dce452dc
SP
517@deftp Tunable glibc.cpu.hwcaps
518The @code{glibc.cpu.hwcaps=-xxx,yyy,-zzz...} tunable allows the user to
905947c3
L
519enable CPU/ARCH feature @code{yyy}, disable CPU/ARCH feature @code{xxx}
520and @code{zzz} where the feature name is case-sensitive and has to match
d976d44a 521the ones in @code{sysdeps/x86/include/cpu-features.h}.
905947c3 522
41f67ccb
SL
523On s390x, the supported HWCAP and STFLE features can be found in
524@code{sysdeps/s390/cpu-features.c}. In addition the user can also set
525a CPU arch-level like @code{z13} instead of single HWCAP and STFLE features.
526
21841f0d
MB
527On powerpc, the supported HWCAP and HWCAP2 features can be found in
528@code{sysdeps/powerpc/dl-procinfo.c}.
529
095067ef 530On loongarch, the supported HWCAP features can be found in
531@code{sysdeps/loongarch/cpu-tunables.c}.
532
533This tunable is specific to i386, x86-64, s390x, powerpc and loongarch.
905947c3
L
534@end deftp
535
dce452dc
SP
536@deftp Tunable glibc.cpu.cached_memopt
537The @code{glibc.cpu.cached_memopt=[0|1]} tunable allows the user to
c9cd7b0c
AZ
538enable optimizations recommended for cacheable memory. If set to
539@code{1}, @theglibc{} assumes that the process memory image consists
540of cacheable (non-device) memory only. The default, @code{0},
541indicates that the process may use device memory.
542
543This tunable is specific to powerpc, powerpc64 and powerpc64le.
544@end deftp
545
dce452dc
SP
546@deftp Tunable glibc.cpu.name
547The @code{glibc.cpu.name=xxx} tunable allows the user to tell @theglibc{} to
28cfa3a4 548assume that the CPU is @code{xxx} where xxx may have one of these values:
2f5524cc 549@code{generic}, @code{thunderxt88}, @code{thunderx2t99},
fa527f34
NT
550@code{thunderx2t99p1}, @code{ares}, @code{emag}, @code{kunpeng},
551@code{a64fx}.
28cfa3a4
SP
552
553This tunable is specific to aarch64.
554@end deftp
555
dce452dc
SP
556@deftp Tunable glibc.cpu.x86_data_cache_size
557The @code{glibc.cpu.x86_data_cache_size} tunable allows the user to set
905947c3
L
558data cache size in bytes for use in memory and string routines.
559
560This tunable is specific to i386 and x86-64.
561@end deftp
562
dce452dc
SP
563@deftp Tunable glibc.cpu.x86_shared_cache_size
564The @code{glibc.cpu.x86_shared_cache_size} tunable allows the user to
905947c3
L
565set shared cache size in bytes for use in memory and string routines.
566@end deftp
567
dce452dc
SP
568@deftp Tunable glibc.cpu.x86_non_temporal_threshold
569The @code{glibc.cpu.x86_non_temporal_threshold} tunable allows the user
d3c57027
PM
570to set threshold in bytes for non temporal store. Non temporal stores
571give a hint to the hardware to move data directly to memory without
572displacing other data from the cache. This tunable is used by some
573platforms to determine when to use non temporal stores in operations
574like memmove and memcpy.
905947c3
L
575
576This tunable is specific to i386 and x86-64.
577@end deftp
6d90776d 578
46b5e98e
NG
579@deftp Tunable glibc.cpu.x86_memset_non_temporal_threshold
580The @code{glibc.cpu.x86_memset_non_temporal_threshold} tunable allows
581the user to set threshold in bytes for non temporal store in
582memset. Non temporal stores give a hint to the hardware to move data
583directly to memory without displacing other data from the cache. This
584tunable is used by some platforms to determine when to use non
585temporal stores memset.
586
587This tunable is specific to i386 and x86-64.
588@end deftp
589
590
3f4b61a0
L
591@deftp Tunable glibc.cpu.x86_rep_movsb_threshold
592The @code{glibc.cpu.x86_rep_movsb_threshold} tunable allows the user to
593set threshold in bytes to start using "rep movsb". The value must be
594greater than zero, and currently defaults to 2048 bytes.
595
596This tunable is specific to i386 and x86-64.
597@end deftp
598
599@deftp Tunable glibc.cpu.x86_rep_stosb_threshold
600The @code{glibc.cpu.x86_rep_stosb_threshold} tunable allows the user to
601set threshold in bytes to start using "rep stosb". The value must be
602greater than zero, and currently defaults to 2048 bytes.
603
604This tunable is specific to i386 and x86-64.
605@end deftp
606
dce452dc
SP
607@deftp Tunable glibc.cpu.x86_ibt
608The @code{glibc.cpu.x86_ibt} tunable allows the user to control how
6d90776d
L
609indirect branch tracking (IBT) should be enabled. Accepted values are
610@code{on}, @code{off}, and @code{permissive}. @code{on} always turns
611on IBT regardless of whether IBT is enabled in the executable and its
612dependent shared libraries. @code{off} always turns off IBT regardless
613of whether IBT is enabled in the executable and its dependent shared
614libraries. @code{permissive} is the same as the default which disables
615IBT on non-CET executables and shared libraries.
616
617This tunable is specific to i386 and x86-64.
618@end deftp
619
dce452dc
SP
620@deftp Tunable glibc.cpu.x86_shstk
621The @code{glibc.cpu.x86_shstk} tunable allows the user to control how
6d90776d
L
622the shadow stack (SHSTK) should be enabled. Accepted values are
623@code{on}, @code{off}, and @code{permissive}. @code{on} always turns on
624SHSTK regardless of whether SHSTK is enabled in the executable and its
625dependent shared libraries. @code{off} always turns off SHSTK regardless
626of whether SHSTK is enabled in the executable and its dependent shared
627libraries. @code{permissive} changes how dlopen works on non-CET shared
628libraries. By default, when SHSTK is enabled, dlopening a non-CET shared
629library returns an error. With @code{permissive}, it turns off SHSTK
630instead.
631
632This tunable is specific to i386 and x86-64.
633@end deftp
26450d04 634
317f1c0a 635@deftp Tunable glibc.cpu.prefer_map_32bit_exec
188ecdb7 636When this tunable is set to @code{1}, shared libraries of non-setuid
317f1c0a
L
637programs will be loaded below 2GB with MAP_32BIT.
638
639Note that the @env{LD_PREFER_MAP_32BIT_EXEC} environment is an alias of
640this tunable.
641
642This tunable is specific to 64-bit x86-64.
643@end deftp
644
848746e8
L
645@deftp Tunable glibc.cpu.plt_rewrite
646When this tunable is set to @code{1}, the dynamic linker will rewrite
647the PLT section with 32-bit direct jump. When it is set to @code{2},
648the dynamic linker will rewrite the PLT section with 32-bit direct
649jump and on APX processors with 64-bit absolute jump.
650
651This tunable is specific to x86-64 and effective only when the lazy
652binding is disabled.
653@end deftp
654
26450d04
RE
655@node Memory Related Tunables
656@section Memory Related Tunables
657@cindex memory related tunables
658
659@deftp {Tunable namespace} glibc.mem
660This tunable namespace supports operations that affect the way @theglibc{}
661and the process manage memory.
662@end deftp
663
664@deftp Tunable glibc.mem.tagging
665If the hardware supports memory tagging, this tunable can be used to
666control the way @theglibc{} uses this feature. At present this is only
64d9ebae 667supported on AArch64 systems with the MTE extension; it is ignored for
26450d04
RE
668all other systems.
669
670This tunable takes a value between 0 and 255 and acts as a bitmask
671that enables various capabilities.
672
bdc674d9
PE
673Bit 0 (the least significant bit) causes the @code{malloc}
674subsystem to allocate
26450d04
RE
675tagged memory, with each allocation being assigned a random tag.
676
677Bit 1 enables precise faulting mode for tag violations on systems that
678support deferred tag violation reporting. This may cause programs
679to run more slowly.
680
e9dd3682
TB
681Bit 2 enables either precise or deferred faulting mode for tag violations
682whichever is preferred by the system.
683
26450d04
RE
684Other bits are currently reserved.
685
686@Theglibc{} startup code will automatically enable memory tagging
687support in the kernel if this tunable has any non-zero value.
688
689The default value is @samp{0}, which disables all memory tagging.
690@end deftp
31be941e 691
bf033c00
AZ
692@deftp Tunable glibc.mem.decorate_maps
693If the kernel supports naming anonymous virtual memory areas (since
694Linux version 5.17, although not always enabled by some kernel
695configurations), this tunable can be used to control whether
696@theglibc{} decorates the underlying memory obtained from operating
697system with a string describing its usage (for instance, on the thread
698stack created by @code{ptthread_create} or memory allocated by
699@code{malloc}).
700
701The process mappings can be obtained by reading the @code{/proc/<pid>maps}
702(with @code{pid} being either the @dfn{process ID} or @code{self} for the
703process own mapping).
704
705This tunable takes a value of 0 and 1, where 1 enables the feature.
706The default value is @samp{0}, which disables the decoration.
707@end deftp
708
31be941e
SK
709@node gmon Tunables
710@section gmon Tunables
711@cindex gmon tunables
712
713@deftp {Tunable namespace} glibc.gmon
714This tunable namespace affects the behaviour of the gmon profiler.
715gmon is a component of @theglibc{} which is normally used in
716conjunction with gprof.
717
718When GCC compiles a program with the @code{-pg} option, it instruments
719the program with calls to the @code{mcount} function, to record the
720program's call graph. At program startup, a memory buffer is allocated
721to store this call graph; the size of the buffer is calculated using a
722heuristic based on code size. If during execution, the buffer is found
723to be too small, profiling will be aborted and no @file{gmon.out} file
724will be produced. In that case, you will see the following message
725printed to standard error:
726
727@example
728mcount: call graph buffer size limit exceeded, gmon.out will not be generated
729@end example
730
731Most of the symbols discussed in this section are defined in the header
732@code{sys/gmon.h}. However, some symbols (for example @code{mcount})
733are not defined in any header file, since they are only intended to be
734called from code generated by the compiler.
735@end deftp
736
737@deftp Tunable glibc.mem.minarcs
738The heuristic for sizing the call graph buffer is known to be
739insufficient for small programs; hence, the calculated value is clamped
740to be at least a minimum size. The default minimum (in units of
741call graph entries, @code{struct tostruct}), is given by the macro
742@code{MINARCS}. If you have some program with an unusually complex
743call graph, for which the heuristic fails to allocate enough space,
744you can use this tunable to increase the minimum to a larger value.
745@end deftp
746
747@deftp Tunable glibc.mem.maxarcs
748To prevent excessive memory consumption when profiling very large
749programs, the call graph buffer is allowed to have a maximum of
750@code{MAXARCS} entries. For some very large programs, the default
751value of @code{MAXARCS} defined in @file{sys/gmon.h} is too small; in
752that case, you can use this tunable to increase it.
753
754Note the value of the @code{maxarcs} tunable must be greater or equal
755to that of the @code{minarcs} tunable; if this constraint is violated,
756a warning will printed to standard error at program startup, and
757the @code{minarcs} value will be used as the maximum as well.
758
759Setting either tunable too high may result in a call graph buffer
760whose size exceeds the available memory; in that case, an out of memory
761error will be printed at program startup, the profiler will be
762disabled, and no @file{gmon.out} file will be generated.
763@end deftp