loongarch: Enable THP-aligned load segments by default on 64-bit
On LoongArch64 Linux, aligning ELF load segments to Transparent Huge Page
(THP) boundaries provides consistent performance benefits for large
binaries by reducing TLB pressure and improving instruction fetch
efficiency.
Enable THP-based load segment alignment by default on LoongArch64 by
setting `glibc.elf.thp=1` during startup. Define the default THP
page size for load segment alignment on LoongArch64 as 32MB.
This allows the dynamic loader to apply THP-friendly alignment without
requiring the `glibc.elf.thp` tunable to be explicitly set.
Workload 1: building Cargo 1.93.0
Rustc: nightly-2026-02-26
Without patch With patch
instructions 3,690,358,948,176 3,690,301,774,568
cpu-cycles 4,233,025,766,760 4,035,866,635,741
itlb-misses 9,708,829,532 2,700,014,717
time elapsed 302.40 s 289.68 s
Instructions remain essentially unchanged. iTLB misses drop by about
72%, reducing CPU cycles by about 4.7% and wall time by about 4.2%.
Workload 2: building Linux kernel v7.0-rc1
LLVM: 21.1.8
Without patch With patch
instructions 14,163,739,876,387 14,169,418,598,675
cpu-cycles 19,231,890,317,741 16,851,494,928,181
itlb-misses 91,142,010,440 90,779,245
time elapsed 1022.09 s 893.22 s
Instructions remain roughly the same. iTLB misses drop from about 91B
to about 90M (roughly 99.9% reduction), reducing CPU cycles by about
12% and wall time by about 12.6%.
Reviewed-by: caiyinyu <caiyinyu@loongson.cn> Signed-off-by: WANG Rui <wangrui@loongson.cn>