From: Willy Tarreau Date: Wed, 12 May 2021 07:47:30 +0000 (+0200) Subject: BUILD: makefile: add a few popular ARMv8 CPU targets X-Git-Tag: v2.4.0~19 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=40a871f09d13dc666ebe0f97ed576ca93c5ff38c;p=thirdparty%2Fhaproxy.git BUILD: makefile: add a few popular ARMv8 CPU targets This adds the following CPUs to the makefile: - armv81 : modern ARM cores (Cortex A55/A75/A76/A78/X1, Neoverse, Graviton2) - a72 : ARM Cortex-A72 or A73 (e.g. RPi4, Odroid N2, VIM3, AWS Graviton) - a53 : ARM Cortex-A53 or any of its successors in 64-bit mode (e.g. RPi3) - armv8-auto: both older and newer ARMv8 cores, with a minor runtime penalty The reasons for these ones are: - a53 is the common denominator of all of its successors, and does support CRC32 which is used by the gzip compression, that the generic armv8-a does not ; - a72 supports the same features but is an out-of-order one that deserves better optimizations; it's found in a number of high-performance multi-core CPUs mainly oriented towards I/O and network processing (Armada 8040, NXP LX2160A, AWS Graviton), and more recently the Raspberry Pi 4. The A73 found in VIM3 and Odroid-N2 can use the same optimizations ; - armv81 is for generic ARMv8.1-A and above, automatically enables LSE atomics which are way more scalable, and CRC32. This one covers modern ARMv8 cores such as Cortex A55/A75/A76/A77/A78/X1 and the Neoverse family such as found in AWS's Graviton2. The LSE instructions are essential for large numbers of cores (8 and above). - armv8-auto dynamically enables support for LSE extensions when detected while still being compatible with older cores. There is a small performance penalty in doing this (~3%) but a same executable will perform optimally on a wider range of hardware. This should be the best option for distros. It requires gcc-10 or gcc-9.4 and above. When no CPU is specified, GCC version 10.2 and above will automatically implement the wrapper used to detect the LSE extensions. --- diff --git a/INSTALL b/INSTALL index e5143c6b7b..e9d38f931e 100644 --- a/INSTALL +++ b/INSTALL @@ -285,7 +285,10 @@ systems, by passing "USE_SLZ=" to the "make" command. Please note that SLZ will benefit from some CPU-specific instructions like the availability of the CRC32 extension on some ARM processors. Thus it can further -improve its performance to build with "CPU=native" on the target system. +improve its performance to build with "CPU=native" on the target system, or +"CPU=armv81" (modern systems such as Graviton2 or A55/A75 and beyond), +"CPU=a72" (e.g. for RPi4, or AWS Graviton), "CPU=a53" (e.g. for RPi3), or +"CPU=armv8-auto" (automatic detection with minor runtime penalty). A second option involves the widely known zlib library, which is very likely installed on your system. In order to use zlib, simply pass "USE_ZLIB=1" to the @@ -421,6 +424,11 @@ one of the following choices to the CPU variable : - ultrasparc : Sun UltraSparc I/II/III/IV processor - power8 : IBM POWER8 processor - power9 : IBM POWER9 processor + - armv81 : modern ARM cores (Cortex A55/A75/A76/A78/X1, Neoverse, Graviton2) + - a72 : ARM Cortex-A72 or A73 (e.g. RPi4, Odroid N2, AWS Graviton) + - a53 : ARM Cortex-A53 or any of its successors in 64-bit mode (e.g. RPi3) + - armv8-auto : support both older and newer armv8 cores with a minor penalty, + thanks to gcc 10's outline atomics (default with gcc 10.2). - native : use the build machine's specific processor optimizations. Use with extreme care, and never in virtualized environments (known to break). - generic : any other processor or no CPU-specific optimization. (default) diff --git a/Makefile b/Makefile index 74fca3529d..6571dc6fc4 100644 --- a/Makefile +++ b/Makefile @@ -162,7 +162,8 @@ TARGET = #### TARGET CPU # Use CPU= to optimize for a particular CPU, among the following # list : -# generic, native, i586, i686, ultrasparc, power8, power9, custom +# generic, native, i586, i686, ultrasparc, power8, power9, custom, +# a53, a72, armv81, armv8-auto CPU = generic #### Architecture, used when not building for native architecture @@ -274,6 +275,10 @@ CPU_CFLAGS.i686 = -O2 -march=i686 CPU_CFLAGS.ultrasparc = -O6 -mcpu=v9 -mtune=ultrasparc CPU_CFLAGS.power8 = -O2 -mcpu=power8 -mtune=power8 CPU_CFLAGS.power9 = -O2 -mcpu=power9 -mtune=power9 +CPU_CFLAGS.a53 = -O2 -mcpu=cortex-a53 +CPU_CFLAGS.a72 = -O2 -mcpu=cortex-a72 +CPU_CFLAGS.armv81 = -O2 -march=armv8.1-a +CPU_CFLAGS.armv8-auto = -O2 -march=armv8-a+crc -moutline-atomics CPU_CFLAGS = $(CPU_CFLAGS.$(CPU)) #### ARCH dependent flags, may be overridden by CPU flags