Alan Modra [Mon, 24 Aug 2020 07:02:57 +0000 (16:32 +0930)]
PowerPC TPREL_HA/LO optimisation
ppc64 ld optimises sequences like the following
addis 3,13,wot@tprel@ha
lwz 3,wot@tprel@l(3)
to
nop
lwz 3,wot@tprel(13)
when "wot" is located near enough to the thread pointer.
However, the ABI doesn't require that R_PPC64_TPREL16_HA always be on
an addis rt,13,imm instruction, and while ld checked for that on the
high-part instruction it didn't disable the optimisation on the
low-part instruction. This patch fixes that problem, disabling the
tprel optimisation globally if high-part instructions don't pass
sanity checks. The optimisation is also enabled for ppc32, where
before ld.bfd had the code in the wrong place and ld.gold had it in a
block only enabled for ppc64.
bfd/
* elf32-ppc.c (ppc_elf_check_relocs): Set has_tls_reloc for
high part tprel16 relocs.
(ppc_elf_tls_optimize): Sanity check high part tprel16 relocs.
Clear do_tls_opt on odd instructions.
(ppc_elf_relocate_section): Move TPREL16_HA/LO optimisation later.
Don't sanity check them here.
* elf64-ppc.c (ppc64_elf_check_relocs): Set has_tls_reloc for
high part tprel16 relocs.
(ppc64_elf_tls_optimize): Sanity check high part tprel16 relocs.
Clear do_tls_opt on odd instructions.
(ppc64_elf_relocate_section): Don't sanity check TPREL16_HA.
ld/
* testsuite/ld-powerpc/tls32.d: Update for TPREL_HA/LO optimisation.
* testsuite/ld-powerpc/tlsexe32.d: Likewise.
* testsuite/ld-powerpc/tlsldopt32.d: Likewise.
* testsuite/ld-powerpc/tlsmark32.d: Likewise.
* testsuite/ld-powerpc/tlsopt4_32.d: Likewise.
* testsuite/ld-powerpc/tprel.s,
* testsuite/ld-powerpc/tprel.d,
* testsuite/ld-powerpc/tprel32.d: New tests.
* testsuite/ld-powerpc/tprelbad.s,
* testsuite/ld-powerpc/tprelbad.d: New test.
* testsuite/ld-powerpc/powerpc.exp: Run them.
gold/
* powerpc.cc (Target_powerpc): Add tprel_opt_ and accessors.
(Target_powerpc::Scan::local): Sanity check tprel high relocs.
(Target_powerpc::Scan::global): Likewise.
(Target_powerpc::Relocate::relocate): Control tprel optimisation
with tprel_opt_ and enable for 32-bit.
Szabolcs Nagy [Wed, 29 Jul 2020 14:47:50 +0000 (15:47 +0100)]
aarch64: set sh_entsize of .plt to 0
On aarch64 the first PLT entry is 32 bytes, subsequent entries
are 16 bytes by default but can be 24 bytes with BTI or with
PAC-PLT.
sh_entsize of .plt was set to the PLT entry size, so in some
cases sh_size % sh_entsize != 0, which breaks some tools.
Note that PLT0 (and the TLSDESC stub code which is also in the
PLT) were historically not padded up to meet the sh_size
requirement, but to ensure that PLT stub code is aligned on
cache lines. Similar layout is present on other targets too
which just happens to make sh_size a multiple of sh_entsize and
it is not expected that sh_entsize of .plt is used for anything.
This patch sets sh_entsize of .plt to 0: the section does not
hold a table of fixed-size entries so other values are not
conforming in principle to the ELF spec.
bfd/ChangeLog:
Backport from mainline.
2020-07-30 Szabolcs Nagy <szabolcs.nagy@arm.com>
PR ld/26312
* elfnn-aarch64.c (elfNN_aarch64_init_small_plt0_entry): Set sh_entsize
to 0.
(elfNN_aarch64_finish_dynamic_sections): Remove sh_entsize setting.
Accept vector alignment hints on z13 although they are ignored there.
The advantage is that any binary compiled for architecture level z13 may
run on z14 or later and benefit from vector alignment hints.
Backport from mainline.
gas/ChangeLog:
2020-05-18 Stefan Schulze Frielinghaus <stefansf@linux.ibm.com>
* testsuite/gas/s390/zarch-z13.d: Add regexp checks for vector
load/store instruction variants with alignment hints.
* testsuite/gas/s390/zarch-z13.s: Emit new vector load/store
instruction variants with alignment hints.
opcodes/ChangeLog:
2020-05-18 Stefan Schulze Frielinghaus <stefansf@linux.ibm.com>
* s390-opc.txt: Relocate vector load/store instructions with
additional alignment parameter and change architecture level
constraint from z14 to z13.
Alex Coplan [Fri, 29 May 2020 15:04:50 +0000 (16:04 +0100)]
gas: Fix checking for backwards .org with negative offset
This patch fixes internal errors in (at least) arm and aarch64 GAS
when assembling code that attempts a negative .org. The bug appears
to be a regression introduced in binutils-2.29 by commit 9875b36538d.
* write.c (relax_segment): Fix handling of negative offset when
relaxing an rs_org frag.
* testsuite/gas/aarch64/org-neg.d: New test.
* testsuite/gas/aarch64/org-neg.l: Error output for test.
* testsuite/gas/aarch64/org-neg.s: Input for test.
* testsuite/gas/arm/org-neg.d: New test.
* testsuite/gas/arm/org-neg.l: Error output for test.
* testsuite/gas/arm/org-neg.s: Input for test.
Alan Modra [Fri, 15 May 2020 08:36:05 +0000 (18:06 +0930)]
Fix tight loop on recursively-defined symbols
This patch fixes a bug in GAS where the assembler enters a tight loop
when attempting to resolve recursively-defined symbols, e.g. when
trying to assemble "a=a".
This is a regression introduced between binutils 2.32 and 2.33,
by commit 1903f1385bff9
* symbols.c (struct local_symbol): Update comment.
(resolve_symbol_value): For resolved symbols equated to other
symbols, verify that the referenced symbol is not a local_symbol
before accessing sy_value. Don't leave symbol loops during
finalize_syms resolution.
* testsuite/gas/all/assign-bad-recursive.d: New test.
* testsuite/gas/all/assign-bad-recursive.l: Error output for test.
* testsuite/gas/all/assign-bad-recursive.s: Assembly for test.
* testsuite/gas/all/gas.exp: Run it.
gas: PR 25863: Fix scalar vmul inside it block when assembling for MVE
This fixes PR 25863 by fixing the condition in the parsing of vmul in
do_mve_vmull. It also simplifies the code in there fixing latent issues that
would lead to NEON code being accepted when it shouldn't.
2020-05-07 Andre Vieira <andre.simoesdiasvieira@arm.com>
Backport from mainline.
2020-05-04 Andre Vieira <andre.simoesdiasvieira@arm.com>
PR gas/25863
* config/tc-arm.c (do_mve_vmull): Fix scalar and NEON parsing of vmul.
* testsuite/gas/arm/mve-scalar-vmult-it.d: New test.
* testsuite/gas/arm/mve-scalar-vmult-it.s: New test.
Tamar Christina [Tue, 21 Apr 2020 14:16:21 +0000 (15:16 +0100)]
BFD: Exclude sections with no content from compress check.
The check in bfd_get_full_section_contents is trying to check that we don't
allocate more space for a section than the size of the section is on disk.
Previously we excluded linker created sections since they didn't have a size on
disk. However we also need to exclude sections with no content as well such as
the BSS section. Space for these would not have been allocated by the assembler
and so the check would incorrectly fail.
bfd/ChangeLog:
PR binutils/24753
* compress.c (bfd_get_full_section_contents): Exclude sections with no
content.
gas/ChangeLog:
PR binutils/24753
* testsuite/gas/arm/pr24753.d: New test.
* testsuite/gas/arm/pr24753.s: New test.