Alan Modra [Fri, 9 Oct 2020 06:26:33 +0000 (16:56 +1030)]
[GOLD] Power10 segv due to wild r2
Calling non-pcrel functions from pcrel code requires a stub to set up
r2. Gold created the stub, but an "optimisation" made the stub jump
to the function local entry, ie. r2 was not initialised.
This patch fixes that long branch stub problem, and another that might
occur for plt call stubs to local functions.
bfd/
* elf64-ppc.c (write_plt_relocs_for_local_syms): Don't do local
entry offset optimisation.
gold/
* powerpc.cc (Powerpc_relobj::do_relocate_sections): Don't do
local entry offset optimisation for lplt_section.
(Target_powerpc::Branch_info::make_stub): Don't add local
entry offset to long branch dest passed to
add_long_branch_entry. Do pass st_other bits.
(Stub_table::Branch_stub_ent): Add "other_" field.
(Stub_table::add_long_branch_entry): Add "other" param, and
save.
(Stub_table::branch_stub_size): Adjust long branch offset.
(Stub_table::do_write): Likewise.
(Target_powerpc::Relocate::relocate): Likewise.
Alan Modra [Fri, 9 Oct 2020 00:29:33 +0000 (10:59 +1030)]
[GOLD] internal error in relocate, at powerpc.cc:10473
GOT relocations can refer directly to a function in a fixed position
executable, unlike ADDR64 which needs a global entry stub, or branch
relocs, which need PLT stubs.
* powerpc.cc (is_got_reloc): New function.
(Target_powerpc::Relocate::relocate): Use it here, exclude GOT
relocs when looking for stubs.
Alan Modra [Wed, 7 Oct 2020 23:57:43 +0000 (10:27 +1030)]
[GOLD] Increase --split-stack-adjust-size
For functions with small (< 256 bytes) stack frames, the current x86
do_calls_non_split ignores --split-stack-adjust-size and, in
combination with __morestack_non_split, supplies a non-split-stack
function with at least 0x100000 (1M) available stack. On powerpc64, a
default of 0x4000 is not large enough to reliably work with the golang
testsuite. This increase the default size to the defacto x86 value.
* options.h (split_stack_adjust_size): Default to 0x100000.
H.J. Lu [Sat, 3 Oct 2020 11:23:55 +0000 (04:23 -0700)]
x86: Update register operand check for AddrPrefixOpReg
When the address size prefix applies to both the memory and the register
operand, we need to extract the address size prefix from the register
operand if the memory operand has no real registers, like symbol, DISP
or symbol(%rip).
NB: GCC always generates symbol(%rip) for RIP-relative addressing for
both x32 and x86-64.
Move the .code16 tests in movdir.s to movdir-16bit to show the correct
output from objdump.
gas/
PR gas/26685
* config/tc-i386.c (process_suffix): Also check the register
operand for the address size prefix if the memory operand has
no real registers.
* testsuite/gas/i386/enqcmd-16bit.d: New file.
* testsuite/gas/i386/enqcmd-16bit.s: Likewise.
* testsuite/gas/i386/movdir-16bit.d: Likewise.
* testsuite/gas/i386/movdir-16bit.s: Likewise.
* testsuite/gas/i386/enqcmd.s: Add tests with symbol and DISP.
* testsuite/gas/i386/x86-64-enqcmd.s: Likewise.
* testsuite/gas/i386/x86-64-movdir.s: Likewise.
* testsuite/gas/i386/movdir.s: Add tests with symbol and DISP.
Remove the .code16 test.
* testsuite/gas/i386/i386.exp: Run movdir-16bit and enqcmd-16bit.
* testsuite/gas/i386/x86-64-enqcmd-intel.d: Updated.
* testsuite/gas/i386/x86-64-enqcmd.d: Likewise.
* testsuite/gas/i386/x86-64-movdir-intel.d: Likewise.
* testsuite/gas/i386/x86-64-movdir.d: Likewise.
* testsuite/gas/i386/enqcmd-intel.d: Likewise.
* testsuite/gas/i386/enqcmd.d: Likewise.
* testsuite/gas/i386/movdir-intel.d: Likewise.
* testsuite/gas/i386/movdir.d: Likewise.
* testsuite/gas/i386/x86-64-enqcmd-intel.d: Likewise.
* testsuite/gas/i386/x86-64-enqcmd.d: Likewise.
* testsuite/gas/i386/x86-64-movdir-intel.d: Likewise.
* testsuite/gas/i386/x86-64-movdir.d: Likewise.
opcodes/
PR gas/26685
* i386-dis.c (mod_table): Replace Gv with Gdq on movdiri.
Jan Beulich [Tue, 21 Jul 2020 12:20:11 +0000 (14:20 +0200)]
Revert "x86: Don't display eiz with no scale"
This reverts commit 04c662e2b66bedd050f97adec19afe0fcfce9ea7.
In my underlying suggestion I neglected the fact that in those
cases (,%eiz,1) is the only visible indication that 32-bit
addressing is in effect.
Alex Coplan [Tue, 6 Oct 2020 14:56:44 +0000 (15:56 +0100)]
aarch64: Fix bogus type punning in parse_barrier() [PR26699]
This patch fixes a bogus use of type punning in parse_barrier() which
was causing an assembly failure on big endian LP64 hosts when attempting
to assemble "isb sy" for AArch64.
The type of the entries in aarch64_barrier_opt_hsh is
aarch64_name_value_pair. We were incorrectly casting this to the
locally-defined asm_barrier_opt which has a wider type (on LP64) for the
second member. This happened to work on little-endian hosts but fails on
LP64 big endian.
The fix is to use the correct type in parse_barrier(). This makes the
locally-defined asm_barrier_opt redundant, so remove it.
* powerpc.cc (Target_powerpc): Rename power10_stubs_ to
power10_relocs_.
(Target_powerpc::set_power10_relocs): New accessor.
(Target_powerpc::set_power10_stubs): Delete.
(Target_powerpc::power10_stubs): Adjust.
(Target_powerpc::has_localentry0): New accessor.
(ld_0_11): New constant.
(glink_eh_frame_fde_64v1, glink_eh_frame_fde_64v2): Adjust.
(glink_eh_frame_fde_64v2_localentry0): New.
(Output_data_glink::pltresolve_size): Update.
(Output_data_glink::add_eh_frame): Use localentry0 version eh_frame.
(Output_data_glink::do_write): Move r2 save to start of ELFv2 stub
and only emit for has_localentry0. Don't use r2 in the stub.
(Target_powerpc::Scan::local, global): Adjust for
set_power10_relocs renaming.
(Target_powerpc::scan_relocs): Warn and reset plt_localentry0_.
Alan Modra [Sat, 26 Sep 2020 05:40:09 +0000 (15:10 +0930)]
PPC64_OPT_LOCALENTRY is incompatible with tail calls
The save of r2 in __glink_PLTresolve is the culprit. Remove it,
unless we know we need it for --plt-localentry. --plt-localentry
should not be used with power10 pc-relative code that makes tail
calls.
The patch also removes use of r2 as a scratch reg in the ELFv2
__glink_PLTresolve. Using r2 isn't a problem, this is just reducing
the number of scratch regs.
bfd/
* elf64-ppc.c (GLINK_PLTRESOLVE_SIZE): Depend on has_plt_localentry0.
(LD_R0_0R11, ADD_R11_R0_R11): Define.
(ppc64_elf_tls_setup): Disable params->plt_localentry0 when power10
code detected.
(ppc64_elf_size_stubs): Update __glink_PLTresolve eh_frame.
(ppc64_elf_build_stubs): Move r2 save to start of __glink_PLTresolve,
and only emit for has_plt_localentry0. Don't use r2 in the stub.
ld/
* testsuite/ld-powerpc/elfv2so.d,
* testsuite/ld-powerpc/notoc2.d,
* testsuite/ld-powerpc/tlsdesc.wf,
* testsuite/ld-powerpc/tlsdesc2.d,
* testsuite/ld-powerpc/tlsdesc2.wf,
* testsuite/ld-powerpc/tlsopt5.d,
* testsuite/ld-powerpc/tlsopt5.wf,
* testsuite/ld-powerpc/tlsopt6.d,
* testsuite/ld-powerpc/tlsopt6.wf: Update __glink_PLTresolve.
* options.h (DEFINE_enum): Add optional_arg__ param, adjust
all uses.
(General_options): Add --power10-stubs and --no-power10-stubs.
* options.cc (General_options::finalize): Handle --power10-stubs.
* powerpc.cc (set_power10_stubs): Don't set when --power10-stubs=no.
(power10_stubs_auto): New.
(struct Plt_stub_ent): Add toc_ and tocoff_. Don't use a bitfield
for indx_.
(struct Branch_stub_ent): Add toc_and tocoff_. Use bitfields for
iter_, notoc_ and save_res_.
(add_plt_call_entry): Set toc_. Adjust resizing conditions for
--power10-stubs=auto.
(add_long_branch_entry): Set toc_.
(add_eh_frame, define_stub_syms): No longer use const_iterators
for plt and long branch stub iteration.
(build_tls_opt_head, build_tls_opt_tail): Change parameters and
return value. Move tests for __tls_get_addr to callers.
(plt_call_size): Handle --power10-stubs=auto.
(branch_stub_size): Likewise.
(Stub_table::do_write): Likewise.
(relocate): Likewise.
Alan Modra [Tue, 22 Sep 2020 13:21:42 +0000 (22:51 +0930)]
PR26656, power10 libstdc++.so segfault in __cxxabiv1::__cxa_throw
This adds missing support for a power10 version of the __tls_get_addr
call stub implementing DT_PPC64_OPT PPC64_OPT_TLS. Without this,
power10 code using __tls_get_addr fails miserably at runtime unless
the --no-tls-get-addr-optimize option is given.
PR 26656
* elf64-ppc.c (plt_stub_size): Add "odd" param. Use it with
size_power10_offset rather than calculating from start of stub.
Add size for notoc tls_get_addr_opt stub.
(plt_stub_pad): Add "odd" param, pass to plt_stub_size.
(build_tls_get_addr_head, build_tls_get_addr_tail): New functions.
(build_tls_get_addr_stub): Delete.
(ppc_build_one_stub): Use a temp for htab->params->stub_bfd.
Emit notoc tls_get_addr_opt stub. Move eh_frame code to
suit. Adjust code to use bfd_tls_get_addr_head/tail in place
of build_tls_get_addr_stub.
(ppc_size_one_stub): Size notoc tls_get_addr_opt stub.
Adjust plt_stub_size and plt_stub_pad calls. Correct "odd"
when padding stub. Size eh_frame for notoc stub too.
Correct lr_restore value.
(ppc64_elf_relocate_section): Don't skip over first insn of
notoc tls_get_addr_opt stub.
Some of the powerpc64 code editing functions are better run after
dynamic symbols have stabilised in order to make proper decisions
based on SYMBOL_REFERENCES_LOCAL. The dynamic symbols are processed
early in bfd_elf_size_dynamic_sections, before the backend
always_size_sections function is called.
One function, ppc64_elf_tls_setup must run before
bfd_elf_size_dynamic_sections because it changes dynamic symbols.
ppc64_elf_edit_opd and ppc64_elf_inline_plt can run early or late, I
think. ppc64_elf_tls_optimize and ppc64_elf_edit_toc are better run
later.
So this patch arranges to call some edit functions later via
always_size_sections.
Alan Modra [Tue, 18 Aug 2020 23:17:35 +0000 (08:47 +0930)]
Correct vcmpsq, vcmpuq and xvtlsbb BF field
These shouldn't be optional. The record form of vector instructions
set CR6, giving an expectation that omitting BF should be the same as
specifying CR6.
opcodes/
* ppc-opc.c (powerpc_opcodes): Replace OBF with BF for vcmpsq,
vcmpuq and xvtlsbb.
gas/
* testsuite/gas/ppc/int128.s: Correct vcmpuq.
* testsuite/gas/ppc/int128.d: Update.
* testsuite/gas/ppc/xvtlsbb.d: Update.
Alan Modra [Wed, 12 Aug 2020 14:01:28 +0000 (23:31 +0930)]
PowerPC64 --no-pcrel-optimize
This new option effectively ignores R_PPC64_PCREL_OPT, disabling the
optimization of instructions marked by that relocation. The patch
also disables GOT indirect to GOT/TOC pointer relative code editing
when --no-toc-optimize.
bfd/
* elf64-ppc.h (struct ppc64_elf_params): Add no_pcrel_opt.
* elf64-ppc.c (ppc64_elf_relocate_section): Disable GOT reloc
optimizations when --no-toc-optimize. Disable R_PPC64_PCREL_OPT
optimization when --no-pcrel-optimize.
ld/
* emultempl/ppc64elf.em (params): Init new field.
(enum ppc64_opt): Add OPTION_NO_PCREL_OPT.
(PARSE_AND_LIST_LONGOPTS, PARSE_AND_LIST_OPTIONS),
(PARSE_AND_LIST_ARGS_CASES): Support --no-pcrel-optimize.
Nick Clifton [Tue, 15 Sep 2020 09:27:50 +0000 (10:27 +0100)]
Add support to the assembler for a ".nop" directive which inserts a single no-op instruction.
Import from mainline:
2020-09-14 Nick Clifton <nickc@redhat.com>
* read.c (s_nop): New function. Handles the .nop directive.
(potable): Add entry for "nop".
(s_nops): Code tidy.
* read.h (s_nop): Add prototype.
* config/tc-bpf.h (md_single_noop_insn): Define.
* config/tc-mmix.h (md_single_noop_insn): Define.
* config/tc-or1k.h (md_single_noop_insn): Define.
* config/tc-ia64.h (md_single_noop_insn): Define.
* write.c (relax_segment): Update error message regarding
non-absolute values passed to .fill and .nops.
* NEWS: Mention the new directive.
* doc/as.texi: Document the new directive.
* doc/internals.texi: Document the new internal macros used to
implement the new directive.
* testsuite/gas/all/nop.s: New test.
* testsuite/gas/all/nop.d: New test control file.
* testsuite/gas/all/gas.exp: Run the new test.
* testsuite/gas/elf/dwarf-5-nop-for-line-table.s: New test.
* testsuite/gas/elf/dwarf-5-nop-for-line-table.d: New test
control file.
* testsuite/gas/elf/elf.exp: Run the new test.
* testsuite/gas/i386/space1.l: Adjust expected output.
CRIS: fix PR ld/26589, a missing NULL check in fix for PR ld/22269
Not sure why there wasn't a NULL check in the ld/22269 patch
(e01c16a8) at the time, as there was one for the corresponding patch
to elf32-m68k.c (5056ba1d).
Incidentally, I had missed that in 2017, as a prerequisite for the
ld/22269 series, the check_relocs function finally were made "safe"!
(I.e. the number of references and symbol types are final, garbage
collection done, so port-specific accounting can be made sanely.)
Committed.
bfd:
PR ld/26589
* elf32-cris.c (cris_elf_check_relocs): Add missing NULL check
on argument before calling UNDEFWEAK_NO_DYNAMIC_RELOC.
ld:
PR ld/26589
* testsuite/ld-elf/pr26589.d, testsuite/ld-elf/locref3.s: New test.
Mark Wielaard [Mon, 7 Sep 2020 12:25:25 +0000 (14:25 +0200)]
gas: Don't error when .debug_line already exists, unless .loc was used
When -g was used to generate DWARF gas would error out when a .debug_line
already exists. But when a .debug_info section already exists it would
simply skip generating one without warning or error. Do the same for
.debug_line. It is only an error when the user explicitly uses .loc
directives and also generates the .debug_line table itself.
The tests are unfortunately arch specific because the line table is only
generated when actual instructions have been emitted. Use i386 because
that is probably the most used architecture. Before this patch the new
dwarf-line-2 testcase would fail, with this patch it succeeds (and doesn't
try to add its own line table).
gas/ChangeLog:
* as.texi (-g): Explicitly mention when .debug_info and .debug_line
are generated for the DWARF format.
(Loc): Add that it is an error to both use a .loc directive and
generate a .debug_line yourself.
* dwarf2dbg.c (dwarf2_any_loc_directive_seen): New static variable.
(dwarf2_directive_loc): Set dwarf2_any_loc_directive_seen to TRUE.
(dwarf2_finish): Check dwarf2_any_loc_directive_seen before emitting
an error. Only create .debug_line if it is empty (or doesn't exist).
* testsuite/gas/i386/i386.exp: Add dwarf2-line-{1,2,3,4} when testing
an elf target.
* testsuite/gas/i386/dwarf2-line-{1,2,3,4}.{s,d,l}: New test files.
Mark Wielaard [Mon, 7 Sep 2020 13:03:20 +0000 (14:03 +0100)]
gas: Output directory and file names in .debug_line_str for DWARF5
* dwarf2dbg.c (add_line_strp): New function.
(out_dir_and_file_list): Take line_seg and sizeof_offset as
arguments, Use DW_FORM_line_strp for dir and file. Call
add_line_strp and set symbol offset for DWARF2_LINE_VERSION 5.
(out_debug_line): Call out_dir_and_file_list with line_seg and
sizeof_offset.
* gas/testsuite/gas/elf/dwarf-5-file0.d: Expect indirect line
strings.
Mark Wielaard [Mon, 7 Sep 2020 12:04:45 +0000 (13:04 +0100)]
gas: Output .debug_rnglists for DWARF 5.
* dwarf2dbg.c (DWARF2_RNGLISTS_VERSION): New constant.
(out_debug_ranges): Add ranges_sym argument and set it.
(out_debug_rnglists): New function.
(out_debug_info): Change ranges_seg argument to ranges_sym
and use it to set DW_AT_ranges value.
(dwarf2_finish): Remove ranges_seg, add ranges_sym. For
DWARF2_VERSION 5 call out_debug_rnglists.
Mark Wielaard [Tue, 1 Sep 2020 13:29:56 +0000 (15:29 +0200)]
gas: Use DW_FORM_sec_offset for DWARF version 4 or higher.
Older DWARF versions used DW_FORM_data4 or DW_FORM_data8 for offsets
into sections for e.g. DW_AT_stmt_list ot DW_AT_ranges. But version 4
introduced a dedicated form for such section offsets. Make sure to emit
the proper form for newer DWARF versions.
gas/ChangeLog:
* dwarf2dbg.c (out_debug_abbrev): Use DW_FORM_sec_offset for DWARF
version 4 or higher.
Mark Wielaard [Wed, 26 Aug 2020 19:46:04 +0000 (21:46 +0200)]
gas: Handle bad -gdwarf options, just like bad --gdwarf options.
parse_args uses getopt_long_only so it can handle long options both
with double and single dash. But this means that some single dash
options like -gdwarf-1 don't generate an error (unlike --gdwarf-1).
This is especially confusing since there is also --gdwarf2, but no
--gdwarf4 (it is --gdwarf-4). When giving -gdwarf4 the option is
silently interpreted as -g (which set dwarf_version to 2). This causes
some confusion for people who don't expect this and suddenly get
DWARF2 instead of DWARF4 as they might expect.
So make it so that the -gdwarf<unknown> creates an error, just like
--gdwarf<unknown> would.
Alan Modra [Mon, 24 Aug 2020 07:02:57 +0000 (16:32 +0930)]
PowerPC TPREL_HA/LO optimisation
ppc64 ld optimises sequences like the following
addis 3,13,wot@tprel@ha
lwz 3,wot@tprel@l(3)
to
nop
lwz 3,wot@tprel(13)
when "wot" is located near enough to the thread pointer.
However, the ABI doesn't require that R_PPC64_TPREL16_HA always be on
an addis rt,13,imm instruction, and while ld checked for that on the
high-part instruction it didn't disable the optimisation on the
low-part instruction. This patch fixes that problem, disabling the
tprel optimisation globally if high-part instructions don't pass
sanity checks. The optimisation is also enabled for ppc32, where
before ld.bfd had the code in the wrong place and ld.gold had it in a
block only enabled for ppc64.
bfd/
* elf32-ppc.c (ppc_elf_check_relocs): Set has_tls_reloc for
high part tprel16 relocs.
(ppc_elf_tls_optimize): Sanity check high part tprel16 relocs.
Clear do_tls_opt on odd instructions.
(ppc_elf_relocate_section): Move TPREL16_HA/LO optimisation later.
Don't sanity check them here.
* elf64-ppc.c (ppc64_elf_check_relocs): Set has_tls_reloc for
high part tprel16 relocs.
(ppc64_elf_tls_optimize): Sanity check high part tprel16 relocs.
Clear do_tls_opt on odd instructions.
(ppc64_elf_relocate_section): Don't sanity check TPREL16_HA.
ld/
* testsuite/ld-powerpc/tls32.d: Update for TPREL_HA/LO optimisation.
* testsuite/ld-powerpc/tlsexe32.d: Likewise.
* testsuite/ld-powerpc/tlsldopt32.d: Likewise.
* testsuite/ld-powerpc/tlsmark32.d: Likewise.
* testsuite/ld-powerpc/tlsopt4_32.d: Likewise.
* testsuite/ld-powerpc/tprel.s,
* testsuite/ld-powerpc/tprel.d,
* testsuite/ld-powerpc/tprel32.d: New tests.
* testsuite/ld-powerpc/tprelbad.s,
* testsuite/ld-powerpc/tprelbad.d: New test.
* testsuite/ld-powerpc/powerpc.exp: Run them.
gold/
* powerpc.cc (Target_powerpc): Add tprel_opt_ and accessors.
(Target_powerpc::Scan::local): Sanity check tprel high relocs.
(Target_powerpc::Scan::global): Likewise.
(Target_powerpc::Relocate::relocate): Control tprel optimisation
with tprel_opt_ and enable for 32-bit.
Nick Clifton [Thu, 3 Sep 2020 15:00:48 +0000 (16:00 +0100)]
Partially fix a quadratic slowdown when processing secondary relocations for inputs with lots of sections.
PR 26406
* elf-bfd.h (struct bfd_elf_section_data): Add
has_secondary_relocs field.
* elf.c (_bfd_elf_copy_special_section_fields): Set the
has_secondary_relocs field for sections which have associated
secondary relocs.
* elfcode.h (elf_write_relocs): Only call write_secondary_relocs
on sections which have associated secondary relocs.
Jose E. Marchesi [Wed, 26 Aug 2020 13:46:09 +0000 (15:46 +0200)]
bpf: add xBPF ISA
This patch adds support for xBPF, another ISA targetting the BPF
virtual architecture. For now, the primary difference between eBPF
and xBPF is that xBPF supports indirect calls through the
'call %reg' form of the call instruction.
bfd/
* archures.c (bfd_mach_xbpf): Define.
* bfd-in2.h: Regenerate.
* cpu-bpf.c (bfd_xbpf_arch) New.
(bfd_bpf_arch) Update next in list field to point to xbpf arch.
cpu/
* bpf.cpu (arch bpf): Add xbpf mach and isas.
(define-xbpf-isa) New pmacro.
(all-isas) Add xbpfle,xbpfbe.
(endian-isas): New pmacro.
(mach xbpf): New.
(model xbpf-def): Likewise.
(h-gpr): Add xbpf mach.
(f-dstle, f-srcle, dstle, srcle): Add xbpfle isa.
(f-dstbe, f-srcbe, dstbe, srcbe): Add xbpfbe isa.
(define-alu-insn-un): Use new endian-isas pmacro.
(define-alu-insn-bin, define-alu-insn-mov): Likewise.
(define-endian-insn, define-lddw): Likewise.
(dlind, dxli, dxsi, dsti): Likewise.
(define-cond-jump-insn, define-call-insn): Likewise.
(define-atomic-insns): Likewise.
gas/
* config/tc-bpf.c: Add option -mxbpf to select xbpf isa.
* testsuite/gas/bpf/indcall-1.d: New file.
* testsuite/gas/bpf/indcall-1.s: Likewise.
* testsuite/gas/bpf/indcall-bad-1.l: Likewise.
* testsuite/gas/bpf/indcall-bad-1.s: Likewise.
* testsuite/gas/bpf/bpf.exp: Run new tests.
opcodes/
* bpf-desc.c: Regenerate.
* bpf-desc.h: Likewise.
* bpf-opc.c: Likewise.
* bpf-opc.h: Likewise.
* disassemble.c (disassemble_init_for_target): Set bits for xBPF
ISA when appropriate.