- A kernel is launched
- The internal runtime breakpoint is hit during the second
hipLaunchKernelGGL call, which causes
amd_dbgapi_target_breakpoint::check_status to be called
- Meanwhile, all waves of the kernel hit the breakpoint on vectorADD
- amd_dbgapi_target_breakpoint::check_status calls process_event_queue,
which pulls the thousand of breakpoint hit events from the kernel
- As part of handling the breakpoint hit events, we write the PC of the
waves that stopped to decrement it. Because the forward progress
requirement is not disabled, this causes a suspend/resume of the
queue each time, which is time-consuming.
The stack trace where this all happens is:
#32 0x00007ffff6b9abda in amd_dbgapi_write_register (wave_id=..., register_id=..., offset=0, value_size=8, value=0x7fffea9fdcc0) at /home/smarchi/src/amd-dbgapi/src/register.cpp:587
#33 0x00005555588c0bed in amd_dbgapi_target::store_registers (this=0x55555c7b1d20 <the_amd_dbgapi_target>, regcache=0x507000002240, regno=470) at /home/smarchi/src/wt/amd/gdb/amd-dbgapi-target.c:2504
#34 0x000055555a5186a1 in target_store_registers (regcache=0x507000002240, regno=470) at /home/smarchi/src/wt/amd/gdb/target.c:3973
#35 0x0000555559fab831 in regcache::raw_write (this=0x507000002240, regnum=470, src=...) at /home/smarchi/src/wt/amd/gdb/regcache.c:890
#36 0x0000555559fabd2b in regcache::cooked_write (this=0x507000002240, regnum=470, src=...) at /home/smarchi/src/wt/amd/gdb/regcache.c:915
#37 0x0000555559fc3ca5 in regcache::cooked_write<unsigned long, void> (this=0x507000002240, regnum=470, val=140737323456768) at /home/smarchi/src/wt/amd/gdb/regcache.c:850
#38 0x0000555559fab09a in regcache_cooked_write_unsigned (regcache=0x507000002240, regnum=470, val=140737323456768) at /home/smarchi/src/wt/amd/gdb/regcache.c:858
#39 0x0000555559fb0678 in regcache_write_pc (regcache=0x507000002240, pc=0x7ffff62bd900) at /home/smarchi/src/wt/amd/gdb/regcache.c:1460
#40 0x00005555588bb37d in process_one_event (event_id=..., event_kind=AMD_DBGAPI_EVENT_KIND_WAVE_STOP) at /home/smarchi/src/wt/amd/gdb/amd-dbgapi-target.c:1873
#41 0x00005555588bbf7b in process_event_queue (process_id=..., until_event_kind=AMD_DBGAPI_EVENT_KIND_BREAKPOINT_RESUME) at /home/smarchi/src/wt/amd/gdb/amd-dbgapi-target.c:2006
#42 0x00005555588b1aca in amd_dbgapi_target_breakpoint::check_status (this=0x511000140900, bs=0x50600014ed00) at /home/smarchi/src/wt/amd/gdb/amd-dbgapi-target.c:890
#43 0x0000555558c50080 in bpstat_stop_status (aspace=0x5070000061b0, bp_addr=0x7fffed0b9ab0, thread=0x518000026c80, ws=..., stop_chain=0x50600014ed00) at /home/smarchi/src/wt/amd/gdb/breakpoint.c:6126
#44 0x000055555984f4ff in handle_signal_stop (ecs=0x7fffeaa40ef0) at /home/smarchi/src/wt/amd/gdb/infrun.c:7169
#45 0x000055555984b889 in handle_inferior_event (ecs=0x7fffeaa40ef0) at /home/smarchi/src/wt/amd/gdb/infrun.c:6621
#46 0x000055555983eab6 in fetch_inferior_event () at /home/smarchi/src/wt/amd/gdb/infrun.c:4750
#47 0x00005555597caa5f in inferior_event_handler (event_type=INF_REG_EVENT) at /home/smarchi/src/wt/amd/gdb/inf-loop.c:42
#48 0x00005555588b838e in handle_target_event (client_data=0x0) at /home/smarchi/src/wt/amd/gdb/amd-dbgapi-target.c:1513
Fix that performance problem by disabling the forward progress
requirement in amd_dbgapi_target_breakpoint::check_status, before
calling process_event_queue, so that we can process all events
efficiently.
Since the same performance problem could theoritically happen any time
process_event_queue is called with forward progress requirement enabled,
add an assert to ensure that forward progress requirement is disabled
when process_event_queue is invoked. This makes it necessary to add a
require_forward_progress call to amd_dbgapi_finalize_core_attach. It
looks a bit strange, since core files don't have execution, but it
doesn't hurt.
Add a test that replicates this scenario. The test launches a kernel
that hits a breakpoint (with an always false condition) repeatedly.
Meanwhile, the host process loads an unloads a code object, causing
check_status to be called.
Bug: SWDEV-482511
Change-Id: Ida86340d679e6bd8462712953458c07ba3fd49ec Approved-by: Lancelot Six <lancelot.six@amd.com>
Simon Marchi [Mon, 9 Jun 2025 16:09:01 +0000 (12:09 -0400)]
gdb/amd-dbgapi: factor out require_forward_progress overload to target one inferior
A following patch will want to call require_forward_progress for a given
inferior. Extract a new require_forward_progress overload from the
existing require_forward_progress function that targets a specific
inferior.
Change-Id: I54f42b83eb8443d4d91747ffbc86eaeb017f1e49 Approved-by: Lancelot Six <lancelot.six@amd.com>
Simon Marchi [Mon, 9 Jun 2025 16:09:00 +0000 (12:09 -0400)]
gdb/amd-dbgapi: pass amd_dbgapi_inferior_info to process_one_event
Pass the amd_dbgapi_inferior_info object from process_event_queue to
process_one_event. Since process_event_queue pulls events for one
specific inferior, we know for which inferior the event is. This
removes the need for process_one_event to do two dbgapi calls to get the
relevant pid. If also removes one inferior lookup.
Change-Id: I22927e4b6251513eb3be95785082058aa3d09954 Approved-by: Lancelot Six <lancelot.six@amd.com>
Simon Marchi [Mon, 9 Jun 2025 16:08:59 +0000 (12:08 -0400)]
gdb/amd-dbgapi: pass amd_dbgapi_inferior_info to process_event_queue
A following patch will make process_event_queue access a field of
amd_dbgapi_inferior_info. Prepare for this by making
process_event_queue accept an amd_dbgapi_inferior_info object, instead
of a process id.
Change-Id: I9adc491dd1ff64ff74c40aa7662fffb11bd8332b Approved-by: Lancelot Six <lancelot.six@amd.com>
Simon Marchi [Mon, 9 Jun 2025 16:08:58 +0000 (12:08 -0400)]
gdb/amd-dbgapi: add assert in require_forward_progress
I didn't have a problem in this area, but it seems to me that this
pre-condition should always hold. We should only disable forward
progress requirement if the target says it's ok to do so. Otherwise, we
could get in a situation where we wait for events from amd-dbgapi, which
will never arrive, because amd-dbgapi didn't actually resume things.
Change-Id: Ifc49f55c7874924b7c47888b8391a07a01d960fc Approved-by: Lancelot Six <lancelot.six@amd.com>
Tom de Vries [Mon, 16 Jun 2025 13:13:25 +0000 (15:13 +0200)]
[gdb/testsuite] Fix gdb.python/py-source-styling-2.exp with TERM=dumb
When running test-case gdb.python/py-source-styling-2.exp with TERM=dumb, I
get:
...
(gdb) set style enabled on^M
warning: The current terminal doesn't support styling. \
Styled output might not appear as expected.^M
(gdb) FAIL: $exp: set style enabled on
...
Fix this by using with_ansi_styling_terminal on clean_restart.
+#if BFD_SUPPORTS_PLUGINS
+ /* Copy LTO IR file as unknown object. */
+ if (bfd_plugin_target_p (ibfd->xvec))
^^^^ A typo, should be this_element.
+ ok_object = false;
+ else
+#endif
if (ok_object)
{
ok = copy_object (this_element, output_element, input_arch);
to check if the archive element is a LTO IR file. "ibfd" is the archive
BFD. "this_element" should be used to check for LTO IR in the archive
element. Fix it by replacing "ibfd" with "this_element".
PR binutils/33078
* objcopy.c (copy_archive): Correctly check archive element for
LTO IR.
* testsuite/binutils-all/objcopy.exp (strip_test_archive): New.
Run strip_test_archive.
Stafford Horne [Sun, 1 Jun 2025 05:39:01 +0000 (06:39 +0100)]
or1k: Add support for numcores and coreid sprs
These are needed when running GCC tests for newlib toolchains built with
multicore support. Without these SPRs we get the following warnings
when running tests.
spawn or1k-elf-run ./20000112-1.exe^M
WARNING: l.mfspr with invalid SPR address 0x80^M
WARNING: l.mfspr with invalid SPR address 0x81^M
WARNING: l.mfspr with invalid SPR address 0x81^M
WARNING: l.mfspr with invalid SPR address 0x81^M
Support is added by defining the SPRs in the cgen machine definition and
regenerating the machine code. In or1k/or1k.c we initialize NUMCORES to
1 and COREID to 0 as the sim has only one CPU. In or1k/traps.c we allow
returning the NUMCORES and COREID spr values in the mfspr function.
Simon Marchi [Mon, 5 May 2025 20:15:26 +0000 (16:15 -0400)]
gdbsupport: make gdb::parallel_for_each's n parameter a template parameter
This value will likely never change at runtime, so we might as well make
it a template parameter. This has the "advantage" of being able to
remove the unnecessary param from gdb::sequential_for_each.
Change-Id: Ia172ab8e08964e30d4e3378a95ccfa782abce674 Approved-By: Tom Tromey <tom@tromey.com>
Simon Marchi [Fri, 2 May 2025 17:57:57 +0000 (13:57 -0400)]
gdb: re-work parallel-for-selftests.c
I find this file difficult to work with and modify, due to how it uses
the preprocessor to include itself, to generate variations of the test
functions. Change it to something a bit more C++-y, with a test
function that accepts a callback to invoke the foreach function under
test.
Jan Beulich [Fri, 13 Jun 2025 11:46:30 +0000 (13:46 +0200)]
x86: don't constrain %axl/%cxl
They can be used like their %al/%cl counterparts everywhere else;
there's no apparent reason why they shouldn't be usable as accumulator /
shift count respectively. Enforcing such a restriction only makes
writing heavily macro-ized code more cumbersome.
Jan Beulich [Fri, 13 Jun 2025 11:46:06 +0000 (13:46 +0200)]
x86: swap operands in OUT-with-immediate template
In a number of places we assume that immediates come first in the set of
operands. It is mere luck that so far OUT, having operands the other way
around, wasn't negatively impacted by this.
Leverage this to have a few loops start from the first non-immediate
operand (or in one case to stop there). Note, however, that
process_immext() inserts an immediate last, so especially all output_*()
functions cannot be changed in the same way.
objcopy: /tmp/objcopy-poc(OrcError.cpp.o): invalid entry (0x22000000) in group [3]
objcopy: /tmp/objcopy-poc(OrcError.cpp.o): invalid entry (0x21000000) in group [3]
objcopy: /tmp/objcopy-poc(OrcError.cpp.o)(.text._ZNK12_GLOBAL__N_116OrcErrorCategory7messageB5cxx11Ei): relocation 29 has invalid symbol index 1160982879
objcopy: /tmp/stv73zYw/OrcError.cpp.o[.text._ZN4llvm3orc8orcErrorENS0_12OrcErrorCodeE]: bad value
instead of
objcopy: /tmp/objcopy-poc(OrcError.cpp.o): invalid entry (0x22000000) in group [3]
objcopy: /tmp/objcopy-poc(OrcError.cpp.o): invalid entry (0x21000000) in group [3]
objcopy: /tmp/objcopy-poc(OrcError.cpp.o)(.text._ZNK12_GLOBAL__N_116OrcErrorCategory7messageB5cxx11Ei): relocation 29 has invalid symbol index 1160982879
Segmentation fault (core dumped)
PR binutils/33075
* elf.c (elf_map_symbols): Return false if output_section is
NULL.
Jan Beulich [Fri, 13 Jun 2025 06:40:32 +0000 (08:40 +0200)]
x86: refine UD<n> kind-of-insns
While documentation of these continues to be lacking sufficient detail,
it is becoming increasingly clear that in 66f1eba0b7e8 ("x86: correct
UDn") I went too far with requiring operands, to populate a ModR/M byte.
AMD hardware appears to always behave as indicated as "may" in PM 3.36,
which for all practical purposes means there's no ModR/M byte. The SDM
(rev 087) indicates that such behavior can occur on older hardware for
UD0. Re-add an operand-less UD1 form (as well as its UD2B alias), while
newly adding such a form also for UD0. Because of the ambiguity, there's
no good/easy way of handling both possibilities in the disassembler,
which hence remains unaltered.
Further, from all information I'm able to gather, the 0F opcode space
was only introduced with the i286; bump the minimal hardware requirement
for all UD<n> accordingly.
Jan Beulich [Fri, 13 Jun 2025 06:40:01 +0000 (08:40 +0200)]
gas: switch convert_to_bignum() to taking just an expression
Both callers, despite spelling things differently, now pass the same
input for its 2nd parameter. Therefore, as was supposed to be the case
anyway, this 2nd parameter isn't needed anymore - the function can
calculate "sign" all by itself from the incoming expression. Instead
make the function return the resulting value, for emit_expr_with_reloc()
to consume for setting its "extra_digit" local variable.
Jan Beulich [Fri, 13 Jun 2025 06:39:44 +0000 (08:39 +0200)]
gas: also maintain signed-ness for O_big expressions
Interestingly emit_leb128_expr() already assumes X_unsigned is properly
set for O_big. Adjust its conversion-to-bignum to respect the incoming
flag, and have convert_to_bignum() correctly set it on output.
It further can't be quite right that convert_to_bignum() depends on
anything other than the incoming expression. Therefore adjust
emit_expr_with_reloc() to be in line with the other invocation.
This also requires an adjustment for SH, which really should have been
part of 762acf217c40 ("gas: maintain O_constant signedness in more
cases").
Jeremy Drake [Fri, 13 Jun 2025 05:52:47 +0000 (07:52 +0200)]
ld,dlltool: move read-only delayimp data into .rdata
This allows the delay IAT to be in its own section with nothing else, as
required by IMAGE_GUARD_DELAYLOAD_IAT_IN_ITS_OWN_SECTION, documented at
https://learn.microsoft.com/en-us/windows/win32/debug/pe-format#load-configuration-layout
Signed-off-by: Jeremy Drake <sourceware-bugzilla@jdrake.com>
LIU Hao [Fri, 13 Jun 2025 05:52:29 +0000 (07:52 +0200)]
bfd,ld,dlltool: Emit delay-load import data into its own section
A delay-import symbol (of a function) is resolved when a call to it is made.
The delay loader may overwrite the `__imp_` pointer to the actual function
after it has been resolved, which requires the pointer itself be in a
writeable section.
Previously it was placed in the ordinary Import Address Table (IAT), which
is emitted into the `.idata` section, which had been changed to read-only
in db00f6c3aceabbf03acdb69e74b59b2d2b043cd7, which caused segmentation
faults when functions from delay-import library were called. This is
PR 32675.
This commit makes DLLTOOL emit delay-import IAT into `.didat`, as specified
by Microsoft. Most of the code is copied from `.idata`, except that this
section is writeable. As a side-effect of this, PR 14339 is also fixed.
Reference: https://learn.microsoft.com/en-us/windows/win32/secbp/pe-metadata#import-handling Co-authored-by: Jeremy Drake <sourceware-bugzilla@jdrake.com> Signed-off-by: LIU Hao <lh_mouse@126.com> Signed-off-by: Jeremy Drake <sourceware-bugzilla@jdrake.com>
Klaus Gerlicher [Thu, 12 Jun 2025 15:37:50 +0000 (15:37 +0000)]
gdb, linespec: avoid multiple locations with same PC
Setting a BP on a line like this would incorrectly yield two BP locations:
01 void two () { {int var = 0;} }
(gdb) break 1
Breakpoint 1 at 0x1164: main.cpp:1. (2 locations)
(gdb) info breakpoints
Num Type Disp Enb Address What
1 breakpoint keep y <MULTIPLE>
1.1 y 0x0000000000001164 in two() at main.cpp:1
1.2 y 0x0000000000001164 in two() at main.cpp:1
In this case decode_digits_ordinary () returns two SALs, exactly matching the
requested line. One for the entry PC and one for the prologue end PC. This
was
tested with GCC, CLANG and ICPX. Subsequent code tries to skip the prologue
on these PCs, which in turn makes them the same.
To fix this, ignore SALs with the same PC and program space when adding to the
list of SALs.
This will then properly set only one location:
(gdb) break 1
Breakpoint 1 at 0x1164: file main.cpp, line 1
(gdb) info breakpoints
Num Type Disp Enb Address What
1 breakpoint keep y 0x0000000000001164 in two() at main.cpp:1
Approved-By: Simon Marchi <simon.marchi@efficios.com>
Andrew Burgess [Wed, 11 Jun 2025 09:20:17 +0000 (10:20 +0100)]
gdb: convert linux-namespaces debug to the new(er) debug scheme
Convert 'set debug linux-namespaces' to the new(er) debug scheme. As
part of this change I converted the mnsh_debug_print_message function,
which previously printed its output, to instead return a std::string,
this string is then printed using linux_namespaces_debug_printf. The
mnsh_debug_print_message function is only used as part of the debug
output.
I also updated one place in the code where debug_linux_namespaces, the
debug control variable, which is a boolean, was assigned an integer.
When debug is turned on then clearly the output is now different, but
in all other cases, there should be no user visible change in GDB
after this commit.
Richard Ball [Thu, 12 Jun 2025 00:39:24 +0000 (01:39 +0100)]
aarch64: Add support for FEAT_FPRCVT
FEAT_FPRCVT introduces new versions of previous instructions.
The instructions are used to convert between floating points and
Integers. These new versions take as operands SIMD&FP registers
for both the source and destination register. FEAT_FPRCVT also
enables the use of some existing AdvSIMD instructions in
streaming mode. However, no changes are needed in gas to support this.
Aaron Griffith [Mon, 9 Jun 2025 19:19:41 +0000 (15:19 -0400)]
gdb: fix size of z80 "add ii,rr" and "ld (ii+d),n" instructions
The tables in z80-tdep.c previously either gave these instructions the
wrong size, or failed to recognize them by using the wrong masks, or
both. The fixed instructions alongside their representation in octal are:
GDB: doc: Improve AArch64 subsubsection titles and index entries in gdb.texinfo
Remove period from subsubsection titles in the AArch64 configuration-specific
subsection, and expand acronyms.
Regarding @cindex entries, remove periods and standardise their order
and the position of "AArch64" to make it easier to find them by
using the index-searching commands of Info readers that offer TAB
completion.
Matthieu Longo [Wed, 21 May 2025 10:08:31 +0000 (11:08 +0100)]
Arm tests: reduce objdump's output and improve some matching patterns
Linker scripts can change the sections order in the output. Some matching
patterns in tests try to detect the end of a section by detecting the
beginning of the next one. However, they mistakenly enforce the name of
the next section without any need. This caused the tests to break due to
minor changes to the linker scripts.
This patch adds '-j <interesting-section>' to the arguments of objdump
to dump only relevant information for the tests. This removed the issue
related to the ordering of the sections. The matching patterns were also
made stricter to match better the expected output.
Pedro Alves [Thu, 1 Jun 2023 17:43:15 +0000 (18:43 +0100)]
gdb testsuite: Introduce allow_multi_inferior_tests and use it throughout
The Windows port does not support multi-process debugging. Testcases
that want to exercise multi-process currently FAIL and some hit
cascading timeouts. Add a new allow_multi_inferior_tests procedure,
meant to be used with require, and sprinkle it throughout testcases as
needed.
Approved-by: Kevin Buettner <kevinb@redhat.com>
Change-Id: I4a10d8f04f9fa10f4b751f140ad0a6d31fbd9dfb
Pedro Alves [Thu, 1 Jun 2023 15:19:03 +0000 (16:19 +0100)]
gdb testsuite: Introduce allow_fork_tests and use it throughout
Cygwin debugging does not support follow fork. There is currently no
interface between the debugger and the Cygwin runtime to be able to
intercept forks and execs. Consequently, testcases that try to
exercise fork/exec all FAIL, and several hit long cascading timeouts.
Add a new allow_fork_tests procedure, meant to be used with require,
and sprinkle it throughout testcases that exercise fork.
Note that some tests currently are skipped on targets other than
Linux, with something like:
# Until "set follow-fork-mode" and "catch vfork" are implemented on
# other targets...
#
if {![istarget "*-linux*"]} {
continue
}
However, some BSD ports also support fork debugging nowadays, and the
testcases were never adjusted... That is why the new allow_fork_tests
procedure doesn't look for linux.
With this patch, on Cygwin, I get this:
$ make check TESTS="*/*fork*.exp"
...
=== gdb Summary ===
# of expected passes 6
# of untested testcases 1
# of unsupported tests 31
Reviewed-By: Keith Seitz <keiths@redhat.com>
Change-Id: I0c5e8c574d1f61b28d370c22a0b0b6bc3efaf978
Pedro Alves [Fri, 2 Jun 2023 00:05:38 +0000 (01:05 +0100)]
gdb.multi/attach-no-multi-process.exp: Detect no remote non-stop
Running gdb.multi/attach-no-multi-process.exp on Cygwin, where
GDBserver does not support non-stop mode, I see:
FAIL: gdb.multi/attach-no-multi-process.exp: target_non_stop=off: info threads
FAIL: gdb.multi/attach-no-multi-process.exp: target_non_stop=on: attach to the program via remote (timeout)
FAIL: gdb.multi/attach-no-multi-process.exp: target_non_stop=on: info threads (timeout)
Let's ignore the first "info threads" fail. The timeouts look like
this:
builtin_spawn /home/alves/gdb-cache-cygwin/gdb/../gdbserver/gdbserver --once --multi localhost:2346
Listening on port 2346
target extended-remote localhost:2346
Remote debugging using localhost:2346
Non-stop mode requested, but remote does not support non-stop
(gdb) gdb_do_cache: can_spawn_for_attach ( )
builtin_spawn /home/alves/gdb/build-cygwin-testsuite/outputs/gdb.multi/attach-no-multi-process/attach-no-multi-process
attach 14540
FAIL: gdb.multi/attach-no-multi-process.exp: target_non_stop=on: attach to the program via remote (timeout)
info threads
FAIL: gdb.multi/attach-no-multi-process.exp: target_non_stop=on: info threads (timeout)
Note the "Non-stop mode requested, but remote does not support
non-stop" line.
The intro to gdb_target_cmd_ext says:
# gdb_target_cmd_ext
# Send gdb the "target" command. Returns 0 on success, 1 on failure, 2 on
# unsupported.
That's perfect here, we can just use gdb_target_cmd_ext instead of
gdb_target_cmd, and check for 2 (unsupported). That's what this patch
does.
However gdb_target_cmd_ext incorrectly returns 1 instead of 2 for the
case where the remote target says it does not support non-stop. That
is also fixed by this patch.
With this, we no longer get those timeout fails. We get instead:
target extended-remote localhost:2346
Remote debugging using localhost:2346
Non-stop mode requested, but remote does not support non-stop
(gdb) UNSUPPORTED: gdb.multi/attach-no-multi-process.exp: target_non_stop=on: non-stop RSP
Approved-by: Kevin Buettner <kevinb@redhat.com>
Change-Id: I1ab3162f74200c6c02a17a0600b102d2d12db236
Pedro Alves [Wed, 3 Apr 2024 21:34:47 +0000 (22:34 +0100)]
Convert gdb.base/watchpoint-hw-attach.exp to spawn_wait_for_attach
On Cygwin, starting an inferior under GDB, and detaching it, quitting
GDB, and then closing the shell, like so:
(gdb) start
(gdb) detach
(gdb) quit
# close shell
... hangs the parent shell of GDB (not GDB!) until the inferior
process that was detached (as it is still using the same terminal GDB
was using) exits too.
This leads to odd failures in gdb.base/watchpoint-hw-attach.exp like
so:
detach
Detaching from program: .../outputs/gdb.base/watchpoint-hw-attach/watchpoint-hw-attach, process 16580
[Inferior 1 (process 16580) detached]
(gdb) FAIL: gdb.base/watchpoint-hw-attach.exp: detach
Fix this by converting the testcase to spawn the inferior outside GDB,
with spawn_wait_for_attach.
With this patch, the testcase passes cleanly on Cygwin, for me.
Approved-By: Tom Tromey <tom@tromey.com>
Change-Id: I8e3884073a510d6fd2fff611e1d26fc808adc4fa
ld: arm32: fix segfault when linking foreign BFDs [PR32870]
PR ld/32870
The linker may occasionally need to process a BFD that is from a
non-Arm architecture. There will not be any Arm-specific tdata in
that case, so skip such BFDs when looking for iplt information as the
necessary tdata will not be present.
Tom Tromey [Tue, 10 Jun 2025 13:15:10 +0000 (07:15 -0600)]
Fix Solaris build
Commit 58984e4a ("Use gdb::function_view in iterate_over_threads")
broke the Solaris build. This patch attempts to fix it, changing
find_signalled_thread to have the correct signature, and correcting a
couple of problems in sol_thread_target::get_ada_task_ptid.
Jan Beulich [Wed, 11 Jun 2025 12:32:34 +0000 (14:32 +0200)]
ld/PE: special-case relocation types only for COFF inputs
In 72cd2c709779 ("ld/PE: no base relocs for section (relative) ones") I
made a pre-existing problem quite a bit worse: When looking at a
relocation's (numerical) howto->type, that value is meaningful only if
the object was of corresponding COFF type. ELF objects in particular
have their own enumeration. As it stands, specifically the not entirely
unusual R_X86_64_32 and R_X86_64_32S did no longer have relocations
emitted for them, due to matching R_AMD64_SECTION and R_AMD64_SECREL in
value respectively.
Jan Beulich [Wed, 11 Jun 2025 12:32:13 +0000 (14:32 +0200)]
arm: ignore inapplicable .arch=no...
Unlike for command line options, where a base architecture needs to be
provided explicitly, the .arch directive doesn't have such a
requirement. Therefore it is odd that disabling of an inapplicable
extension isn't silently ignored; claiming "not allowed for the current
base architecture" is at best misleading. Alter the error path to emit a
more "soft" diagnostic in that case instead.
Matthieu Longo [Wed, 21 May 2025 10:20:40 +0000 (11:20 +0100)]
AArch64 variant PCS tests: remove RWX permissions on segments
The symbols of variant PCS functions require special handling. The variant PCS
tests check both the relocation information and the markings in the symbol table.
Those tests dump a lot of addresses, so a custom linker script, variant_pcs.ld
was used to control reliably the addresses of the sections.
However, the linker script does not provide information enough to the linker to
assess the right set of permisssions on segments (i.e. Read/Write/Execute).
This insufficiency caused the linker to bundle all the sections in a same segment
with the union of all the required permissions, i.e. RWX.
A segment with such lax permissions constitutes a security hole, so the linker
emits the following warning message:
<ELF file> has a LOAD segment with RWX permissions.
This warning message is noisy in the tests, and has no reason to exist.
This issue can be addressed in two ways:
- either by providing the right set of permissions on a section so that the
linker assigns them to a segment with compatible permissions.
- or by providing alignment constraints so that the linker can move the sections
automatically to a new segment and set the right permission for non-executable
data.
The second option seems to be the preferred approach, even if not explicitly
recommended. Examples of linker scripts for AArch64 are available at [1].
This patch reorganizes the linker script to eliminate RWX segments by changing
the order of the sections and their offset. The tests needed to be amended to
match the new addresses.
Matthieu Longo [Wed, 21 May 2025 10:19:48 +0000 (11:19 +0100)]
AArch64 BTI/PAC PLT tests: remove RWX permissions on segments
The bti-far.ld and bti-plt.ld scripts don't provide information enough to the
linker to assess the right set of permisssions on segments (i.e. Read/Write/Execute).
This insufficiency caused the linker to bundle all the sections in a same segment
with the union of all the required permissions, i.e. RWX.
A segment with such lax permissions constitutes a security hole, so the linker
emits the following warning message:
<ELF file> has a LOAD segment with RWX permissions.
This warning message is noisy in the tests, and has no reason to exist.
This issue can be addressed in two ways:
- either by providing the right set of permissions on a section so that the
linker assigns them to a segment with compatible permissions.
- or by providing alignment constraints so that the linker can move the sections
automatically to a new segment and set the right permission for non-executable
data.
The second option seems to be the preferred approach, even if not explicitly
recommended. Examples of linker scripts for AArch64 are available at [1].
The fixes in bti-far.ld and bti-plt.ld are the same, except that bti-far.ld also
contains a ".far" section, to make sure that it generates the trampolines correctly.
Matthieu Longo [Wed, 21 May 2025 10:18:48 +0000 (11:18 +0100)]
AArch64 tests: remove RWX permissions on segments
aarch64.ld is the linker script used by most of the relocation tests in AArch64
testsuite. The script does not provide information enough to the linker to assess
the right set of permisssions on segments (i.e. Read/Write/Execute).
This insufficiency caused the linker to bundle all the sections in a same segment
with the union of all the required permissions, i.e. RWX.
A segment with such lax permissions constitutes a security hole, so the linker
emits the following warning message:
<ELF file> has a LOAD segment with RWX permissions.
This warning message is noisy in the tests, and has no reason to exist.
This issue can be addressed in two ways:
- either by providing the right set of permissions on a section so that the
linker assigns them to a segment with compatible permissions.
- or by providing alignment constraints so that the linker can move the sections
automatically to a new segment and set the right permission for non-executable
data.
The second option seems to be the preferred approach, even if not explicitly
recommended. Examples of linker scripts for AArch64 are available at [1].
Alan Modra [Mon, 9 Jun 2025 05:30:30 +0000 (15:00 +0930)]
gas md_apply_fix bad casts
ns32k and z8k cast a valueT pointer to a long pointer when loading
md_apply_fix's value. That's quite wrong if the types have different
sizes, as they may eg. on a 32-bit host with 64-bit bfd support.
sparc also loads the value via a cast pointer, but at least in that
case the cast is to the same size pointer. None of these casts are
needed. Get rid of them.
Alan Modra [Tue, 10 Jun 2025 10:59:33 +0000 (20:29 +0930)]
loongarch gcc-4.5 build fixes
Yet another case of missing fields in struct initialisation, which
I've replaced with a memset, and some complaints about identifiers
shadowing global declarations. Fixing the shadowing in
loongarch-parse.y is easy. This one isn't so easy:
gas/expr.c: In function 'expr':
gas/expr.c:1891:12: error: declaration of 'is_unsigned' shadows a global declaration
include/opcode/loongarch.h:224:14: error: shadowed declaration is here
opcode/loongarch.h declares lots of stuff that shouldn't be made
available to generic gas code, so I've removed that header from
tc-loongarch.h and moved the parts of TC_FORCE_RELOCATION_SUB_LOCAL
and TC_FORCE_RELOCATION_SUB_LOCAL that need LARCH_opts to functions
in tc-loongarch.c
* config/loongarch-parse.y (loongarch_parse_expr): Rename
param to avoid shadowing.
* config/tc-loongarch.c (loongarch_assemble_INSNs): Use memset
rather than struct initialisation.
(loongarch_force_relocation_sub_local): New function.
(loongarch_force_relocation_sub_same): Likewise.
* config/tc-loongarch.h: Don't include opcode/loongarch.h.
(loongarch_force_relocation_sub_local): Declare, and..
(TC_FORCE_RELOCATION_SUB_LOCAL): ..use here.
(loongarch_force_relocation_sub_same): Declare, and..
(TC_FORCE_RELOCATION_SUB_SAME): ..use here.
Alan Modra [Tue, 10 Jun 2025 10:58:42 +0000 (20:28 +0930)]
kvx gcc-4.5 build fixes
More missing struct initialisers, for expressionS vars that in this
case don't need to be initialised. Also an error: redefinition of
typedef 'symbolS'. OK, so don't use a typedef.
Alan Modra [Tue, 10 Jun 2025 09:03:04 +0000 (18:33 +0930)]
csky gcc-4.5 build fix
gcc-4.5 warns about missing csky_cpus struct initialisers. Fix that
by providing everything in the init macros and the zero sentinel,
rather than just a single {0} as allowed by C99.
Alan Modra [Mon, 9 Jun 2025 11:04:02 +0000 (20:34 +0930)]
gas: xtensa build failure with --enable-64-bit-bfd
A 32-bit host with --enable-64-bit-bfd --target=xtensa-lx106-elf give:
gas/config/tc-xtensa.c: In function ‘xg_get_best_chain_entry’:
gas/config/tc-xtensa.c:7689:11: error: absolute value function ‘labs’ given an argument of type ‘offsetT’ {aka ‘long long int’} but has parameter of type ‘long int’ which may cause truncation of value [-Werror=absolute-value]
7689 | if (labs (off) >= J_RANGE - J_MARGIN)
| ^~~~
Let's not use labs. Unlike labs vma_abs deliberately returns an
unsigned value, and does the negation in an unsigned type so that
signed overflow can't happen.
* config/tc-xtensa.c (vma_abs): New function.
(xg_get_best_chain_entry, xg_get_fulcrum, xg_find_best_trampoline),
(xg_is_relaxable_fixup): Use in place of labs.
Matthieu Longo [Wed, 21 May 2025 10:13:33 +0000 (11:13 +0100)]
AArch64, Arm and TIC6x tests: fix typo in linker scripts
The linker scripts for AArch64 and TIC6x were probably originally copied from
Arm testsuite, and contain the same typo in the name of the attributes section.
This patch fixes the typo across all the testsuites.
Simon Marchi [Tue, 10 Jun 2025 03:07:04 +0000 (23:07 -0400)]
gdb/dwarf2: remove erroneous comment in open_and_init_dwo_file
When writing commit 28f15782adab ("gdb/dwarf: read multiple .debug_info.dwo
sections"), I initially thought that the gcc behavior of producing multiple
.debug_info.dwo sections was a bug (it is not). I updated the commit
message, but it looks like this comment stayed. Remove it, since it can
be misleading.
Change-Id: I027712d44b778e836f41afbfafab993da02726ef Approved-By: Tom Tromey <tom@tromey.com>
Simon Marchi [Thu, 5 Jun 2025 19:18:43 +0000 (15:18 -0400)]
gdb/solib-svr4: remove svr4_have_link_map_offsets
While C++ifying the solib code, I concluded that all arches that use
SVR4 libraries do provide link map offsets, so I think this function is
unnecessary now.
Change-Id: Ifaae2560d92f658df3724def6219e2f89054e4b7 Approved-By: Tom Tromey <tom@tromey.com>
Here, the breakpoint only got one location because both the in-charge
and the not-in-charge dtors are identical and got the same address:
$ nm -A ./testsuite/outputs/gdb.cp/cpexprs/cpexprs| c++filt |grep "~base"
./testsuite/outputs/gdb.cp/cpexprs/cpexprs:0000000000001d84 W base::~base()
./testsuite/outputs/gdb.cp/cpexprs/cpexprs:0000000000001d84 W base::~base()
While on Cygwin, we get two locations for the same breakpoint, which
the testcase isn't expecting:
Thread 1 "cpexprs" hit Breakpoint 117.1, base::~base (this=0x7ffffcaf8, __in_chrg=<optimized out>) at .../src/gdb/testsuite/gdb.cp/cpexprs.cc:135
135 ~base (void) { } // base::~base
(gdb) FAIL: gdb.cp/cpexprs.exp: continue to base::~base
We got two locations because the in-charge and the not-in-charge dtors
have different addresses:
$ nm -A outputs/gdb.cp/cpexprs/cpexprs.exe | c++filt | grep "~base"
outputs/gdb.cp/cpexprs/cpexprs.exe:0000000100402680 T base::~base()
outputs/gdb.cp/cpexprs/cpexprs.exe:0000000100402690 T base::~base()
On Cygwin, we also see the typical failure due to not expecting the
inferior to be multi-threaded:
(gdb) continue
Continuing.
[New Thread 628.0xe08]
Thread 1 "cpexprs" hit Breakpoint 200, test_function (argc=1, argv=0x7ffffcc20) at .../src/gdb/testsuite/gdb.cp/cpexprs.cc:336
336 derived d;
(gdb) FAIL: gdb.cp/cpexprs.exp: continue to test_function for policyd3::~policyd
Both issues are fixed by this patch, and now the testcase passes
cleanly on Cygwin, for me.
Reviewed-By: Keith Seitz <keiths@redhat.com>
Change-Id: If7eb95d595f083f36dfebf9045c0fc40ef5c5df1
Pedro Alves [Fri, 23 Jun 2023 20:01:39 +0000 (21:01 +0100)]
gdb.threads/thread-execl, don't re-exec forever
I noticed on Cygwin, gdb.thread/thread-execl.exp would hang, (not that
surprising since we can't follow-exec on Cygwin). Looking at the
process list running on the machine, we end up with a thread-execl.exe
process constantly respawning another process [1].
We see the same constant-reexec if we launch gdb.thread/thread-execl
manually on the shell:
Pedro Alves [Mon, 26 Jun 2023 12:56:26 +0000 (13:56 +0100)]
Support core dumping testcases with Cygwin's dumper
Cygwin supports dumping ELF cores via a dumper.exe utility, see
https://www.cygwin.com/cygwin-ug-net/dumper.html.
When I run a testcase that has the "kernel" generate a corefile, like
gdb.base/corefile.exp, Cygwin invokes dumper.exe correctly and
generates an ELF core file, however, the testsuite doesn't find the
generated core:
foreach i "${coredir}/core ${coredir}/core.coremaker.c ${binfile}.core" {
Note that that isn't looking for "${binfile}.core" inside
${coredir}... That is fixed in this patch.
However, that still isn't sufficient for Cygwin + dumper, as in that
case the core is going to be called foo.exe.core, not foo.core. Fix
that by looking for foo.exe.core in the core dir as well.
With this, gdb.base/corefile.exp and other tests that use core_find
now run. They don't pass cleanly, but at least now they're exercised.
Approved-By: Tom Tromey <tom@tromey.com>
Change-Id: Ic807dd2d7f22c5df291360a18c1d4fbbbb9b993e
Pedro Alves [Mon, 26 Jun 2023 20:03:32 +0000 (21:03 +0100)]
Adjust gdb.base/sigall.exp for Cygwin
The gdb.base/sigall.exp testcase has many FAILs on Cygwin currently.
From:
Thread 1 "sigall" received signal SIGPWR, Power fail/restart.
0x00007ffeac9ed134 in ntdll!ZwWaitForSingleObject () from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
(gdb) FAIL: gdb.base/sigall.exp: get signal LOST
we see two issues. The test is expecting "Program received ..." which
only appears if the inferior is single-threaded. All Cygwin inferiors
are multi-threaded, because both Windows and the Cygwin runtime spawn
a few helper threads.
And then, SIGLOST is the same as SIGPWR on Cygwin. The testcase
already knows to treat them the same on SPARC64 GNU/Linux. We just
need to extend the relevant code to treat Cygwin the same.
With this, the test passes cleanly on Cygwin.
Approved-By: Tom Tromey <tom@tromey.com>
Change-Id: Ie3553d043f4aeafafc011347b6cb61ed58501667
Pedro Alves [Tue, 5 Sep 2023 12:38:14 +0000 (13:38 +0100)]
Adjust gdb.base/bp-permanent.exp for Cygwin
On Cygwin, all inferiors are multi-threaded, because both Windows and
the Cygwin runtime spawn a few helper threads. Adjust the
gdb.base/bp-permanent.exp testcase to work with either single- or
multi-threaded inferiors.
Approved-by: Kevin Buettner <kevinb@redhat.com>
Change-Id: I28935b34fc9f739c2a5490e83aa4995d29927be2
The difference is the "Thread 1" part in the beginning of the quoted
output. It appears on Cygwin, but not on Linux. That's because on
Cygwin, all inferiors are multi-threaded, because both Windows and the
Cygwin runtime spawn a few helper threads.
Fix this by adjusting the gdb.base/bp-cond-failure.exp testcase to
work with either single- or multi-threaded inferiors.
The testcase passes cleanly for me after this.
Approved-by: Kevin Buettner <kevinb@redhat.com>
Change-Id: I5ff11d06ac1748d044cef025f1e78b8f84ad3349
aarch64: Increase the number of feature words to 3
Now that most of the effort of updating the number of feature words is
handled by macros, add an additional one, taking the number of
supported features to 192.
Richard Earnshaw [Thu, 22 May 2025 15:18:11 +0000 (16:18 +0100)]
aarch64: use macro trickery to automate feature array size replication
There are quite a few macros that need to be changed when we need to
increase the number of words in the features data structure. With
some macro trickery we can automate most of this so that a single
macro needs to be updated.
With C2X we could probably do even better by using recursion, but this
is still a much better situation than we had previously.
A static assertion is used to ensure that there is always enough space
in the flags macro for the number of feature bits we need to support.
Alan Modra [Mon, 9 Jun 2025 03:21:01 +0000 (12:51 +0930)]
dwarf2dbg.c line_entry.next assert
I was puzzling over how it was correct to cast what is clearly a
struct line_entry** pointer to a struct line_entry* pointer for a
few moments, and was going to write a comment but then decided we
really don't require the "next" pointer to be where it is. Replace
the assert with an inline function that does any necessary pointer
adjustments.
* dwarf2dbg.c (line_entry.next): Delete static assertion.
(line_entry_at_tail): New inline function.
(dwarf2_gen_line_info_1, dwarf2_finish): Replace casts in
set_or_check_view arguments with line_entry_at_tail.
Alan Modra [Mon, 9 Jun 2025 03:16:23 +0000 (12:46 +0930)]
str_hash_find casts
Putting an explicit cast on the void* return from str_hash_find isn't
necessary and doesn't add much to code clarity. In other cases, poor
choice of function parameter types, eg. "void *value" in
tc-aarch64.c checked_hash_insert rather than "const void *value" leads
to needing (void *) casts all over the place just to cast away const.
Fix that by correcting the parameter type. (And it really is a const,
the function and str_hash_insert don't modify the strings.)
This patch also removes some unnecessary casts in hash.c
Alan Modra [Mon, 9 Jun 2025 02:36:00 +0000 (12:06 +0930)]
str_hash_find_int
This changes the internal representation of string_tuple.value from
a void* to an intptr_t, removing any concerns that code wanting to
store an integer value will use values that are trap encodings or
suchlike for a pointer. The ISO C standard says any void* can be
converted to intptr_t and back again and will compare equal to the
original pointer. It does *not* say any intptr_t can be converted to
void* and back again to get the original integer..
Two new functions, str_hash_find_int and str_hash_insert_int are
provided for handling integer values. str_hash_find_int returns
(intptr_t) -1 on failing to find the key string.
Most target code need minimal changes to use the new interface, but
some simplification is possible since now a zero can be stored and
differentiated from the NULL "can't find" return. (Yes, that means
(intptr_t) -1 can't be stored.)
I've changed the avr_no_sreg_hash dummy value to zero, and the
loongarch register numbers don't need to be incremented. loongarch
also doesn't need to store an empty key string (if it ever did).
Alan Modra [Sat, 7 Jun 2025 11:58:41 +0000 (21:28 +0930)]
metag build error
gas/config/tc-metag.c: In function ‘parse_dsp_addr’:
gas/config/tc-metag.c:4386:29: error: ‘regs[0]’ may be used uninitialized [-Werror=maybe-uninitialized]
4386 | if (!is_addr_unit (regs[0]->unit) &&
| ~~~~~~~^~~~~~
It looks like regs_read can be zero with "l" non-NULL, so this gcc
complaint is accurate.
Tom de Vries [Sat, 7 Jun 2025 21:28:53 +0000 (23:28 +0200)]
[gdb/build] Fix buildbreaker in hardwire_setbaudrate
When building on x86_64-cygwin, I run into:
...
In file included from gdbsupport/common-defs.h:203,
from gdb/defs.h:26,
from <command-line>:
gdb/ser-unix.c: In function ‘void hardwire_setbaudrate(serial*, int)’:
gdbsupport/gdb_locale.h:28:20: error: expected ‘)’ before ‘gettext’
28 | # define _(String) gettext (String)
| ^~~~~~~
gdbsupport/gdb_assert.h:43:43: note: in expansion of macro ‘_’
43 | internal_error_loc (__FILE__, __LINE__, _("%s: " message), __func__, \
| ^
gdb/ser-unix.c:590:7: note: in expansion of macro ‘gdb_assert_not_reached’
590 | gdb_assert_not_reached (_("Serial baud rate was not found in B_codes"));
| ^~~~~~~~~~~~~~~~~~~~~~
gdb/ser-unix.c:590:31: note: in expansion of macro ‘_’
590 | gdb_assert_not_reached (_("Serial baud rate was not found in B_codes"));
| ^
gdbsupport/gdb_locale.h:28:28: note: to match this ‘(’
28 | # define _(String) gettext (String)
| ^
gdbsupport/gdb_assert.h:43:43: note: in expansion of macro ‘_’
43 | internal_error_loc (__FILE__, __LINE__, _("%s: " message), __func__, \
| ^
gdb/ser-unix.c:590:7: note: in expansion of macro ‘gdb_assert_not_reached’
590 | gdb_assert_not_reached (_("Serial baud rate was not found in B_codes"));
| ^~~~~~~~~~~~~~~~~~~~~~
...
Fix this by dropping the unneeded _() on the gdb_assert_not_reached argument.
Tom de Vries [Sat, 7 Jun 2025 11:59:52 +0000 (13:59 +0200)]
[gdb/testsuite] Fix gdb.ada/dyn-bit-offset.exp on s390x
On s390x-linux, with test-case gdb.ada/dyn-bit-offset.exp and gcc 7.5.0 I get:
...
(gdb) print spr^M
$1 = (discr => 3, array_field => (-5, -6, -7), field => -6, another_field => -6)^M
(gdb) FAIL: $exp: print spr
print spr.field^M
$2 = -6^M
(gdb) FAIL: $exp: print spr.field
...
On x86_64-linux, with the same compiler version I get:
...
(gdb) print spr^M
$1 = (discr => 3, array_field => (-5, -6, -7), field => -4, another_field => -4)^M
(gdb) XFAIL: $exp: print spr
print spr.field^M
$2 = -4^M
(gdb) PASS: $exp: print spr.field
...
In both cases, we're hitting the same compiler problem, but it manifests
differently on little and big endian.
Make sure the values seen for both little and big endian trigger xfails
for both tests.
Printing spr.field gives the expected value -4 for x86_64, but that's an
accident. Change the actual spr.field value to -5, to make sure
that we get the same number of xfails on x86_64 and s390x.
Finally, make the xfails conditional on the compiler version.
Tested using gcc 7.5.0 on both x86_64-linux and s390x-linux.
Approved-By: Andrew Burgess <aburgess@redhat.com>
PR testsuite/33042
https://sourceware.org/bugzilla/show_bug.cgi?id=33042
Georg-Johann Lay [Thu, 15 May 2025 08:29:25 +0000 (10:29 +0200)]
AVR: ld/32968 - Assert that .progmem data resides in the lower 64 KiB.
This patch locates the linker stubs / trampolines *after* all the .progmem
sections. This is the natural placement since progmem data has to reside
in the lower 64 KiB (it is accessed using LPM), whereas the linker stubs
are only required to be located in the lower 128 KiB of program memory.
(They must be in the range of EICALL / EIJMP with EIND = 0.)
The current location of the linker stubs was motivated by an invalid test
case from PR13812 that allocates more than 64 KiB of progmem data.
The patch adds an assertion that makes sure that no progmem data is
allocated past 0xffff.
Data that is accessed using ELPM should be located to .progmemx so that
no .progmem addresses are wasted. .progmemx was introduced in 2017 and
is used by __memx, __flashx and by the current AVR-LibC.
(The compiler uses .jumptables.gcc for its jump dispatch tables since
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63223 / GCC v4.9.2).
PR ld/32968
ld/
* scripttempl/avr.sc: Move the trampolines section after the
.progmem sections. Assert that .progmem is in the lower 64 KiB.
Andrew Burgess [Thu, 5 Jun 2025 13:50:41 +0000 (14:50 +0100)]
gdb/guile: fix memory leak in gdbscm_parse_command_name
For reference see the previous commit.
Fix a memory leak in gdbscm_parse_command_name when a guile exception
is thrown. To reveal the memory leak I placed the following content
into a file 'leak.scm':
(gdb) source leak.scm
ERROR: In procedure register-command!:
In procedure gdbscm_register_command_x: Out of range: 'break' is not a prefix command in position 1: "break cmd"
Error while executing Scheme code.
Running this under valgrind reveals a memory leak for 'result' and
'prefix_text' from gdbscm_parse_command_name.
Another leak can be revealed with this input script:
This one occurs earlier in gdbscm_parse_command_name, and now only
'result' leaks.
The problem is that, when guile throws an exception then a longjmp is
performed from the function that raise the exception back to the guile
run-time. A consequence of this is that no function local destructors
will be run.
In gdbscm_parse_command_name, this means that the two function locals
`result` and `prefix_text` will not have their destructors run, and
any memory managed by these objects will be leaked.
Fix this by assigning nullptr to these two function locals before
throwing an exception. This will cause the managed memory to be
deallocated.
I could have implemented a fix that made use of Guile's dynwind
mechanism to register a cleanup callback, however, this felt like
overkill. At the point the exception is being thrown we know that we
no longer need the managed memory, so we might as well just free the
memory at that point.
With this fix in place, the two leaks are now fixed in the valgrind
output.
Andrew Burgess [Wed, 4 Jun 2025 18:54:01 +0000 (19:54 +0100)]
gdb/python/guile: remove some explicit calls to xmalloc
In gdbpy_parse_command_name (python/py-cmd.c) there is a call to
xmalloc that can easily be replaced with a call to
make_unique_xstrndup, which makes the code easier to read (I think).
In gdbscm_parse_command_name (guile/scm-cmd.c) the same fix can be
applied to remove an identical xmalloc call. And there is an
additional xmalloc call, which can also be replaced with
make_unique_xstrndup in the same way.
The second xmalloc call in gdbscm_parse_command_name was also present
in gdbpy_parse_command_name at one point, but was replaced with a use
of std::string by this commit:
I haven't changed the gdbscm_parse_command_name to use std::string
though, as that doesn't work well with the guile exception model.
Guile exceptions work by performing a longjmp from the function that
raises the exception, back to the guile run-time. The consequence of
this is that destructors are not run. For example, if
gdbscm_parse_command_name calls gdbscm_out_of_range_error, then any
function local objects in gdbscm_parse_command_name will not have
their destructors called.
What this means is that, for the existing `result` and `prefix_text`
locals, any allocated memory managed by these objects will be leaked
if an exception is called. However, fixing this is pretty easy, one
way is to just assign nullptr to these locals before raising the
exception, this would cause the allocated memory to be released.
But for std::string it is harder to ensure that the managed memory has
actually been released. We can call std::string::clear() and then
maybe std::string::shrink_to_fit(), but this is still not guaranteed
to release any managed memory. In fact, I believe the only way to
ensure all managed memory is released, is to call the std::string
destructor.
And so, for functions that can throw a guile exception, it is easier
to just avoid std::string.
As for the memory leak that I identify above; I'll fix that in a
follow on commit.
gdb/solib: make _linker_namespace use selected frame
When the convenience variable $_linker_namespace was introduced, I meant
for it to print the namespace of the frame that where the user was
stopped. However, due to confusing what "current_frame" and
"selected_frame" meant, it instead printed the namespace of the
lowermost frame.
This commit updates the code to follow my original intent. Since the
variable was never in a GDB release, updating the behavior should not
cause any disruption. It also adds a test to verify the functionality.
Tom Tromey [Tue, 27 May 2025 18:18:30 +0000 (12:18 -0600)]
Fix regression with DW_AT_bit_offset handling
Internal AdaCore testing using -gdwarf-4 found a spot where GCC will
emit a negative DW_AT_bit_offset. However, my recent signed/unsigned
changes assumed that this value had to be positive.
I feel this bug somewhat invalidates my previous thinking about how
DWARF attributes should be handled.
In particular, both GCC and LLVM at understand that a negative bit
offset can be generated -- but for positive offsets they might use a
smaller "data" form, which is expected not to be sign-extended. LLVM
has similar code but GCC does:
What this means is that this attribute is "signed but default
unsigned".
To fix this, I've added a new attribute::confused_constant method.
This should be used when a constant value might be signed, but where
narrow forms (e.g., DW_FORM_data1) should *not* cause sign extension.
I examined the GCC and LLVM DWARF writers to come up with the list of
attributes where this applies, namely DW_AT_bit_offset,
DW_AT_const_value and DW_AT_data_member_location (GCC only, but LLVM
always emits it as unsigned, so we're safe here).
This patch corrects the bug and imports the relevant test case.
Regression tested on x86-64 Fedora 41.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32680
Bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118837 Approved-By: Simon Marchi <simon.marchi@efficios.com>
The problem is that catch breakpoints don't set the
bp_location::gdbarch member variable, they a "dummy" location added
with a call to add_dummy_location (breakpoint.c).
The breakpoint_location_address_str function (which is only used for
breakpoint debug output) relies on bp_location::gdbarch being set in
order to call the paddress function.
I considered trying to ensure that the bp_location::gdbarch variable
is always set to sane value. For example, in add_dummy_location I
tried copying the gdbarch from the breakpoint object, and this does
work for the catchpoint case, but for some of the watchpoint cases,
even the breakpoint object has no gdbarch value set.
Now this seemed a little suspect, but, the more I thought about it, I
wondered if "fixing" the gdbarch was allowing me to solve the wrong
problem.
If the gdbarch was set, then this would allow us to print the address
field of the bp_location, which is going to be 0, after all, as this
is a dummy location, which has no address.
But does it really make sense to print the address 0? For some
targets, 0 is a valid address. But that wasn't an address we actually
selected, it's just the default value for dummy locations.
And we already have a helper function bl_address_is_meaningful, which
returns false for dummy locations.
So, I propose that in breakpoint_location_address_str, we use
bl_address_is_meaningful to detect dummy locations, and skip the
address printing code in that case.
For testing, I temporarily changed insert_bp_location so that
breakpoint_location_address_str was always called, even when
breakpoint debugging was off. I then ran the whole testsuite.
Without the fixes included in this commit I saw lots of assertion
failures, but with the fixes from this commit in place, I now see no
assertion failures.
I've added a new test which reveals the original assertion failure.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
commit 4b42385c470c5f72f158f382f4d9c36f927aa84f
Author: Guinevere Larsen <guinevere@redhat.com>
Date: Wed Feb 12 08:25:46 2025 -0300
gdb: Make dwarf support optional at compile time
Introduced a change that made the configure script not POSIX compliant,
by using fallthrough in some case statements. This commit reworks that
part of the change to only use if statements, so that no code is
duplicated but things remain POSIX compliant.
Reviewed-by: Sam James <sam@gentoo.org> Approved-By: Tom Tromey <tom@tromey.com>
Pedro Alves [Mon, 10 Mar 2025 20:08:54 +0000 (20:08 +0000)]
Make default_gdb_exit resilient to failed closes
For some reason, when testing GDB on Cygwin, I get:
child process exited abnormally
while executing
"exec sh -c "exec > /dev/null 2>&1 && (kill -2 -$spid || kill -2 $spid)""
(procedure "close_wait_program" line 20)
invoked from within
"close_wait_program $shell_id $pid"
(procedure "standard_close" line 23)
invoked from within
"standard_close "Windows-ROCm""
("eval" body line 1)
invoked from within
"eval ${try}_${proc} \"$dest\" $args"
(procedure "call_remote" line 42)
invoked from within
"call_remote "" close $host"
(procedure "remote_close" line 3)
invoked from within
"remote_close host"
(procedure "log_and_exit" line 30)
invoked from within
"log_and_exit"
When that happens from within clean_restart, clean_restart doesn't
clear the gdb_spawn_id variable, and then when clean_restart starts up
a new GDB, that sees that gdb_spawn_id is already set, so it doesn't
actually spawn a new GDB, and so clean_restart happens to reuse the
same GDB (!). Many tests happen to actually work OK with this, but
some don't, and the failure modes can be head-scratching.
Of course, the failure to close GDB should be fixed, but when it
happens, I think it's good to not end up with the current weird state.
Connecting the "child process exit abnormally" errors at the end of a
testcase run with weird FAILs in other testcases took me a while (as
in, weeks!), it wasn't obvious to me immediately.
Thus, this patch makes default_gdb_exit more resilient to failed
closes, so that gdb_spawn_id is unset even is closing GDB fails, and
we move on to start a new GDB.
Approved-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: I9ec95aa61872a40095775534743525e0ad2097d2
Pedro Alves [Thu, 5 Jun 2025 17:09:44 +0000 (18:09 +0100)]
gdb_test_multiple: Anchor prompt match if -lbl
The testcase added by this patch has a gdb_test_multiple call that
wants to match different lines of output that all have a common
prefix, and do different actions on each. Instead of a single regular
expression with alternatives, it's clearer code if the different
expressions are handled with different "-re", like so:
gdb_test_multiple "command" "" -lbl {
-re "^command(?=\r\n)" {
exp_continue
}
-re "^\r\nprefix foo(?=\r\n)" {
# Some action for "foo".
exp_continue
}
-re "^\r\nprefix bar(?=\r\n)" {
# Some action for "bar".
exp_continue
}
-re "^\r\nprefix \[^\r\n\]*(?=\r\n)" {
# Some action for all others.
exp_continue
}
-re "^\r\n$::gdb_prompt $" {
gdb_assert {$all_prefixes_were_seen} $gdb_test_name
}
}
Above, the leading anchors in the "^\r\nprefix..." matches are needed
to avoid too-eager matching due to the common prefix. Without the
anchors, if the expect output buffer happens to contain at least:
"\r\nprefix xxx\r\nprefix foo\r\n"
... then the "prefix foo" pattern match inadvertently consumes the
first "prefix xxx" line.
... then the prompt regexp matches this, consuming the "prefix" line
inadvertently, and we get a FAIL. The built-in regexp matcher for
-lbl doesn't get a chance to match the
"\r\nmeant-to-be-matched-by-lbl\r\n" part, because the built-in prompt
match appears first within gdb_test_multiple.
By adding the anchor to the prompt regexp, we avoid that problem.
However, the same expect output buffer contents will still match the
built-in prompt regexp. That is what is fixed by this patch. It
makes it so that if -lbl is specified, the built-in prompt regexp has
a leading anchor.
Original idea for turning this into a gdb.testsuite/ testcase by Tom
de Vries <tdevries@suse.de>.
Approved-By: Tom de Vries <tdevries@suse.de>
Change-Id: Ic2571ec793d856a89ee0d533ec363e2ac6036ea2