Pedro Alves [Tue, 21 Jun 2022 17:05:19 +0000 (18:05 +0100)]
Cancel execution command on thread exit, when stepping, nexting, etc.
If your target has no support for TARGET_WAITKIND_NO_RESUMED events
(and no way to support them, such as the yet-unsubmitted AMDGPU
target), and you step over thread exit with scheduler-locking on, this
is what you get:
(gdb) n
[Thread ... exited]
*hang*
Getting back the prompt by typing Ctrl-C may not even work, since no
inferior thread is running to receive the SIGINT. Even if it works,
it seems unnecessarily harsh. If you started an execution command for
which there's a clear thread of interest (step, next, until, etc.),
and that thread disappears, then I think it's more user friendly if
GDB just detects the situation and aborts the command, giving back the
prompt.
That is what this commit implements. It does this by explicitly
requesting the target to report thread exit events whenever the main
resumed thread has a thread_fsm. Note that unlike stepping over a
breakpoint, we don't need to enable clone events in this case.
Pedro Alves [Tue, 21 Jun 2022 18:30:48 +0000 (19:30 +0100)]
Centralize "[Thread ...exited]" notifications
Currently, each target backend is responsible for printing "[Thread
...exited]" before deleting a thread. This leads to unnecessary
differences between targets, like e.g. with the remote target, we
never print such messages, even though we do print "[New Thread ...]".
E.g., debugging the gdb.threads/attach-many-short-lived-threads.exp
with gdbserver, letting it run for a bit, and then pressing Ctrl-C, we
currently see:
Thread 1 "attach-many-sho" received signal SIGINT, Interrupt.
0x00007ffff7e6a23f in __GI___clock_nanosleep (clock_id=clock_id@entry=0, flags=flags@entry=0, req=req@entry=0x7fffffffda80, rem=rem@entry=0x7fffffffda80)
at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:78
78 in ../sysdeps/unix/sysv/linux/clock_nanosleep.c
(gdb)
Above, we only see "New Thread" notifications, even though threads
were deleted.
Thread 1 "attach-many-sho" received signal SIGINT, Interrupt.
0x00007ffff7e6a23f in __GI___clock_nanosleep (clock_id=clock_id@entry=0, flags=flags@entry=0, req=req@entry=0x7fffffffda80, rem=rem@entry=0x7fffffffda80)
at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:78
78 in ../sysdeps/unix/sysv/linux/clock_nanosleep.c
(gdb) q
This commit fixes this by moving the thread exit printing to common
code instead, triggered from within delete_thread (or rather,
set_thread_exited).
There's one wrinkle, though. While most targest want to print:
[Thread ... exited]
the Windows target wants to print:
[Thread ... exited with code <exit_code>]
... and sometimes wants to suppress the notification for the main
thread. To address that, this commits adds a delete_thread_with_code
function, only used by that target (so far).
Pedro Alves [Thu, 30 Jun 2022 12:04:07 +0000 (13:04 +0100)]
inferior::clear_thread_list always silent
Now that the MI mi_thread_exit observer ignores "silent", there's no
difference between inferior::clean_thread_list with silent true or
false. And for the CLI, clean_thread_list() never printed anything
either. So we can eliminate the 'silent' parameter.
Pedro Alves [Mon, 13 Jun 2022 18:26:59 +0000 (19:26 +0100)]
Document remote clone events, and QThreadOptions packet
This commit documents in both manual and NEWS:
- the new remote clone event stop reply,
- the new QThreadOptions packet and its current defined options,
- the associated "set/show remote thread-events-packet" command,
- and the associated QThreadOptions qSupported feature.
Simon Marchi [Fri, 5 Feb 2021 21:42:32 +0000 (16:42 -0500)]
Testcases for stepping over thread exit syscall (PR gdb/27338)
- New in this series version:
- gdb.threads/step-over-thread-exit.exp has a couple new testing axes:
- non-stop vs all-stop.
- in non-stop have the testcase explicitly stop all threads in
non-stop mode, vs not doing that.
- bail out of gdb.threads/step-over-thread-exit.exp early on FAIL, to
avoid cascading timeouts.
Add new gdb.threads/step-over-thread-exit.exp and
gdb.threads/step-over-thread-exit-while-stop-all-threads.exp
testcases, exercising stepping over thread exit syscall. These make
use of lib/my-syscalls.S to define the exit syscall.
Co-authored-by: Pedro Alves <pedro@palves.net>
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27338
Change-Id: Ie8b2c5747db99b7023463a897a8390d9e814a9c9
Pedro Alves [Fri, 2 Jul 2021 10:46:40 +0000 (11:46 +0100)]
gdb/testsuite/lib/my-syscalls.S: Refactor new SYSCALL macro
Refactor the syscall assembly code in gdb/testsuite/lib/my-syscalls.S
behind a SYSCALL macro so that it's easy to add new syscalls without
duplicating code.
Note that the way the macro is implemented, it only works correctly
for syscalls with up to 3 arguments, and, if the syscall doesn't
return (the macro doesn't bother to save/restore callee-saved
registers).
The following patch will want to use the macro to define a wrapper for
the "exit" syscall, so the limitations continue to be sufficient.
Pedro Alves [Tue, 7 Jun 2022 13:20:42 +0000 (14:20 +0100)]
Ignore failure to read PC when resuming
If GDB sets a GDB_TO_EXIT option on a thread, and the thread exits,
the server reports the corresponding thread exit event, and forgets
about the thread, i.e., removes the exited thread from its thread
list.
On the GDB side, GDB set the GDB_TO_EXIT option on a thread, GDB
delays deleting the thread from its thread list until it sees the
corresponding thread exit event, as that event needs special handling
in infrun.
When a thread disappears from the target, but it still exists on GDB's
thread list, in all-stop RSP mode, it can happen that GDB ends up
trying to resume such an already-exited-thread that GDB doesn't yet
know is gone. When that happens, against GDBserver, typically the
ongoing execution command fails with this error:
...
PC register is not available
(gdb)
At the remote protocol level, we may see e.g., this:
GDB saw an exit event for thread 619641.620754. After processing it,
infrun decides to re-resume the target again. To do that, infrun
picks some other thread that isn't exited yet from GDB's perspective,
switches to it, and calls keep_going. Below, infrun happens to pick
thread p97479.97479, the leader, which also exited, but GDB doesn't
know yet:
...
[remote] Sending packet: $Hgp97479.97479#75
[remote] Packet received: OK
[remote] Sending packet: $g#67
[remote] Packet received: xxxxxxxxxxxxxxxxx (...snip...) [1120 bytes omitted]
[infrun] reset: reason=handling event
[infrun] maybe_set_commit_resumed_all_targets: not requesting commit-resumed for target remote, no resumed threads
[infrun] fetch_inferior_event: exit
PC register is not available
(gdb)
The Linux backends, both in GDB and in GDBserver, already silently
ignore failures to resume, with the understanding that we'll see an
exit event soon. Core of GDB doesn't do that yet, though.
This patch is a small step in that direction. It swallows the error
when thrown from within resume_1. There are likely are spots where we
will need similar treatment, but we can tackle them as we find them.
After this patch, we'll see something like this instead:
Pedro Alves [Wed, 15 Jun 2022 12:19:47 +0000 (13:19 +0100)]
Report thread exit event for leader if reporting thread exit events
If GDB sets the GDB_TO_EXIT option on a thread, then if the thread
disappears from the thread list, GDB expects to shortly see a thread
exit event for it. See e.g., here, in
remote_target::update_thread_list():
/* Do not remove the thread if we've requested to be
notified of its exit. For example, the thread may be
displaced stepping, infrun will need to handle the
exit event, and displaced stepping info is recorded
in the thread object. If we deleted the thread now,
we'd lose that info. */
if ((tp->thread_options () & GDB_TO_EXIT) != 0)
continue;
There's one scenario that is deleting a thread from the
remote/gdbserver thread list without ever reporting a corresponding
thread exit event though -- check_zombie_leaders. This can lead to
GDB getting confused. For example, with a following patch that
enables GDB_TO_EXIT whenever schedlock is enabled, we'd see this
regression:
$ make check RUNTESTFLAGS="--target_board=native-extended-gdbserver" TESTS="gdb.threads/no-unwaited-for-left.exp"
...
Running src/gdb/testsuite/gdb.threads/no-unwaited-for-left.exp ...
FAIL: gdb.threads/no-unwaited-for-left.exp: continue stops when the main thread exits (timeout)
... some more cascading FAILs ...
gdb.log shows:
(gdb) continue
Continuing.
FAIL: gdb.threads/no-unwaited-for-left.exp: continue stops when the main thread exits (timeout)
A passing run would have resulted in:
(gdb) continue
Continuing.
No unwaited-for children left.
(gdb) PASS: gdb.threads/no-unwaited-for-left.exp: continue stops when the main thread exits
note how the leader thread is not listed in the remote-reported XML
thread list below:
(gdb) set debug remote 1
(gdb) set debug infrun 1
(gdb) info threads
Id Target Id Frame
* 1 Thread 1163850.1163850 "no-unwaited-for" main () at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.threads/no-unwaited-for-left.c:65
3 Thread 1163850.1164130 "no-unwaited-for" [remote] Sending packet: $Hgp11c24a.11c362#39
(gdb) c
Continuing.
[infrun] clear_proceed_status_thread: 1163850.1163850.0
...
[infrun] resume_1: step=1, signal=GDB_SIGNAL_0, trap_expected=1, current thread [1163850.1163850.0] at 0x55555555534f
[remote] Sending packet: $QPassSignals:#f3
[remote] Packet received: OK
[remote] Sending packet: $QThreadOptions;3:p11c24a.11c24a#f3
[remote] Packet received: OK
...
[infrun] target_set_thread_options: [options for Thread 1163850.1163850 are now 0x3]
...
[infrun] do_target_resume: resume_ptid=1163850.1163850.0, step=0, sig=GDB_SIGNAL_0
[remote] Sending packet: $vCont;c:p11c24a.11c24a#98
[infrun] prepare_to_wait: prepare_to_wait
[infrun] reset: reason=handling event
[infrun] maybe_set_commit_resumed_all_targets: enabling commit-resumed for target extended-remote
[infrun] maybe_call_commit_resumed_all_targets: calling commit_resumed for target extended-remote
[infrun] fetch_inferior_event: exit
[infrun] fetch_inferior_event: enter
[infrun] scoped_disable_commit_resumed: reason=handling event
[infrun] random_pending_event_thread: None found.
[remote] wait: enter
[remote] Packet received: N
[remote] wait: exit
[infrun] print_target_wait_results: target_wait (-1.0.0 [process -1], status) =
[infrun] print_target_wait_results: -1.0.0 [process -1],
[infrun] print_target_wait_results: status->kind = NO_RESUMED
[infrun] handle_inferior_event: status->kind = NO_RESUMED
[remote] Sending packet: $Hgp0.0#ad
[remote] Packet received: OK
[remote] Sending packet: $qXfer:threads:read::0,1000#92
[remote] Packet received: l<threads>\n<thread id="p11c24a.11c362" core="0" name="no-unwaited-for" handle="0097d8f7ff7f0000"/>\n</threads>\n
[infrun] handle_no_resumed: TARGET_WAITKIND_NO_RESUMED (ignoring: found resumed)
...
... however, infrun decided there was a resumed thread still, so
ignored the TARGET_WAITKIND_NO_RESUMED event. Debugging GDB, we see
that the "found resumed" thread that GDB finds, is the leader thread.
Even though that thread is not on the remote-reported thread list, it
is still on the GDB thread list, due to the special case in remote.c
mentioned above.
This commit addresses the issue by fixing GDBserver to report a thread
exit event for the zombie leader too, i.e., making GDBserver respect
the "if thread has GDB_TO_EXIT set, report a thread exit" invariant.
To do that, it takes a bit more code than one would imagine off hand,
due to the fact that we currently always report LWP exit pending
events as TARGET_WAITKIND_EXITED, and then decide whether to convert
it to TARGET_WAITKIND_THREAD_EXITED just before reporting the event to
GDBserver core. For the zombie leader scenario described, we need to
record early on that we want to report a THREAD_EXITED event, and then
make sure that decision isn't lost along the way to reporting the
event to GDBserver core.
Pedro Alves [Tue, 30 Nov 2021 19:52:11 +0000 (19:52 +0000)]
Don't resume new threads if scheduler-locking is in effect
If scheduler-locking is in effect, like e.g., with "set
scheduler-locking on", and you step over a function that spawns a new
thread, the new thread is allowed to run free, at least until some
event is hit, at which point, whether the new thread is re-resumed
depends on a number of seemingly random factors. E.g., if the target
is all-stop, and the parent thread hits a breakpoint, and gdb decides
the breakpoint isn't interesting to report to the user, then the
parent thread is resumed, but the new thread is left stopped.
I think that letting the new threads run with scheduler-locking
enabled is a defect. This commit fixes that, making use of the new
clone events on Linux, and of target_thread_events() on targets where
new threads have no connection to the thread that spawned them.
Pedro Alves [Fri, 22 Apr 2022 20:03:07 +0000 (21:03 +0100)]
gdbserver: Queue no-resumed event after thread exit
Normally, if the last thread resumed thread on the target exits, the
server sends a no-resumed event to GDB. If however, GDB enables the
GDB_TO_EXIT option on a thread, and, that thread exits, the server
sends a thread exit event for that thread instead.
In all-stop RSP mode, since events can only be forwarded to GDB one at
a time, and the whole target stops whenever an event is reported, GDB
resumes the target again after getting a THREAD_EXITED event, and then
the server finally reports back a no-resumed event if/when
appropriate.
For non-stop RSP though, events are asynchronous, and if the server
sends a thread-exit event for the last resumed thread, the no-resumed
event is never sent. This patch makes sure that in non-stop mode, the
server queues a no-resumed event after the thread-exit event if it was
the last resumed thread that exited.
Without this, we'd see failures in step-over-thread-exit testcases
added later in the series, like so:
continue
Continuing.
- No unwaited-for children left.
- (gdb) PASS: gdb.threads/step-over-thread-exit.exp: displaced-stepping=off: non-stop=on: target-non-stop=on: schedlock=off: ns_stop_all=1: continue stops when thread exits
+ FAIL: gdb.threads/step-over-thread-exit.exp: displaced-stepping=off: non-stop=on: target-non-stop=on: schedlock=off: ns_stop_all=1: continue stops when thread exits (timeout)
Pedro Alves [Mon, 28 Jun 2021 13:05:54 +0000 (14:05 +0100)]
stop_all_threads: (re-)enable async before waiting for stops
Running the
gdb.threads/step-over-thread-exit-while-stop-all-threads.exp testcase
added later in the series against gdbserver, after the
TARGET_WAITKIND_NO_RESUMED fix from the following patch, would run
into an infinite loop in stop_all_threads, leading to a timeout:
FAIL: gdb.threads/step-over-thread-exit-while-stop-all-threads.exp: displaced-stepping=off: target-non-stop=on: iter 0: continue (timeout)
The is really a latent bug, and it is about the fact that
stop_all_threads stops listening to events from a target as soon as it
sees a TARGET_WAITKIND_NO_RESUMED, ignoring that
TARGET_WAITKIND_NO_RESUMED may be delayed. handle_no_resumed knows
how to handle delayed no-resumed events, but stop_all_threads was
never taught to.
In more detail, here's what happens with that testcase:
#1 - Multiple threads report breakpoint hits to gdb.
#2 - gdb picks one events, and it's for thread 1. All other stops are
left pending. thread 1 needs to move past a breakpoint, so gdb
stops all threads to start an inline step over for thread 1.
While stopping threads, some of the threads that were still
running report events that are also left pending.
#2 - gdb steps thread 1
#3 - Thread 1 exits while stepping (it steps over an exit syscall),
gdbserver reports thread exit for thread 1
#4 - Thread 1 was the last resumed thread, so gdbserver also reports
no-resumed:
[remote] Notification received: Stop:w0;p3445d0.3445d3
[remote] Sending packet: $vStopped#55
[remote] Packet received: N
[remote] Sending packet: $vStopped#55
[remote] Packet received: OK
#5 - gdb processes the thread exit for thread 1, finishes the step
over and restarts threads.
#6 - gdb picks the next event to process out of one of the resumed
threads with pending events:
[infrun] random_resumed_with_pending_wait_status: Found 32 events, selecting #11
#7 - This is again a breakpoint hit and the breakpoint needs to be
stepped over too, so gdb starts a step-over dance again.
#8 - We reach stop_all_threads, which finds that some threads need to
be stopped.
#9 - wait_one finally consumes the no-resumed event queue by #4.
Seeing this, wait_one disable target async, to stop listening for
events out of the remote target.
#10 - We still haven't seen all the stops expected, so
stop_all_threads tries another iteration.
#11 - Because the remote target is no longer async, and there are no
other targets, wait_one return no-resumed immediately without
polling the remote target.
#12 - We still haven't seen all the stops expected, so
stop_all_threads tries another iteration. goto #11, looping
forever.
Fix this by explicitly enabling/re-enabling target async on targets
that can async, before waiting for stops.
Simon Marchi [Fri, 5 Feb 2021 21:42:32 +0000 (16:42 -0500)]
gdb: clear step over information on thread exit (PR gdb/27338)
GDB doesn't handle correctly the case where a thread steps over a
breakpoint (using either in-line or displaced stepping), and the
executed instruction causes the thread to exit.
Using the test program included later in the series, this is what it
looks like with displaced-stepping, on x86-64 Linux, where we have two
displaced-step buffers:
$ ./gdb -q -nx --data-directory=data-directory build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-over-thread-exit/step-over-thread-exit -ex "b my_exit_syscall" -ex r
Reading symbols from build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-over-thread-exit/step-over-thread-exit...
Breakpoint 1 at 0x123c: file src/binutils-gdb/gdb/testsuite/lib/my-syscalls.S, line 68.
Starting program: build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-over-thread-exit/step-over-thread-exit
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/../lib/libthread_db.so.1".
[New Thread 0x7ffff7c5f640 (LWP 2915510)]
[Switching to Thread 0x7ffff7c5f640 (LWP 2915510)]
Thread 2 "step-over-threa" hit Breakpoint 1, my_exit_syscall () at src/binutils-gdb/gdb/testsuite/lib/my-syscalls.S:68
68 syscall
(gdb) c
Continuing.
[New Thread 0x7ffff7c5f640 (LWP 2915524)]
[Thread 0x7ffff7c5f640 (LWP 2915510) exited]
[Switching to Thread 0x7ffff7c5f640 (LWP 2915524)]
Thread 3 "step-over-threa" hit Breakpoint 1, my_exit_syscall () at src/binutils-gdb/gdb/testsuite/lib/my-syscalls.S:68
68 syscall
(gdb) c
Continuing.
[New Thread 0x7ffff7c5f640 (LWP 2915616)]
[Thread 0x7ffff7c5f640 (LWP 2915524) exited]
[Switching to Thread 0x7ffff7c5f640 (LWP 2915616)]
Thread 4 "step-over-threa" hit Breakpoint 1, my_exit_syscall () at src/binutils-gdb/gdb/testsuite/lib/my-syscalls.S:68
68 syscall
(gdb) c
Continuing.
... hangs ...
The first two times we do "continue", we displaced-step the syscall
instruction that causes the thread to exit. When the thread exits,
the main thread, waiting on pthread_join, is unblocked. It spawns a
new thread, which hits the breakpoint on the syscall again. However,
infrun was never notified that the displaced-stepping threads are done
using the displaced-step buffer, so now both buffers are marked as
used. So when we do the third continue, there are no buffers
available to displaced-step the syscall, so the thread waits forever
for its turn.
When trying the same but with in-line step over (displaced-stepping
disabled):
$ ./gdb -q -nx --data-directory=data-directory \
build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-over-thread-exit/step-over-thread-exit \
-ex "b my_exit_syscall" -ex "set displaced-stepping off" -ex r
Reading symbols from build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-over-thread-exit/step-over-thread-exit...
Breakpoint 1 at 0x123c: file src/binutils-gdb/gdb/testsuite/lib/my-syscalls.S, line 68.
Starting program: build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-over-thread-exit/step-over-thread-exit
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/../lib/libthread_db.so.1".
[New Thread 0x7ffff7c5f640 (LWP 2928290)]
[Switching to Thread 0x7ffff7c5f640 (LWP 2928290)]
Thread 2 "step-over-threa" hit Breakpoint 1, my_exit_syscall () at src/binutils-gdb/gdb/testsuite/lib/my-syscalls.S:68
68 syscall
(gdb) c
Continuing.
[Thread 0x7ffff7c5f640 (LWP 2928290) exited]
No unwaited-for children left.
(gdb) i th
Id Target Id Frame
1 Thread 0x7ffff7c60740 (LWP 2928285) "step-over-threa" 0x00007ffff7f7c9b7 in __pthread_clockjoin_ex () from /usr/lib/libpthread.so.0
The current thread <Thread ID 2> has terminated. See `help thread'.
(gdb) thread 1
[Switching to thread 1 (Thread 0x7ffff7c60740 (LWP 2928285))]
#0 0x00007ffff7f7c9b7 in __pthread_clockjoin_ex () from /usr/lib/libpthread.so.0
(gdb) c
Continuing.
^C^C
... hangs ...
The "continue" causes an in-line step to occur, meaning the main
thread is stopped while we step the syscall. The stepped thread exits
when executing the syscall, the linux-nat target notices there are no
more resumed threads to be waited for, so returns
TARGET_WAITKIND_NO_RESUMED, which causes the prompt to return. But
infrun never clears the in-line step over info. So if we try
continuing the main thread, GDB doesn't resume it, because it thinks
there's an in-line step in progress that we need to wait for to
finish, and we are stuck there.
To fix this, infrun needs to be informed when a thread doing a
displaced or in-line step over exits. We can do that with the new
target_set_thread_options mechanism which is optimal for only enabling
exit events of the thread that needs it; or, if that is not supported,
by using target_thread_events, which enables thread exit events for
all threads. This is done by this commit.
This patch then modifies handle_inferior_event in infrun.c to clean up
any step-over the exiting thread might have been doing at the time of
the exit. The cases to consider are:
- the exiting thread was doing an in-line step-over with an all-stop
target
- the exiting thread was doing an in-line step-over with a non-stop
target
- the exiting thread was doing a displaced step-over with a non-stop
target
The displaced-stepping buffer implementation in displaced-stepping.c
is modified to account for the fact that it's possible that we
"finish" a displaced step after a thread exit event. The buffer that
the exiting thread was using is marked as available again and the
original instructions under the scratch pad are restored. However, it
skips applying the fixup, which wouldn't make sense since the thread
does not exist anymore.
Another case that needs handling is if a displaced-stepping thread
exits, and the event is reported while we are in stop_all_threads. We
should call displaced_step_finish in the handle_one function, in that
case. It was already called in other code paths, just not the "thread
exited" path.
This commit doesn't make infrun ask the target to report the
TARGET_WAITKIND_THREAD_EXITED events yet, that'll be done later in the
series.
Note that "stop_print_frame = false;" line is moved to normal_stop,
because TARGET_WAITKIND_THREAD_EXITED can also end up with the event
transmorphed into TARGET_WAITKIND_NO_RESUMED. Moving it to
normal_stop keeps it centralized.
Co-authored-by: Pedro Alves <pedro@palves.net>
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27338
Change-Id: I745c6955d7ef90beb83bcf0ff1d1ac8b9b6285a5
When stepping over a breakpoint with displaced stepping, GDB needs to
be informed if the stepped thread exits, otherwise the displaced
stepping buffer that was allocated to that thread leaks, and this can
result in deadlock, with other threads waiting for their turn to
displaced step, but their turn never comes.
Similarly, when stepping over a breakpoint in line, GDB also needs to
be informed if the stepped thread exits, so that is can clear the step
over state and re-resume threads.
This commit makes it possible for GDB to ask the target to report
thread exit events for a given thread, using the new "thread options"
mechanism introduced by a previous patch.
This only adds the core bits. Following patches in the series will
teach the Linux backends (native & gdbserver) to handle the
GDB_TO_EXIT option, and then a later patch will make use of these
thread exit events to clean up displaced stepping and inline stepping
state properly.
Pedro Alves [Tue, 5 Jul 2022 11:21:50 +0000 (12:21 +0100)]
gdbserver/linux-low.cc: Ignore event_ptid if TARGET_WAITKIND_IGNORE
gdbserver's linux_process_target::wait loops if called sync mode, and
wait_1 returns TARGET_WAITKIND_IGNORE, _and_ wait_1 also returns
null_ptid. The null_ptid check fails however when this path is taken:
Pedro Alves [Mon, 4 Apr 2022 20:12:03 +0000 (21:12 +0100)]
all-stop/synchronous RSP support thread-exit events
Currently, GDB does not understand the THREAD_EXITED stop reply in
remote all-stop mode. There's no good reason for this, it just
happened that THREAD_EXITED was only ever reported in non-stop mode so
far. This patch teaches GDB to parse that event in all-stop RSP too.
There is no need to add a qSupported feature for this, because the
server won't send a THREAD_EXITED event unless GDB explicitly asks for
it, with QThreadEvents, or with the GDB_TO_EXIT QThreadOptions option
added in the next patch.
Andrew Burgess [Wed, 30 Jun 2021 19:55:03 +0000 (20:55 +0100)]
Add test for stepping over clone syscall
- New in this in series version:
Fixed race, remove end anchor after prompt.
Fix leading anchor and inferior output flushing issue against
gdbserver.
Now works with displaced stepping.
Avoid printing inferior addresses in gdb.sum messages
Fail when clone child thread is stuck on scratchpad
Misc minor tweaks throughout.
This adds a new gdb.threads/stepi-over-clone.exp testcase, which
exercises stepping over a clone syscall, with displaced stepping vs
inline stepping, and all-stop vs non-stop. We already test stepping
over clone syscalls with gdb.base/step-over-syscall.exp, but this test
uses pthreads, while the other test uses raw clone, and this one is
more thorough.
Co-authored-by: Pedro Alves <pedro@palves.net>
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=19675
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27830
Change-Id: I95c06024736384ae8542a67ed9fdf6534c325c8e
Pedro Alves [Fri, 3 Dec 2021 22:10:05 +0000 (22:10 +0000)]
gdbserver: Hide and don't detach pending clone children
This commit extends the logic added by these two commits from a while
ago:
#1 7b961964f866 (gdbserver: hide fork child threads from GDB),
#2 df5ad102009c (gdb, gdbserver: detach fork child when detaching from fork parent)
... to handle thread clone events, which are very similar to (v)fork
events.
For #1, we want to hide clone children as well, so just update the
comments.
For #2, unlike (v)fork children, pending clone children aren't full
processes, they're just threads, so don't detach them in
handle_detach. linux-low.cc will take care of detaching them along
with all other threads of the process, there's nothing special that
needs to be done.
Pedro Alves [Tue, 23 Nov 2021 20:35:12 +0000 (20:35 +0000)]
Thread options & clone events (Linux GDBserver)
This patch teaches the Linux GDBserver backend to report clone events
to GDB, when GDB has requested them with the GDB_TO_CLONE thread
option, via the new QThreadOptions packet.
This shuffles code in linux_process_target::handle_extended_wait
around to a more logical order when we now have to handle and
potentially report all of fork/vfork/clone.
Pedro Alves [Tue, 23 Nov 2021 20:35:12 +0000 (20:35 +0000)]
Thread options & clone events (native Linux)
This commit teaches the native Linux target about the GDB_TO_CLONE
thread option. It's actually simpler to just continue reporting all
clone events unconditionally to the core. There's never any harm in
reporting a clone event when the option is disabled. All we need to
do report support for the option, otherwise GDB falls back to use
target_thread_events().
Pedro Alves [Tue, 23 Nov 2021 20:35:12 +0000 (20:35 +0000)]
Thread options & clone events (core + remote)
A previous patch taught GDB about a new TARGET_WAITKIND_THREAD_CLONED
event kind, and made the Linux target report clone events.
A following patch will teach Linux GDBserver to do the same thing.
However, for remote debugging, it wouldn't be ideal for GDBserver to
report every clone event to GDB, when GDB only cares about such events
in some specific situations. Reporting clone events all the time
would be potentially chatty. We don't enable thread create/exit
events all the time for the same reason. Instead we have the
QThreadEvents packet. QThreadEvents is target-wide, though.
This patch makes GDB instead explicitly request that the target
reports clone events or not, on a per-thread basis.
In order to be able to do that with GDBserver, we need a new remote
protocol feature. Since a following patch will want to enable thread
exit events on per-thread basis too, the packet introduced here is
more generic than just for clone events. It lets you enable/disable a
set of options at once, modelled on Linux ptrace's PTRACE_SETOPTIONS.
IOW, this commit introduces a new QThreadOptions packet, that lets you
specify a set of per-thread event options you want to enable. The
packet accepts a list of options/thread-id pairs, similarly to vCont,
processed left to right, with the options field being a number
interpreted as a bit mask of options. The only option defined in this
commit is GDB_TO_CLONE (0x1), which ask the remote target to report
clone events. Another patch later in the series will introduce
another option.
For example, this packet sets option "1" (clone events) on thread
p1000.2345:
QThreadOptions;1:p1000.2345
and this clears options for all threads of process 1000, and then sets
option "1" (clone events) on thread p1000.2345:
QThreadOptions;0:p1000.-1;1:p1000.2345
This clears options of all threads of all processes:
QThreadOptions;0
The target reports the set of supported options by including
"QThreadOptions=<supported options>" in its qSupported response.
infrun is then tweaked to enable GDB_TO_CLONE when stepping over a
breakpoint.
Unlike PTRACE_SETOPTIONS, fork/vfork/clone children do NOT inherit
their parent's thread options. This is so that GDB can send e.g.,
"QThreadOptions;0;1:TID" without worrying about threads it doesn't
know about yet.
Documentation for this new remote protocol feature is included in a
documentation patch later in the series.
Pedro Alves [Tue, 23 Nov 2021 20:35:12 +0000 (20:35 +0000)]
Support clone events in the remote protocol
The previous patch taught GDB about a new
TARGET_WAITKIND_THREAD_CLONED event kind, and made the Linux target
report clone events.
A following patch will teach Linux GDBserver to do the same thing.
But before we get there, we need to teach the remote protocol about
TARGET_WAITKIND_THREAD_CLONED. That's what this patch does. Clone is
very similar to vfork and fork, and the new stop reply is likewise
handled similarly. The stub reports "T05clone:...".
GDBserver core is taught to handle TARGET_WAITKIND_THREAD_CLONED and
forward it to GDB in this patch, but no backend actually emits it yet.
That will be done in a following patch.
Documentation for this new remote protocol feature is included in a
documentation patch later in the series.
Pedro Alves [Fri, 12 Nov 2021 20:50:29 +0000 (20:50 +0000)]
Step over clone syscall w/ breakpoint, TARGET_WAITKIND_THREAD_CLONED
(A good chunk of the problem statement in the commit log below is
Andrew's, adjusted for a different solution, and for covering
displaced stepping too.)
This commit addresses bugs gdb/19675 and gdb/27830, which are about
stepping over a breakpoint set at a clone syscall instruction, one is
about displaced stepping, and the other about in-line stepping.
Currently, when a new thread is created through a clone syscall, GDB
sets the new thread running. With 'continue' this makes sense
(assuming no schedlock):
- all-stop mode, user issues 'continue', all threads are set running,
a newly created thread should also be set running.
- non-stop mode, user issues 'continue', other pre-existing threads
are not effected, but as the new thread is (sort-of) a child of the
thread the user asked to run, it makes sense that the new threads
should be created in the running state.
Similarly, if we are stopped at the clone syscall, and there's no
software breakpoint at this address, then the current behaviour is
fine:
- all-stop mode, user issues 'stepi', stepping will be done in place
(as there's no breakpoint to step over). While stepping the thread
of interest all the other threads will be allowed to continue. A
newly created thread will be set running, and then stopped once the
thread of interest has completed its step.
- non-stop mode, user issues 'stepi', stepping will be done in place
(as there's no breakpoint to step over). Other threads might be
running or stopped, but as with the continue case above, the new
thread will be created running. The only possible issue here is
that the new thread will be left running after the initial thread
has completed its stepi. The user would need to manually select
the thread and interrupt it, this might not be what the user
expects. However, this is not something this commit tries to
change.
The problem then is what happens when we try to step over a clone
syscall if there is a breakpoint at the syscall address.
- For both all-stop and non-stop modes, with in-line stepping:
+ user issues 'stepi',
+ [non-stop mode only] GDB stops all threads. In all-stop mode all
threads are already stopped.
+ GDB removes s/w breakpoint at syscall address,
+ GDB single steps just the thread of interest, all other threads
are left stopped,
+ New thread is created running,
+ Initial thread completes its step,
+ [non-stop mode only] GDB resumes all threads that it previously
stopped.
There are two problems in the in-line stepping scenario above:
1. The new thread might pass through the same code that the initial
thread is in (i.e. the clone syscall code), in which case it will
fail to hit the breakpoint in clone as this was removed so the
first thread can single step,
2. The new thread might trigger some other stop event before the
initial thread reports its step completion. If this happens we
end up triggering an assertion as GDB assumes that only the
thread being stepped should stop. The assert looks like this:
infrun.c:5899: internal-error: int finish_step_over(execution_control_state*): Assertion `ecs->event_thread->control.trap_expected' failed.
- For both all-stop and non-stop modes, with displaced stepping:
+ user issues 'stepi',
+ GDB starts the displaced step, moves thread's PC to the
out-of-line scratch pad, maybe adjusts registers,
+ GDB single steps the thread of interest, [non-stop mode only] all
other threads are left as they were, either running or stopped.
In all-stop, all other threads are left stopped.
+ New thread is created running,
+ Initial thread completes its step, GDB re-adjusts its PC,
restores/releases scratchpad,
+ [non-stop mode only] GDB resumes the thread, now past its
breakpoint.
+ [all-stop mode only] GDB resumes all threads.
There is one problem with the displaced stepping scenario above:
3. When the parent thread completed its step, GDB adjusted its PC,
but did not adjust the child's PC, thus that new child thread
will continue execution in the scratch pad, invoking undefined
behavior. If you're lucky, you see a crash. If unlucky, the
inferior gets silently corrupted.
What is needed is for GDB to have more control over whether the new
thread is created running or not. Issue #1 above requires that the
new thread not be allowed to run until the breakpoint has been
reinserted. The only way to guarantee this is if the new thread is
held in a stopped state until the single step has completed. Issue #3
above requires that GDB is informed of when a thread clones itself,
and of what is the child's ptid, so that GDB can fixup both the parent
and the child.
When looking for solutions to this problem I considered how GDB
handles fork/vfork as these have some of the same issues. The main
difference between fork/vfork and clone is that the clone events are
not reported back to core GDB. Instead, the clone event is handled
automatically in the target code and the child thread is immediately
set running.
Note we have support for requesting thread creation events out of the
target (TARGET_WAITKIND_THREAD_CREATED). However, those are reported
for the new/child thread. That would be sufficient to address in-line
stepping (issue #1), but not for displaced-stepping (issue #3). To
handle displaced-stepping, we need an event that is reported to the
_parent_ of the clone, as the information about the displaced step is
associated with the clone parent. TARGET_WAITKIND_THREAD_CREATED
includes no indication of which thread is the parent that spawned the
new child. In fact, for some targets, like e.g., Windows, it would be
impossible to know which thread that was, as thread creation there
doesn't work by "cloning".
The solution implemented here is to model clone on fork/vfork, and
introduce a new TARGET_WAITKIND_THREAD_CLONED event. This event is
similar to TARGET_WAITKIND_FORKED and TARGET_WAITKIND_VFORKED, except
that we end up with a new thread in the same process, instead of a new
thread of a new process. Like FORKED and VFORKED, THREAD_CLONED
waitstatuses have a child_ptid property, and the child is help stopped
until GDB explicitly resumes it. This addresses the in-line stepping
case (issues #1 and #2).
The infrun code that handles displaced stepping fixup for the child
after a fork/vfork event is thus reused for THREAD_CLONE, with some
minimal conditions added, addressing the displaced stepping case
(issue #3).
The native Linux backend is adjusted to unconditionally report
TARGET_WAITKIND_THREAD_CLONED events to the core.
Following the follow_fork model in core GDB, we introduce a
target_follow_clone target method, which is responsible for making the
new clone child visible to the rest of GDB.
Subsequent patches will add clone events support to the remote
protocol and gdbserver.
A testcase will be added by a later patch.
displaced_step_in_progress_thread is removed in this patch, but it is
added back again in a subsequent patch. We need to do this because
the function is static, and with no callers, the compiler would warn,
(error with -Werror), breaking the build.
Pedro Alves [Wed, 13 Jul 2022 16:16:38 +0000 (17:16 +0100)]
gdb/linux: Delete all other LWPs immediately on ptrace exec event
I noticed that after a following patch ("Step over clone syscall w/
breakpoint, TARGET_WAITKIND_THREAD_CLONED"), the
gdb.threads/step-over-exec.exp was passing cleanly, but still, we'd
end up with four new unexpected GDB core dumps:
=== gdb Summary ===
# of unexpected core files 4
# of expected passes 48
That said patch is making the pre-existing
gdb.threads/step-over-exec.exp testcase (almost silently) expose a
latent problem in gdb/linux-nat.c, resulting in a GDB crash when:
#1 - a non-leader thread execs
#2 - the post-exec program stops somewhere
#3 - you kill the inferior
Instead of #3 directly, the testcase just returns, which ends up in
gdb_exit, tearing down GDB, which kills the inferior, and is thus
equivalent to #3 above.
Vis:
$ gdb --args ./gdb /home/pedro/gdb/build/gdb/testsuite/outputs/gdb.threads/step-over-exec/step-over-exec-execr-thread-other-diff-text-segs-true
...
(top-gdb) r
...
(gdb) b main
...
(gdb) r
...
Breakpoint 1, main (argc=1, argv=0x7fffffffdb88) at /home/pedro/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.threads/step-over-exec.c:69
69 argv0 = argv[0];
(gdb) c
Continuing.
[New Thread 0x7ffff7d89700 (LWP 2506975)]
Other going in exec.
Exec-ing /home/pedro/gdb/build/gdb/testsuite/outputs/gdb.threads/step-over-exec/step-over-exec-execr-thread-other-diff-text-segs-true-execd
process 2506769 is executing new program: /home/pedro/gdb/build/gdb/testsuite/outputs/gdb.threads/step-over-exec/step-over-exec-execr-thread-other-diff-text-segs-true-execd
Thread 1 "step-over-exec-" hit Breakpoint 1, main () at /home/pedro/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.threads/step-over-exec-execd.c:28
28 foo ();
(gdb) k
...
Thread 1 "gdb" received signal SIGSEGV, Segmentation fault.
0x000055555574444c in thread_info::has_pending_waitstatus (this=0x0) at ../../src/gdb/gdbthread.h:393
393 return m_suspend.waitstatus_pending_p;
(top-gdb) bt
#0 0x000055555574444c in thread_info::has_pending_waitstatus (this=0x0) at ../../src/gdb/gdbthread.h:393
#1 0x0000555555a884d1 in get_pending_child_status (lp=0x5555579b8230, ws=0x7fffffffd130) at ../../src/gdb/linux-nat.c:1345
#2 0x0000555555a8e5e6 in kill_unfollowed_child_callback (lp=0x5555579b8230) at ../../src/gdb/linux-nat.c:3564
#3 0x0000555555a92a26 in gdb::function_view<int (lwp_info*)>::bind<int, lwp_info*>(int (*)(lwp_info*))::{lambda(gdb::fv_detail::erased_callable, lwp_info*)#1}::operator()(gdb::fv_detail::erased_callable, lwp_info*) const (this=0x0, ecall=..., args#0=0x5555579b8230) at ../../src/gdb/../gdbsupport/function-view.h:284
#4 0x0000555555a92a51 in gdb::function_view<int (lwp_info*)>::bind<int, lwp_info*>(int (*)(lwp_info*))::{lambda(gdb::fv_detail::erased_callable, lwp_info*)#1}::_FUN(gdb::fv_detail::erased_callable, lwp_info*) () at ../../src/gdb/../gdbsupport/function-view.h:278
#5 0x0000555555a91f84 in gdb::function_view<int (lwp_info*)>::operator()(lwp_info*) const (this=0x7fffffffd210, args#0=0x5555579b8230) at ../../src/gdb/../gdbsupport/function-view.h:247
#6 0x0000555555a87072 in iterate_over_lwps(ptid_t, gdb::function_view<int (lwp_info*)>) (filter=..., callback=...) at ../../src/gdb/linux-nat.c:864
#7 0x0000555555a8e732 in linux_nat_target::kill (this=0x55555653af40 <the_amd64_linux_nat_target>) at ../../src/gdb/linux-nat.c:3590
#8 0x0000555555cfdc11 in target_kill () at ../../src/gdb/target.c:911
...
The root of the problem is that when a non-leader LWP execs, it just
changes its tid to the tgid, replacing the pre-exec leader thread,
becoming the new leader. There's no thread exit event for the execing
thread. It's as if the old pre-exec LWP vanishes without trace. The
ptrace man page says:
"PTRACE_O_TRACEEXEC (since Linux 2.5.46)
Stop the tracee at the next execve(2). A waitpid(2) by the
tracer will return a status value such that
status>>8 == (SIGTRAP | (PTRACE_EVENT_EXEC<<8))
If the execing thread is not a thread group leader, the thread
ID is reset to thread group leader's ID before this stop.
Since Linux 3.0, the former thread ID can be retrieved with
PTRACE_GETEVENTMSG."
When the core of GDB processes an exec events, it deletes all the
threads of the inferior. But, that is too late -- deleting the thread
does not delete the corresponding LWP, so we end leaving the pre-exec
non-leader LWP stale in the LWP list. That's what leads to the crash
above -- linux_nat_target::kill iterates over all LWPs, and after the
patch in question, that code will look for the corresponding
thread_info for each LWP. For the pre-exec non-leader LWP still
listed, won't find one.
This patch fixes it, by deleting the pre-exec non-leader LWP (and
thread) from the LWP/thread lists as seen as we get an exec event out
of ptrace.
GDBserver does not need an equivalent fix, because it is already doing
this, as side effect of mourning the pre-exec process, in
gdbserver/linux-low.cc:
else if (event == PTRACE_EVENT_EXEC && cs.report_exec_events)
{
...
/* Delete the execing process and all its threads. */
mourn (proc);
switch_to_thread (nullptr);
Pedro Alves [Fri, 12 Nov 2021 20:50:29 +0000 (20:50 +0000)]
linux-nat: introduce pending_status_str
I noticed that some debug log output printing an lwp's pending status
wasn't considering lp->waitstatus. This fixes it, by introducing a
new pending_status_str function.
Pedro Alves [Tue, 22 Jun 2021 14:42:51 +0000 (15:42 +0100)]
displaced step: pass down target_waitstatus instead of gdb_signal
This commit tweaks displaced_step_finish & friends to pass down a
target_waitstatus instead of a gdb_signal. This needed because a
patch later in the series will want to make
displaced_step_buffers::finish handle TARGET_WAITKIND_THREAD_EXITED.
It also helps with the TARGET_WAITKIND_THREAD_CLONED patch.
Tom Tromey [Fri, 17 Jun 2022 13:41:37 +0000 (07:41 -0600)]
Remove array typedef assumption for Ada
Currently the Ada code assumes that it can distinguish between a
multi-dimensional array and an array of arrays by looking for an
intervening typedef -- that is, for an array of arrays, there will be
a typedef wrapping the innermost array type.
A recent compiler change removes this typedef, which causes a gdb
failure in the internal AdaCore test suite.
This patch handles this case by checking whether the array type in
question has a name.
Tom Tromey [Thu, 23 Jun 2022 17:05:39 +0000 (11:05 -0600)]
Remove ui_register_input_event_handler
This patch removes ui_register_input_event_handler and
ui_unregister_input_event_handler, replacing them with methods on
'ui'. It also changes gdb to use these methods everywhere, rather
than sometimes reaching in to the ui to manage the file descriptor
directly.
opcodes/arc: Implement style support in the disassembler
Update the ARC disassembler to supply style information to the
disassembler output. The output formatting remains unchanged.
opcodes/ChangeLog:
* disassemble.c (disassemble_init_for_target): Set
created_styled_output for ARC based targets.
* arc-dis.c (find_format_from_table): Use fprintf_styled_ftype
instead of fprintf_ftype throughout.
(find_format): Likewise.
(print_flags): Likewise.
(print_insn_arc): Likewise.
Tom de Vries [Mon, 18 Jul 2022 10:48:39 +0000 (12:48 +0200)]
[gdb/testsuite] Remove duplicate of supports_gnuc
In commit 9d9dd861e98 ("[gdb/testsuite] Fix regression in
step-indirect-call-thunk.exp with gcc 7") I accidentally committed a duplicate
of supports_gnuc, which caused:
...
DUPLICATE: gdb.base/gdb-caching-proc.exp: supports_gnuc: consistency
...
Andrew Burgess [Mon, 4 Jul 2022 15:40:05 +0000 (16:40 +0100)]
gdb/python: look for python, then python 3 at configure time
It is possible that a system might have a python3 executable, but no
python executable. For example, on my Fedora system the python2
package provides /usr/bin/python2, the python3 package provides
/usr/bin/python3, and the python-unversioned-command package provides
/usr/bin/python, which picks between python2 and python3.
It is quite possible to only have python3 available on a system.
Currently, when GDB configures, it looks for a 'python' executable.
If non is found then GDB will be built without python support. Or the
user needs to configure using --with-python=/usr/bin/python3.
This commit updates GDB's configure.ac script to first look for
'python', and then 'python3'. Now, on a system that only has a
python3 executable, GDB will automatically find, and use that in order
to provide python support, no user supplied configure arguments are
needed.
I've tested this on my local machine by removing the
python-unversioned-command package, confirming that there is no longer
a 'python' executable in my $PATH, and then rebuilding GDB from
scratch. GDB with this patch has python support.
Jan Beulich [Mon, 18 Jul 2022 09:20:44 +0000 (11:20 +0200)]
x86: correct VMOVSH attributes
Both forms were missing VexW0 (thus allowing Evex.W=1 to be encoded by
suitable means, which would cause #UD). The memory operand form further
was using the wrong Masking value, thus allowing zeroing-masking to be
encoded for the store form (which would again cause #UD).
Jan Beulich [Mon, 18 Jul 2022 09:19:58 +0000 (11:19 +0200)]
x86: re-order insn template fields
This saves quite a number of shift instructions: The "operands" field
can now be retrieved by just masking (no shift), and extracting the
"extension_opcode" field now only requires a (signed) right shift, with
no prereq left one. (Of course there may be architectures where, in a
cross build, there might be no difference at all, e.g. when there are
suitable bitfield extraction insns.)
Tom de Vries [Mon, 18 Jul 2022 06:34:06 +0000 (08:34 +0200)]
[gdbsupport] Improve thread scheduling in parallel_for_each
When running a task using parallel_for_each, we get the following
distribution:
...
Parallel for: n_elements: 7271
Parallel for: minimum elements per thread: 10
Parallel for: elts_per_thread: 1817
Parallel for: elements on worker thread 0 : 1817
Parallel for: elements on worker thread 1 : 1817
Parallel for: elements on worker thread 2 : 1817
Parallel for: elements on worker thread 3 : 0
Parallel for: elements on main thread : 1820
...
Note that there are 4 active threads, and scheduling elts_per_thread on each
of those handles 4 * 1817 == 7268, leaving 3 "left over" elements.
These leftovers are currently handled in the main thread.
That doesn't seem to matter much for this example, but for say 10 threads and
99 elements, you'd have 9 threads handling 9 elements and 1 thread handling 18
elements.
Instead, distribute the left over elements over the worker threads, such that
we have:
...
Parallel for: elements on worker thread 0 : 1818
Parallel for: elements on worker thread 1 : 1818
Parallel for: elements on worker thread 2 : 1818
Parallel for: elements on worker thread 3 : 0
Parallel for: elements on main thread : 1817
...
Tom de Vries [Mon, 18 Jul 2022 04:20:38 +0000 (06:20 +0200)]
[gdb/testsuite] Allow override of ASAN_OPTIONS in lib/gdb.exp
Use set_sanitizer_default for ASAN_OPTIONS in lib/gdb.exp.
This allows us to override the default detect_leaks=0 setting, by manually
doing:
...
$ export ASAN_OPTIONS=detect_leaks=1
$ make check
...
Tested on x86_64-linux, by building with -fsanitize=address and running
test-case gdb.dwarf2/gdb-add-index.exp with and without
"export ASAN_OPTIONS=detect_leaks=1".
Tom de Vries [Mon, 18 Jul 2022 04:13:45 +0000 (06:13 +0200)]
[gdb/testsuite] Fix regression in step-indirect-call-thunk.exp with gcc 7
Since commit 43127ae5714 ("Fix gdb.base/step-indirect-call-thunk.exp") I run
into:
...
gdb compile failed, gcc: error: unrecognized command line option \
'-fcf-protection=none'; did you mean '-flto-partition=none'?
UNTESTED: gdb.base/step-indirect-call-thunk.exp: failed to prepare
...
The problem is that -fcf-protection is supported starting gcc 8, but I'm using
system gcc 7.5.0.
Fix this by only adding -fcf-protection=none for gcc 8 and later.
Tested on x86_64-linux, with gcc 7.5.0, 8.2.1 and 12.1.1.
Tom de Vries [Mon, 18 Jul 2022 03:34:01 +0000 (05:34 +0200)]
[gdbsupport] Add parallel_for_each_debug
Add a parallel_for_each_debug variable, set to false by default.
With an a.out compiled from hello world, we get with
parallel_for_each_debug == true:
...
$ gdb -q -batch a.out -ex start
...
Parallel for: n_elements: 7271
Parallel for: minimum elements per thread: 10
Parallel for: elts_per_thread: 1817
Parallel for: elements on worker thread 0 : 1817
Parallel for: elements on worker thread 1 : 1817
Parallel for: elements on worker thread 2 : 1817
Parallel for: elements on worker thread 3 : 0
Parallel for: elements on main thread : 1820
Temporary breakpoint 1, main () at /home/vries/hello.c:6
6 printf ("hello\n");
...
gdb-add-index always generates an error when libdebuginfod wasn't compiled in
gdb-add-index runs gdb with -iex 'set debuginfod enabled off'. If gdb
is not compiled against libdebuginfod this causes an unnecessary error
message to be printed to stderr indicating that gdb was not built with
debuginfod support.
Fix this by changing the 'set debuginfod enabled off' command to a
no-op when gdb isn't built with libdebuginfod.
Bruno Larsen [Tue, 28 Jun 2022 18:36:45 +0000 (15:36 -0300)]
gdb/testsuite: modernize gdb.base/maint.exp
gdb.base/maint.exp was using several gdb_expect statements, probably
because this test case predates the existance of gdb_test_multiple. This
commit updates the test case to use gdb_test_multiple, making it more
resilient to internal errors and such.
The only gdb_expect left in the testcase is one that specifically looks
for an internal error being triggered as a PASS.
Tom Tromey [Fri, 15 Jul 2022 15:38:32 +0000 (09:38 -0600)]
Add 'nibbles' to gdb.print_options
When I rebased and updated the print_options patch, I forgot to update
print_options to add the new 'nibbles' feature to the result. This
patch fixes the oversight. I'm checking this in.
Carl Love [Fri, 15 Jul 2022 15:30:43 +0000 (15:30 +0000)]
PowerPC: Add support for IEEE 128-bit format.
The test gdb.base/infcall-nested-structs-c.exp fails on a gdb assert
in function ppc64_sysv_abi_return_value in file gdb/ppc-sysv-tdep.c. The
assert is due to the missing IEEE 128-bit support in file
gdb/ppc-sysv-tdep.c.
The IBM long double was the initial float 128-bit support added by IBM
The IEEE 128-bit support, which is similar IBM long double support, was
made the default starting with GCC 12. The floating point format
differences include the number of bits used to encode the exponent
and significand. Also, IBM long double values are passed in a pair of
floating point registers. The IEEE 128-bit value is passed in a single
vector register.
This patch fixes the gdb_assert (ok); in function
ppc64_sysv_abi_return_value in gdb/ppc-sysv-tdep.c by adding IEEE FLOAT
128-bit type support for PowerPC.
The patch has been tested on Power 10, ELFv2. It fixes the following list
of regression failures on Power 10:
Tom Tromey [Tue, 7 Jun 2022 13:05:02 +0000 (07:05 -0600)]
Add 'summary' mode to Value.format_string
This adds a 'summary' mode to Value.format_string and to
gdb.print_options. For the former, it lets Python code format values
using this mode. For the latter, it lets a printer potentially detect
if it is being called in a backtrace with 'set print frame-arguments'
set to 'scalars'.
I considered adding a new mode here to let a pretty-printer see
whether it was being called in a 'backtrace' context at all, but I'm
not sure if this is really desirable.
Tom Tromey [Mon, 6 Jun 2022 15:54:45 +0000 (09:54 -0600)]
Expose current 'print' settings to Python
PR python/17291 asks for access to the current print options. While I
think this need is largely satisfied by the existence of
Value.format_string, it seemed to me that a bit more could be done.
First, while Value.format_string uses the user's settings, it does not
react to temporary settings such as "print/x". This patch changes
this.
Second, there is no good way to examine the current settings (in
particular the temporary ones in effect for just a single "print").
This patch adds this as well.
Carl Love [Fri, 15 Jul 2022 15:17:34 +0000 (15:17 +0000)]
PowerPC: fix for gdb.base/eh_return.exp
Disable the Traceback Table generation on PowerPC for this test. The
Traceback Table consists of a series of bit fields to indicate things like
the Traceback Table version, language, and specific information about the
function. The Traceback Table is generated following the end of the code
for every function by default. The Traceback Table is defined in the
PowerPC ELF ABI and is intended to support debuggers and exception
handlers. The Traceback Table is displayed in the disassembly of functions
by default and is part of the function length. The table is typically
interpreted by the disassembler as data represented by .long xxx entries.
Generation of the Traceback Table is disabled in this test using the
PowerPC specific gcc compiler option -mtraceback=no, the xlc option
additional_flags-qtable=none and the clang optons
-mllvm -xcoff-traceback-table=false. Disabling the Traceback Table
generation in this test results in the gdb_test_multiple statement
correctly locating the address of the bclr instruction before the statement
"End of assembler dump." in the disassembly output.
Tom de Vries [Thu, 14 Jul 2022 18:47:54 +0000 (20:47 +0200)]
[gdb/symtab] Fix data race in cooked_index_functions::expand_symtabs_matching
When building gdb with -fsanitize-threads and running test-case
gdb.ada/char_enum_unicode.exp, I run into:
...
WARNING: ThreadSanitizer: data race (pid=21301)^M
Write of size 8 at 0x7b2000008080 by main thread:^M
#0 free <null> (libtsan.so.2+0x4c5e2)^M
#1 _dl_close_worker <null> (ld-linux-x86-64.so.2+0x4b7b)^M
#2 convert_between_encodings() charset.c:584^M
...
#21 cooked_index_functions::expand_symtabs_matching() read.c:18606
...
This is fixed by making cooked_index_functions::expand_symtabs_matching wait
for the cooked index finalization to be done.
Tom de Vries [Thu, 14 Jul 2022 08:20:16 +0000 (10:20 +0200)]
[gdb/build] Fix gdb build with gcc 4.8.5
When building gdb with gcc 4.8.5, we run into:
...
In file included from /usr/include/c++/4.8/future:43:0,
from gdbsupport/thread-pool.h:30,
from gdb/dwarf2/cooked-index.h:33,
from gdb/dwarf2/read.h:26,
from gdb/dwarf2/abbrev-cache.c:21:
/usr/include/c++/4.8/atomic: In instantiation of \
‘_Tp std::atomic<_Tp>::load(std::memory_order) const [with _Tp = \
packed<dwarf_unit_type, 1ul>; std::memory_order = std::memory_order]’:
gdb/dwarf2/read.h:332:44: required from here
/usr/include/c++/4.8/atomic:208:13: error: no matching function for call to \
‘packed<dwarf_unit_type, 1ul>::packed()’
_Tp tmp;
^
...
Fix this by adding the default constructor for packed.
Tested on x86_64-linux, with gdb build with gcc 4.8.5.
Tom de Vries [Thu, 14 Jul 2022 06:19:00 +0000 (08:19 +0200)]
[gdb/symtab] Make per_cu->m_lang atomic
When building gdb with -fsanitize=thread and running test-case
gdb.dwarf2/inlined_subroutine-inheritance.exp, we run into a data race
between:
...
Read of size 1 at 0x7b2000003010 by thread T4:
#0 packed<language, 1ul>::operator language() const packed.h:54
#1 dwarf2_per_cu_data::set_lang(language) read.h:363
...
and:
...
Previous write of size 1 at 0x7b2000003010 by main thread:
#0 dwarf2_per_cu_data::set_lang(language) read.h:365
...
Tom de Vries [Thu, 14 Jul 2022 06:19:00 +0000 (08:19 +0200)]
[gdb/symtab] Make per_cu->unit_type atomic
With gdb with -fsanitize=thread and test-case gdb.ada/array_bounds.exp, I run
into a data race between:
...
Read of size 1 at 0x7b2000025f0f by main thread:
#0 packed<dwarf_unit_type, 1ul>::operator dwarf_unit_type() const packed.h:54
#1 dwarf2_per_cu_data::set_unit_type(dwarf_unit_type) read.h:339
...
and:
...
Previous write of size 1 at 0x7b2000025f0f by thread T3:
#0 dwarf2_per_cu_data::set_unit_type(dwarf_unit_type) read.h:341
...
Tom de Vries [Thu, 14 Jul 2022 06:19:00 +0000 (08:19 +0200)]
[gdb/symtab] Fix data race in ~charset_vector
When doing:
...
$ gdb ./outputs/gdb.ada/char_enum_unicode/foo -batch -ex "break foo.adb:26"
...
with a gdb build with -fsanitize=thread I run into a data race:
...
WARNING: ThreadSanitizer: data race (pid=30917)
Write of size 8 at 0x7b0400004070 by main thread:
#0 free <null> (libtsan.so.2+0x4c5e2)
#1 xfree<char> gdbsupport/gdb-xfree.h:37 (gdb+0x650f17)
#2 charset_vector::clear() gdb/charset.c:703 (gdb+0x651354)
#3 charset_vector::~charset_vector() gdb/charset.c:697 (gdb+0x6512d3)
#4 <null> <null> (libtsan.so.2+0x32643)
#5 captured_main_1 gdb/main.c:1310 (gdb+0xa3975a)
...
The problem is that we're freeing the charset_vector elements in the destructor,
which may still be used by a worker thread.
Fix this by not freeing the charset_vector elements in the destructor.
Alan Modra [Tue, 12 Jul 2022 01:40:08 +0000 (11:10 +0930)]
PowerPC: implement md_operand to parse register names
Allows register names to appear in symbol assignments, so for example
tocp = %r2
mr %r3,tocp
now assembles.
* gas/config/tc-ppc.c (REG_NAME_CNT): Delete, replace uses with
ARRAY_SIZE.
(register_name): Rename to..
(md_operand): ..this. Only handle %reg.
(cr_names): Rename to..
(cr_cond): ..this. Just keep conditions.
(ppc_parse_name): Add mode param. Search both cr_cond and
pre_defined_registers. Handle absolute and register symbol
values here rather than in expr.c:operand().
(md_assemble): Don't special case register name matching in
operands, except to set cr_operand as appropriate.
* gas/config/tc-ppc.h (md_operand): Don't define.
(md_parse_name, ppc_parse_name): Update.
* read.c (pseudo_set): Copy over entire O_register value.
* testsuite/gas/ppc/regsyms.d.
* testsuite/gas/ppc/regsyms.s: New test.
* testsuite/gas/ppc/ppc.exp: Run it.
A WIP version of a patch
(https://sourceware.org/pipermail/gdb-patches/2022-June/190202.html)
resulted in a bug that went unnoticed by the testuite, like so:
(gdb) PASS: gdb.threads/no-unwaited-for-left.exp: enable scheduler-locking, for main thread
continue
Continuing.
[New Thread 1251861.1251861]
No unwaited-for children left.
(gdb) PASS: gdb.threads/no-unwaited-for-left.exp: continue stops when the main thread exits
info threads
Id Target Id Frame
3 Thread 1251861.1251863 "no-unwaited-for" __pthread_clockjoin_ex (threadid=140737351558976, thread_return=0x0, clockid=<optimized out>, abstime=<optimized out>, block=<optimized out>) at pthread_join_common.c:145
4 Thread 1251861.1251861 "no-unwaited-for" <unavailable> in ?? ()
The current thread <Thread ID 1> has terminated. See `help thread'.
(gdb) PASS: gdb.threads/no-unwaited-for-left.exp: only thread 3 left, main thread terminated
Somehow, above, GDB re-added the zombie leader back before printing
"No unwaited-for children left.". The "only thread 3 left, main
thread terminated" test should have caught this, but didn't. That is
because the test's regexp has a ".*" after the part that matches
thread 3. This commit tightens that regexp to catch such a bug. It
also tightens the "only main thread left, thread 2 terminated" test's
regexp in the same way.
Carl Love [Wed, 13 Jul 2022 15:09:33 +0000 (15:09 +0000)]
Fix gdb.base/step-indirect-call-thunk.exp
Due to recent changes in the default value of -fcf-protection for gcc, the
test gdb.base/step-indirect-call-thunk.exp fails on Intel X86-64 with the
error:
Executing on host: gcc -fno-stack-protector -fdiagnostics-color=never
-mindirect-branch=thunk -mfunction-return=thunk -c -g
-o /.../gdb/testsuite/outputs/gdb.base/step-indirect-call-thunk/step-indirect-call-thunk0.o
/.../gdb/testsuite/gdb.base/step-indirect-call-thunk.c
(timeout = 300) builtin_spawn -ignore SIGHUP gcc -fno-stack-protector
-fdiagnostics-color=never -mindirect-branch=thunk -mfunction-return=thunk -c
-g -o /.../gdb/testsuite/outputs/gdb.base/step-indirect-call-thunk/step-indirect-call-thunk0.o
/.../binutils-gdb-current/gdb/testsuite/gdb.base/step-indirect-call-thunk.c
/.../gdb/testsuite/gdb.base/step-indirect-call-thunk.c:
In function 'inc': /.../gdb/testsuite/gdb.base/step-indirect-call-thunk.c:
22:1: error: '-mindirect-branch' and '-fcf-protection' are not compatible
22 | { /* inc.1 */
As stated in the error message the default "-fcf-protection" and
"-mindirect-branch' are in compatible. The fcf-protection argument needs
to be "-fcf-protection=none" for the test to compile on Intel.
The gcc command line "-mindirect-branch' is an Intel specific and will give
an error on other platforms. A check for X86 is added so the test will
only run on X86 platforms.
The patch has been tested and verified on Power 10 and Intel X86-64 systems
with no regressions.
Pedro Alves [Tue, 12 Jul 2022 21:14:37 +0000 (22:14 +0100)]
Fix "until LINE" in main, when "until" runs into longjmp
With a test like this:
1 #include <dlfcn.h>
2 int
3 main ()
4 {
5 dlsym (RTLD_DEFAULT, "FOO");
6 return 0;
7 }
and then "start" followed by "until 6", GDB currently incorrectly
stops inside the runtime loader, instead of line 6. Vis:
...
Temporary breakpoint 1, main () at until.c:5
4 {
(gdb) until 6
0x00007ffff7f0a90d in __GI__dl_catch_exception (exception=exception@entry=0x7fffffffdb00, operate=<optimized out>, args=0x7ffff7f0a90d <__GI__dl_catch_exception+109>) at dl-error-skeleton.c:206
206 dl-error-skeleton.c: No such file or directory.
(gdb)
The problem is related to longjmp handling -- dlsym internally
longjmps on error. The testcase can be reduced to this:
1 #include <setjmp.h>
2 void func () {
3 jmp_buf buf;
4 if (setjmp (buf) == 0)
5 longjmp (buf, 1);
6 }
7
8 int main () {
9 func ();
10 return 0; /* until to here */
11 }
and then with "start" followed by "until 10", GDB currently
incorrectly stops at line 4 (returning from setjmp), instead of line
10.
The problem is that the BPSTAT_WHAT_CLEAR_LONGJMP_RESUME code in
infrun.c fails to find the initiating frame, and so infrun thinks that
the longjmp jumped somewhere outer to "until"'s originating frame.
Here:
case BPSTAT_WHAT_CLEAR_LONGJMP_RESUME:
{
struct frame_info *init_frame;
/* There are several cases to consider.
1. The initiating frame no longer exists. In this case we
must stop, because the exception or longjmp has gone too
far.
/* For Cases 1 and 2, remove the step-resume breakpoint, if it
exists. */
delete_step_resume_breakpoint (ecs->event_thread);
end_stepping_range (ecs); // case 1., so we stop.
}
The initiating frame is set by until_break_command ->
set_longjmp_breakpoint. The initiating frame is supposed to be the
frame that is selected when the command was issued, but
until_break_command instead passes the frame id of the _caller_ frame
by mistake. When the "until LINE" command is issued from main, the
caller frame is the caller of main. When later infrun tries to find
that frame by id, it fails to find it, because frame_find_by_id
doesn't unwind past main.
The bug is that we passed the caller frame's id to
set_longjmp_breakpoint. We should have passed the selected frame's id
instead.
Enze Li [Mon, 11 Jul 2022 12:53:48 +0000 (20:53 +0800)]
gdbserver: remove unused variable
When building with clang 15, I got this error:
CXX server.o
server.cc:2985:10: error: variable 'new_argc' set but not used [-Werror,-Wunused-but-set-variable]
int i, new_argc;
^
Remove the unused variable to eliminate the error.
Tested by rebuilding on x86_64-linux with clang 15.
Tom de Vries [Wed, 13 Jul 2022 10:20:53 +0000 (12:20 +0200)]
[gdb/symtab] Make per_cu->set_lang more strict
We have in per_cu->set_lang this comment:
...
void set_lang (enum language lang)
{
/* We'd like to be more strict here, similar to what is done in
set_unit_type, but currently a partial unit can go from unknown to
minimal to ada to c. */
...
Fix this by not setting m_lang for partial units.
This requires us to move the m_unit_type initialization to ensure that
m_unit_type is initialized before per_cu->m_lang.
Tested on x86_64-linux, with native and target board cc-with-dwz-m.
Tom de Vries [Tue, 12 Jul 2022 15:12:17 +0000 (17:12 +0200)]
[gdb/symtab] Add dwarf2_cu::lang ()
The cu->per_cu->lang field was added to carry information from the initial
partial symtabs phase to the symtab expansion phase, for the benefit of a
particular optimization in process_imported_unit_die.
Other uses have been added, but since the first phase now has been
parallelized, those have become problematic and sources of race conditions.
Fix this by adding dwarf2_cu::lang () and using it where we can to replace
cu->per_cu->lang () with cu->lang ().
Also assert in dwarf2_cu::lang () that we're not returning language_unknown.
Tom de Vries [Tue, 12 Jul 2022 11:58:31 +0000 (13:58 +0200)]
[gdb/testsuite] Run two test-cases with ASAN_OPTIONS=verify_asan_link_order=0
When building gdb with -fsanitize=address we run into:
...
builtin_spawn gdb -nw -nx -iex set height 0 -iex set width 0 -data-directory \
build/gdb/data-directory^M
==10637==ASan runtime does not come first in initial library list; you \
should either link runtime to your application or manually preload it with \
LD_PRELOAD.^M
ERROR: GDB process no longer exists
...
Prevent the ASan runtime error by using
ASAN_OPTIONS=verify_asan_link_order=0. This makes both test-cases pass.
Tom de Vries [Tue, 12 Jul 2022 11:36:57 +0000 (13:36 +0200)]
[gdb/build] Fix build with gcc 4.8.5
When building gdb with gcc 4.8.5, we run into problems due to unconditionally
using:
...
gdb_static_assert (std::is_trivially_copyable<packed>::value);
...
in gdbsupport/packed.h.
Fix this by guarding the usage with HAVE_IS_TRIVIALLY_COPYABLE.
Tom de Vries [Tue, 12 Jul 2022 10:20:13 +0000 (12:20 +0200)]
Fix -fsanitize=thread for per_cu fields
When building gdb with -fsanitize=thread and gcc 12, and running test-case
gdb.dwarf2/dwz.exp, we run into a data race between:
...
Read of size 1 at 0x7b200000300d by thread T2:^M
#0 cutu_reader::cutu_reader(dwarf2_per_cu_data*, dwarf2_per_objfile*, \
abbrev_table*, dwarf2_cu*, bool, abbrev_cache*) gdb/dwarf2/read.c:6164 \
(gdb+0x82ec95)^M
...
and:
...
Previous write of size 1 at 0x7b200000300d by main thread:^M
#0 prepare_one_comp_unit gdb/dwarf2/read.c:23588 (gdb+0x86f973)^M
...
In other words, between:
...
if (this_cu->reading_dwo_directly)
...
and:
...
cu->per_cu->lang = pretend_language;
...
Likewise, we run into a data race between:
...
Write of size 1 at 0x7b200000300e by thread T4:
#0 process_psymtab_comp_unit gdb/dwarf2/read.c:6789 (gdb+0x830720)
...
and:
...
Previous read of size 1 at 0x7b200000300e by main thread:
#0 cutu_reader::cutu_reader(dwarf2_per_cu_data*, dwarf2_per_objfile*, \
abbrev_table*, dwarf2_cu*, bool, abbrev_cache*) gdb/dwarf2/read.c:6164 \
(gdb+0x82edab)
...
In other words, between:
...
this_cu->unit_type = DW_UT_partial;
...
and:
...
if (this_cu->reading_dwo_directly)
...
Likewise for the write to addresses_seen in cooked_indexer::check_bounds and a
read from is_dwz in dwarf2_find_containing_comp_unit for test-case
gdb.dwarf2/dw2-dir-file-name.exp and target board cc-with-dwz-m.
The problem is that the written fields are part of the same memory location as
the read fields, so executing a read and write in different threads is
undefined behavour.
Making the written fields separate memory locations, using the new
struct packed template fixes this.
The set of fields has been established experimentally to be the
minimal set to get rid of this type of -fsanitize=thread errors, but
more fields might require the same treatment.
Looking at the properties of the lang field, unlike dwarf_version it's
not available in the unit header, so it will be set the first time
during the parallel cooked index reading. The same holds for
unit_type, and likewise for addresses_seen.
dwarf2_per_cu_data::addresses_seen is moved so that the bitfields that
currently follow it can be merged in the same memory location as the
bitfields that currently precede it, for better packing.
Tested on x86_64-linux.
Co-Authored-By: Pedro Alves <pedro@palves.net>
Change-Id: Ifa94f0a2cebfae5e8f6ddc73265f05e7fd9e1532
Pedro Alves [Tue, 12 Jul 2022 10:20:13 +0000 (12:20 +0200)]
Introduce struct packed template
When building gdb with -fsanitize=thread and gcc 12, and running test-case
gdb.dwarf2/dwz.exp, we run into a few data races. For example, between:
...
Write of size 1 at 0x7b200000300e by thread T4:
#0 process_psymtab_comp_unit gdb/dwarf2/read.c:6789 (gdb+0x830720)
...
and:
...
Previous read of size 1 at 0x7b200000300e by main thread:
#0 cutu_reader::cutu_reader(dwarf2_per_cu_data*, dwarf2_per_objfile*, \
abbrev_table*, dwarf2_cu*, bool, abbrev_cache*) gdb/dwarf2/read.c:6164 \
(gdb+0x82edab)
...
In other words, between:
...
this_cu->unit_type = DW_UT_partial;
...
and:
...
if (this_cu->reading_dwo_directly)
...
The problem is that the written fields are part of the same memory
location as the read fields, so executing a read and write in
different threads is undefined behavour.
Making the written fields separate memory locations, like this:
fixes it, however that also increases the size of struct
dwarf2_per_cu_data, because it introduces padding due to alignment of
these new structs, which align on the natural alignment of the
specified type of their fields. We can fix that with
__attribute__((packed)), like so:
but to avoid having to write that in several places and add suitable
comments explaining how that concoction works, introduce a new struct
packed template that wraps/hides this. Instead of the above, we'll be
able to write:
packed<dwarf_unit_type, 1> unit_type;
Note that we can't change the type of dwarf_unit_type, as that is
defined in include/, and shared with other projects, some of those
written in C.
This patch just adds the struct packed type. Following patches will
make use of it. One of those patches will want to wrap a struct
packed in an std::atomic, like:
std::atomic<std::packed<language, 1>> m_lang;
so the new gdbsupport/packed.h header adds some operators to make
comparisions between that std::atomic and the type that the wrapped
struct packed wraps work, like in:
if (m_lang == language_c)
It would be possible to implement struct packed without using
__attribute__((packed)), by having it store an array of bytes of the
appropriate size instead, however that would make it less convenient
to debug GDB. The way it's implemented, printing a struct packed
variable just prints its field using its natural type, which is
particularly useful if the type is an enum. I believe that
__attribute__((packed)) is supported by all compilers that are able to
build GDB. Even a few BFD headers use on ATTRIBUTE_PACKED on external
types:
It is not possible to build GDB with MSVC today, but if it could, that
would be one compiler that doesn't support this attribute. However,
it supports packing via pragmas, so there's a way to cross that bridge
if we ever get to it. I believe any compiler worth its salt supports
some way of packing.
In any case, the worse that happens without the attribute is that some
types become larger than ideal. Regardless, I've added a couple
static assertions to catch such compilers in action:
/* Ensure size and aligment are what we expect. */
gdb_static_assert (sizeof (packed) == Bytes);
gdb_static_assert (alignof (packed) == 1);
Alan Modra [Tue, 12 Jul 2022 01:21:52 +0000 (10:51 +0930)]
PR29355, ld segfaults with -r/-q and custom-named section .rela*
The bug testcase uses an output section named .rel or .rela which has
input .data sections mapped to it. The input .data section has
relocations. When counting output relocations SHT_REL and SHT_RELA
section reloc_count is ignored, with the justification that reloc
sections themselves can't have relocations and some backends use
reloc_count in reloc sections. However, the test wrongly used the
output section type (which normally would match input section type).
Fix that. Note that it is arguably wrong for ld to leave the output
.rel/.rela section type as SHT_REL/SHT_RELA when non-empty non-reloc
sections are written to it, but I'm not going to change that since it
might be useful to hand-craft relocs in a data section that is then
written to a SHT_REL/SHT_RELA output section.
PR 29355
* elflink.c (bfd_elf_final_link): Use input section type rather
than output section type to determine whether to exclude using
reloc_count from that section.
Jiangshuai Li [Tue, 12 Jul 2022 01:54:58 +0000 (09:54 +0800)]
gdb/csky complete csky_dwarf_reg_to_regnum
For csky arch, the correspondence between Dwarf registers and GDB
registers are as follows:
dwarf regnos 0~31 ==> gdb regs r0~r31
dwarf regno CSKY_HI_REGNUM(36) ==> gdb reg hi
dwarf regno CSKY_LO_REGNUM(37) ==> gdb reg hi
dwarf regno CSKY_PC_REGNUM(72) ==> gdb reg pc
dwarf regnos FV_PSEUDO_REGNO_FIRST(74)~FV_PSEUDO_REGNO_LAST(201)
==>
gdb regs s0~s127 (pseudo regs for float and vector regs)
other dwarf regnos have no corresponding gdb regs to them.
Pedro Alves [Wed, 22 Jun 2022 17:10:00 +0000 (18:10 +0100)]
Always emit =thread-exited notifications, even if silent
[Note: the testcased added by this commit depends on
https://sourceware.org/pipermail/gdb-patches/2022-June/190259.html,
otherwise GDB just crashes when detaching the core]
Currently, in MI, =thread-created are always emitted, like:
but on teardown, if the target uses exit_inferior_silent, then you'll
only see the inferior exit notification (thread-group-exited), no
notification for threads.
The core target is one of the few targets that use
exit_inferior_silent. Here's an example session:
This imbalance of emitting =thread-created but then not =thread-exited
seems off to me. (And, it complicates changes I want to do to
centralize emitting thread exit notifications for the CLI, which is
why I'm looking at this.)
And then, since most other targets use exit_inferior instead of
exit_inferior_silent, MI is already emitting =thread-exited
notifications when tearing down an inferior, for most targets.
This commit makes MI always emit the =thread-exited notifications,
even for exit_inferior_silent.
Afterwards, when debugging a core, MI outputs:
(gdb)
-target-detach
=thread-exited,id="1",group-id="i1" << new line
=thread-group-exited,id="i1"
^done
(gdb)
Surprisingly, there's no MI testcase debugging a core file. This
commit adds the first.
After loading a core file, you're supposed to be able to use "detach"
to unload the core file. That unfortunately regressed starting with
GDB 11, with these commits:
1192f124a308 - gdb: generalize commit_resume, avoid commit-resuming when threads have pending statuses 408f66864a1a - detach in all-stop with threads running
resulting in a GDB crash:
...
Thread 1 "gdb" received signal SIGSEGV, Segmentation fault.
0x0000555555e842bf in maybe_set_commit_resumed_all_targets () at ../../src/gdb/infrun.c:2899
2899 if (proc_target->commit_resumed_state)
(top-gdb) bt
#0 0x0000555555e842bf in maybe_set_commit_resumed_all_targets () at ../../src/gdb/infrun.c:2899
#1 0x0000555555e848bf in scoped_disable_commit_resumed::reset (this=0x7fffffffd440) at ../../src/gdb/infrun.c:3023
#2 0x0000555555e84a0c in scoped_disable_commit_resumed::reset_and_commit (this=0x7fffffffd440) at ../../src/gdb/infrun.c:3049
#3 0x0000555555e739cd in detach_command (args=0x0, from_tty=1) at ../../src/gdb/infcmd.c:2791
#4 0x0000555555c0ba46 in do_simple_func (args=0x0, from_tty=1, c=0x55555662a600) at ../../src/gdb/cli/cli-decode.c:95
#5 0x0000555555c112b0 in cmd_func (cmd=0x55555662a600, args=0x0, from_tty=1) at ../../src/gdb/cli/cli-decode.c:2514
#6 0x0000555556173b1f in execute_command (p=0x5555565c5916 "", from_tty=1) at ../../src/gdb/top.c:699
if (proc_target->commit_resumed_state)
^^^^^^^^^^^
With 'proc_target' above being null. all_non_exited_inferiors filters
out inferiors that have pid==0. We get here at the end of
detach_command, after core_target::detach has already run, at which
point the inferior _should_ have pid==0 and no process target. It is
clear it no longer has a process target, but, it still has a pid!=0
somehow.
The reason the inferior still has pid!=0, is that core_target::detach
just unpushes, and relies on core_target::close to actually do the
getting rid of the core and exiting the inferior. The problem with
that is that detach_command grabs an extra strong reference to the
process stratum target, so the unpush_target inside
core_target::detach doesn't actually result in a call to
core_target::close.
Fix this my moving the cleaning up the core inferior to a shared
routine called by both core_target::close and core_target::detach. We
still need to cleanup the inferior from within core_file::close
because there are paths to it that want to get rid of the core without
going through detach. E.g., "core-file" -> "run".
This commit includes a new test added to gdb.base/corefile.exp to
cover the "core-file core" -> "detach" scenario.
Pedro Alves [Mon, 11 Jul 2022 15:05:00 +0000 (16:05 +0100)]
Fix non-existent "@var{thread-id}" in stop reply descriptions
In the description of stop replies, where the "thread" register is
explained, and where the fork and vfork stop reasons are detailed,
there are references to "@var{thread-id}", but such variable does not
actually exist. This commit fixes it.
Luis Machado [Thu, 30 Jun 2022 14:23:56 +0000 (15:23 +0100)]
Try a couple PAuth compilation flags for gdb.arch/aarch64-pauth.exp
The -msign-return-address switch has been dropped from GCC, but some
older compiler may still support it. Make sure we try both
-msign-return-address and -mbranch-protection before bailing out when
running gdb.arch/aarch64-pauth.exp.
Andrew Burgess [Mon, 14 Feb 2022 14:40:52 +0000 (14:40 +0000)]
gdb: add support for disassembler styling using libopcodes
This commit extends GDB to make use of libopcodes styling support
where available, currently this is just i386 based architectures, and
RISC-V.
For architectures that don't support styling using libopcodes GDB will
fall back to using the Python Pygments package, when the package is
available.
The new libopcodes based styling has the disassembler identify parts
of the disassembled instruction, e.g. registers, immediates,
mnemonics, etc, and can style these components differently.
Additionally, as the styling is now done in GDB we can add settings to
allow the user to configure which colours are used right from the GDB
CLI.
There's some new maintenance commands:
maintenance set libopcodes-styling enabled on|off
maintenance show libopcodes-styling
These can be used to manually disable use of libopcodes styling. This
is a maintenance command as it's not anticipated that a user should
need to do this. But, this could be useful for testing, or, in some
rare cases, a user might want to override the Python hook used for
disassembler styling, and then disable libopcode styling so that GDB
falls back to using Python. Right now I would consider this second
use case a rare situation, which is why I think a maintenance command
is appropriate.
When libopcodes is being used for styling then the user can make use
of the following new styles:
The disassembler also makes use of the 'address' and 'function'
styles to style some parts of the disassembler output. I have also
added the following aliases though:
set/show style disassembler address
set/show style disassembler symbol
these are aliases for:
set/show style address
set/show style function
respectively, and exist to make it easier for users to discover
disassembler related style settings. The 'address' style is used to
style numeric addresses in the disassembler output, while the 'symbol'
or 'function' style is used to style the names of symbols in
disassembler output.
As not every architecture supports libopcodes styling, the maintenance
setting 'libopcodes-styling enabled' has an "auto-off" type behaviour.
Consider this GDB session:
(gdb) show architecture
The target architecture is set to "auto" (currently "i386:x86-64").
(gdb) maintenance show libopcodes-styling enabled
Use of libopcodes styling support is "on".
the setting defaults to "on" for architectures that support libopcodes
based styling.
(gdb) set architecture sparc
The target architecture is set to "sparc".
(gdb) maintenance show libopcodes-styling enabled
Use of libopcodes styling support is "off" (not supported on architecture "sparc")
the setting will show as "off" if the user switches to an architecture
that doesn't support libopcodes styling. The underlying setting is
still "on" at this point though, if the user switches back to
i386:x86-64 then the setting would go back to being "on".
(gdb) maintenance set libopcodes-styling enabled off
(gdb) maintenance show libopcodes-styling enabled
Use of libopcodes styling support is "off".
now the setting is "off" for everyone, even if the user switches back
to i386:x86-64 the setting will still show as "off".
(gdb) maintenance set libopcodes-styling enabled on
Use of libopcodes styling not supported on architecture "sparc".
(gdb) maintenance show libopcodes-styling enabled
Use of libopcodes styling support is "off".
attempting to switch the setting "on" for an unsupported architecture
will give an error, and the setting will remain "off".
(gdb) set architecture auto
The target architecture is set to "auto" (currently "i386:x86-64").
(gdb) maintenance show libopcodes-styling enabled
Use of libopcodes styling support is "off".
(gdb) maintenance set libopcodes-styling enabled on
(gdb) maintenance show libopcodes-styling enabled
Use of libopcodes styling support is "on".
the user will need to switch back to a supported architecture before
they can one again turn this setting "on".
Andrew Burgess [Tue, 5 Apr 2022 10:06:16 +0000 (11:06 +0100)]
gdb: have gdb_disassemble_info carry 'this' in its stream pointer
The gdb_disassemble_info class is a wrapper around the libopcodes
disassemble_info struct. The 'stream' field of disassemble_info is
passed as an argument to the fprintf_func and fprintf_styled_func
callbacks when the disassembler wants to print anything.
Previously, GDB would store a pointer to a ui_file object in the
'stream' field, then, when the disassembler wanted to print anything,
the content would be written to the ui_file object. An example of an
fprintf_func callback, from gdb/disasm.c is:
int
gdb_disassembler::dis_asm_fprintf (void *stream, const char *format, ...)
{
/* Write output to STREAM here. */
}
This is fine, but has one limitation, within the print callbacks we
only have access to STREAM, we can't access any additional state
stored within the gdb_disassemble_info object.
Right now this isn't a problem, but in a future commit this will
become an issue, how we style the output being written to STREAM will
depend on the state of the gdb_disassemble_info object, and this state
might need to be updated, depending on what is being printed.
In this commit I propose changing the 'stream' field of the
disassemble_info to carry a pointer to the gdb_disassemble_info
sub-class, rather than the stream itself.
We then have the two sub-classes of gdb_disassemble_info to consider,
the gdb_non_printing_disassembler class never cared about the stream,
previously, for this class, the stream was nullptr. With the change
to make stream be a gdb_disassemble_info pointer, no further updates
are needed for gdb_non_printing_disassembler.
The other sub-class is gdb_printing_disassembler. In this case the
sub-class now carries around a pointer to the stream object. The
print callbacks are updated to cast the incoming stream object back to
a gdb_printing_disassembler, and then extract the stream.
This is purely a refactoring commit. A later commit will add
additional state to the gdb_printing_disassembler, and update the
print callbacks to access this state.
There should be no user visible changes after this commit.
Tom de Vries [Mon, 11 Jul 2022 09:36:54 +0000 (11:36 +0200)]
[gdb/symtab] Fix data race in per_cu->length
With gdb build with -fsanitize=thread and test-case
gdb.dwarf2/dw4-sig-types.exp and target board cc-with-dwz-m we run into a data
race between:
...
Write of size 4 at 0x7b2800002268 by thread T4:^M
#0 cutu_reader::cutu_reader(dwarf2_per_cu_data*, dwarf2_per_objfile*, \
abbrev_table*, dwarf2_cu*, bool, abbrev_cache*) gdb/dwarf2/read.c:6236 \
(gdb+0x82f525)^M
...
and this read:
...
Previous read of size 4 at 0x7b2800002268 by thread T1:^M
#0 dwarf2_find_containing_comp_unit gdb/dwarf2/read.c:23444 \
(gdb+0x86e22e)^M
...
In other words, between this write:
...
this_cu->length = cu->header.get_length ();
...
and this read:
...
&& mid_cu->sect_off + mid_cu->length > sect_off))
...
of per_cu->length.
Fix this similar to the per_cu->dwarf_version case, by only setting it if
needed, and otherwise verifying that the same value is used. [ Note that the
same code is already present in the other cutu_reader constructor. ]
Move this logic into into a member function set_length to make sure it's used
consistenly, and make the field private in order to enforce access through the
member functions, and rename it to m_length.
This exposes (running test-case gdb.dwarf2/fission-reread.exp) that in
fill_in_sig_entry_from_dwo_entry, the m_length field is overwritten.
For now, allow for that exception.
While we're at it, make sure that the length is set before read.
Tom de Vries [Mon, 11 Jul 2022 09:36:54 +0000 (11:36 +0200)]
[gdb/symtab] Use comp_unit_head::get_length
There's a spot in read_comp_units_from_section where we explictly use
initial_length_size to get the total length:
...
this_cu->length = cu_header.length + cu_header.initial_length_size;
...