This is already how the parent is reported in the vfork/fork events,
and is actually what the fix made gdbserver do. Following the
documentation change, the event would have been reported like this
instead:
Pedro Alves [Tue, 15 Sep 2015 16:45:26 +0000 (17:45 +0100)]
PR remote/18965: vforkdone stop reply should indicate parent PID
The vforkdone stop reply misses indicating the thread ID of the vfork
parent which the event relates to:
@cindex vfork events, remote reply
@item vfork
The packet indicates that @code{vfork} was called, and @var{r}
is the thread ID of the new child process. Refer to
@ref{thread-id syntax} for the format of the @var{thread-id}
field. This packet is only applicable to targets that support
vfork events.
@cindex vforkdone events, remote reply
@item vforkdone
The packet indicates that a child process created by a vfork
has either called @code{exec} or terminated, so that the
address spaces of the parent and child process are no longer
shared. The @var{r} part is ignored. This packet is only
applicable to targets that support vforkdone events.
Unfortunately, this is not just a documentation issue. GDBserver
is really not specifying the thread ID. I noticed because
in non-stop mode, gdb complains:
[Thread 6089.6089] #1 stopped.
#0 0x0000003615a011f0 in ?? ()
0x0000003615a011f0 in ?? ()
(gdb) set debug remote 1
(gdb) c
Continuing.
Sending packet: $QPassSignals:e;10;14;17;1a;1b;1c;21;24;25;2c;4c;#5f...Packet received: OK
Sending packet: $vCont;c:p17c9.17c9#88...Packet received: OK
Notification received: Stop:T05vfork:p17ce.17ce;06:40d7ffffff7f0000;07:30d7ffffff7f0000;10:e4c9eb1536000000;thread:p17c9.17c9;core:2;
Sending packet: $vStopped#55...Packet received: OK
Sending packet: $D;17ce#af...Packet received: OK
Sending packet: $vCont;c:p17c9.17c9#88...Packet received: OK
Notification received: Stop:T05vforkdone:;
No process or thread specified in stop reply: T05vforkdone:;
(gdb)
This is not non-stop-mode-specific, however. Consider e.g., that in
all-stop, you may be debugging more than one process at the same time.
You continue, and both processes vfork. So when you next get a
T05vforkdone, there's no way to tell which of the parent processes is
done with the vfork.
Tests will be added later.
Tested on x86_64 Fedora 20.
gdb/gdbserver/ChangeLog:
2015-09-15 Pedro Alves <palves@redhat.com>
PR remote/18965
* remote-utils.c (prepare_resume_reply): Merge
TARGET_WAITKIND_VFORK_DONE switch case with the
TARGET_WAITKIND_FORKED case.
gdb/doc/ChangeLog:
2015-09-15 Pedro Alves <palves@redhat.com>
PR remote/18965
* gdb.texinfo (Stop Reply Packets): Explain that vforkdone's 'r'
part indicates the thread ID of the parent process.
Gary Benson [Mon, 14 Sep 2015 10:02:06 +0000 (11:02 +0100)]
Fix build issue with nat/linux-namespaces.c
This commit fixes a build issue on systems with a prototype for setns
in their header files but no working setns is detected by configure.
gdb/ChangeLog:
PR gdb/18957
* nat/linux-namespaces.c (setns): Rename from this ...
(do_setns): ... to this. Support calling setns if it exists.
(mnsh_handle_setns): Call do_setns.
Jan Kratochvil [Tue, 4 Aug 2015 11:40:44 +0000 (13:40 +0200)]
ASAN attach crash - 7.9 regression
-fsanitize=address
gdb.base/attach-pie-noexec.exp
==32586==ERROR: AddressSanitizer: heap-use-after-free on address 0x60200004ed90 at pc 0x48ad50 bp 0x7ffceb3aef50 sp 0x7ffceb3aef20
READ of size 2 at 0x60200004ed90 thread T0
#0 0x48ad4f in __interceptor_strlen (/home/jkratoch/redhat/gdb-test-asan/gdb/gdb+0x48ad4f)
#1 0xeafe5c in xstrdup xstrdup.c:33
#2 0x85e024 in attach_command /home/jkratoch/redhat/gdb-test-asan/gdb/infcmd.c:2680
regressed by:
commit 6c4486e63f7583ed85a0c72841f6ccceebbf858e
Author: Pedro Alves <palves@redhat.com>
Date: Fri Oct 17 13:31:26 2014 +0100
PR gdb/17471: Repeating a background command makes it foreground
gdb/ChangeLog
2015-08-04 Jan Kratochvil <jan.kratochvil@redhat.com>
PR gdb/18767
* infcmd.c (attach_command): Move ARGS_CHAIN cleanup after last ARGS
use.
Pedro Alves [Tue, 25 Aug 2015 15:19:56 +0000 (16:19 +0100)]
remote: allow aborting long operations (e.g., file transfers)
Currently, when remote debugging, if you type Ctrl-C just while the
target stopped for an internal event, and GDB is busy doing something
that takes a while (e.g., fetching chunks of a shared library off of
the target, with vFile, to process ELF headers and debug info), the
Ctrl-C is lost.
The patch hooks up the QUIT macro to a new target method that lets the
target react to the double-Ctrl-C before the event loop is reached,
which allows reacting to a double-Ctrl-C even when GDB is busy doing
some long operation and not waiting for a stop reply. That end result
is:
(gdb) c
Continuing.
^C
^C
Interrupted while waiting for the program.
Give up waiting? (y or n) y
Quit
(gdb) info threads
Id Target Id Frame
* 1 Thread 11673 0x00007ffff7deb240 in _dl_debug_state () from target:/lib64/ld-linux-x86-64.so.2
(gdb)
If, however, GDB is waiting for a stop reply (because the target has
been resumed, with e.g., vCont;c), but the target isn't responding, we
now get:
(gdb) c
Continuing.
^C
^C
The target is not responding to interrupt requests.
Stop debugging it? (y or n) y
Disconnected from target.
(gdb) info threads
No threads.
This offers to disconnect, because when we're waiting for a stop
reply, there's nothing else we can send the target other than an
interrupt request. And if that doesn't work, there's nothing else we
can do.
The Ctrl-C is presently lost because until we get to a user-visible
stop, the SIGINT handler that is installed is the one that forwards
the interrupt to the remote side, with the \003 "packet" [1]. But,
gdbserver ignores an interrupt request if the program is stopped.
Still, even if it didn't, the server can only report back a
stop-because-of-SIGINT when the program is next resumed. And it may
take a while to actually re-resume the target.
[1] - In the old sync days, the remote target would react to a
double-Ctrl-C by asking users whether they wanted to give up waiting
and disconnect. The code is still there, but it it isn't reacheable
on most hosts, which support serial connections in async mode
(probably only DJGPP doesn't). Even then, in sync mode, remote.c's
SIGINT handler is only installed while the target is resumed, and is
removed as soon as the target sends back a stop reply. That means
that a Ctrl-C just while GDB is processing an internal event can end
up with an odd "Quit" at the prompt instead of "Program stopped by
SIGINT". In contrast, in async mode, remote.c's SIGINT handler is set
up as long as target_terminal_inferior or
target_terminal_ours_for_output are in effect (IOW, until we get a
user-visible stop and call target_terminal_ours), so the user
shouldn't get back a spurious Quit. However, it's still desirable to
be able to interrupt a long-running GDB operation, if GDB takes a
while to re-resume the target or get back to the event loop.
Tested on x86_64 Fedora 20.
gdb/ChangeLog:
2015-08-24 Pedro Alves <palves@redhat.com>
PR gdb/18804
* defs.h (maybe_quit): Declare.
(QUIT): Now calls maybe_quit.
* event-loop.c (clear_async_signal_handler)
(async_signal_handler_is_marked): New functions.
* event-loop.h (async_signal_handler_is_marked)
(clear_async_signal_handler): New declarations.
* remote.c (remote_check_pending_interrupt): New function.
(interrupt_query): Use make_cleanup_restore_target_terminal. No
longer check whether the target is async. If waiting for a stop
reply, and a Ctrl-C as been sent to the target, offer to
disconnect, and throw TARGET_CLOSE_ERROR instead of a quit.
Otherwise do not disconnect and throw a quit.
(_initialize_remote): Install remote_check_pending_interrupt as
to_check_pending_interrupt.
* target.c (target_check_pending_interrupt): New function.
* target.h (struct target_ops) <to_check_pending_interrupt>: New
field.
(target_check_pending_interrupt): New declaration.
* utils.c (maybe_quit): New function.
* target-delegates.c: Regenerate.
Gary Benson [Fri, 21 Aug 2015 16:09:20 +0000 (17:09 +0100)]
Warn when accessing binaries from remote targets
GDB provides no indicator of progress during file operations, and can
appear to have locked up during slow remote transfers. This commit
updates GDB to print a warning each time a file is accessed over RSP.
An additional message detailing how to avoid remote transfers is
printed for the first transfer only.
gdb/ChangeLog:
* target.h (struct target_ops) <to_fileio_open>: New argument
warn_if_slow. Update comment. All implementations updated.
(target_fileio_open_warn_if_slow): New declaration.
* target.c (target_fileio_open): Renamed as...
(target_fileio_open_1): ...this. New argument warn_if_slow.
Pass warn_if_slow to implementation. Update debug printing.
(target_fileio_open): New function.
(target_fileio_open_warn_if_slow): Likewise.
* gdb_bfd.c (gdb_bfd_iovec_fileio_open): Use new function
target_fileio_open_warn_if_slow.
gdb/testsuite/ChangeLog:
* gdb.trace/pending.exp: Cope with remote transfer warnings.
Pedro Alves [Fri, 21 Aug 2015 09:25:43 +0000 (10:25 +0100)]
Add readahead cache to gdb's vFile:pread
This patch almost halves the time it takes to "target remote + run to
main" on a higher-latency connection.
E.g., I've got a ping time of ~85ms to an x86-64 machine on the gcc
compile farm (almost 2000km away from me), and I'm behind a ~16Mbit
ADSL. When I connect to a gdbserver debugging itself on that machine
and run to main, it takes almost 55 seconds:
Pristine gdb 7.10.50.20150820-cvs gets us:
...
Remote debugging using :9999
Reading symbols from target:/home/palves/gdb/build/gdb/gdbserver/gdbserver...done.
Reading symbols from target:/lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
0x00007ffff7ddd190 in ?? () from target:/lib64/ld-linux-x86-64.so.2
Breakpoint 1 at 0x41200c: file ../../../src/gdb/gdbserver/server.c, line 3635.
Continuing.
Breakpoint 1, main (argc=1, argv=0x7fffffffe3d8) at ../../../src/gdb/gdbserver/server.c:3635
3635 ../../../src/gdb/gdbserver/server.c: No such file or directory.
/home/palves/gdb/build/gdb/gdbserver/gdbserver: No such file or directory.
real 0m54.803s
user 0m0.329s
sys 0m0.064s
While with the readahead cache added by this patch, it drops to:
real 0m29.462s
user 0m0.454s
sys 0m0.054s
I added a few counters to show cache hit/miss, and got:
readahead cache miss 142
readahead cache hit 310
Tested on x86_64 Fedora 20.
gdb/ChangeLog:
2015-08-21 Pedro Alves <palves@redhat.com>
* remote.c (struct readahead_cache): New.
(struct remote_state) <readahead_cache>: New field.
(remote_open_1): Invalidate the cache.
(readahead_cache_invalidate, readahead_cache_invalidate_fd): New
functions.
(remote_hostio_pwrite): Invalidate the readahead cache.
(remote_hostio_pread): Rename to ...
(remote_hostio_pread_vFile): ... this.
(remote_hostio_pread_from_cache): New function.
(remote_hostio_pread): Reimplement.
(remote_hostio_close): Invalidate the readahead cache.
Gary Benson [Wed, 19 Aug 2015 13:00:50 +0000 (14:00 +0100)]
Prelimit number of bytes to read in "vFile:pread:"
While handling "vFile:pread:" packets, gdbserver would read the
number of bytes requested regardless of whether this would fit
into the reply packet. gdbserver would then return a packet's
worth of data and discard the remainder. When accessing large
binaries GDB (via BFD) routinely makes large "vFile:pread:"
requests, resulting in gdbserver allocating large unnecessary
buffers and reading some portions of the file many times over.
This commit causes gdbserver to limit the number of bytes to be
read to a sensible maximum prior to allocating buffers and reading
data.
gdb/gdbserver/ChangeLog:
* hostio.c (handle_pread): Do not attempt to read more data
than hostio_reply_with_data can fit in a packet.
Yao Qi [Wed, 29 Jul 2015 11:43:10 +0000 (12:43 +0100)]
PR record/18691: Fix fails in solib-precsave.exp
We see the following regressions in testing on x86_64-linux,
reverse-step^M
Cannot access memory at address 0x2aaaaaed26c0^M
(gdb) FAIL: gdb.reverse/solib-precsave.exp: reverse-step into solib function one
when GDB reverse step into a function, GDB wants to skip prologue so
it requests TARGET_OBJECT_CODE_MEMORY to read some code memory in
memory_xfer_partial_1. However in dcache_read_memory_partial, the object
becomes TARGET_OBJECT_MEMORY
in reverse debugging, ops->to_xfer_partial is record_full_core_xfer_partial
and it will return TARGET_XFER_E_IO because it can't find any records.
The test fails.
At this moment, the delegate relationship is like
dcache -> record-core -> core -> exec
and we want to GDB read memory across targets, which means if the
requested memory isn't found in record-core, GDB can read memory from
core, and exec even further if needed. I find raw_memory_xfer_partial
is exactly what I want.
gdb:
2015-08-18 Yao Qi <yao.qi@linaro.org>
PR record/18691
* dcache.c (dcache_read_memory_partial): Call
raw_memory_xfer_partial.
* target.c (raw_memory_xfer_partial): Make it non-static.
* target.h (raw_memory_xfer_partial): Declare.
Keith Seitz [Thu, 13 Aug 2015 01:31:11 +0000 (18:31 -0700)]
gdb.base/dso2dso.exp sometimes broken
Keith reported that gdb.base/dso2dso.exp is broken, with the following
error:
| $ make check RUNTESTFLAGS=dso2dso.exp
| [snip]
| Running ../../../src/gdb/testsuite/gdb.base/dso2dso.exp ...
| ERROR: tcl error sourcing ../../../src/gdb/testsuite/gdb.base/dso2dso.exp.
| ERROR: couldn't open
| "../../../src/gdb/testsuite/gdb.base/../../../src/gdb/testsuite/gdb.base/dso2dso-dso1.c":
| no such file or directory
| while executing
| "error "$message""
| (procedure "gdb_get_line_number" line 14)
| invoked from within
| "gdb_get_line_number "STOP HERE" $srcfile_libdso1"
| (file "../../../src/gdb/testsuite/gdb.base/dso2dso.exp" line 60)
| invoked from within
| "source ../../../src/gdb/testsuite/gdb.base/dso2dso.exp"
| ("uplevel" body line 1)
| invoked from within
| "uplevel #0 source ../../../src/gdb/testsuite/gdb.base/dso2dso.exp"
| invoked from within
| "catch "uplevel #0 source $test_file_name""
This happens because gdb_get_line_number will prepend $srcdir/$subdir
if the given filename does not start with "/", and this happens when
GDB was configured using a relative path to the configure script.
When using an absolute path like I do, we avoid the pre-pending that
Keith is seeing.
gdb/testsuite/ChangeLog:
Keith Seitz <keiths@redhat.com>:
* gdb.base/dso2dso.exp: Pass basename of source file in call
to gdb_get_line_number.
Joel Brobecker [Wed, 12 Aug 2015 16:33:19 +0000 (09:33 -0700)]
[amd64] Invalid return address after displaced stepping
Making all-stop run on top of non-stop caused a small regression
in behavior. This was observed on x86_64-linux. The attached testcase
is in C whereas the investigation was done with an Ada program,
but it's the same scenario, and using a C testcase allows wider testing.
Basically: I am debugging a single-threaded program, and currently
stopped inside a function provided by a shared-library, at a line
calling a subprogram provided by a second shared library, and trying
to "next" over that function call.
Before we changed the default all-stop behavior, we had:
7 Impl_Initialize; -- Stop here and try "next" over this line
(gdb) n
8 return 5; <<-- OK
But now, "next" just stops much earlier:
(gdb) n
0x00007ffff7bd8560 in impl.initialize@plt () from /[...]/lib/libpck.so
What happens is that next stops at a call instruction, which calls
the function's PLT, and GDB fails to notice that the inferior stepped
into a subroutine, and so decides that we're done. We can see another
symptom of the same issue by looking at the backtrace at the point
GDB stopped:
(gdb) bt
#0 0x00007ffff7bd8560 in impl.initialize@plt ()
from /[...]/lib/libpck.so
#1 0x00000000f7bd86f9 in ?? ()
#2 0x00007fffffffdf50 in ?? ()
#3 0x0000000000401893 in a () at /[...]/a.adb:7
Backtrace stopped: frame did not save the PC
With a functioning GDB, the backtrace looks like the following instead:
#0 0x00007ffff7bd8560 in impl.initialize@plt ()
from /[...]/lib/libpck.so
#1 0x00007ffff7bd86f9 in sub () at /[...]/pck.adb:7
#2 0x0000000000401893 in a () at /[...]/a.adb:7
Note how, for frame #1, the address looks quite similar, except
for the high-order bits not being set:
#1 0x00007ffff7bd86f9 in sub () at /[...]/pck.adb:7 <<<-- OK
#1 0x00000000f7bd86f9 in ?? () <<<-- WRONG
^^^^
||||
Wrong
Investigating this further led me to displaced stepping.
As we are "next"-ing from a location where a breakpoint is inserted,
we need to step out of it, and since we're on non-stop mode, we need
to do it using displaced stepping. And looking at
amd64-tdep.c:amd64_displaced_step_fixup, I found the code that handles
the return address:
The mask used to compute retaddr looks wrong to me, keeping only
4 bytes instead of 8, and explains why the high order bits of
the backtrace are unset. What happens is that, after the displaced
stepping has completed, GDB restores that return address at the location
where the program expects it. But because the top half bits of
the address have been masked out, the return address is now invalid.
The incorrect behavior of the "next" command and the backtrace at
that location are the first symptoms of that. Another symptom is
that this actually alters the behavior of the program, where a "cont"
from there soon leads to a SEGV when the inferior tries to jump back
to that incorrect return address:
(gdb) c
Continuing.
Program received signal SIGSEGV, Segmentation fault.
0x00000000f7bd86f9 in ?? ()
^^^^^^^^^^^^^^^^^^
This patch fixes the issue by using a mask that seems more appropriate
for this architecture.
gdb/ChangeLog:
* amd64-tdep.c (amd64_displaced_step_fixup): Fix the mask used to
compute RETADDR.
gdb/testsuite/ChangeLog:
* gdb.base/dso2dso-dso2.c, gdb.base/dso2dso-dso2.h,
gdb.base/dso2dso-dso1.c, gdb.base/dso2dso-dso1.h, gdb.base/dso2dso.c,
gdb.base/dso2dso.exp: New files.