Tom Tromey [Thu, 22 Jun 2023 15:00:13 +0000 (09:00 -0600)]
Handle typedefs in no-op pretty printers
The no-ops pretty-printers that were introduced for DAP have a classic
gdb bug: they neglect to call check_typedef. This will cause some
strange behavior; for example not showing the children of a variable
whose type is a typedef of a structure type. This patch fixes the
oversight.
Tom Tromey [Wed, 14 Jun 2023 12:09:23 +0000 (06:09 -0600)]
Reimplement DAP stack traces using frame filters
This reimplements DAP stack traces using frame filters. This slightly
simplifies the code, because frame filters and DAP were already doing
some similar work. This also renames RegisterReference and
ScopeReference to make it clear that these are private (and so changes
don't have to worry about other files).
Tom Tromey [Wed, 14 Jun 2023 12:54:13 +0000 (06:54 -0600)]
Add new interface to frame filter iteration
This patch adds a new function, frame_iterator, that wraps the
existing code to find and execute the frame filters. However, unlike
execute_frame_filters, it will always return an iterator -- whereas
execute_frame_filters will return None if no frame filters apply.
Nothing uses this new function yet, but it will used by a subsequent
DAP patch.
Tom Tromey [Wed, 14 Jun 2023 12:27:49 +0000 (06:27 -0600)]
Fix execute_frame_filters doc string
When reading the doc string for execute_frame_filters, I wasn't sure
if the ranges were inclusive or exclusive. This patch updates the doc
string to reflect my findings, and also fixes an existing typo.
Tom Tromey [Sun, 26 Jun 2022 15:19:46 +0000 (09:19 -0600)]
Move definition of ctf_target type
This moves the definition of the ctf_target type into the
HAVE_LIBBABELTRACE block. This type is only used in this block, so it
makes sense to only define it there.
Tom Tromey [Tue, 20 Jun 2023 21:18:23 +0000 (15:18 -0600)]
Avoid crash with absolute symbol
A user supplied an executable and a remote logfile that could be used
to crash gdb. The problem is that the BFD section for a particular
symbol was null, because the section was not marked "allocated".
Digging deeper, the problem was that elfread.c dropped the section for
absolute symbols. This patch fixes the crash.
gdbserver: allow agent expressions to fail with invalid memory access
Now that agent expressions might fail with the error
expr_eval_invalid_memory_access, we might overflow the
eval_result_names array in tracepoint.cc. This is because the
eval_result_names array does not include a string for either
expr_eval_invalid_goto or expr_eval_invalid_memory_access.
I don't know if having expr_eval_invalid_goto missing is also a
problem, but it feels like eval_result_names should just include a
string for every possible error.
I could just add two more strings into the array, but I figure that a
more robust solution will be to move all of the error types, and their
associated strings, into a new ax-result-types.def file, and to then
include this file in both ax.h and tracepoint.cc in order to build
the enum eval_result_type and the eval_result_names string array.
Doing this means it is impossible to have a missing error string in
the future.
gdb/testsuite: add test for core file with a 0 pid
a new test gdb.arch/core-file-pid0.exp was added. This test includes
a pre-generated core file for x86-64 and for other architectures the
test reports 'unsupported'.
However, after reporting 'unsupported' the test failed to perform an
early return, so the test would then carry on and try to actually
perform the test, which resulted in some TCL errors.
Fix this by returning after reporting the test unsupported.
gdb: include breakpoint number in testing condition error message
The earlier commit extended the error message:
Error in testing breakpoint condition:
to include the breakpoint number, e.g.:
Error in testing breakpoint condition 3:
This commit extends takes this further, and includes the location
number if the breakpoint has multiple locations, so we might now see:
Error in testing breakpoint condition 3.2:
Just as with how GDB reports a normal breakpoint stop, if a breakpoint
only has a single location then the location number is not included,
this keeps things nice and consistent.
I've extended one of the tests to cover the new functionality.
Richard Bunt [Mon, 10 Jul 2023 07:43:59 +0000 (08:43 +0100)]
gdb/testsuite: Testing with the nvfortran compiler
Currently, the Fortran test suite does not run with NVIDIA's Fortran
compiler (nvfortran).
The goal here is to get the tests running and preventing further
regressions during future work. This change does not do anything to fix
existing failures.
Teach the compiler detection about nvfortran. There is no underlying
information about whether this compiler is related to flang classic or
flang, so we cannot reuse the main and type definitions. Therefore, we
explicitly record the main method and type information observed when
using nvfortran.
The main name was extracted by trying to set breakpoints on both MAIN_
and MAIN__.
The following mapping of test to type names was used to extract how
nvfortran reports types.
logical.exp: fortran_character1. Ran ptype on "c".
Types defined as fortran_complex16 do not compile with nvfortran, so it
was left unset.
gdb.fortran regression tests run with GNU, Intel, Intel LLVM and ACfL.
No regressions detected.
The gdb.fortran test results with nvfortran 23.3 are as follows.
Before:
# of expected passes 523
# of unexpected failures 107
# of known failures 2
# of unresolved testcases 1
# of untested testcases 7
# of duplicate test names 2
After:
# of expected passes 5696
# of unexpected failures 271
# of known failures 12
# of untested testcases 9
# of unsupported tests 5
As can be seen from the above, there are now considerably more passing
assertions.
Fangrui Song [Sun, 9 Jul 2023 17:57:19 +0000 (10:57 -0700)]
PR30592 objcopy: allow --set-section-flags to add or remove SHF_X86_64_LARGE
For example, objcopy --set-section-flags .data=alloc,large will add
SHF_X86_64_LARGE to the .data section. Omitting "large" will drop the
SHF_X86_64_LARGE flag.
The bfd_section flag is named generically, SEC_ELF_LARGE, in case other
processors want to follow SHF_X86_64_LARGE. SEC_ELF_LARGE has the same
value as SEC_TIC54X_BLOCK used by coff.
gdb/cp-namespace.c: Fix assert failure caused by malformed user input
When debugging C++ programs, it is possible to trigger a spurious assert
failure when attempting to set a breakpoint on a malformed symbol name.
Names of the form 'A>::B' and 'A)::B' trigger this assert failure in
cp_lookup_bare_symbol:
$ gdb gdb
[...]
(gdb) br test>::assert
Function "test>::assert" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (test>::assert) pending.
(gdb) start
[...]
cp-namespace.c:181: internal-error: cp_lookup_bare_symbol: Assertion `strstr (name, "::") == NULL' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
----- Backtrace -----
0x5217e2 gdb_internal_backtrace_1
/home/amerey/binutils-gdb/gdb/bt-utils.c:122
0x521885 _Z22gdb_internal_backtracev
/home/amerey/binutils-gdb/gdb/bt-utils.c:168
0xaf8303 internal_vproblem
/home/amerey/binutils-gdb/gdb/utils.c:396
0xaf86be _Z15internal_verrorPKciS0_P13__va_list_tag
/home/amerey/binutils-gdb/gdb/utils.c:476
0xccdb3f _Z18internal_error_locPKciS0_z
/home/amerey/binutils-gdb/gdbsupport/errors.cc:58
0x5dded9 cp_lookup_bare_symbol
/home/amerey/binutils-gdb/gdb/cp-namespace.c:181
0x5de39d cp_lookup_symbol_in_namespace
/home/amerey/binutils-gdb/gdb/cp-namespace.c:328
[...]
Currently this assert is skipped if the symbol name contains '<' or '('.
Fix this spurious failure by also skipping the assert when the symbol
name contains '>' or ')'.
Tom Tromey [Wed, 21 Jun 2023 12:23:20 +0000 (06:23 -0600)]
Fix result of DAP setExpression
A co-worker, Andry, noticed that the DAP setExpression implementation
returned the wrong fields -- it used "result" rather than "value", and
included "memoryReference", which isn't in the spec (an odd oversight,
IMO).
Tom Tromey [Thu, 8 Jun 2023 20:06:45 +0000 (14:06 -0600)]
Remove unchecked casts to mi_interp
Simon noticed a crash that could be caused via new Python
gdb.execute_mi function. Looking into this, I found a few unchecked
casts to mi_interp, like:
This patch replaces all such casts with safer variants.
For -gdb-exit and mi_load_progress, I chose to have the functions
simply not generate any output. It didn't seem useful to do so.
Some casts I eliminated by adding a parameter to a function. Then, in
mi_execute_command, I changed the code to use
gdb::checked_static_cast. This is appropriate because this particular
overload can only be called by the MI interpreter.
There does not seem to be a very good way to test -gdb-exit.
Andrew Burgess [Wed, 31 May 2023 20:41:48 +0000 (21:41 +0100)]
gdb: check max-value-size when reading strings for printf
I noticed that the printf code for strings, printf_c_string and
printf_wide_c_string, don't take max-value-size into account, but do
load a complete string from the inferior into a GDB buffer.
As such it would be possible for an badly behaved inferior to cause
GDB to try and allocate an excessively large buffer, potentially
crashing GDB, or at least causing GDB to swap lots, which isn't
great.
We already have a setting to protect against this sort of thing, the
'max-value-size'. So this commit updates the two function mentioned
above to check the max-value-size and give an error if the
max-value-size is exceeded.
If the max-value-size is exceeded, I chose to continue reading
inferior memory to figure out how long the string actually is, we just
don't store the results. The benefit of this is that when we give the
user an error we can tell the user how big the string actually is,
which would allow them to correctly adjust max-value-size, if that's
what they choose to do.
The default for max-value-size is 64k so there should be no user
visible changes after this commit, unless the user was previously
printing very large strings. If that is the case then the user will
now need to increase max-value-size.
However, this change was not included in that original series.
The original series received push back because it was thought that
replacing alloca with a C++ container type would introduce unnecessary
malloc/free overhead.
However, in this case we are building a string, and (at least for
GCC), the std::string type has a small string optimisation, where
small strings are stored on the stack.
And in this case we are building what will usually be a very small
string, we're just constructing a printf format specifier for a hex
value, so it'll be something like '%#x' -- though it could also have a
width in there too -- but still, it should normally fit within GCCs
small string buffer.
So, in this commit, I propose replacing the use of alloca with a
std::string. This shouldn't result (normally) in any additional
malloc or free calls, so should be similar in performance to the
original approach.
There should be no user visible differences after this commit.
however, there was push back on that thread due to it adding extra
dynamic allocation, i.e. moving the memory buffers off the stack on to
the heap.
However, of all the patches originally proposed, I think in these two
cases moving off the stack is the correct thing to do. Unlike all the
other patches in the original series, where the data being read
was (mostly) small in size, a register, or a couple of registers, in
this case we are reading an arbitrary string from the inferior. This
could be any size, and so should not be placed on the stack.
So in this commit I replace the use of alloca with std::byte_vector
and simplify the logic a little (I think) to take advantage of the
ability of std::byte_vector to dynamically grow in size.
Of course, really, we should probably be checking the max-value-size
setting as we load the string to stop GDB crashing if a corrupted
inferior causes GDB to try read a stupidly large amount of
memory... but I'm leaving that for a follow on patch.
There should be no user visible changes after this commit.
Andrew Burgess [Wed, 31 May 2023 15:14:47 +0000 (16:14 +0100)]
gdb: fix printf of wchar_t early in a gdb session
Given this test program:
#include <wchar.h>
const wchar_t wide_str[] = L"wide string";
int
main (void)
{
return 0;
}
I observed this GDB behaviour:
$ gdb -q /tmp/printf-wchar_t
Reading symbols from /tmp/printf-wchar_t...
(gdb) start
Temporary breakpoint 1 at 0x40110a: file /tmp/printf-wchar_t.c, line 8.
Starting program: /tmp/printf-wchar_t
Temporary breakpoint 1, main () at /tmp/printf-wchar_t.c:8
25 return 0;
(gdb) printf "%ls\n", wide_str
(gdb)
Notice that the printf results in a blank line rather than the
expected 'wide string' output.
I tracked the problem down to printf_wide_c_string (in printcmd.c), in
this function we do this:
struct type *wctype = lookup_typename (current_language,
"wchar_t", NULL, 0);
int wcwidth = wctype->length ();
the problem here is that 'wchar_t' is a typedef. If we look at the
comment on type::length() we see this:
/* Note that if thistype is a TYPEDEF type, you have to call check_typedef.
But check_typedef does set the TYPE_LENGTH of the TYPEDEF type,
so you only have to call check_typedef once. Since value::allocate
calls check_typedef, X->type ()->length () is safe. */
What this means is that after calling lookup_typename we should call
check_typedef in order to ensure that the length of the typedef has
been setup correctly. We are not doing this in printf_wide_c_string,
and so wcwidth is incorrectly calculated as 0. This is what leads GDB
to print an empty string.
We can see in c_string_operation::evaluate (in c-lang.c) an example of
calling check_typedef specifically to fix this exact issue.
Initially I did fix this problem by adding a check_typedef call into
printf_wide_c_string, but then I figured why not move the
check_typedef call up into lookup_typename itself, that feels like it
should be harmless when looking up a non-typedef type, but will avoid
bugs like this when looking up a typedef. So that's what I did.
I can then remove the extra check_typedef call from c-lang.c, I don't
see any other places where we had extra check_typedef calls. This
doesn't mean we definitely had bugs -- so long as we never checked the
length, or, if we knew that check_typedef had already been called,
then we would be fine.
I don't see any test regressions after this change, and my new test
case is now passing.
Jan Beulich [Fri, 7 Jul 2023 12:10:21 +0000 (14:10 +0200)]
ld: fix build with old glibc / gcc
"rename" conflicts with a function of that name, which gcc from that
same timeframe then complains about. Use a name matching that of
struct input_remap's respective field.
The ARC HS5x and ARC HS6x processors are based on the new ARCv3 ISA
that implements a full range of 32-bit and 64-bit instructions. These
processors feature a high-speed 10-stage, dual-issue pipeline that
offers increased utilization of functional units with a limited
increase in power and area. The HS5x processors feature a 32-bit
pipeline that can execute all ARCv3 32-bit instructions, while the
HS6x processors feature a full 64-bit pipeline and register file that
can execute both 32-bit and 64-bit instructions. In addition, the ARC
HS6x supports 64-bit virtual and 52-bit physical address spaces to
enable direct addressing of current and future large memories, as well
as 128-bit loads and stores for efficient data movement.
This readelf patch updates/adds Synopsys ARCv3 machine name fileds and
supported relocations.
- GPLv2 instead of GPLv3,
- Use the FSF postal address rather than their URL.
Nobody else has touched the file since I merged it, so I don't believe
there are any problems with me changing the license, this commit does
just that.
Pedro Alves [Wed, 7 Jun 2023 09:38:14 +0000 (10:38 +0100)]
Linux: Avoid pread64/pwrite64 for high memory addresses (PR gdb/30525)
Since commit 05c06f318fd9 ("Linux: Access memory even if threads are
running"), GDB prefers pread64/pwrite64 to access inferior memory
instead of ptrace. That change broke reading shared libraries on
SPARC64 Linux, as reported by PR gdb/30525 ("gdb cannot read shared
libraries on SPARC64").
On SPARC64 Linux, surprisingly (to me), userspace shared libraries are
mapped at high 64-bit addresses:
(gdb) info sharedlibrary
Cannot access memory at address 0xfff80001002011e0
Cannot access memory at address 0xfff80001002011d8
Cannot access memory at address 0xfff80001002011d8
From To Syms Read Shared Object Library
0xfff80001000010a0 0xfff8000100021f80 Yes (*) /lib64/ld-linux.so.2
(*): Shared library is missing debugging information.
Those addresses are 64-bit addresses with the high bits set. When
interpreted as signed, they're negative.
The Linux kernel rejects pread64/pwrite64 if the offset argument of
type off_t (a signed type) is negative, which happens if the memory
address we're accessing has its high bit set. See
linux/fs/read_write.c sys_pread64 and sys_pwrite64 in Linux.
Thankfully, lseek does not fail in that situation. So the fix is to
use the 'lseek + read|write' path if the offset would be negative.
Fix this in both native GDB and GDBserver.
Tested on a SPARC64 GNU/Linux and x86-64 GNU/Linux.
Branislav Brzak [Tue, 20 Jun 2023 14:19:55 +0000 (16:19 +0200)]
riscv: Ensure LE instruction fetching
Currently riscv gdb code looks at arch byte order
when fetching instructions. This works when the
target is LE, but on BE arch it will byte swap the
instruction, while the riscv spec defines all
instructions are LE encoded regardless of
system memory endianess.
Pedro Alves [Thu, 6 Jul 2023 14:05:11 +0000 (15:05 +0100)]
Fix Solaris regression (PR tdep/30252)
PR tdep/30252 reports that using GDB on Solaris fails an assertion in
target_resume:
target.c:2648: internal-error: target_resume: Assertion `inferior_ptid != null_ptid' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n)
The backtrace, after running it through c++filt, looks like:
The problem is that the procfs backend, while inside target_wait,
called target_resume without switching to the leader thread of that
resumption.
The target_resume interface is:
/* Resume execution (or prepare for execution) of the current thread
(INFERIOR_PTID), while optionally letting other threads of the
current process or all processes run free.
...
Thus calling target_resume with inferior_ptid == null_ptid is bogus.
target_wait (which leads to procfs_target::wait on Solaris) is called
with inferior_ptid == null_ptid on entry exactly to help catch such
bogus uses.
From the backtrace, it seems that the relevant line in question is
procfs.c:2187:
2186 /* How to keep going without returning to wfi: */
2187 target_continue_no_signal (ptid);
2188 goto wait_again;
target_continue_no_signal is a small wrapper around target_resume,
which would make sense.
The fix is to not call target_resume or go via the target stack at
all. Instead, factor out a new proc_resume function out of
procfs_target::resume, and call that. The new function does not rely
on inferior_ptid.
I've not been able to test it myself, but Petr confirmed it fixes the
assertion failure with his test case, and Marcel Telka also confirmed
it solves the problem.
YunQiang Su [Mon, 3 Jul 2023 04:43:21 +0000 (12:43 +0800)]
ld: fix plugin tests for MIPS PIC
On MIPS, for PIC objects, symbols may reference 2 times:
once from the caller, and once from GOT.
Thus ld may complains 2 times about "undefined reference".
So we add a new "#?" line to every effected testsuite.
Alan Modra [Wed, 5 Jul 2023 13:53:51 +0000 (23:23 +0930)]
Use run_host_cmd to run $CC and other no-section-header test fixes
We should be using run_host_cmd everywhere we invoke a compiler in the
ld testsuite, if we want to use ld/ld-new just built. run_host_cmd
properly inserts $gcc_B_opt in cases where a user wants to test
binutils with a newly built compiler, ie. when $CC specifies -B itself.
Also, it is not good practice to exclude tests when non-native except
of course those tests that run a target binary. Compiling and linking
often shows up problems.
* testsuite/ld-elf/no-section-header.exp (binutils_run_test):
Use run_host_cmd to invoke $CC_FOR_TARGET. Run all tests
non-native too, except for attempting to run the binaries.
Run tests for ELF in general, not just linux.
* testsuite/ld-elf/pr25617-1-no-sec-hdr.rd: Allow localentry
symbol decoration, and support either sorting of symbols.
* testsuite/ld-elf/pr25617-1a-no-sec-hdr.rd: Likewise.
* testsuite/ld-elf/pr25617-1a-sec-hdr.rd: Likewise.
* testsuite/ld-elf/pr25617-1a-no-sec-hdr.nd: Accept D function syms.
* testsuite/ld-elf/start-shared-noheader-sysv.rd: Accept
mips-sgi-irix symbol output.
* testsuite/ld-elf/start-shared-noheader.nd: Likewise.
YunQiang Su [Fri, 30 Jun 2023 05:14:51 +0000 (13:14 +0800)]
ld: Use [list ] syntax to define run_tests in indirect.exp
Currently, the var run_tests is defined by syntax {{}},
while in this case, variables cannot be used.
Thus $NOPIE_CFLAGS and $NOPIE_LDFLAGS are passed to cmd as names
instead of values:
gcc ... $NOPIE_CFLAGS -c .../indirect5a.c -o tmpdir/indirect5a.o
Let's use [list [list ]] syntax instead.
ld/ChangeLog:
* testsuite/ld-elf/indirect.exp(run_tests): use [list [list]]
syntax instead of {{}}.
Jan Beulich [Tue, 4 Jul 2023 15:02:17 +0000 (17:02 +0200)]
x86: flag bad EVEX masking for miscellaneous insns
Masking is not permitted for certain further insns, not falling in any
of the earlier categories. Introduce the Y macro (not expanding to any
output) to flag such cases.
Note that in a few cases entries already covered otherwise are converted
as well, to continue to allow sharing of the string literals.
Jan Beulich [Tue, 4 Jul 2023 15:00:35 +0000 (17:00 +0200)]
x86: flag EVEX.z set when destination is a mask register
While only zeroing-masking is possible in this case, this still requires
EVEX.z to be clear. Introduce a "global" flag right here, to be re-used
by checks which need to live in specific operand handlers.
Jan Beulich [Tue, 4 Jul 2023 15:00:15 +0000 (17:00 +0200)]
x86: re-work EVEX-z-without-masking check
Rather than corrupting disassmbly altogether, flag EVEX.z set as bad
when masking isn't in effect in the first place at the time the
destination operand is actually processed.
gdb: add __repr__() implementation to a few Python types
Only a few types in the Python API currently have __repr__()
implementations. This patch adds a few more of them. specifically: it
adds __repr__() implementations to gdb.Symbol, gdb.Architecture,
gdb.Block, gdb.Breakpoint, gdb.BreakpointLocation, and gdb.Type.
This makes it easier to play around the GDB Python API in the Python
interpreter session invoked with the 'pi' command in GDB, giving more
easily accessible tipe information to users.
Andrew Burgess [Fri, 19 May 2023 20:42:39 +0000 (21:42 +0100)]
gdb: have mdict_size always return a symbol count
In the next commit we would like to have mdict_size return the number
of symbols in the dictionary, currently mdict_size is just a
heuristic, sometimes it returns the number of symbols, and sometimes
the number of buckets in a hashing dictionary (see size_hashed in
dictionary.c).
Currently this vague notion of size is good enough, the only place
mdict_size is used is in a maintenance command in order to print a
message containing the size of the dictionary ... so we don't really
care that the value isn't correct.
However, in the next commit we do want the size returned to be the
number of symbols in the dictionary, so this commit makes mdict_size
return the symbol count in all cases.
The new use is still not on a hot path -- it's going to be a Python
__repr__ method, so all I do in this commit is have size_hashed walk
the dictionary and count the entries, obviously this could be slow if
we have a large number of symbols, but for now I'm not worrying about
that case. We could always store the symbol count if we wanted, but
that would increase the size of every dictionary for a use case that
isn't going to be hit that often.
I've updated the text in 'maint print symbols' so that we don't talk
about the size being 'syms/buckets', but just 'symbols' now.
Andreas Krebbel [Mon, 3 Jul 2023 17:51:51 +0000 (19:51 +0200)]
IBM Z: Fix pcrel relocs for symA-symB expressions
The code in md_apply_fix which tries to deduce from the operand type
which reloc to apply currently does the wrong thing for absolute
relocs which have been re-written by fixup_segment as pc-relative to
implement a subtraction of a local and an external symbol.
In all these cases we wrongly emit an absolute reloc because we ignore
the fx_pcrel flag in md_apply_fix. However, only for the last one we
actually support a pc relative relocation of the proper size and can
implement it accordingly. For the other 3 we have to issue an error.
foo:
cli 0(%r2),undef-foo
la %r2,undef-foo(%r2)
lay %r2,undef-foo(%r2)
lhi %r2,undef-foo
Tom Tromey [Thu, 29 Jun 2023 13:10:40 +0000 (07:10 -0600)]
Fix two Python calls that don't check for errors
PyModule_AddObject steals a reference on success, but not on error,
which is why we have gdb_pymodule_addobject. I found one spot still
calling the former, which could in theory leak memory on failure.
This patch fixes this.
In the same function I found an unchecked call to
PyDict_SetItemString. This patch fixes this as well.
Andrew Burgess [Tue, 23 May 2023 10:25:21 +0000 (11:25 +0100)]
gdb: handle core files with .reg/0 section names
The previous commit added the test gdb.arch/core-file-pid0.exp which
tests GDB's ability to load a core file containing threads with an
lwpid of 0, which is something we GDB can encounter when loading a
vmcore file -- a core file generated by the Linux kernel. The threads
with an lwpid of 0 represents idle cores.
While the previous commit added the test, which confirms GDB doesn't
crash when confronted with such a core file, there are still some
problems with GDB's handling of these core files. These problems all
originate from the fact that the core file (once opened by bfd)
contains multiple sections called .reg/0, these sections all
represents different threads (cpu cores in the original vmcore dump),
but GDB gets confused and thinks all of these .reg/0 sections are all
referencing the same thread.
Here is a GDB session on an x86-64 machine which loads the core file
from the gdb.arch/core-file-pid0.exp, this core file contains two
threads, both of which have a pid of 0:
$ ./gdb/gdb --data-directory ./gdb/data-directory/ -q
(gdb) core-file /tmp/x86_64-pid0-core.core
[New process 1]
[New process 1]
Failed to read a valid object file image from memory.
Core was generated by `./segv-mt'.
Program terminated with signal SIGSEGV, Segmentation fault.
The current thread has terminated
(gdb) info threads
Id Target Id Frame
2 process 1 0x00000000004017c2 in ?? ()
The current thread <Thread ID 1> has terminated. See `help thread'.
(gdb) maintenance info sections
Core file: `/tmp/x86_64-pid0-core.core', file type elf64-x86-64.
[0] 0x00000000->0x000012d4 at 0x00000318: note0 READONLY HAS_CONTENTS
[1] 0x00000000->0x000000d8 at 0x0000039c: .reg/0 HAS_CONTENTS
[2] 0x00000000->0x000000d8 at 0x0000039c: .reg HAS_CONTENTS
[3] 0x00000000->0x00000080 at 0x0000052c: .note.linuxcore.siginfo/0 HAS_CONTENTS
[4] 0x00000000->0x00000080 at 0x0000052c: .note.linuxcore.siginfo HAS_CONTENTS
[5] 0x00000000->0x00000140 at 0x000005c0: .auxv HAS_CONTENTS
[6] 0x00000000->0x000000a4 at 0x00000714: .note.linuxcore.file/0 HAS_CONTENTS
[7] 0x00000000->0x000000a4 at 0x00000714: .note.linuxcore.file HAS_CONTENTS
[8] 0x00000000->0x00000200 at 0x000007cc: .reg2/0 HAS_CONTENTS
[9] 0x00000000->0x00000200 at 0x000007cc: .reg2 HAS_CONTENTS
[10] 0x00000000->0x00000440 at 0x000009e0: .reg-xstate/0 HAS_CONTENTS
[11] 0x00000000->0x00000440 at 0x000009e0: .reg-xstate HAS_CONTENTS
[12] 0x00000000->0x000000d8 at 0x00000ea4: .reg/0 HAS_CONTENTS
[13] 0x00000000->0x00000200 at 0x00000f98: .reg2/0 HAS_CONTENTS
[14] 0x00000000->0x00000440 at 0x000011ac: .reg-xstate/0 HAS_CONTENTS
[15] 0x00400000->0x00401000 at 0x00002000: load1 ALLOC LOAD READONLY HAS_CONTENTS
[16] 0x00401000->0x004b9000 at 0x00003000: load2 ALLOC READONLY CODE
[17] 0x004b9000->0x004e5000 at 0x00003000: load3 ALLOC READONLY
[18] 0x004e6000->0x004ec000 at 0x00003000: load4 ALLOC LOAD HAS_CONTENTS
[19] 0x004ec000->0x004f2000 at 0x00009000: load5 ALLOC LOAD HAS_CONTENTS
[20] 0x012a8000->0x012cb000 at 0x0000f000: load6 ALLOC LOAD HAS_CONTENTS
[21] 0x7fda77736000->0x7fda77737000 at 0x00032000: load7 ALLOC READONLY
[22] 0x7fda77737000->0x7fda77f37000 at 0x00032000: load8 ALLOC LOAD HAS_CONTENTS
[23] 0x7ffd55f65000->0x7ffd55f86000 at 0x00832000: load9 ALLOC LOAD HAS_CONTENTS
[24] 0x7ffd55fc3000->0x7ffd55fc7000 at 0x00853000: load10 ALLOC LOAD READONLY HAS_CONTENTS
[25] 0x7ffd55fc7000->0x7ffd55fc9000 at 0x00857000: load11 ALLOC LOAD READONLY CODE HAS_CONTENTS
[26] 0xffffffffff600000->0xffffffffff601000 at 0x00859000: load12 ALLOC LOAD READONLY CODE HAS_CONTENTS
(gdb)
Notice when the core file is first loaded we see two lines like:
[New process 1]
And GDB reports:
The current thread has terminated
Which isn't what we'd expect from a core file -- the core file should
only contain threads that are live at the point of the crash, one of
which should be the current thread. The above message is reported
because GDB has deleted what we think is the current thread!
And in the 'info threads' output we are only seeing a single thread,
again, this is because GDB has deleted one of the threads.
Finally, the 'maintenance info sections' output shows the cause of all
our problems, two sections named .reg/0. When GDB sees the first of
these it creates a new thread. But, when we see the second .reg/0 GDB
tries to create another new thread, but this thread has the same
ptid_t as the first thread, so GDB deletes the first thread and
creates the second thread in its place.
Because both these threads are created with an lwpid of 0 GDB reports
these are 'New process NN' rather than 'New LWP NN' which is what we
would normally expect.
The previous commit includes a little more of the history of GDB
support in this area, but these problems were discussed on the mailing
list a while ago in this thread:
In this commit I propose a solution to these problems.
What I propose is that GDB should spot when we have .reg/0 sections
and, when these are found, should rename these sections using some
unique non-zero lwpid.
Note in the above output we also have sections like .reg2/0 and
.reg-xstate/0, these are additional register sets, this commit also
renumbers these sections inline with their .reg section.
The user is warned that some section renumbering has been performed.
GDB takes care to ensure that the new numbers assigned are unique and
don't clash with any of the pid's that might already be in use --
remember, in a real vmcore file, 0 is used to indicate an idle core,
non-idle cores will have the pid of whichever process was running on
that core, so we don't want GDB to assign an lwpid that clashes with
an actual pid that is in use in the core file.
After this commit here's the updated GDB session output:
$ ./gdb/gdb --data-directory ./gdb/data-directory/ -q
(gdb) core-file /tmp/x86_64-pid0-core.core
warning: found threads with pid 0, assigned replacement Target Ids: LWP 1, LWP 2
[New LWP 1]
[New LWP 2]
Failed to read a valid object file image from memory.
Core was generated by `./segv-mt'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00000000004017c2 in ?? ()
[Current thread is 1 (LWP 1)]
(gdb) info threads
Id Target Id Frame
* 1 LWP 1 0x00000000004017c2 in ?? ()
2 LWP 2 0x000000000040dda5 in ?? ()
(gdb) maintenance info sections
Core file: `/tmp/x86_64-pid0-core.core', file type elf64-x86-64.
[0] 0x00000000->0x000012d4 at 0x00000318: note0 READONLY HAS_CONTENTS
[1] 0x00000000->0x000000d8 at 0x0000039c: .reg/1 HAS_CONTENTS
[2] 0x00000000->0x000000d8 at 0x0000039c: .reg HAS_CONTENTS
[3] 0x00000000->0x00000080 at 0x0000052c: .note.linuxcore.siginfo/1 HAS_CONTENTS
[4] 0x00000000->0x00000080 at 0x0000052c: .note.linuxcore.siginfo HAS_CONTENTS
[5] 0x00000000->0x00000140 at 0x000005c0: .auxv HAS_CONTENTS
[6] 0x00000000->0x000000a4 at 0x00000714: .note.linuxcore.file/1 HAS_CONTENTS
[7] 0x00000000->0x000000a4 at 0x00000714: .note.linuxcore.file HAS_CONTENTS
[8] 0x00000000->0x00000200 at 0x000007cc: .reg2/1 HAS_CONTENTS
[9] 0x00000000->0x00000200 at 0x000007cc: .reg2 HAS_CONTENTS
[10] 0x00000000->0x00000440 at 0x000009e0: .reg-xstate/1 HAS_CONTENTS
[11] 0x00000000->0x00000440 at 0x000009e0: .reg-xstate HAS_CONTENTS
[12] 0x00000000->0x000000d8 at 0x00000ea4: .reg/2 HAS_CONTENTS
[13] 0x00000000->0x00000200 at 0x00000f98: .reg2/2 HAS_CONTENTS
[14] 0x00000000->0x00000440 at 0x000011ac: .reg-xstate/2 HAS_CONTENTS
[15] 0x00400000->0x00401000 at 0x00002000: load1 ALLOC LOAD READONLY HAS_CONTENTS
[16] 0x00401000->0x004b9000 at 0x00003000: load2 ALLOC READONLY CODE
[17] 0x004b9000->0x004e5000 at 0x00003000: load3 ALLOC READONLY
[18] 0x004e6000->0x004ec000 at 0x00003000: load4 ALLOC LOAD HAS_CONTENTS
[19] 0x004ec000->0x004f2000 at 0x00009000: load5 ALLOC LOAD HAS_CONTENTS
[20] 0x012a8000->0x012cb000 at 0x0000f000: load6 ALLOC LOAD HAS_CONTENTS
[21] 0x7fda77736000->0x7fda77737000 at 0x00032000: load7 ALLOC READONLY
[22] 0x7fda77737000->0x7fda77f37000 at 0x00032000: load8 ALLOC LOAD HAS_CONTENTS
[23] 0x7ffd55f65000->0x7ffd55f86000 at 0x00832000: load9 ALLOC LOAD HAS_CONTENTS
[24] 0x7ffd55fc3000->0x7ffd55fc7000 at 0x00853000: load10 ALLOC LOAD READONLY HAS_CONTENTS
[25] 0x7ffd55fc7000->0x7ffd55fc9000 at 0x00857000: load11 ALLOC LOAD READONLY CODE HAS_CONTENTS
[26] 0xffffffffff600000->0xffffffffff601000 at 0x00859000: load12 ALLOC LOAD READONLY CODE HAS_CONTENTS
(gdb)
Notice the new warning which is issued when the core file is being
loaded. The threads are announced as '[New LWP NN]', and we see two
threads in the 'info threads' output. The 'maintenance info sections'
output shows the result of the section renaming.
The gdb.arch/core-file-pid0.exp test has been update to check for the
improved GDB output.
* thread.c (add_thread_silent): Use null_ptid instead of
minus_one_ptid while getting rid of stale inferior_ptid.
This is another test that has been carried in the Fedora GDB tree for
some time, and I thought that it would be worth merging to master. I
don't believe there is any test like this currently in the testsuite.
The problem was that when GDB was used to open a vmcore (core file)
image generated by the Linux kernel GDB would (sometimes) crash with
an assertion failure:
To understand what's going on we need some background; a vmcore file
represents each processor core in the same way that a standard
application core file represents threads. Thus, we might say, a
vmcore file represents cores as threads.
When writing a vmcore file, the kernel will store the pid of the
process currently running on that core as the thread's lwpid.
However, if a core is idle, with no process currently running on it,
then the lwpid for that thread is stored as 0 in the vmcore file. If
multiple cores are idle then multiple threads will have a lwpid of 0.
Back in 2010, the original issue reported tried to change the kernel's
behaviour in this thread:
https://lkml.org/lkml/2010/8/3/75
This change was rejected by the kernel team, the current
behaviour (lwpid of 0) was considered correct. I've checked the
source of a recent kernel. The code mentioned in the lkml.org posting
has moved, it's now in the function crash_save_cpu in the file
kernel/kexec_core.c, but the general behaviour is unchanged, an idle
core will have an lwpid of 0, so I think GDB still needs to be able to
handle this case.
When GDB loads a vmcore file (which is handled just like any other
core file) the sections are processed in core_open to generate the
threads for the core file. The processing is done by calling
add_to_thread_list, a function which looks for sections named .reg/NN
where NN is the lwpid of the thread, GDB then builds a ptid_t for the
new thread and calls add_thread.
Remember, in our case the lwpid is 0. Now for the first thread this
is fine, if a little weird, 0 isn't usually a valid lwpid, but that's
OK, GDB creates a thread with lwpid of 0 and carries on.
When we find the next thread (core) with lwpid of 0, we attempt to
create another thread with an lwpid of 0. This of course clashes with
the previously created thread, they have the same ptid_t, so GDB tries
to delete the first thread.
And it was within this thread delete code that we triggered a bug
which would then cause GDB to assert -- when deleting we tried to
switch to a thread with minus_one_ptid, this resulted in a call to
find_inferior_pid (passing in minus_one_ptid's pid, which is -1), the
find_inferior_pid call fails and returns NULL, which then triggered an
assert in switch_to_thread.
The actual details of the why the assert triggered are really not
important. What's important (I think) is that a vmcore file might
have this interesting lwpid of 0 characteristic, which isn't something
we see in "normal" application core files, and it is this that I think
we should be testing.
Now, you might be thinking: isn't deleting the first thread the wrong
thing to do? If the vmcore file has two threads that represent two
cores, and both have an lwpid of 0 (indicating both cores are idle),
then surely GDB should still represent this as two threads? You're
not wrong. This was mentioned by Pedro in the original GDB mailing
list thread here:
This is indeed a problem, and this problem is still present in GDB
today. I plan to try and address this in a later commit, however,
this first commit is about getting a test in place to confirm that GDB
at a minimum doesn't crash when loading such a vmcore file.
And so, finally, what's in this commit?
This commit contains a new test. The test doesn't actually contain a
vmcore file. Instead I've created a standard application core file
that contains two threads, and then manually edited the core file to
set the lwpid of each thread to 0.
To further reduce the size of the core file (as it will be stored in
git), I've zeroed all of the LOAD-able segments in the core file.
This test really doesn't care about that part of the core file, we
only really care about loading the register's, this is enough to
confirm that the GDB doesn't crash.
Obviously as the core file is pre-generated, this test is architecture
specific. There are already a few tests in gdb.arch/ that include
pre-generate core files. Just as those existing tests do, I've
compressed the core file with bzip2, which reduces it to just 750
bytes. I have structured the test so that if/when this patch is
merged I can add some additional core files for other architectures,
however, these are not included in this commit.
The test simply expands the core file, and then loads it into GDB.
One interesting thing to note is that GDB reports the core file
loading like this:
(gdb) core-file ./gdb/testsuite/outputs/gdb.arch/core-file-pid0/core-file-pid0.x86-64.core
[New process 1]
[New process 1]
Failed to read a valid object file image from memory.
Core was generated by `./segv-mt'.
Program terminated with signal SIGSEGV, Segmentation fault.
The current thread has terminated
(gdb)
There's two interesting things here: first, the repeated "New process
1" message. This is caused because linux_core_pid_to_str reports
anything with an lwpid of 0 as a process, rather than an LWP. And
second, the "The current thread has terminated" message. This is
because the first thread in the core file is the current thread, but
when GDB loads the second thread (which also has lwpid 0) this causes
the first thread to be deleted, as a result GDB thinks that the
current (first) thread has terminated.
As I said previously, both of these problems are a result of the lwpid
0 aliasing, which is not being fixed in this commit -- this commit is
just confirming that GDB doesn't crash when loading this core file.
Andrew Burgess [Thu, 1 Jun 2023 17:30:48 +0000 (18:30 +0100)]
gdb: split inferior and thread setup when opening a core file
I noticed that in corelow.c, when a core file is opened, both the
thread and inferior setup is done in add_to_thread_list. In this
patch I propose hoisting the inferior setup out of add_to_thread_list
into core_target_open.
The only thing about this change that gave me cause for concern is
that in add_to_thread_list, we only setup the inferior after finding
the first section with a name like ".reg/NN". If we find no such
section then the inferior will never be setup.
Is this important?
Well, I don't think so. Back in core_target_open, if there is no
current thread (which there will not be if no ".reg/NN" section was
found), then we look for a thread in the current inferior. If there
are no threads (which there will not be if no ".reg/NN" is found),
then we once again setup the current inferior.
What I think this means, is that, in all cases, the current inferior
will end up being setup. By moving the inferior setup code earlier in
core_target_open and making it non-conditional, we can remove the
later code that sets up the inferior, we now know this will always
have been done.
There should be no user visible changes after this commit.
RISC-V: Zvkh[a,b]: Remove individual instruction class
Currently we have three instruction classes defined for Zvkh[a,b]:
- INSN_CLASS_ZVKNHA
- INSN_CLASS_ZVKNHB
- INSN_CLASS_ZVKNHA_OR_ZVKNHB
The encodings of all instructions in Zvknh[a,b] are identical.
Therefore, we don't need the individual instruction classes
and can remove them.
This patch also adds the missing support of the combined instruction
class in riscv_multi_subset_supports_ext().
Fixes: 62edb233ef5 ("RISC-V: Add support for the Zvknh[a,b] ISA extensions") Reported-By: Nelson Chu <nelson@rivosinc.com> Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
WANG Xuerui [Sun, 2 Jul 2023 10:14:22 +0000 (18:14 +0800)]
LoongArch: gas: Fix shared builds
Formerly an include of libbfd.h was added in commit 56576f4a722
("LoongArch: gas: Add support for linker relaxation."), in order to
allow calling _bfd_read_unsigned_leb128 from gas, but doing so broke
shared builds. Commit d2fddb6d783 fixed this reference but did not
remove the now unnecessary inclusion of libbfd.h. The gas_assert macro
expands into a conditional call to abort(), but "abort" is re-defined to
_bfd_abort in libbfd.h, so the extra include breaks any gas_assert
usage, and should be removed.
gas/ChangeLog:
* config/tc-loongarch.c: Don't include libbfd.h.
Fixes: d2fddb6d783 ("LoongArch: Fix ld "undefined reference" error with --enable-shared") Signed-off-by: WANG Xuerui <git@xen0n.name>
In our GUI project (https://savannah.gnu.org/projects/gprofng-gui), we use
the output of gprofng to display the data. Sometimes this data is corrupted.
gprofng/ChangeLog
2023-06-29 Vladimir Mezentsev <vladimir.mezentsev@oracle.com>
* src/ipc.cc (ipc_doWork): Fix data race.
* src/ipcio.cc (IPCresponse::print): Fix data race.
Remove unused variables and functions.
* src/ipcio.h: Declare two variables.
* src/StringBuilder.cc (StringBuilder::write): New function.
* src/StringBuilder.h: Likewise.
Certain extensions require two levels of implications. For example,
zvkng implies zvkn and zvkn implies zvkned. Enabling zvkng should also
enable zvkned.
This patch fixes this behavior.
bfd/ChangeLog:
* elfxx-riscv.c (riscv_parse_add_implicit_subsets): Allow nested
implications for extensions.
Signed-off-by: Nathan Huckleberry <nhuck@google.com> Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
This extension adds the following instructions:
- vandn.[vv,vx]
- vbrev.v
- vbrev8.v
- vrev8.v
- vclz.v
- vctz.v
- vcpop.v
- vrol.[vv,vx]
- vror.[vv,vx,vi]
- vwsll.[vv,vx,vi]
bfd/ChangeLog:
* elfxx-riscv.c (riscv_multi_subset_supports): Add instruction
class support for Zvbb.
(riscv_multi_subset_supports_ext): Likewise.
gas/ChangeLog:
* config/tc-riscv.c (validate_riscv_insn): Add 'l' as new format
string directive.
(riscv_ip): Likewise.
* testsuite/gas/riscv/zvbb.d: New test.
* testsuite/gas/riscv/zvbb.s: New test.
Tom Tromey [Fri, 30 Jun 2023 01:38:10 +0000 (19:38 -0600)]
Fix regressions caused by agent expression C++-ification
Simon pointed out that my agent expression C++-ification patches
caused a regression with the native-gdbserver target board. The bug
is that append_const is supposed to write in big-endian order, but I
switched this by mistake.
Philipp Tomsich [Fri, 30 Jun 2023 14:02:11 +0000 (16:02 +0200)]
binutils: NEWS: announce new RISC-V extensions
We picked up support for a few new extensions over the last weeks
(this may need further updating prior to the next release), list them
in the NEWS file.
binutils/ChangeLog:
* binutils/NEWS: announce suuport for the new RISC-V
extensions (Zicond, Zfa, XVentanaCondOps).
Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
This patch adds support for the RISC-V Zfa extension,
which introduces additional floating-point instructions:
* fli (load-immediate) with pre-defined immediates
* fminm/fmaxm (like fmin/fmax but with different NaN behaviour)
* fround/froundmx (round to integer)
* fcvtmod.w.d (Modular Convert-to-Integer)
* fmv* to access high bits of FP registers in case XLEN < FLEN
* fleq/fltq (quiet comparison instructions)
Zfa defines its instructions in combination with the following
extensions:
* single-precision floating-point (F)
* double-precision floating-point (D)
* quad-precision floating-point (Q)
* half-precision floating-point (Zfh)
This patch is based on an earlier version from Tsukasa OI:
https://sourceware.org/pipermail/binutils/2022-September/122939.html
Most significant change to that commit is the switch from the rs1-field
value to the actual floating-point value in the last operand of the fli*
instructions. Everything that strtof() can parse is accepted and
the '%a' printf specifier is used to output hex floating-point literals
in the disassembly.
The Zfa specification is frozen (and has passed public review). It is
available as a chapter in "The RISC-V Instruction Set Manual: Volume 1":
https://github.com/riscv/riscv-isa-manual/releases
bfd/ChangeLog:
* elfxx-riscv.c (riscv_multi_subset_supports): Add instruction
class support for 'Zfa' extension.
(riscv_multi_subset_supports_ext): Likewise.
(riscv_implicit_subsets): Add 'Zfa' -> 'F' dependency.
gas/ChangeLog:
* config/tc-riscv.c (flt_lookup): New helper to lookup a float value
in an array.
(validate_riscv_insn): Add 'Wfv' as new format string directive.
(riscv_ip): Likewise.
* doc/c-riscv.texi: Add floating-point chapter and describe
limiations of the Zfa FP literal parsing.
* testsuite/gas/riscv/zfa-32.d: New test.
* testsuite/gas/riscv/zfa-32.s: New test.
* testsuite/gas/riscv/zfa-64.d: New test.
* testsuite/gas/riscv/zfa-64.s: New test.
* testsuite/gas/riscv/zfa-fail.d: New test.
* testsuite/gas/riscv/zfa-fail.l: New test.
* testsuite/gas/riscv/zfa-fail.s: New test.
* testsuite/gas/riscv/zfa.d: New test.
* testsuite/gas/riscv/zfa.s: New test.
* testsuite/gas/riscv/zfa.s: New test.