Fix unwinding when restoring a register from one of a greater size
When debugging functions where a callee-saved register is moved to a
register of a larger size (e.g., a 64-bit general-purpose register to
a 128-bit vector register), GDB would crash when the user issued the
"return" command. For example:
ldgr %f0, %r11 ; Move 64-bit general-purpose register (r11)
; to 128-bit vector register (f0)
.cfi_register r11, f0 ; DW_CFA_register: r11 is stored in f0
...
lgdr %r11, %f0 ; Restore r11 from f0
.cfi_restore r11 ; DW_CFA_restore: r11 is restored to its original
; register
(This example uses instructions and registers for the S390x architecture,
where this bug was originally found.)
If GDB is stopped in the "..." section and the user issues the
"return" command, GDB crashes due to a buffer size mismatch during
unwinding. Specifically, in frame_register_unwind in frame.c, a
buffer the size of the original register (the 64-bit r11 in this
example) has been allocated and GDB would like to use memcpy to copy
the contents of the register where the original register was saved
(the 128-bit f0) to the buffer for the original register. But,
fortunately, GDB has an assertion which prevents this from happening:
This patch ensures that GDB uses the original register's type (e.g.,
r11's type) when unwinding, even if it was marked as saved to a differently
typed/sized register (e.g., f0) via .cfi_register (DW_CFA_register).
The fix adds a 'struct type *' parameter to value_of_register_lazy() to
explicitly track the original register's type. The function
frame_unwind_got_register is updated to pass the correct type for the
original register.
The call chain from frame_register_unwind to frame_unwind_got_register
is shown by this backtrace:
#0 frame_unwind_got_register (frame=..., regnum=13, new_regnum=128)
at gdb/frame-unwind.c:300
#1 0x000000000135d894 in dwarf2_frame_prev_register (this_frame=...,
this_cache=0x2204528, regnum=13)
at gdb/dwarf2/frame.c:1187
#2 0x00000000014d9186 in frame_unwind_legacy::prev_register (
this=0x211f428 <dwarf2_frame_unwind>, this_frame=...,
this_prologue_cache=0x2204528, regnum=13) at gdb/frame-unwind.c:401
#3 0x00000000014e1d12 in frame_unwind_register_value (next_frame=...,
regnum=13) at gdb/frame.c:1263
#4 0x00000000014e16b8 in frame_register_unwind (next_frame=..., regnum=13,
optimizedp=0x3ffffff813c, unavailablep=0x3ffffff8138,
lvalp=0x3ffffff8134, addrp=0x3ffffff8128, realnump=0x3ffffff8124,
buffer=...) at gdb/frame.c:1189
The register numbers shown above are for s390x. On s390x,
S390_R11_REGNUM has value 13. Vector registers (like f0) are numbered
differently from floating-point registers of the same name, leading to
regnum 128 for f0 despite S390_F0_REGNUM being assigned a different
value in s390-tdep.h.
New test cases for aarch64 and x86_64 check for this on more popular
architectures and also without dependency on a particular compiler to
generate an unusual prologue in which a general purpose register is
being moved to a vector register. In both cases, the test simulates
the bug found on s390x where a 64-bit frame pointer was being moved to
a much wider vector register. These test cases will cause an internal
error on their respective architecture, but will pass with this fix in
place.
When tested on s390x linux (native), this change fixes 59 GDB internal
errors and around 200 failures overall. This is the list of internal
errors that no longer occur on s390x:
I have tested this commit on Fedora Linux, with architectures s390x,
x86_64, x86_64/-m32, aarch64, ppc64le, and riscv64, with no
regressions found.
This v2 version makes some changes suggested by Andrew Burgess: It
adds an assert to frame_unwind_got_register() and always passes the
type of REGNUM to value_of_register_lazy(). It also updates value.h's
comment describing value_of_register_lazy().
In his approval message, Andrew requested some changes to the tests.
Those have been made exactly as requested.