git.ipfire.org Git - thirdparty/binutils-gdb.git/log

Improve vRun error reporting

After the previous commit, if starting the inferior process with "run"
(vRun packet) fails, GDBserver reports an error using the "E." textual
error packet.  On the GDB side, however, GDB doesn't yet do anything
with the textual error string.  This commit improves that.

This makes remote debugging output the same as native output, when
possible, another small step in the "local/remote parity" project.

E.g., before, against GNU/Linux GDBserver:

  (gdb) run
  Starting program: .../gdb.base/run-fail-twice/run-fail-twice.nox
  Running ".../gdb.base/run-fail-twice/run-fail-twice.nox" on the remote target failed

After, against GNU/Linux GDBserver (same as native):

  (gdb) run
  Starting program: .../gdb.base/run-fail-twice/run-fail-twice.nox
  During startup program exited with code 126.

To know whether we have a textual error message, extend packet_result
to carry that information.  While at it, convert packet_result to use
factory methods, and change its std::string parameter to a plain const
char *, as that it always what we have handy to pass to it.

Change-Id: Ib386f267522603f554b52a885b15229c9639e870
Approved-By: Tom Tromey <tom@tromey.com>

Fix "run" failure handling with GDBserver

If starting the inferior process with "run" (vRun packet) fails,
GDBserver throws an error that escapes all the way to the top level.
When an error escapes all the way like that, GDBserver interprets it
as a disconnection, and either goes back to waiting for a new GDB
connection, or exits, if --once was specified.

E.g., with the testcase program added by this commit, we see:

On GDB side:

...
(gdb) tar extended-remote :999
...
Remote debugging using :9999
(gdb) r
Starting program:
Running ".../gdb.base/run-fail-twice/run-fail-twice.nox" on the remote target failed
(gdb)

On GDBserver side:

$ gdbserver --once --multi :9999
Remote debugging from host 127.0.0.1, port 34344
bash: line 1: .../gdb.base/run-fail-twice/run-fail-twice.nox: Permission denied
bash: line 1: exec: .../gdb.base/run-fail-twice/run-fail-twice.nox: cannot execute: Permission denied
gdbserver: During startup program exited with code 126.
$ # gdbserver exited

This is wrong, as we've connected with extended-remote/--multi.
GDBserver should just report an error to vCont, and continue connected
to GDB, waiting for other commands.

This commit fixes GDBserver by catching the error locally in
handle_v_run.

Change-Id: Ib386f267522603f554b52a885b15229c9639e870
Approved-By: Tom Tromey <tom@tromey.com>

Windows: Fix run/attach hang after bad run/attach

On Cygwin, gdb.base/attach.exp exposes that an "attach" after a
previously failed "attach" hangs:

(gdb) PASS: gdb.base/attach.exp: do_attach_failure_tests: attach to digits-starting nonsense is prohibited
attach 0
Can't attach to process 0 (error 2: The system cannot find the file specified.)
(gdb) PASS: gdb.base/attach.exp: do_attach_failure_tests: attach to nonexistent process is prohibited
attach 10644
FAIL: gdb.base/attach.exp: do_attach_failure_tests: first attach (timeout)

The problem is that windows_nat_target::attach always returns success
even if the attach fails.  When we return success, the helper thread
begins waiting for events (which will never come), and thus the next
attach deadlocks on the do_synchronously call within
windows_nat_target::attach.

"run" has the same problem, which is exposed by the new
gdb.base/run-fail-twice.exp testcase added in a following patch:

(gdb) run
Starting program: .../gdb.base/run-fail-twice/run-fail-twice.nox
Error creating process .../gdb.base/run-fail-twice/run-fail-twice.nox, (error 6: The handle is invalid.)
(gdb) PASS: gdb.base/run-fail-twice.exp: test: bad run 1
run
Starting program: .../gdb.base/run-fail-twice/run-fail-twice.nox
FAIL: gdb.base/run-fail-twice.exp: test: bad run 2 (timeout)

The problem here is the same, except that this time it is
windows_nat_target::create_inferior that returns the incorrect result.

This commit fixes both the "attach" and "run" paths, and the latter
both the Cygwin and MinGW paths.  The tests mentioned above now pass
on Cygwin.  Confirmed the fixes manually for MinGW GDB.

Change-Id: I15ec9fa279aff269d4982b00f4ea7c25ae917239
Approved-By: Tom Tromey <tom@tromey.com>

Document "E.MESSAGE" RSP errors

For many years, GDB has accepted a "E.MESSAGE" error reponse, in
addition to "E NN".  For many packets, GDB strips the "E." before
giving the error message to the user.  For others, GDB does not strip
the "E.", but still understands that it is an error, as it starts with
"E", and either prints the whole string, or ignores it and just
mentions an error occured (same as for "E NN").

This has been the case for as long as I remember.  Now that I check, I
see that it's been there since 2006 (commit a76d924dffcb, also here:
https://sourceware.org/pipermail/gdb-patches/2006-September/047286.html).
All along, I actually thought it was documented.  Turns out it wasn't.

This commit documents it, in the new "Standard Replies" section, near
where we document "E NN".

The original version of this 3-patch documentation series was a single
CodeSourcery patch that documented the textual error as
"E.NAME.MESSAGE", with MESSAGE being 8-bit binary encoded.  But I
think the ship has sailed for that.  GDBserver has been sending error
messages with more than one "." for a long while, and with no binary
encoding.  Still, I've preserved the "Co-Authored-By" list of the
original larger patch.

The 'qRcmd' and 'm' commands are exceptions and do not accept this
reply format.  The top of the "Standard Replies" section already says:

  "All commands support these, except as noted in the individual
  command descriptions."

So this adds a note to the description of 'qRcmd' and 'm', explicitly
stating that they do not support this error reply format.

Change-Id: Ie4fee3d00d82ede39e439bf162e8cb7485532fd8
Co-Authored-By: Jim Blandy <jimb@codesourcery.com>
Co-Authored-By: Mike Wrighton <mike_wrighton@mentor.com>
Co-Authored-By: Nathan Sidwell <nathan@codesourcery.com>
Co-Authored-By: Hafiz Abid Qadeer <abidh@codesourcery.com>
Approved-By: Eli Zaretskii <eliz@gnu.org>

Centralize documentation of error and empty RSP responses

Currently, for each packet, we document the "E NN" response (error),
and the empty response (unsupported).  This patch centralizes that in
a new "Standard Replies" section.

In the "Packets", "General Query Packets", "Tracepoint Packets"
sections, Remove explicit mention of empty and error replies, except
when they provide detail not covered in Standard Replies.

Note this hunk:

-@item E @var{NN}
-@var{NN} is errno

and this one:

-@item E00
-The request was malformed, or @var{annex} was invalid.
-
-@item E @var{nn}
-The offset was invalid, or there was an error encountered reading the data.
-The @var{nn} part is a hex-encoded @code{errno} value.

were really documenting things that don't really work that way.

The first is the documentation of the "m" packet.  GDB does _not_
interpret the NN as an errno.  It can't, in fact, because the
remote/target errno numbers have nothing to do with GDB/host errno
numbers in a cross debugging scenario.

The second hunk above is from the documentation of qXfer.  Again, GDB
does not give any interpretation to the NN error code at all.  Nor
does GDBserver.  And again, an errno number can't be interpreted in a
cross debugging scenario.

Change-Id: I973695c80809cdb5a5e8d5be8b78ba4d1ecdb513
Co-Authored-By: Jim Blandy <jimb@codesourcery.com>
Co-Authored-By: Mike Wrighton <mike_wrighton@mentor.com>
Co-Authored-By: Nathan Sidwell <nathan@codesourcery.com>
Co-Authored-By: Hafiz Abid Qadeer <abidh@codesourcery.com>
Approved-By: Eli Zaretskii <eliz@gnu.org>

Document conventions for describing packet syntax

This comment documents conventions for describing packet syntax in the
Overview section.

Change-Id: I96198592601b24c983da563d143666137e4d0a4e
Co-Authored-By: Jim Blandy <jimb@codesourcery.com>
Co-Authored-By: Mike Wrighton <mike_wrighton@mentor.com>
Co-Authored-By: Nathan Sidwell <nathan@codesourcery.com>
Co-Authored-By: Hafiz Abid Qadeer <abidh@codesourcery.com>
Approved-By: Eli Zaretskii <eliz@gnu.org>

Remove unnecessary get_current_frame calls from infrun.c

Since the frame variable is now a frame_info_ptr, the issue
with the dangling frame pointer is apparently no longer there.

So remove the re-fetch code and the corresponding meanwhile
misleading comments.

Approved-By: Tom Tromey <tom@tromey.com>

gdb: Add a SECURITY.txt document for GDB

This commit adds a SECURITY document to GDB.  The idea behind this
document is to define what security expectations a user can reasonably
have when using GDB.  In addition the document specifies which bugs
GDB developers consider a security bug, and which are just "normal"
bugs.

Discussion for the creation of this initial version can be found here:

  https://inbox.sourceware.org/gdb-patches/877cmvui64.fsf@redhat.com/

Like any part of GDB, this is not intended as the absolute final
version, instead this is a living document, and this is just a
reasonable starting point from which we can iterate.

For now I've added this document as a text file but I am considering
merging this document into the manual at a later date, and having the
SECURITY.txt file just say "Read the manual"

Approved-By: Tom Tromey <tom@tromey.com>

gdb: specify sh pointer register types

This patch fixes a pretty funny issue on sh targets that occurred
because $pc (and similar registers) were typed as int. When $pc is in
the upper half of the address space (i.e. kernel code on sh), `x/i $pc'
would resolve to a negative value. At least in the case of a remote
target with an Xfer memory map, this leads to a spurious "cannot access
memory" error as negative addresses are out of bounds.

(gdb) x/i $pc
0x8c202c04: Cannot access memory at address 0x8c202c04
(gdb) x/i 0x8c202c04
=> 0x8c202c04 <gintctl_gint_gdb+304>: mov.l @r1,r10

The issue is fixed by specifying pointer types for pc and other pointer
registers. Code pointer registers on sh include pc, pr (return address
of a call), vbr (interrupt handler) and spc (return address after
interrupt). Data pointers include r15 (stack pointer) and gbr (base
register for a few specific addressing modes).

Change-Id: I043a058f7cbc6494f380dc0461616a9f3e0d87e0
Approved-By: Simon Marchi <simon.marchi@efficios.com>

objcopy: check input flavor before setting PE/COFF section alignment

coff_section_data() and elf_section_data() use the same underlying
field. The pointer being non-NULL therefore isn't sufficient to know
that pei_section_data() can validly be used on the incoming object.
Apparently in 64-bit-host builds the resulting memory corruption is
benign, whereas in 32-bit-host builds a segmentation fault occurs upon
de-referencing pei_section_data()'s return value.

Automatic date update in version.in

Fix end_sequence addresses for dw2-lines.exp

The patch:

  From f0d556d14b1d1c3f8e2f9c13b08adca22e1b8c9c Mon Sep 17 00:00:00 2001
  From: Tom de Vries <tdevries@suse.de>
  Date: Wed, 17 Apr 2024 12:55:00 +0200
  Subject: [PATCH] [gdb/testsuite] Fix end_sequence addresses

  I noticed in test-case gdb.reverse/map-to-same-line.exp, that the end of main:
  ...
  00000000004102c4 <end_of_sequence>:
    4102c4:       52800000        mov     w0, #0x0                        // #0
   4102c8:       9100c3ff        add     sp, sp, #0x30
    4102cc:       d65f03c0        ret
  ...
  is not described by the line table:
  ...

  <snip>

The regression failure on PowerPC is due to the change in file
dw2-lines.exp,

-               DW_LNE_set_address bar_label_5
+               DW_LNE_set_address "$main_start + $main_len"

The label bar_label_5 is in function bar, not function main.  The new
set address should have been $bar_start + $bar_len.

bpf: fix calculation when deciding to relax branch

In certain cases we were calculating the jump displacement incorrectly
when deciding whether to relax a branch.  This meant for some branches,
such as a very long backwards conditional branch, relaxation was not
done when it should have been.  The result was to error later, because
the actual jump displacement was too large to fit in the original
instruction.

This patch fixes up the displacement calculation so that those branches
are correctly relaxed and no longer result in an error.  In addition, it
changes md_convert_frag to install fixups for the JAL instructions in
the resulting relaxations rather than encoding the displacement value
directly.

gas/
* config/tc-bpf.c (relaxed_branch_length): Correct displacement
calculation when relaxing.
(md_convert_frag): Likewise.  Install fixups for JAL
instructions resulting from relaxation.
* testsuite/gas/bpf/jump-relax-ja-be.d: Correct and expand test.
* testsuite/gas/bpf/jump-relax-ja.d: Likewise.
* testsuite/gas/bpf/jump-relax-ja.s: Likewise.
* testsuite/gas/bpf/jump-relax-jump-be.d: Likewise.
* testsuite/gas/bpf/jump-relax-jump.d: Likewise.
* testsuite/gas/bpf/jump-relax-jump.s: Likewise.

gdb: add type annotations to ada-unicode.py

Add type annotations to ada-unicode.py, just enough to make pyright
happy:

    $ pyright --version
    pyright 1.1.359
    $ pyright ada-unicode.py
    0 errors, 0 warnings, 0 informations

Introduce a `Range` class instead of using separate variables and
tuples, to make the code and type annotations a bit cleaner.

When running ada-unicode.py, I get a diff for ada-casefold.h, but I get
the same diff before and after this patch, so that is a separate issue.

Change-Id: I0d8975a57f9fb115703178ae197dc6b6b8b4eb7a
Approved-By: Tom Tromey <tom@tromey.com>

gdb: remove gdbcmd.h

Most files including gdbcmd.h currently rely on it to access things
actually declared in cli/cli-cmds.h (setlist, showlist, etc). To make
things easy, replace all includes of gdbcmd.h with includes of
cli/cli-cmds.h. This might lead to some unused includes of
cli/cli-cmds.h, but it's harmless, and much faster than going through
the 170 or so files by hand.

Change-Id: I11f884d4d616c12c05f395c98bbc2892950fb00f
Approved-By: Tom Tromey <tom@tromey.com>

gdb: move style_set_list/style_show_list declarations to cli/cli-style.h

They are defined in cli/cli-style.c.

Change-Id: Ic478a3985ff0fd773bd7ba85bb144c6e914d0be6
Approved-By: Tom Tromey <tom@tromey.com>

gdb: remove unused print_command_line and print_command_lines declarations

There is no corresponding definition for print_command_line.

There is already a declaration for print_command_lines in
cli/cli-script.h (the implementation is in cli/cli-script.c).

Change-Id: Ic9e67ed04703306d614383ead14e2b2b059b2a8e
Approved-By: Tom Tromey <tom@tromey.com>

gdb: move execute function declarations from gdbcmd.h to top.h

These functions are implemented in top.c, move their declarations to
top.h.

Change-Id: I8893ef91d955156a6530734fefe8002d78c3e5fc
Approved-By: Tom Tromey <tom@tromey.com>

LoongArch: gas: Simplify relocations in sections without code flag

Gas should not emit ADD/SUB relocation pairs for label differences
if they are in the same section without code flag even relax enabled.
Because the real value is not be affected by relaxation and it can be
compute out in assembly stage. Thus, correct the `TC_FORCE_RELOCATION
_SUB_SAME` and the label differences in same section without code
flag can be resolved in fixup_segment().

LoongArch: Add bad static relocation check and output more information to user

Absolute address symbols cannot be used with -shared.
We output more information to the user than just BFD_ASSETR.

LoongArch: The symbol got type can only be obtained after initialization

When scanning relocations and determining whether TLS type transition is
possible, it will try to obtain the symbol got type. If the symbol got
type record has not yet been allocated space and initialized, it will
cause ld to crash. So when uninitialized, the symbol is set to GOT_UNKNOWN.

Automatic date update in version.in

gdb/testsuite: Add libc_has_debug_info require helper

Factor the test for libc debug info out of gdb.base/relativedebug.exp to
a new procedure.

Also, change the "info sharedlibrary" test to explicitly detect when
libc has debug info.

Approved-by: Kevin Buettner <kevinb@redhat.com>

gdb/doc: Fix incorrect information in RSP doc

The 'PacketSize' attribute of the qSupported packet was
documented to be the maximum size of the packet including
the frame and checksum bytes, however this is not how it
was treated in the code. In reality, PacketSize is the
maximum size of the data in the RSP packets, not including
the framing or checksum bytes.

For instance, GDB's remote.c treats it as the maximum
number of data bytes. See remote_read_bytes_1, where the
size of the request is capped at PacketSize/2 (for
hex-encoding).

Also see gdbserver's server.cc, where the internal buffer
is sized as PBUFSIZ and PBUFSIZ-1 is used as PacketSize.
In gdbserver's case, the buffer is not used for any of the
framing or checksum characters. (I am not certain where the -1
comes from. I think it comes from back when there were no
binary packets, so packets were treated as strings with
null terminators).

It also seems like gdbservers in the wild treat it in
this way:

Embocosm doc:
https://www.embecosm.com/appnotes/ean4/embecosm-howto-rsp-server-ean4-issue-2.html#id3078000

A quick glance over openocd's gdb_server.c gdb_put_packet_inner()
function shows that the internal buffer also excludes the framing
and checksum.

Likewise, qEmu's gdbstub.c allocates PacketSize bytes for
the internal packet contents, and PacketSize+4 for the
full frame.

Reviewed-By: Eli Zaretskii <eliz@gnu.org>
Approved-By: Pedro Alves <pedro@palves.net>

Handle two-linetable function in find_epilogue_using_linetable

Consider the following test-case:
...
$ cat hello.c
int main()
{
  printf("hello ");
  #include "world.inc"
$ cat world.inc
  printf("world\n");
  return 0;
}
$ gcc -g hello.c
...

The line table for the compilation unit, consisting just of
function main, is translated into these two gdb line tables, one for hello.c
and one for world.inc:
...
compunit_symtab: hello.c
symtab: hello.c
INDEX  LINE   REL-ADDRESS UNREL-ADDRESS IS-STMT PROLOGUE-END EPILOGUE-BEGIN
0      3      0x400557    0x400557      Y
1      4      0x40055b    0x40055b      Y
2      END    0x40056a    0x40056a      Y

compunit_symtab: hello.c
symtab: world.inc
INDEX  LINE   REL-ADDRESS UNREL-ADDRESS IS-STMT PROLOGUE-END EPILOGUE-BEGIN
0      1      0x40056a    0x40056a      Y
1      2      0x400574    0x400574      Y
2      3      0x400579    0x400579      Y
3      END    0x40057b    0x40057b      Y
...

The epilogue of main starts at 0x400579:
...
  400579: 5d                    pop    %rbp
  40057a: c3                    ret
...

Now, say we have an epilogue_begin marker in the line table at 0x400579.

We won't find it using find_epilogue_using_linetable, because it does:
...
  const struct symtab_and_line sal = find_pc_line (start_pc, 0);
...
which gets us the line table for hello.c.

Fix this by using "find_pc_line (end_pc - 1, 0)" instead.

Tested on x86_64-linux.

Co-Authored-By: Tom de Vries <tdevries@suse.de>
PR symtab/31622
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31622

Fix an out of bounds array access in find_epilogue_using_linetable

An out of bounds array access in find_epilogue_using_linetable causes random
test failures like these:

FAIL: gdb.base/unwind-on-each-insn-amd64.exp: foo: instruction 6: $fba_value == $fn_fba
FAIL: gdb.base/unwind-on-each-insn-amd64.exp: foo: instruction 6: check frame-id matches
FAIL: gdb.base/unwind-on-each-insn-amd64.exp: foo: instruction 6: bt 2
FAIL: gdb.base/unwind-on-each-insn-amd64.exp: foo: instruction 6: up
FAIL: gdb.base/unwind-on-each-insn-amd64.exp: foo: instruction 6: $sp_value == $::main_sp
FAIL: gdb.base/unwind-on-each-insn-amd64.exp: foo: instruction 6: $fba_value == $::main_fba
FAIL: gdb.base/unwind-on-each-insn-amd64.exp: foo: instruction 6: [string equal $fid $::main_fid]

Here the read happens below the first element of the line
table, and the test failure depends on the value that is
read from there.

It also happens that std::lower_bound returns a pointer exactly at the upper
bound of the line table, also here the read value is undefined, that happens
in this test:

FAIL: gdb.dwarf2/dw2-epilogue-begin.exp: confirm watchpoint doesn't trigger

Fixes: 528b729be1a2 ("gdb/dwarf2: Add support for DW_LNS_set_epilogue_begin in line-table")
Co-Authored-By: Tom de Vries <tdevries@suse.de>
PR symtab/31268
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31268

[gdb/testsuite] Fix gdb.threads/threadcrash.exp for remote host

With test-case gdb.threads/threadcrash.exp using host board local-remote-host
and target board remote-gdbserver-on-localhost I run into:
...
(gdb) PASS: gdb.threads/threadcrash.exp: test_gcore: continue to crash
gcore $outputs/gdb.threads/threadcrash/threadcrash.gcore^M
Failed to open '$outputs/gdb.threads/threadcrash/threadcrash.gcore' for output.^M
(gdb) FAIL: gdb.threads/threadcrash.exp: test_gcore: saving gcore
UNSUPPORTED: gdb.threads/threadcrash.exp: test_gcore: couldn't generate gcore file
...

The problem is that the gcore command tries to save a file on a remote host,
but the filename is a location on build.

Fix this by using host_standard_output_file.

Tested on x86_64-linux.

[gdb/testsuite] Fix gdb.threads/threadcrash.exp with glibc debuginfo

After installing glibc debuginfo, I ran into:
...
FAIL: gdb.threads/threadcrash.exp: test_live_inferior: \
  $thread_count == [llength $test_list]
...

This happens because the clause:
...
-re "^\r\n${hs}main$hs$eol" {
...
which is intended to match only:
...
#1  <hex> in main () at threadcrash.c:423^M
...
also matches "remaining" in:
...
#1  <hex> in __GI___nanosleep (requested_time=<hex>, remaining=<hex>) at \
   nanosleep.c:27^M
...

Fix this by checking for "in main" instead.

Tested on x86_64-linux.

Update readelf's display of RELR sections to include the number of locations relocated

gdb: include extract-store-integer.h in charset.c when PHONY_ICONV

When building on a system where "phony iconv" is used (NetBSD in this
case, not sure why), I get:

      CXX    charset.o
    /home/smarchi/src/binutils-gdb/gdb/charset.c: In function 'size_t phony_iconv(int, const char**, size_t*, char**, size_t*)':
    /home/smarchi/src/binutils-gdb/gdb/charset.c:140:8: error: 'extract_unsigned_integer' was not declared in this scope
          = extract_unsigned_integer ((const gdb_byte *)*inbuf, 4, endian);
            ^~~~~~~~~~~~~~~~~~~~~~~~
    /home/smarchi/src/binutils-gdb/gdb/charset.c:140:8: note: suggested alternative: 'btrace_insn_number'
          = extract_unsigned_integer ((const gdb_byte *)*inbuf, 4, endian);
            ^~~~~~~~~~~~~~~~~~~~~~~~
            btrace_insn_number

Add the necessary include.

Change-Id: I10b967584645961c86167a8395d88929a42bef03

PPC maintainers

I'm retiring from IBM, and Geoff hasn't been active for a very long
time.

* MAINTAINERS (ppc): Remove myself and Geoff Keating. Add
Geoff to past maintainers.

buffer overflow in libctf tests

       * testsuite/libctf-regression/gzrewrite.c (main): Don't overflow
       "a" buffer in "after adding types" check.
       * testsuite/libctf-regression/zrewrite.c (main): Likewise.

Automatic date update in version.in

gdb: adjust copyright years of extract-store-integer.{c,h}

The contents of these files was copied from defs.h and findvar. Copy
over the copyright years (1986-2024).

Change-Id: Idfb0f255fbcfda7e107e9a82804cece3d81ed5fc

arm: Fix MVE vmla encoding

bfd: Remove duplicate word in elf-vxworks.c

PR ld/31652
* elf-vxworks.c (elf_vxworks_emit_relocs): Drop duplicate word.

objcopy.c: Fix bfd_copy_private_symbol_data on 32-bit hosts

Use long with bfd_copy_private_symbol_data to fix

.../binutils/objcopy.c: In
function ‘copy_object’:
.../binutils/objcopy.c:3383:17: error: comparison of integer expressions of different signedness: ‘unsigned int’ and ‘long int’ [-Werror=sign-compare]
3383 | for (i = 0; i < symcount; i++)
| ^

on 32-bit hosts.

PR binutils/14493
* objcopy.c (copy_object): Use long with
bfd_copy_private_symbol_data.

gdb: move symbol_file_command declaration to symfile.h

Move it out of defs.h, the corresponding definition is in symfile.c.

Change-Id: I984666c3bcd213f8574e9ec91462e1d61f77f16b
Approved-By: Tom Tromey <tom@tromey.com>

gdb: remove enum precision_type

It is unused.

Change-Id: Ic49a3ef03c21b209594cd567ae80b5441606bef6
Approved-By: Tom Tromey <tom@tromey.com>

gdb: move annotation_level declaration/definition to annotate.{h,c}

The declaration of annotation_level is currently in defs.h, while the
definition is in stack.c. I don't really understand why that variable
would live in stack.c, it seems completely unrelated. Move it to
annotate.c, and move the declaration to annotate.h.

Change-Id: I6cf8e9bd20e83959bdf5ad58dd008b6e1187d7d8
Approved-By: Tom Tromey <tom@tromey.com>

gdb: move a bunch of quit-related things to event-top.{c,h}

Move some declarations related to the "quit" machinery from defs.h to
event-top.h.  Most of the definitions associated to these declarations
are in event-top.c.  The exceptions are `quit()` and `maybe_quit()`,
that are defined in utils.c.  For consistency, move these two
definitions to event-top.c.

Include "event-top.h" in many files that use these things.

Change-Id: I6594f6df9047a9a480e7b9934275d186afb14378
Approved-By: Tom Tromey <tom@tromey.com>

gdb: change type of quit_flag to bool

Change-Id: I7dc5189ee172e82ef5b2c4a739c011f43a84258b
Approved-By: Tom Tromey <tom@tromey.com>

gdb: change return type of check_quit_flag to bool

Change the return type of the check_quit_flag function to bool. Update
a few related spots.

Change-Id: I9d3a15d3f8651efb02c7d211f06222a592bd4184
Approved-By: Tom Tromey <tom@tromey.com>

gdb: move declarations of check_quit_flag and set_quit_flag to extension.h

Move them out of defs.h, to extension.h, since the implementations are
in extension.c.

Change-Id: Ie7321468bd7fecc684d70b09f72c3ee8ac75d8f4
Approved-By: Tom Tromey <tom@tromey.com>

gdb: remove unused include in infrun.c

Remove the gdbcmd.h, which is reported as unused by clangd. Add
cli/cli-cmds.h instead, to get access to `cmdlist` and friends.

Change-Id: Ic0c60d2f6d3618f1bd9fd80b95ffd7c33c692a04

objdump: Round ASCII art lines in jump visualization

gdb/dwarf2/read.c: remove pessimizing std::move

When building with this clang:

    $ c++ --version
    FreeBSD clang version 16.0.6 (https://github.com/llvm/llvm-project.git llvmorg-16.0.6-0-g7cbf1a259152)

I see:

    $ gmake
      CXX    dwarf2/read.o
    /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:4890:6: error: moving a temporary object prevents copy elision [-Werror,-Wpessimizing-move]
                                            std::move (thread_storage.release_parent_map ()));
                                            ^
    /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:4890:6: note: remove std::move call here
                                            std::move (thread_storage.release_parent_map ()));
                                            ^~~~~~~~~~~                                    ~

The compiler seems right, there is not need to std::move the result of
`release_parent_map ()`, it's already going to be an rvalue.  Remove the
std::move.

The issue isn't FreeBSD-specific, I see it on Linux as well when
building hwith clang, I just noticed it on a FreeBSD build first.

Change-Id: I7aa20a4db56c799f20d838ad08099a01653bba19
Approved-By: Tom Tromey <tom@tromey.com>

gdb: bump black version to 24.4.0

Run `pre-commit autoupdate`, this is the outcome. There is no change in
formatting of Python files.

Change-Id: I977781fa6cc924c398cc3b9d9954dc0fbb95d082

PR31667, objcopy/strip corrupts solaris binaries

Using want_p_paddr_set_to_zero in commit 45d92439aebd was wrong. Even
solaris targets don't have want_p_paddr_set_to_zero, but we should
handle them at least somewhat reasonably.

PR 31667
* elf.c (IS_SECTION_IN_INPUT_SEGMENT): Remove bed arg, add
paddr_valid. Don't use bed->want_p_paddr_set_to_zero.
(INCLUDE_SECTION_IN_SEGMENT): Likewise.
(rewrite_elf_program_header): Adjust to suit.

ignore some symbols in elf.c:swap_out_syms

The reason behind this patch was noticing that generic ELF targets
fail to remove "bar" in the recently committed ld-elf/undefweak-1
test.  (Despite that, those targets pass the test due to it being too
strict when matching symbols.  "bar" gets turned into a local weak
defined absolute symbol.)

swap_out_syms currently drops local section syms that are defined in
discarded sections.  Extend that to also drop other symbols in
discarded sections too, even global symbols.  The linker goes to quite
a lot of effort to ensure globals in discarded section take a
definition from the kept linkonce or comdat group section.  So the
global sym change should only affect cases where something is quite
wrong about the set of linkonce or comdat group sections.  However
that change to elf_map_symbols meant we dropped _DYNAMIC_LINK /
_DYNAMIC_LINKING for mips, a global absolute symbol given STT_SECTION
type for some reason.  That problem is fixed by reverting the pr14493
change which is no longer needed due to a) BSF_SECTION_SYM_USED on
x86, and b) fixing objcopy to use copy_private_symbol_data.

bfd/
PR 14493
* elf.c (ignore_sym): Rename from ignore_section_sym.  Return
true for any symbol without a section or in a discarded section.
Revert pr14493 change.
(elf_map_symbols): Tidy.  Use ignore_sym on all symbols.
(swap_out_syms): Tidy.
ld/
* testsuite/ld-elf/undefweak-1.rd: Match any "bar".

xfail undefweak-1 test for alpha

".set" has a different meaning on alpha. Changing it to ".equ" runs
into ".equ" having a different meaning on hppa, and changing it to "="
runs into trouble on bfin.

* testsuite/ld-elf/elf.exp (undefweak-1): xfail on alpha,
don't xfail for genelf.

use copy_private_symbol_data in objcopy

osympp appearing twice here is not a bug.

PR 14493
* objcopy.c (copy_object): Run the symbols through
bfd_copy_private_symbol_data.

copy_private_symbol_data

bfd_copy_private_symbol_data is a bfd function that appeared in
commit 89665c8562da a long time ago, but seemingly wasn't used
anywhere until Jan added it to gas/symbols.c in commit 6a2b6326c21e.

The function is used to modify ELF symbol st_shndx for symbols defined
in odd sections like .symtab, so that they get the corresponding
section st_shndx in an output file. This patch fixes some bitrot in
the function. After commit c03551323c04 which introduced
output_elf_obj_tdata, elf_strtab_sec and elf_shstrtab_sec will
segfault if used on an input bfd.

PR 14493
* elf.c (_bfd_elf_copy_private_symbol_data): Don't use
elf_strtab_sec and elf_shstrtab_sec.

gdb: don't include gdbsupport/array-view.h in defs.h

Nothing in defs.h actually uses this. Everything that I (and the
buildbot) can compile still compiles, so I guess that all users of
array_view already include it one way or another. Worst case, if this
causes some build failure, the fix will be one #include away.

Change-Id: I981be98b0653cc18c929d85e9afd8732332efd15
Approved-By: John Baldwin <jhb@FreeBSD.org>

gdb: don't include hashtab.h in defs.h

Nothing in defs.h actually uses this.

Add some includes for some spots using things from hashtab.h. Note that
if the GDB build doesn't use libxxhash, hashtab.h is included by
gdbsupport/common-utils.h, so all files still see hashtab.h. It puzzled
me for some time why I didn't see build failures in my build (which
didn't use libxxhash) but the buildbot gave build failures (it uses
libxxhash).

Change-Id: I8efd68decdaf579f048941c7537cd689885caa2a
Approved-By: John Baldwin <jhb@FreeBSD.org>

gdb: move RequireLongest to gdbsupport/traits.h

Move it out of defs.h.

Change-Id: Ie1743d41a57f81667650048563e66073c72230cf
Approved-By: John Baldwin <jhb@FreeBSD.org>

gdb: move store/extract integer functions to extract-store-integer.{c,h}

Move the declarations out of defs.h, and the implementations out of
findvar.c.

I opted for a new file, because this functionality of converting
integers to bytes and vice-versa seems a bit to generic to live in
findvar.c.

Change-Id: I524858fca33901ee2150c582bac16042148d2251
Approved-By: John Baldwin <jhb@FreeBSD.org>

gdb: remove extract_long_unsigned_integer

It is unused.

Change-Id: I5d4091368c4dfc29752b12061e38f1df8353ba74
Approved-By: John Baldwin <jhb@FreeBSD.org>

gdb: move `enum compile_i_scope_types` to compile/compile.h

Move it out of defs.h, adjust the includes here and there.

Change-Id: I11901fdce55d54f5e51723e123cef154cfb1bbc5
Approved-By: John Baldwin <jhb@FreeBSD.org>

gdb: move two declarations out of defs.h

Move declarations of initialize_progspace and initialize_inferiors to
progspace.h and inferior.h, respectively.

Change-Id: I62292ffda429861b9f27d8c836a56d161dfa548d
Approved-By: John Baldwin <jhb@FreeBSD.org>

Automatic date update in version.in

gdb/testsuite: Use default gdb_expect timeout in runto

runto uses a hard-coded timeout of 30s in its invocation of gdb_expect.
This is normally fine, but for very a slow system (e.g., an emulator) it
may not be enough time for GDB to reach the intended breakpoint.

gdb_expect can obtain a timeout value from user-configurable variables
when it's not given one explicitly, so use that mechanism instead since
the user will have already adjusted the timeout variable to account for
the slow system.

Approved-By: Tom Tromey <tom@tromey.com>

gdb: fix unknown variable typo in c-exp.y

Fix 'val' -> 'value' typo in c-exp.y which was breaking the build.
Introduced in commit:

  commit e6375bc8ebbbc177c79f08e9616eb0b131229f65
  Date:   Wed Apr 17 16:17:33 2024 -0600

      Remove some alloca uses

aarch64: Fix coding style issue in `aarch64-dis.c'

Fix integer value being returned from boolean function, as introduced
in `aarch64: Remove asserts from operand qualifier decoders [PR31595]'.

Use std::vector in event-loop.cc

In my occasional and continuing campaign against realloc, this patch
changes event-loop.cc to use std::vector to keep track of pollfd
objects. Regression tested on x86-64 Fedora 38.

Approved-By: Simon Marchi <simon.marchi@efficios.com>
Approved-By: John Baldwin <jhb@FreeBSD.org>

x86/APX: Add invalid check for APX EVEX.X4.

gas/ChangeLog:

        * config/tc-i386.c (build_apx_evex_prefix): Added invalid check for APX
        X4.
        * testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d: Added invalid
        testcase.
        * testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s: Ditto.

opcodes/ChangeLog:

        * i386-dis.c (get_valid_dis386): Added invalid check for APX X4.

Automatic date update in version.in

Remove a couple of VLAs

I found a couple of spots where VLAs are in use but where they can
easily be removed.

In one spot, adding 'const' is enough -- and is already done in
similar code elsewhere in the file.

In another spot, one of two arrays will be used, so making the buffer
large enough for both works.

Approved-By: John Baldwin <jhb@FreeBSD.org>

Remove some alloca uses

A few spots (mostly in the parsers) use alloca to ensure that a string
is terminated before passing it to a printf-like function (mostly
'error'). However, this isn't needed as the "%.*s" format can be used
instead.

This patch makes this change.

In one spot the alloca is dead code and is simply removed.

Regression tested on x86-64 Fedora 38.

Approved-By: John Baldwin <jhb@FreeBSD.org>

Automatic date update in version.in

LoongArch: Add -mignore-start-align option

Ignore .align at the start of a section may result in misalignment when
partial linking. Manually add -mignore-start-align option without partial
linking.

Gcc -falign-functions add .align 5 to the start of a section, it causes some
error message mismatch. Set these testcases to xfail on LoongArch target.

Error compiling libctf-regression test

Seen on 64-bit targets.
ERROR: compilation of lookup program .../libctf-regression/gzrewrite.c failed

* testsuite/libctf-regression/gzrewrite.c (main): Use %zu to
print size_t values.
* testsuite/libctf-regression/zrewrite.c (main): Likewise.

Automatic date update in version.in

gdb: add target_debug_printf and target_debug_printf_nofunc

Add the `target_debug_printf` and `target_debug_printf_nofunc` macros
and use them when outputting debug messages depending on `targetdebug`.
I opted for `target_debug_printf_nofunc` to follow the current style
where the function name is already printed, along with the arguments.

Modify the debug printfs in the `debug_target` methods (generated by
`make-target-delegates.py`) to use `target_debug_printf_nofunc` as well.

This makes the "target" debug prints integrate nicely with the other
debug prints that use the "new" debug print system:

    [infrun] proceed: enter
      [infrun] follow_fork: enter
        [target] -> multi-thread->record_will_replay (...)
        [target] <- multi-thread->record_will_replay (-1, 0) = false
        [target] -> multi-thread->supports_multi_process (...)
        [target] <- multi-thread->supports_multi_process () = true
      [infrun] follow_fork: exit
      ...

Change-Id: Ide3c8c1b8a30e6d4c353a29cba911c7192de29ac
Approved-By: Tom Tromey <tom@tromey.com>

gdb: make regcache::debug_print_register return a string

Rename the method to `register_debug_string`.

This makes it easier to introduce `target_debug_printf` in a subsequent
patch.

Change-Id: I5bb2d49476d17940d503e66f40762e3f1e3baabc
Approved-By: Tom Tromey <tom@tromey.com>

gdb: make debug_target use one-liners

Turn the debug prints in debug_target's method to be one liners.  For
instance, change this:

    gdb_printf (gdb_stdlog, "<- %s->wait (", this->beneath ()->shortname ());
    gdb_puts (target_debug_print_ptid_t (arg0), gdb_stdlog);
    gdb_puts (", ", gdb_stdlog);
    gdb_puts (target_debug_print_target_waitstatus_p (arg1), gdb_stdlog);
    gdb_puts (", ", gdb_stdlog);
    gdb_puts (target_debug_print_target_wait_flags (arg2), gdb_stdlog);
    gdb_puts (") = ", gdb_stdlog);
    target_debug_print_ptid_t (result);
    gdb_puts ("\n", gdb_stdlog);

into this:

    gdb_printf (gdb_stdlog,
               "<- %s->wait (%s, %s, %s) = %s\n",
               this->beneath ()->shortname (),
               target_debug_print_ptid_t (arg0).c_str (),
               target_debug_print_target_waitstatus_p (arg1).c_str (),
               target_debug_print_target_wait_flags (arg2).c_str (),
               target_debug_print_ptid_t (result).c_str ());

This makes it possible for a subsequent patch to turn this gdb_printf
call into a `target_debug_printf` call.

Change-Id: I808202438972fac1bba2f8ccb63e66a4fcef20c9
Approved-By: Tom Tromey <tom@tromey.com>

gdb: make target debug functions return std::string

Change the functions in target-debug.h to return string representations
in an std::string, such that they don't need to know how the printing
part is done. This also helps the following patch that makes the debug
prints in debug_target one-liners.

Update target-delegates.c (through make-target-delegates.py) to do the
printing.

Add an overload of gdb_puts to avoid using `.c_str ()`.

Change-Id: I55cbff1c1b03a3b24a81740e34c6ad41ac4f8453
Approved-By: Tom Tromey <tom@tromey.com>

gdb: fix include for gdb_signal in target/waitstatus.h

clangd tells me that the gdb_signals.h include in target/waitstatus.h is
unused.  This include was probably to give access to `enum gdb_signal`,
but this is in fact defined in gdb/signals.h.  Change the include to
gdb/signals.h.  Include gdbsupport/gdb_signals.h in some files that were
relying on the transitive include.

Change-Id: I6f4361b3d801394bf29abe8c1393aff110aa0ad6

gdb: convert target debug macros to functions

Convert all the macros to static functions. Some macros were unused,
and now that they are functions, got flagged by the compiler, so I
omitted them.

No behavior change expected.

Change-Id: Ia88e61d95e29a0378901c71aa50df7c37d16bebe
Approved-By: Tom Tromey <tom@tromey.com>

gdb: add includes in target-debug.h

Editing target-debug.h with clangd shows a bunch of errors. Add some
includes to fix that (make target-debug.h include what it uses).

Change-Id: I49075a171e6875fa516d6b2ce56b4a03ac7b3376

libctf: do not include undefined functions in libctf.ver

libctf's version script is applied to two libraries: libctf.so,
and libctf-nobfd.so. The latter library is a subset of the former
which does not link to libbfd and does not include a few public
entry points that use it (found in libctf-open-bfd.c). This means
that some of the symbols in this version script only exist in one
of the libraries it's applied to.

A number of linkers dislike this: before now, only Solaris's linker
caused serious problems, introducing NOTYPE-typed symbols when such
things were found, but now LLD has started to complain as well:

ld: error: version script assignment of 'LIBCTF_1.0' to symbol 'ctf_arc_open' failed: symbol not defined
ld: error: version script assignment of 'LIBCTF_1.0' to symbol 'ctf_fdopen' failed: symbol not defined
ld: error: version script assignment of 'LIBCTF_1.0' to symbol 'ctf_open' failed: symbol not defined
ld: error: version script assignment of 'LIBCTF_1.0' to symbol 'ctf_bfdopen' failed: symbol not defined
ld: error: version script assignment of 'LIBCTF_1.0' to symbol 'ctf_bfdopen_ctfsect' failed: symbol not defined

Rather than adding more and more whack-a-mole fixes for every
linker we encounter that does this, simply exclude such symbols
unconditionally, using the same trick we used to use for Solaris.
(Well, unconditionally if we can use version scripts with this
linker at all, which is not always the case.)

Thanks to Nicholas Vinson for the original report and a fix very
similar to this one (but not quite identical).

libctf/

* configure.ac: Always exclude libctf symbols from
libctf-nobfd's version script.
* configure: Regenerated.

libctf: Remove undefined functions from ver. map

Starting with ld.lld-17, ld.lld is invoked with the option
--no-undefined-version enabled by default. Furthermore, The functions
ctf_label_set() and ctf_label_get() are not defined. Their inclusion in
libctf/libctf.ver causes ld.lld-17 to fail emitting the following error
messages:

ld.lld: error: version script assignment of 'LIBCTF_1.0' to symbol 'ctf_label_set' failed: symbol not defined
ld.lld: error: version script assignment of 'LIBCTF_1.0' to symbol 'ctf_label_get' failed: symbol not defined

This patch fixes the issue by removing the symbol names from
libctf/libctf.ver.

[nca: fused in later commit that marked ctf_arc_open as libctf
only as well. Added ChangeLog entry.]

Signed-off-by: Nicholas Vinson <nvinson234@gmail.com>
libctf/
* libctf.ver: drop nonexistent label functions: mark
ctf_arc_open as libctf-only.

libctf: don't pass errno into ctf_err_warn so often

The libctf-internal warning function ctf_err_warn() can be passed a libctf
errno as a parameter, and will add its textual errmsg form to the passed-in
error message. But if there is an error on the fp already, and this is
specifically an error and not a warning, ctf_err_warn() will print the error
out regardless: there's no need to pass in anything but 0.

There are still a lot of places where we do

ctf_err_warn (fp, 0, EFOO, ...);
return ctf_set_errno (fp, 0, EFOO);

I've left all of those alone, because fixing it makes the code a bit longer:
but fixing the cases where no return is involved and the error has just been
set on the fp itself costs nothing and reduces redundancy a bit.

libctf/

* ctf-dedup.c (ctf_dedup_walk_output_mapping): Drop the errno arg.
(ctf_dedup_emit): Likewise.
(ctf_dedup_type_mapping): Likewise.
* ctf-link.c (ctf_create_per_cu): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_one_symtypetab): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
* ctf-lookup.c (ctf_lookup_symbol_idx): Likewise.
* ctf-subr.c (ctf_assert_fail_internal): Likewise.

libctf: fix leak in test

This purely serves to make it easier to interpret valgrind output.
No functional effect.

libctf/
* testsuite/libctf-lookup/conflicting-type-syms.c: Free everything.

libctf: add rewriting tests

Now there's a chance of it actually working, we can add more tests for
the long-broken dict read-and-rewrite cases. This is the first ever
test for the (rarely-used, unpleasant, and until recently completely
broken) ctf_gzwrite function.

libctf/

* testsuite/libctf-regression/gzrewrite*: New test.
* testsuite/libctf-regression/zrewrite*: Likewise.

libctf: fix a debugging typo

libctf/

* ctf-lookup.c (ctf_symidx_sort): Fix a debugging typo.

libctf: make ctf_lookup of symbols by name work in more cases

In particular, we don't need a symbol table if we're looking up a
symbol by name and that type of symbol has an indexed symtypetab,
since in that case we get the name from the symtypetab index, not
from the symbol table.

This lets you do symbol lookups in unlinked object files and unlinked
dicts written out via libctf's writeout functions.

libctf/

* ctf-lookup.c (ctf_lookup_by_sym_or_name): Allow lookups
by index even when there is no symtab.

libctf: improve handling of type dumping errors

When dumping a type fails with an error, we want to emit a warning noting
this: a warning because it's not fatal and we can continue. But warnings
don't automatically print out the ctf_errno (because not all cases causing
warnings set the errno at all), so we must do it at warning-emission time or
lose track of what's gone wrong.

libctf/

* ctf-dump.c (ctf_dump_format_type): Dump the underlying error on
type dump failure.

libctf: fix tiny dumping error

Without this, you might get things like this in the output:

Flags: 0xa (CTF_F_NEWFUNCINFO, , CTF_F_DYNSTR)

Note the spurious comma.

libctf/
* ctf-dump.c (ctf_dump_header): Fix comma emission.

libctf: make ctf_serialize() actually serialize

ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.

It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.

Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.

... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t.  This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly.  This simplifies most of its callers
significantly.

(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)

This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.

libctf/

* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict.  No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.

libctf: rethink strtab writeout

This commit finally adjusts strtab writeout so that repeated writeouts, or
writeouts of a dict that was read in earlier, only sorts the portion of the
strtab that was newly added.

There are three intertwined changes here:

- pull the contents of strtabs from newly ctf_bufopened dicts into the
   atoms table, so that future additions will reuse the existing offset etc
   rather than adding new identical strings
- allow the internal ctf_bufopen done by serialization to contribute its
   existing atoms table, so that existing atoms can be used for the
   remainder of the open process (like name table construction): this atoms
   table currente gets thrown away in the mass reassignment done later in
   ctf_serialize in any case, but it needs to be there during the open.
- rewrite ctf_str_write_strtab so that a) it uses iterators rather than
   ctf_*_iter, reducing pointless structures which serve no other purpose
   than to implement ordinary variable scope, but more clunkily, and b)
   retains the existing strtab on the front of the new one, with its sort
   retained, rather than resorting, so all existing already-written strtab
   offsets remain valid across the call.

This latter change finally permits repeated serializations, and
reserializations of ctf_open()ed dicts, to work, but for now we keep the
code that prevents that because serialization is about to change again in a
way that will make it more obvious that doing such things is safe, and we
can take it out then.

(There are also some smaller changes like moving the purge of the refs table
into ctf_str_write_strtab(), since that's where the changes happen that
invalidate it, rather than doing it in ctf_serialize().  We also prohibit
something that has never worked, opening a dict and then reporting symbols
to it via ctf_link_add_strtab() et al: you must do that to newly-created
dicts which have had stuff ctf_link()ed into them.  This is very unlikely
ever to be a problem in practice: linkers just don't do that sort of thing.)

libctf/

* ctf-create.c (ctf_create): Add (temporary) atoms arg.
* ctf-impl.h (struct ctf_dict.ctf_dynstrtab): New.
(ctf_str_create_atoms): Adjust.
(ctf_str_write_strtab): Likewise.
(ctf_simple_open_internal): Likewise.
* ctf-open.c (ctf_simple_open_internal): Add atoms arg.
(ctf_bufopen): Likewise.
(ctf_bufopen_internal): Initialize just enough of an
atoms table: pre-init from the atoms arg if supplied.
(ctf_simple_open): Adjust.
* ctf-serialize.c (ctf_serialize): Constify the strtab.
Move ref list purging into ctf_str_write_strtab.
Initialize the new dict with the old dict's atoms table.
Accept the new strtab from ctf_str_write_strtab.
Adjust for addition of ctf_dynstrtab.
* ctf-string.c (ctf_strraw_explicit): Improve comments.
(ctf_str_create_atoms): Prepopulate from an existing atoms table,
or alternatively pull in all strings from the strtab and turn
them into atoms.
(ctf_str_free_atoms): Free the dynstrtab and its strtab.
(struct ctf_strtab_write_state): Remove.
(ctf_str_count_strtab): Fold this...
(ctf_str_populate_sorttab): ... and this...
(ctf_str_write_strtab): ... into this.  Prepend existing strings
to the strtab rather than resorting them (and wrecking their
offsets).  Keep the dynstrtab updated.  Update refs for all
atoms with refs, whether or not they are strings newly added
to the strtab.

libctf: replace 'pending refs' abstraction

A few years ago we introduced a 'pending refs' abstraction to fix one
problem: serializing a dict, then changing it would tend to corrupt the dict
because the strtab sort we do on strtab writeout (to improve compression
efficiency) would modify the offset of any strings that sorted
lexicographically earlier in the strtab: so we added a new restriction that
all strings are added only at serialization time, and maintained a set of
'pending' refs that were added earlier, whose offsets we could update (like
other refs) at writeout time.

This was in hindsight seriously problematic for maintenance (because
serialization has to traverse all strings in all datatypes in the entire
dict), and has become impossible to sustain now that we can read in existing
dicts, modify them, and reserialize them again.  We really don't want to
have to dig through the entire dict we jut read in just in order to dig out
all its strtab offsets, then *change* it, just for the sake of a sort that
adds a frankly trivial amount of compression efficiency.

Sorting *is* still worthwhile -- but it sacrifices very little to only sort
newly-added portions of the strtab, reusing older portions as necessary.
As a first stage in this, discard the whole "pending refs" abstraction and
replace it with "movable" refs, which are exactly like all other refs
(addresses containing the strtab offset of some string, which are updated
wiht the final strtab offset on serialization) except that we track them in
a reverse dict so that we can move the refs around (which we do whenever we
realloc() a buffer containing a bunch of structure members or something when
we add members to the structure).

libctf/

* ctf-create.c (ctf_add_enumerator): Call ctf_str_move_refs; add
        a movable ref.
(ctf_add_member_offset): Likewise.
* ctf-util.c (ctf_realloc): Delete.
* ctf-serialize.c (ctf_serialize): No longer use it.  Adjust to
new fields.
* ctf-string.c (ctf_str_purge_atom_refs): Purge movable refs.
(ctf_str_free_atom): Free freeable atoms' strings.
(ctf_str_create_atoms): Create the movable refs dynhash if needed.
(ctf_str_free_atoms): Destroy it.
(CTF_STR_MOVABLE): Switch (back) from ints to flags (see previous
reversion).  Add new flag.
(aref_create):  New, populate movable refs if need be.
(ctf_str_add_ref_internal): Switch back to flags, update refs
directly for nonprovisional strings (with already-known fixed offsets);
create refs via aref_create.  Allocate strings only if not within an
mmapped strtab.
(ctf_str_add_movable_ref): New.
(ctf_str_add): Adjust to CTF_STR_* reintroduction.
(ctf_str_add_external): LIkewise.
(ctf_str_move_refs): New, move refs via ctf_str_movable_refs
backpointer.
(ctf_str_purge_refs): Drop ctf_str_num_refs.
(ctf_str_update_refs): Fix indentation.
* ctf-impl.h (struct ctf_str_atom_movable): New.
(struct ctf_dict.ctf_str_num_refs): Drop.
(struct ctf_dict.ctf_str_movable_refs): New.
(ctf_str_add_movable_ref): Declare.
(ctf_str_move_refs): Likewise.
(ctf_realloc): Drop.

Revert "libctf: do not corrupt strings across ctf_serialize"

This reverts commit 986e9e3aa03f854bedacef7fac38fe8f009a416c.

(We do not revert the testcase -- it remains valid -- but we are
taking a different, less complex and more robust approach.)

This also deletes the pending refs abstraction without (yet)
replacing it, so some tests will fail for a commit or two.

libctf: rename ctf_dict.ctf_{symtab,strtab}

These two fields are constantly confusing because CTF dicts contain both
a symtypetab and strtab, but these fields are not that: they are the
symtab and strtab from the ELF file. We have enough string tables now
(internal, external, synthetic external, dynamic) that we need to at
least name them better than this to avoid getting totally confused.
Rename them to ctf_ext_symtab and ctf_ext_strtab.

libctf/

* ctf-dump.c (ctf_dump_objts): Rename ctf_symtab -> ctf_ext_symtab.
* ctf-impl.h (struct ctf_dict.ctf_symtab): Rename to...
(struct ctf_dict.ctf_ext_strtab): ... this.
(struct ctf_dict.ctf_strtab): Rename to...
(struct ctf_dict.ctf_ext_strtab): ... this.
* ctf-lookup.c (ctf_lookup_symbol_name): Adapt.
(ctf_lookup_symbol_idx): Adapt.
(ctf_lookup_by_sym_or_name): Adapt.
* ctf-open.c (ctf_bufopen_internal): Adapt.
(ctf_dict_close): Adapt.
(ctf_getsymsect): Adapt.
(ctf_getstrsect): Adapt.
(ctf_symsect_endianness): Adapt.

libctf: fix a comment typo

ctf_update has been called ctf_serialize for years now.

libctf/

* ctf-impl.h: Fix comment typo.

libctf: delete LCTF_DIRTY

This flag was meant as an optimization to avoid reserializing dicts
unnecessarily.  It was critically necessary back when serialization was
done by ctf_update() and you had to call that every time you wanted any
new modifications to the type table to be usable by other types, but
that has been unnecessary for years now, and serialization is only done
once when writing out, which one would naturally assume would always
serialize the dict.  Worse, it never really worked: it only tracked
newly-added types, not things like added symbols which might equally
well require reserialization, and it gets in the way of an upcoming
change.  Delete entirely.

libctf/

* ctf-create.c (ctf_create): Drop LCTF_DIRTY.
(ctf_discard): Likewise.
(ctf_rollback): Likewise.
(ctf_add_generic): Likewise.
(ctf_set_array): Likewise.
(ctf_add_enumerator): Likewise.
(ctf_add_member_offset): Likewise.
(ctf_add_variable_forced): Likewise.
* ctf-link.c (ctf_link_intern_extern_string): Likewise.
(ctf_link_add_strtab): Likewise.
* ctf-serialize.c (ctf_serialize): Likewise.
* ctf-impl.h (LCTF_DIRTY): Likewise.
(LCTF_LINKING): Renumber.

libctf: fix a comment

A mistaken "not" in ctf_err_warn made it seem like we only extracted
error messages if this was not an error.

libctf/

* ctf-subr.c (ctf_err_warn): Fix comment.

libctf: support addition of types to dicts read via ctf_open()

libctf has long declared deserialized dictionaries (out of files or ELF
sections or memory buffers or whatever) to be read-only: back in the
furthest prehistory this was not the case, in that you could add a
few sorts of type to such dicts, but attempting to do so often caused
horrible memory corruption, so I banned the lot.

But it turns out real consumers want it (notably DTrace, which
synthesises pointers to types that don't have them and adds them to the
ctf_open()ed dicts if it needs them). Let's bring it back again, but
without the memory corruption and without the massive code duplication
required in days of yore to distinguish between static and dynamic
types: the representation of both types has been identical for a few
years, with the only difference being that types as a whole are stored in
a big buffer for types read in via ctf_open and per-type hashtables for
newly-added types.

So we discard the internally-visible concept of "readonly dictionaries"
in favour of declaring the *range of types* that were already present
when the dict was read in to be read-only: you can't modify them (say,
by adding members to them if they're structs, or calling ctf_set_array
on them), but you can add more types and point to them.  (The API
remains the same, with calls sometimes returning ECTF_RDONLY, but now
they do so less often.)

This is a fairly invasive change, mostly because code written since the
ban was introduced didn't take the possibility of a static/dynamic split
into account.  Some of these irregularities were hard to define as
anything but bugs.

Notably:

- The symbol handling was assuming that symbols only needed to be
   looked for in dynamic hashtabs or static linker-laid-out indexed/
   nonindexed layouts, but now we want to check both in case people
   added more symbols to a dict they opened.

- The code that handles type additions wasn't checking to see if types
   with the same name existed *at all* (so you could do
   ctf_add_typedef (fp, "foo", bar) repeatedly without error).  This
   seems reasonable for types you just added, but we probably *do* want
   to ban addition of types with names that override names we already
   used in the ctf_open()ed portion, since that would probably corrupt
   existing type relationships.  (Doing things this way also avoids
   causing new errors for any existing code that was doing this sort of
   thing.)

- ctf_lookup_variable entirely failed to work for variables just added
   by ctf_add_variable: you had to write the dict out and read it back
   in again before they appeared.

- The symbol handling remembered what symbols you looked up but didn't
   remember their types, so you could look up an object symbol and then
   find it popping up when you asked for function symbols, which seems
   less than ideal.  Since we had to rejig things enough to be able to
   distinguish function and object symbols internally anyway (in order
   to give suitable errors if you try to add a symbol with a name that
   already existed in the ctf_open()ed dict), this bug suddenly became
   more visible and was easily fixed.

We do not (yet) support writing out dicts that have been previously read
in via ctf_open() or other deserializer (you can look things up in them,
but not write them out a second time).  This never worked, so there is
no incompatibility; if it is needed at a later date, the serializer is a
little bit closer to having it work now (the only table we don't deal
with is the types table, and that's because the upcoming CTFv4 changes
are likely to make major changes to the way that table is represented
internally, so adding more code that depends on its current form seems
like a bad idea).

There is a new testcase that tests much of this, in particular that
modification of existing types is still banned and that you can add new
ones and chase them without error.

libctf/

* ctf-impl.h (struct ctf_dict.ctf_symhash): Split into...
(ctf_dict.ctf_symhash_func): ... this and...
(ctf_dict.ctf_symhash_objt): ... this.
(ctf_dict.ctf_stypes): New, counts static types.
(LCTF_INDEX_TO_TYPEPTR): Use it instead of CTF_RDWR.
(LCTF_RDWR): Deleted.
(LCTF_DIRTY): Renumbered.
(LCTF_LINKING): Likewise.
(ctf_lookup_variable_here): New.
(ctf_lookup_by_sym_or_name): Likewise.
(ctf_symbol_next_static): Likewise.
(ctf_add_variable_forced): Likewise.
(ctf_add_funcobjt_sym_forced): Likewise.
(ctf_simple_open_internal): Adjust.
(ctf_bufopen_internal): Likewise.
* ctf-create.c (ctf_grow_ptrtab): Adjust a lot to start with.
(ctf_create): Migrate a bunch of initializations into bufopen.
Force recreation of name tables.  Do not forcibly override the
model, let ctf_bufopen do it.
(ctf_static_type): New.
(ctf_update): Drop LCTF_RDWR check.
(ctf_dynamic_type): Likewise.
(ctf_add_function): Likewise.
(ctf_add_type_internal): Likewise.
(ctf_rollback): Check ctf_stypes, not LCTF_RDWR.
(ctf_set_array): Likewise.
(ctf_add_struct_sized): Likewise.
(ctf_add_union_sized): Likewise.
(ctf_add_enum): Likewise.
(ctf_add_enumerator): Likewise (only on the target dict).
(ctf_add_member_offset): Likewise.
(ctf_add_generic): Drop LCTF_RDWR check.  Ban addition of types
with colliding names.
(ctf_add_forward): Note safety under the new rules.
(ctf_add_variable): Split all but the existence check into...
(ctf_add_variable_forced): ... this new function.
(ctf_add_funcobjt_sym): Likewise...
(ctf_add_funcobjt_sym_forced): ... for this new function.
* ctf-link.c (ctf_link_add_linker_symbol): Ban calling on dicts
with any stypes.
(ctf_link_add_strtab): Likewise.
(ctf_link_shuffle_syms): Likewise.
(ctf_link_intern_extern_string): Note pre-existing prohibition.
* ctf-lookup.c (ctf_lookup_by_id): Drop LCTF_RDWR check.
(ctf_lookup_variable): Split out looking in a dict but not
its parent into...
(ctf_lookup_variable_here): ... this new function.
(ctf_lookup_symbol_idx): Track whether looking up a function or
object: cache them separately.
(ctf_symbol_next): Split out looking in non-dynamic symtypetab
entries to...
(ctf_symbol_next_static): ... this new function.  Don't get confused
by the simultaneous presence of static and dynamic symtypetab entries.
(ctf_try_lookup_indexed):  Don't waste time looking up symbols by
index before there can be any idea how symbols are numbered.
(ctf_lookup_by_sym_or_name): Distinguish between function and
data object lookups.  Drop LCTF_RDWR.
(ctf_lookup_by_symbol): Adjust.
(ctf_lookup_by_symbol_name): Likewise.
* ctf-open.c (init_types): Rename to...
(init_static_types): ... this.  Drop LCTF_RDWR.  Populate ctf_stypes.
(ctf_simple_open): Drop writable arg.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen): Likewise.
(ctf_bufopen_internal): Populate fields only used for writable dicts.
Drop LCTF_RDWR.
(ctf_dict_close): Cater for symhash cache split.
* ctf-serialize.c (ctf_serialize): Use ctf_stypes, not LCTF_RDWR.
* ctf-types.c (ctf_variable_next): Drop LCTF_RDWR.
* testsuite/libctf-lookup/add-to-opened*: New test.

libctf: fix name lookup in dicts containing base-type bitfields

The intent of the name lookup code was for lookups to yield non-bitfield
basic types except if none existed with a given name, and only then
return bitfield types with that name. Unfortunately, the code as
written only does this if the base type has a type ID higher than all
bitfield types, which is most unlikely (the opposite is almost always
the case).

Adjust it so that what ends up in the name table is the highest-width
zero-offset type with a given name, if any such exist, and failing that
the first type with that name we see, no matter its offset. (We don't
define *which* bitfield type you get, after all, so we might as well
just stuff in the first we find.)

Reported by Stephen Brennan <stephen.brennan@oracle.com>.

libctf/

* ctf-open.c (init_types): Modify to allow some lookups during open;
detect bitfield name reuse and prefer less bitfieldy types.
* testsuite/libctf-writable/libctf-bitfield-name-lookup.*: New test.

libctf: remove static/dynamic name lookup distinction

libctf internally maintains a set of hash tables for type name lookups,
one for each valid C type namespace (struct, union, enum, and everything
else).

Or, rather, it maintains *two* sets of hash tables: one, a ctf_hash *,
is meant for lookups in ctf_(buf)open()ed dicts with fixed content; the
other, a ctf_dynhash *, is meant for lookups in ctf_create()d dicts.

This distinction was somewhat valuable in the far pre-binutils past when
two different hashtable implementations were used (one expanding, the
other fixed-size), but those days are long gone: the hash table
implementations are almost identical, both wrappers around the libiberty
hashtab. The ctf_dynhash has many more capabilities than the ctf_hash
(iteration, deletion, etc etc) and has no downsides other than starting
at a fixed, arbitrary small size.

That limitation is easy to lift (via a new ctf_dynhash_create_sized()),
following which we can throw away nearly all the ctf_hash
implementation, and all the code to choose between readable and writable
hashtabs; the few convenience functions that are still useful (for
insertion of name -> type mappings) can also be generalized a bit so
that the extra string verification they do is potentially available to
other string lookups as well.

(libctf still has two hashtable implementations, ctf_dynhash, above,
and ctf_dynset, which is a key-only hashtab that can avoid a great many
malloc()s, used for high-volume applications in the deduplicator.)

libctf/

* ctf-create.c (ctf_create): Eliminate ctn_writable.
(ctf_dtd_insert): Likewise.
(ctf_dtd_delete): Likewise.
(ctf_rollback): Likewise.
(ctf_name_table): Eliminate ctf_names_t.
* ctf-hash.c (ctf_dynhash_create): Comment update.
Reimplement in terms of...
(ctf_dynhash_create_sized): ... this new function.
(ctf_hash_create): Remove.
(ctf_hash_size): Remove.
(ctf_hash_define_type): Remove.
(ctf_hash_destroy): Remove.
(ctf_hash_lookup_type): Rename to...
(ctf_dynhash_lookup_type): ... this.
(ctf_hash_insert_type): Rename to...
(ctf_dynhash_insert_type): ... this, moving validation to...
* ctf-string.c (ctf_strptr_validate): ... this new function.
* ctf-impl.h (struct ctf_names): Extirpate.
(struct ctf_lookup.ctl_hash): Now a ctf_dynhash_t.
(struct ctf_dict): All ctf_names_t fields are now ctf_dynhash_t.
(ctf_name_table): Now returns a ctf_dynhash_t.
(ctf_lookup_by_rawhash): Remove.
(ctf_hash_create): Likewise.
(ctf_hash_insert_type): Likewise.
(ctf_hash_define_type): Likewise.
(ctf_hash_lookup_type): Likewise.
(ctf_hash_size): Likewise.
(ctf_hash_destroy): Likewise.
(ctf_dynhash_create_sized): New.
(ctf_dynhash_insert_type): New.
(ctf_dynhash_lookup_type): New.
(ctf_strptr_validate): New.
* ctf-lookup.c (ctf_lookup_by_name_internal): Adapt.
* ctf-open.c (init_types): Adapt.
(ctf_set_ctl_hashes): Adapt.
(ctf_dict_close): Adapt.
* ctf-serialize.c (ctf_serialize): Adapt.
* ctf-types.c (ctf_lookup_by_rawhash): Remove.