Mark Wielaard [Wed, 21 Aug 2024 13:52:20 +0000 (15:52 +0200)]
tests: When BUILD_STATIC always link against libeu
libeu is a static library with internal helper functions normally
included in all shared libraries. But when linking static (with
--enable-gcov) we need to explicitly link it into the test binaries.
* tests/Makefile.am (libelf): Add $(libeu) when BUILD_STATIC.
Mark Wielaard [Wed, 21 Aug 2024 13:32:59 +0000 (15:32 +0200)]
debuginfod: Make sure crypto and jsonc are also included in static link
When doing a --enable-gcov build we link all binaries static.
libdebuginfod.so now depends on crypto an jsonc. So also add those
when linking against libdebuginfod.a
debuginfod/Makefile.am (libdebuginfod): Add $(crypto_LIBS)
$(jsonc_LIBS) when BUILD_STATIC.
Alfred Wingate [Wed, 14 Aug 2024 16:14:38 +0000 (12:14 -0400)]
Avoid overriding libcxx system header
Replace -I with -iquote to avoid overriding stack system header from libcxx-18
with the previously built stack binary. Override DEFAULT_INLCUDES because m4
adds -I. by default.
Andreas Schwab [Wed, 31 Jul 2024 13:03:35 +0000 (15:03 +0200)]
backends/riscv: Remove unused relocations
None of these relocations were ever part of any object file. The
GNU_VTINHERIT and GNU_VTINHERIT relocations were part of the obsolete
--gc-sections support which was never implemented for RISC-V. The other
relocations are only used internally by libbfd during the relaxation pass
and eliminated before writing the object file.
Since the schema change adding _r_seekable was done in a backward
compatible way, seekable archives that were previously scanned will not
be in _r_seekable. Whenever an archive is going to be extracted to
satisfy a request, check if it is seekable. If so, populate _r_seekable
while extracting it so that future requests use the optimized path.
The next time that BUILDIDS is bumped, all archives will be checked at
scan time. At that point, checking again will be unnecessary and this
commit (including the test case modification) can be reverted.
Whenever a new archive is scanned, check if it is seekable with a little
liblzma magic, and populate _r_seekable if so. With this, newly scanned
seekable archives will used the optimized extraction path added in the
previous commit. Also add a test case using some artificial packages.
debuginfod: optimize extraction from seekable xz archives
The kernel debuginfo packages on Fedora, Debian, and Ubuntu, and many of
their downstreams, are all compressed with xz in multi-threaded mode,
which allows random access. We can use this to bypass the full archive
extraction and dramatically speed up kernel debuginfo requests (from ~50
seconds in the worst case to < 0.25 seconds).
This works because multi-threaded xz compression splits up the stream
into many independently compressed blocks. The stream ends with an
index of blocks. So, to seek to an offset, we find the block containing
that offset in the index and then decompress and throw away data until
we reach the offset within the block. We can then decompress the
desired amount of data, possibly from subsequent blocks. There's no
high-level API in liblzma to do this, but we can do it by stitching
together a few low-level APIs.
We need to pass down the file ids then look up the size, uncompressed
offset, and mtime in the _r_seekable table. Note that this table is not
yet populated, so this commit has no functional change on its own.
debugifod: add new table and views for seekable archives
In order to extract a file from a seekable archive, we need to know
where in the uncompressed archive the file data starts and its size.
Additionally, in order to populate the response headers, we need the
file modification time (since we won't be able to get it from the
archive metadata). Add a new table, _r_seekable, keyed on the archive
file id and entry file id and containing the size, offset, and mtime.
It also contains the compression type just in case new seekable formats
are supported in the future.
In order to search this table when we get a request, we need the file
ids available. Add the ids to the _query_d and _query_e views, and
rename them to _query_d2 and _query_e2.
This schema change is backward compatible and doesn't require
reindexing. _query_d2 and _query_e2 can be renamed back the next time
BUILDIDS needs to be bumped.
Before this change, the database for a single kernel debuginfo RPM
(kernel-debuginfo-6.9.6-200.fc40.x86_64.rpm) was about 15MB. This
change increases that by about 70kB, only a 0.5% increase.
debuginfod: factor out common code for responding from an archive
handle_buildid_r_match has two very similar branches where it optionally
extracts a section and then creates a microhttpd response. In
preparation for adding a third one, factor it out into a function.
Since commit acd9525e93d7 ("PR31265 - rework debuginfod archive-extract
fdcache"), the fdcache limit is only applied when a new file is interned
and it has been at least 10 seconds since the limit was last applied.
This means that the fdcache can go over the limit temporarily.
run-debuginfod-fd-prefetch-caches.sh happens to avoid tripping over this
because of lucky sizes of the files used in the test. However, adding
new files for an upcoming test exposed this failure.
dwarf_extract_source_paths explicitly skips source files that equal
"<built-in>", but dwarf_filesrc may return a path like "dir/<built-in>".
Check for and skip that case, too.
In particular, the test debuginfod RPMs have paths like this. However,
the test cases didn't catch this because they have a bug, too: they
follow symlinks, which results in double-counting every file. Fix that,
too.
backends: allocate enough stace for null terminator
`gcc-15` added a new warning in https://gcc.gnu.org/PR115185:
i386_regs.c:88:11: error: initializer-string for array of 'char' is too long [-Werror=unterminated-string-initialization]
88 | "ax", "cx", "dx", "bx", "sp", "bp", "si", "di", "ip"
| ^~~~
`elfutils` does not need to store '\0'. We could either initialize the
arrays with individual bytes or allocate extra byte for null.
This change initializes the array bytewise.
* backends/i386_regs.c (i386_register_info): Initialize the
array bytewise to fix gcc-15 warning.
* backends/x86_64_regs.c (x86_64_register_info): Ditto.
Signed-off-by: Sergei Trofimovich <slyich@gmail.com>
Aleksei Vetrov [Thu, 11 Jul 2024 20:35:21 +0000 (20:35 +0000)]
libdwfl: Make dwfl_report_offline_memory work with ELF_C_READ_MMAP
elf_memory open mode recently changed from ELF_C_READ to
ELF_C_READ_MMAP. This broken dwfl_report_offline_memory that changes
mode to ELF_C_READ_MMAP_PRIVATE to be compatible with subsequent
elf_begin on embedded ELF files.
The proper implementation of dwfl_report_offline_memory doesn't change
open mode and subsequent elf_begin invocations simply use cmd from the
reference Elf*.
Add tests to exercise Elf* to trigger the bug caused by incorrect cmd
set to Elf*.
* libdwfl/offline.c (process_archive): Use archive->cmd
instead of hardcoded ELF_C_READ_MMAP_PRIVATE.
* libdwfl/open.c (libdw_open_elf): Use elf->cmd instead of
hardcoded ELF_C_READ_MMAP_PRIVATE.
(__libdw_open_elf_memory): Don't override (*elfp)->cmd.
* tests/Makefile.am (dwfl_report_offline_memory): Add libelf
as dependency.
* tests/dwfl-report-offline-memory.c: Add count_sections to
exercise Elf* from dwfl_report_offline_memory.
* tests/run-dwfl-report-offline-memory.sh: Add expected number
of sections to test invocations.
tests/run-sysroot.sh: Avoid testing output that depends on LZMA support
run-sysroot.sh checks whether a backtrace generated by eu-stack contains
symbol names found in binaries under a test sysroot. Two frames in
the backtrace contain symbol names that must be read from .gnu_debugdata.
However this section can only be read if elfutils was built with LZMA
support. If not, then the symbol names will be absent from the
backtrace.
Test the eu-stack output with these 2 frames removed in order to prevent
a test failure when LZMA support is missing.
Luke Diamand [Tue, 2 Jul 2024 17:30:58 +0000 (19:30 +0200)]
libdwfl: specify optional sysroot to search for shared libraries and binaries
When searching the list of modules in a core file, if the core was
generated on a different system to the current one, we need to look
in a sysroot for the various shared objects.
For example, we might be looking at a core file from an ARM system
using elfutils running on an x86 host.
This change adds a new function, dwfl_set_sysroot(), which then
gets used when searching for libraries and binaries.
Signed-off-by: Luke Diamand <ldiamand@roku.com> Signed-off-by: Michal Sekletar <msekleta@redhat.com>
Mark Wielaard [Sat, 22 Jun 2024 23:29:42 +0000 (01:29 +0200)]
ar, ranlib: Don't double close file descriptors
Found by GCC14 -Wanalyzer-fd-double-close.
close always closes the given file descriptor even on error. So don't
try to close a file descriptor again on error (even on EINTR). This
could be bad in a multi-threaded environment.
* src/ar.c (do_oper_extract): Call close and set newfd to -1.
(do_oper_delete): Likewise.
(do_oper_insert): Likewise.
* src/ranlib.c (handle_file): Likewise.
Mark Wielaard [Sat, 22 Jun 2024 23:22:54 +0000 (01:22 +0200)]
debuginfod-client: Don't leak id/version with duplicate os-release entries
Found by GCC14 -Wanalyzer-double-free.
If the os-release file would contain multiple ID or VERSION_ID entries
we would leak the originally parsed one. Fix by seeing whether id or
version is already set and ignore any future entries.
* debuginfod/debuginfod-client.c (add_default_headers): Check
whether id or version is already set before resetting them.
Mark Wielaard [Sat, 22 Jun 2024 22:52:06 +0000 (00:52 +0200)]
libdwfl: Make sure mapped is always set in unzip
Found by GCC14 -Wanalyzer-null-argument.
When unzip is called with mapped NULL, but *_whole not NULL, *_whole
contains the first part of the input. But we check against mapped to
make sure the MAGIC bytes are there.
This only worked because this code path was never taken, unzip is
currently always called with *_whole being NULL.
* libdwfl/gzip.c (unzip): Set mapped = state.input_buffer
when *whole is not NULL.
Tweak configure.ac to support $subject again. Also tweak
debuginfod-find.c to not unconditionally include json-c.h, since it
may just be compiled in "dummy" mode, sans such prerequisites. It
turns out debuginfod-find fails at run time very early in such a
configuration, long before it gets to jsonic activities.
Rework the top level configure.ac to systematize the
debuginfod-related checks, inferences, rejections,
and configuration outputs.
Tested by hand on a F39 machine, installing/uninstalling
the various dependencies one at a time, and rerunning
the configury with / without --enable-*debuginfod* flags.
Frank Ch. Eigler [Mon, 31 Oct 2022 21:40:01 +0000 (17:40 -0400)]
PR29472: debuginfod: add metadata query webapi, C api, client
This patch extends the debuginfod API with a "metadata query"
operation. It allows clients to request an enumeration of file names
known to debuginfod servers, returning a JSON response including the
matching buildids. This lets clients later download debuginfo for a
range of versions of the same named binaries, in case they need to to
prospective work (like systemtap-based live-patching). It also lets
server operators implement prefetch triggering operations for popular
but slow debuginfo slivers like kernel vdso.debug files on fedora.
Implementation requires a modern enough json-c library, namely 0.11,
which dates from 2014. Without that, debuginfod client/server bits
will refuse to build.
Refactored several functions in debuginfod-client.c, because the
metadata search logic is different for multiple servers (merge all
responses instead of first responder wins).
Documentation and testing are included.
Signed-off-by: Ryan Goldberg <rgoldber@redhat.com> Signed-off-by: Frank Ch. Eigler <fche@redhat.com>
Luca Boccassi [Fri, 10 May 2024 21:58:02 +0000 (22:58 +0100)]
readelf: add pretty printing for FDO Dlopen Metadata note
The node ID and the string format are now fixed, even if the content
of the string might change, it will still be a string.
* libebl/eblobjnote.c (ebl_object_note): Handle both type
being NT_FDO_PACKAGING_METADATA or NT_FDO_DLOPEN_METADATA when
name is "FDO".
* libebl/eblobjnotetypename.c (ebl_object_note_type_name): Handle
"FDO" name and type NT_FDO_DLOPEN_METADATA.
Mark Wielaard [Sat, 4 May 2024 21:34:24 +0000 (23:34 +0200)]
readelf: Fix printing of DW_FORM_strx and DW_MACRO parsing
print_form_data didn't take the offset_len (4 or 8 bytes) into account
causing the wrong entry to be read from .debug_str_offsets.
print_debug_macro_section did sanity checking before calling
print_form_data, which does sanity checking itself. The sanity check
for DW_FORM_strx was wrong in print_debug_macro_section (but correct
in print_form_data).
Add a new testfile for run-readelf-macro.sh, this one compiled with
clang -gdwarf-5 -fdebug-macro.
* src/readelf.c (print_form_data): Multiply by offset_len for
strx_val.
(print_debug_macro_section): Remove sanity checks before calling
print_form_data.
* tests/testfileclangmacro.bz2: New testfile.
* tests/Makefile.am (EXTRA_DIST): Add testfileclangmacro.bz2.
* tests/run-readelf-macro.sh: Add testfileclangmacro output.
Mark Wielaard [Tue, 30 Apr 2024 14:39:17 +0000 (16:39 +0200)]
ar: Replace one alloca use by xmalloc
This alloca use is inside a lexical block and is used to replace one
element of argv. Use a function local variable, xmalloc and free to
make memory usage pattern more clear.
* src/ar.c (main): Move newp char pointer declaration up.
Use xmalloc to allocate space. free at end of main.
Aaron Merey [Fri, 10 May 2024 21:46:24 +0000 (17:46 -0400)]
config/Makefile.am: Modify profile.fish in all-local
Fish shell scripts do not support bracketed variables.
config/Makefile.am removes brackets from a variable in
config/fish.profile in order to prevent an error when running the
script.
Currently the brackets are removed during make install. This causes
testsuite failures if make check is run before make install.
Fix this by removing the brackets in all-local instead of
install-data-local.
Ryan Goldberg [Mon, 14 Aug 2023 17:51:00 +0000 (13:51 -0400)]
debuginfod: PR28204 - RPM IMA per-file signature verification
Recent versions of Fedora/RHEL include per-file cryptographic
signatures in RPMs, not just an overall RPM signature. This work
extends debuginfod client & server to extract, transfer, and verify
those signatures. These allow clients to assure users that the
downloaded files have not been corrupted since their original
packaging. Downloads that fail the test are rejected.
Clients may select a desired level of enforcement for sets of URLs in
the DEBUGINFOD_URLS by inserting special markers ahead of them:
ima:ignore pay no attention to absence or presence of signatures
ima:enforcing require every file to be correctly signed
The default is ima:ignore mode. In ima:enforcing mode, section
queries are forced to be entire-file downloads, as it is not
possible to crypto-verify just sections.
IMA signatures are verified against a set of signing certificates.
These are normally published by distributions. The environment
variable $DEBUGINFOD_IMA_CERT_PATH contains a colon-separated path for
finding DER or PEM formatted certificates / public keys. These
certificates are assumed trusted. The profile.d scripts transcribe
/etc/debuginfod/*.certdir files into that variable.
* config/Makefile.am: Install defaults into /etc files.
* config/profile.{csh,sh}.in: Process defaults into env variables.
* config/elfutils.spec.in: Add more buildrequires.
* debuginfod/debuginfod.cxx (handle_buildid_r_match): Added extraction of the
per-file IMA signature for the queried file and store in http header.
(find_globbed_koji_filepath): New function.
(parse_opt): New flag --koji-sigcache.
* debuginfod/debuginfod-client.c (debuginfod_query_server): Added policy for
validating IMA signatures
(debuginfod_validate_imasig): New function, with friends.
* debuginfod/debuginfod.h.in: Added DEBUGINFOD_IMA_CERT_PATH_ENV_VAR.
* debuginfod/Makefile.am: Add linker flags for rpm and crypto.
config/profile.fish.in: Prevent bracketed variables and unmatched wildcard errors
Fish does not support bracketed variables in scripts. Remove brackets
from the variable ${prefix} in profile.fish before installation to
prevent this error.
Fish also raises an error for unmatched wildcards, except for special
cases like the set command. Use a wildcard to match .urls files using
the set command instead of cat to prevent an unmatched wildcard error
when no .urls files are found.
Aaron Merey [Mon, 25 Mar 2024 19:57:25 +0000 (15:57 -0400)]
libdw: dwarf_getsrcfiles should not imply dwarf_getsrclines
dwarf_getsrcfiles causes line data to be read in addition to file data.
This is wasteful for programs which only need file or directory names.
Debuginfod server is one such example.
Fix this by moving the srcfile reading in read_srclines into a separate
function read_srcfiles. This change improves debuginfod server's max
resident set size by up to 75% during rpm indexing.
* libdw/dwarf_getsrcfiles.c (dwarf_getsrcfiles): Replace
dwarf_getsrclines and __libdw_getsrclines with
__libdw_getsrcfiles.
* libdw/dwarf_getsrclines.c (read_line_header): New function.
(read_srcfiles): New function.
(read_srclines): Move srcfile reading into read_srcfiles.
Add parameter to use cached srcfiles if available.
Also merge srcfiles with any files from DW_LNE_define_file.
(__libdw_getsrclines): Changed to call get_lines_or_files.
(__libdw_getsrcfiles): New function. Calls get_lines_or_files.
(get_lines_or_files): New function based on the old
__libdw_getsrclines. Call read_srcfiles if linesp is NULL,
otherwise call read_srclines. Pass previously read srcfiles
to read_srclines if available.
* libdw/dwarf_macro_getsrcfiles.c (dwarf_macro_getsrcfiles):
Replace __libdw_getsrclines with __libdw_getsrcfiles.
* libdw/libdwP.h (__libdw_getsrcfiles): New declaration.
* tests/.gitignore: Add new test binary.
* tests/get-files.c: Verify that dwarf_getsrcfiles does
not cause srclines to be read.
* tests/get-lines.c: Verify that srclines can be read
after srcfiles have been read.
* tests/Makefile.am: Add new testfiles.
* tests/get-files-define-file.c: Print file names before
and after reading DW_LNE_define_file.
* tests/run-get-files.sh: Add get-files-define-file test.
* tests/testfile-define-file.bz2: New testfile. Copy of
testfile36.debug but with a line program consisting of two
DW_LNE_define_file opcodes.
This implements initial support for the Hexagon architecture. The
Hexagon ABI spec can be seen at
https://lists.llvm.org/pipermail/llvm-dev/attachments/20190916/21516a52/attachment-0001.pdf
might fail when there isn't an *.urls file the first command in the
pipe fails (the 2>/dev/null is there to hide that failure).
This can be fixed by adding || : at the end.
This works because : always succeeds and produces no outpur which is
what the script expects when the command would fail.
Also add a new testcase that runs profile.sh with bout set -e
and set -o pipefail.
* config/profile.sh.in: Add || : at end of pipe.
* tests/run-debuginfod-client-profile.sh: New test.
* tests/Makefile.am (TESTS): Add run-debuginfod-client-profile.sh.
(EXTRA_DIST): Likewise.
Add support for setting $DEBUGINFOD_URLS automatically in the fish shell
similar to the profile scripts for POSIX and csh shells.
Makefile is set to install this into fish’s $XDG_DATA_DIRS vendor
directory instead of under /etc:
https://fishshell.com/docs/current/language.html#configuration-files
* config/profile.fish.in: Set $DEBUGINFOD_URLS in fish shells.
* configure.ac, config/Makefile.am: Include profile.fish in
install and uninstall targets.
Signed-off-by: Frederik “Freso” S. Olesen <freso.dk@gmail.com>
Mark Wielaard [Tue, 19 Mar 2024 22:43:10 +0000 (22:43 +0000)]
riscv: Partial implementation of flatten_aggregate
dwfl_module_return_value_location would fail on riscv for functions
which return a (small) struct. This patch implements the simplest
cases of flatten_aggregate in backends/riscv_retval.c. It just handles
structs containing one or two members of the same base type which fit
completely or in pieces in one or two general or floating point
registers.
It also adds a specific test case run-funcretval-struct.sh containing
small structs of ints, longs, floats and doubles. All these testscases
now work for riscv. There is already a slightly more extensive
testcase for this in tests/run-funcretval.sh but that only has a
testcase for aarch64.
* backends/riscv_retval.c (flatten_aggregate_arg): Implement
for the simple cases where we have a struct with one or two
members of the same base type.
(pass_by_flattened_arg): Likewise. Call either
pass_in_gpr_lp64 or pass_in_fpr_lp64d.
(riscv_return_value_location_lp64ifd): Call
flatten_aggregate_arg including size.
* tests/Makefile.am (TESTS): Add run-funcretval-struct.sh
and run-funcretval-struct-native.sh.
(check_PROGRAMS): Add funcretval_test_struct.
(funcretval_test_struct_SOURCES): New.
(EXTRA_DIST): Add run-funcretval-struct.sh,
funcretval_test_struct_riscv.bz2 and
run-funcretval-struct-native.sh.
* tests/funcretval_test_struct_riscv.bz2: New test binary.
* tests/run-funcretval-struct-native.sh: New test.
* tests/run-funcretval-struct.sh: Likewise.
Add malloc_trim() for releasing memory which is allocated for
temporary purposes, e.g. answering queries, adding data to the
database during scans. This patch just adds one call after the groom
cycle, but others could be added around webapi query handling or
scanning ops too.
Khem Raj [Sat, 9 Mar 2024 23:54:35 +0000 (15:54 -0800)]
debuginfod: Remove unused variable
Recent commit acd9525e9 has removed all references to max_fds
therefore remove it, moreover clang18 is happier
| ../../elfutils-0.191/debuginfod/debuginfod.cxx:1448:8: error: private field 'max_fds' is not used [-Werror,-Wunused-private-field]
| 1448 | long max_fds;
| | ^
| 1 error generated.
Mark Wielaard [Sat, 2 Mar 2024 23:45:34 +0000 (00:45 +0100)]
libdw: Don't use INTUSE in libdwP.h str_offsets_base_off
readelf.c cheats and include libdwP.h, which is an internal only
header of libdw. It really shouldn't do that, but there are some
internals that readelf currently needs. The str_offsets_base_off
function used by readelf uses INTUSE when calling dwarf_get_units.
This is a micro optimization useful inside libdw so a public
function can be called directly, skipping a PLT call. This can
cause issues linking readelf since it might not be able to call
the internal function, since readelf.c isn't part of libdw itself.
Just drop the INTUSE.
* libdw/libdwP.h (str_offsets_base_off): Don't use INTUSE
when calling dwarf_get_units.
Mark Wielaard [Fri, 1 Mar 2024 16:05:16 +0000 (17:05 +0100)]
libdw: Initialize tu_offset in __libdw_package_index
dwarf_cu_dwp_section_info.c: In function ‘__libdw_package_index’:
dwarf_cu_dwp_section_info.c:306:25: error: ‘tu_offset’ may be used uninitialized [-Werror=maybe-uninitialized]
306 | tu_offset += tu_index->section_count * 4;
| ~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
dwarf_cu_dwp_section_info.c:268:28: note: ‘tu_offset’ was declared here
268 | const unsigned char *tu_offset;
| ^~~~~~~~~
Which is the same issue we thought to have fixed by checking for
tu_index != NULL but not all gcc versions seem able to see that.
So just explicitly initialize tu_offset to NULL. We keep the older
check, so the NULL pointer should never be used.
Omar Sandoval [Mon, 26 Feb 2024 19:32:51 +0000 (11:32 -0800)]
libdw: Handle overflowed DW_SECT_INFO offsets in DWARF package file indexes
Meta uses DWARF package files for our large, statically-linked C++
applications. Some of our largest applications have more than 4GB in
.debug_info.dwo, but the section offsets in .debug_cu_index and
.debug_tu_index are 32 bits; see the discussion here [1]. We
implemented a workaround/extension for this in LLVM. Implement the
equivalent in libdw.
To test this, we need files with more than 4GB in .debug_info.dwo. I
created these artificially by editing GCC's assembly output. They
compress down to 6KB. I test them from run-large-elf-file.sh to take
advantage of the existing checks for large file support.
* libdw/dwarf_end.c (dwarf_package_index_free): New function.
* tests/testfile-dwp-4-cu-index-overflow.bz2: New test file.
* tests/testfile-dwp-4-cu-index-overflow.dwp.bz2: New test file.
* tests/testfile-dwp-5-cu-index-overflow.bz2: New test file.
* tests/testfile-dwp-5-cu-index-overflow.dwp.bz2: New test file.
* tests/testfile-dwp-cu-index-overflow.source: New file.
* tests/run-large-elf-file.sh: Check
testfile-dwp-5-cu-index-overflow and
testfile-dwp-4-cu-index-overflow.
Aaron Merey [Fri, 1 Mar 2024 00:46:09 +0000 (19:46 -0500)]
tests/run-getsrc-die.sh: Skip tests if objcopy fails
run-getsrc-die.sh uses objcopy to remove .debug_aranges from testfiles.
Depending how objcopy is built, it may fail to recognize the format of
the testfiles. Skip the remaining tests if objcopy fails.
Omar Sandoval [Mon, 26 Feb 2024 19:32:50 +0000 (11:32 -0800)]
libdw: Apply DWARF package file section offsets where appropriate
The final piece of DWARF package file support is that offsets have to be
interpreted relative to the section offset from the package index.
.debug_abbrev.dwo is already covered, so sprinkle around calls to
dwarf_cu_dwp_section_info for the remaining sections: .debug_line.dwo,
.debug_loclists.dwo/.debug_loc.dwo, .debug_str_offsets.dwo,
.debug_macro.dwo/.debug_macinfo.dwo, and .debug_rnglists.dwo. With all
of that in place, we can finally test various libdw functions on dwp
files.
* libdw/dwarf_getlocation.c (initial_offset): Call
dwarf_cu_dwp_section_info and add offset to start_offset.
* libdw/dwarf_getmacros.c (get_macinfo_table): Call
dwarf_cu_dwp_section_info and add offset to line_offset.
(get_offset_from): Call dwarf_cu_dwp_section_info and add offset
to *retp.
* libdw/dwarf_getsrcfiles.c (dwarf_getsrcfiles): Call
dwarf_cu_dwp_section_info and pass offset to
__libdw_getsrclines.
* libdw/dwarf_next_lines.c (dwarf_next_lines): Call
dwarf_cu_dwp_section_info and add offset to stmt_off.
* libdw/libdwP.h (str_offsets_base_off): Call
dwarf_cu_dwp_section_info and add offset.
(__libdw_cu_ranges_base): Ditto.
(__libdw_cu_locs_base): Ditto.
* tests/run-all-dwarf-ranges.sh: Check testfile-dwp-5 and
testfile-dwp-4.
* tests/run-declfiles.sh: Ditto.
* tests/run-get-lines.sh: Ditto.
* tests/run-next-lines.sh: Ditto.
* tests/run-varlocs.sh: Ditto.
* tests/run-get-files.sh: Check testfile-dwp-5,
testfile-dwp-5.dwp, testfile-dwp-4, and testfile-dwp-4.dwp
* tests/run-next-files.sh: Ditto.
* tests/run-dwarf-getmacros.sh: Check testfile-dwp-5 and
testfile-dwp-4-strict.
* tests/run-get-units-split.sh: Ditto.
Omar Sandoval [Mon, 26 Feb 2024 19:32:49 +0000 (11:32 -0800)]
libdw: Refactor dwarf_next_lines and fix skipped CU
dwarf_next_lines has two loops over CUs: one from the CU after the given
CU to the end, and one from the first CU up to _but not including_ the
given CU. This means that the given CU is never checked.
This is unlikely to matter in practice since CUs usually correspond 1:1
with line number tables in the same order, but let's fix it anyways.
Refactoring it to one loop fixes the problem and simplifies the next
change to support DWARF package files.
* libdw/dwarf_next_lines.c (dwarf_next_lines): Refactor loops
over CUs into one loop.
Aaron Merey [Mon, 26 Feb 2024 14:58:39 +0000 (09:58 -0500)]
Add __libdw_getdieranges
__libdw_getdieranges builds an aranges list by iterating over each
CU and recording each address range.
This function is an alternative to dwarf_getaranges. dwarf_getaranges
attempts to read address ranges from .debug_aranges, which might be
absent or incomplete.
This patch replaces dwarf_getaranges with __libdw_getdieranges in
dwarf_addrdie and dwfl_module_addrdie. The existing tests in
run-getsrc-die.sh are also rerun with .debug_aranges removed from
the testfiles.
This is because dwarf_decl_file calls dwarf_getsrclines to populate
cu->files. For normal units, cu->files is cached by dwarf_getsrclines
when it parses the line number information. However, for split units,
the line number information is parsed for the skeleton unit, then copied
to the split unit's cu->lines. Split units have their own file name
table, so cu->files is not copied.
The obvious solution is to use dwarf_getsrcfiles instead of relying on
implicit caching.
Also add a test case for dwarf_decl_file.
* libdw/dwarf_decl_file.c (dwarf_decl_file): Use
dwarf_getsrcfiles instead of dwarf_getsrclines.
* tests/Makefile.am (check_PROGRAMS): Add declfiles.
(TESTS): Add run-declfiles.sh.
(EXTRA_DIST): Add run-declfiles.sh.
(declfiles_LDADD): New variable.
* tests/declfiles.c: New test.
* tests/run-declfiles.sh: New test.
Mark Wielaard [Wed, 21 Feb 2024 21:19:32 +0000 (22:19 +0100)]
readelf: Use unsigned loop variables in handle_verneed and handle_verdef
Prevent signed underflow by changing loop variables to unsigned and
doing count checks before decrementing. This isn't really a bug, but
prevents UB detected by ubsan on fuzzed input. The bad (fuzzed) input
data does get detected anyway.
* src/readelf.c (handle_verneed): Use unsigned cnt, cnt2.
(handle_verdef): Likewise.
Annobin address ranges were always printed as if they were 64bit wide
because addr_size was set to twice the size. This was done because the
note description size should contain two addresses. Fix this by setting
the address size to just one address and then check that descsz is
twice that.
* libebl/eblobjnote.c (ebl_object_note): Set addr_size to one
ELF_T_ADDR. Check descsz equals two times addr_size.
Update the documentation of dwarf_cu_dwp_section_info to make clear
that the function only returns an error if the DWARF package file data
couldn't be read or an unknown section constant is provided. Missing
DWP information for a given CU isn't an error and will set both OFFSET
and SIZE to zero. It also makes sure the documentation is < 76 chars
wide.
Omar Sandoval [Wed, 6 Dec 2023 09:22:17 +0000 (01:22 -0800)]
libdw: Try .dwp file in __libdw_find_split_unit()
Try opening the file in the location suggested by the standard (the
skeleton file name + ".dwp") and looking up the unit in the package
index. The rest is similar to .dwo files, with slightly different
cleanup since a single Dwarf handle is shared.
* libdw/libdw_find_split_unit.c (try_dwp_file): New function.
(__libdw_find_split_unit): Call try_dwp_file.
* libdw/libdwP.h (Dwarf): Add dwp_dwarf and dwp_fd.
(__libdw_dwp_findcu_id): New declaration.
(__libdw_link_skel_split): Handle .debug_addr for dwp.
* libdw/libdw_begin_elf.c (dwarf_begin_elf): Initialize
result->dwp_fd.
* libdw/dwarf_end.c (dwarf_end): Free dwarf->dwp_dwarf and close
dwarf->dwp_fd.
(cu_free): Don't free split dbg if it is dwp_dwarf.
Omar Sandoval [Wed, 6 Dec 2023 09:22:16 +0000 (01:22 -0800)]
libdw: Parse DWARF package file index sections
The .debug_cu_index and .debug_tu_index sections in DWARF package files
are basically hash tables mapping a unit's 8 byte signature to an offset
and size in each section used by that unit [1]. Add support for parsing
and doing lookups in the index sections.
We look up a unit in the index when we intern it and cache its hash
table row in Dwarf_CU. Then, a new function, dwarf_cu_dwp_section_info,
can be used to look up the section offsets and sizes for a unit. This
will mostly be used internally in libdw, but it will also be needed in
static inline functions shared with eu-readelf. Additionally, making it
public it makes dwp support much easier for external tools that do their
own low-level parsing of DWARF information, like drgn [2].
* libdw/dwarf.h: Add DW_SECT_TYPES.
* libdw/libdwP.h (Dwarf): Add cu_index and tu_index.
(Dwarf_CU): Add dwp_row.
(Dwarf_Package_Index): New type.
(__libdw_dwp_find_unit): New declaration.
(dwarf_cu_dwp_section_info): New INTDECL.
Add DWARF_E_UNKNOWN_SECTION.
* libdw/Makefile.am (libdw_a_SOURCES): Add
dwarf_cu_dwp_section_info.c.
* libdw/dwarf_end.c (dwarf_end): Free dwarf->cu_index and
dwarf->tu_index.
* libdw/dwarf_error.c (errmsgs): Add DWARF_E_UNKNOWN_SECTION.
* libdw/libdw.h (dwarf_cu_dwp_section_info): New declaration.
* libdw/libdw.map (ELFUTILS_0.190): Add
dwarf_cu_dwp_section_info.
* libdw/libdw_findcu.c (__libdw_intern_next_unit): Call
__libdw_dwp_find_unit, and use it to adjust abbrev_offset and
assign newp->dwp_row.
* libdw/dwarf_cu_dwp_section_info.c: New file.
* tests/Makefile.am (check_PROGRAMS): Add cu-dwp-section-info.
(TESTS): Add run-cu-dwp-section-info.sh
(EXTRA_DIST): Add run-cu-dwp-section-info.sh and new test files.
(cu_dwp_section_info_LDADD): New variable.
* tests/cu-dwp-section-info.c: New test.
* tests/run-cu-dwp-section-info.sh: New test.
* tests/testfile-dwp-4-strict.bz2: New test file.
* tests/testfile-dwp-4-strict.dwp.bz2: New test file.
* tests/testfile-dwp-4.bz2: New test file.
* tests/testfile-dwp-4.dwp.bz2: New test file.
* tests/testfile-dwp-5.bz2: New test file.
* tests/testfile-dwp-5.dwp.bz2: New test file.
* tests/testfile-dwp.source: New file.
Completely replace the "fdcache" algorithms in debuginfod, which
manages files extracted from archives. Previous logic was a LRU queue
for files requested by users, and a separate LRU queue for prefetched
files found nearby the ones users requested. The code did not handle
annoying edge cases like infrequently accessed but very costly
extraction of files like fedora kernels' vdso.debug. In addition, the
queue was searched linearly for normal lookups. It was also
unceremoniously dropped at each groom cycle.
New code replaces this with an indexed datastructure for quick
lookups, and extra metadata for use during eviction decisions. Each
entry tracks size and such, but now also tracks how recently and how
many times it was requested, how long it took to originally extract.
The new code combines these quantities in a score, by which eviction
eligibility is ranked. Intuitively, the effect is to prefer to hoard
small / slow-to-access files, and prefer to jettison large / fast /
never accessed ones.
It's a tricky thing to balance. The parameters in this configuration
were tested by timing-accurate replaying a few days' worth of actual
traffic of the main fedora debuginfod server. The peer
debuginfod.stg.fedoraproject.org runs the new code. It shows good
performance, excellent use of the cache storage, and strong preference
to hold onto those vdso.debug files. But who knows, it might need
tweaking later. The new code adds more prometheus metrics to make it
possible to grok the effectiveness of the few remaining
fdcache-related options.
Patch includes doc updates and NEWS. The changes are invisible to the
testsuite (except with respect to the new metrics). Code changes are
focused on all the member functions of class libarchive_fdcache, and
their callers. Unused parameters are removed, with previous command
line options hidden/accepted/ignored. Some other minor error-path
tempfile-gc was fixed in the extraction paths.
Aaron Merey [Sat, 10 Feb 2024 02:10:19 +0000 (21:10 -0500)]
Handle DW_AT_decl_file 0
Modify dwarf_decl_file to support DW_AT_decl_file with value 0.
Because of inconsistencies in the DWARF 5 spec, it is ambiguous whether
DW_AT_decl_file value 0 is a valid .debug_line file table index for the
main source file or if it means that there is no source file specified.
dwarf_decl_file interprets DW_AT_decl_file 0 as meaning no source file
is specified. This works with DWARF 5 produced by gcc, which duplicates
the main source file name at index 0 and 1 of the file table and avoids
using DW_AT_decl_file 0.
However clang uses DW_AT_decl_file 0 for the main source index with no
duplication at another index. In this case dwarf_decl_file will be
unable to find the file name of the main file.
This patch changes dwarf_decl_file to treat DW_AT_decl_file 0 as a normal
index into the file table, allowing it to work with DWARF 5 debuginfo
produced by clang.
As for earlier DWARF versions which exclusively use DW_AT_decl_file 0
to indicate that no source file is specified, dwarf_decl_file will now
return the name "???" if called on a DIE with DW_AT_decl_file 0.
Frank Ch. Eigler [Mon, 12 Feb 2024 15:03:02 +0000 (10:03 -0500)]
debuginfod.8 man page: tweak -U explanation
In debian bug #1063768, smcv noted that the man page was
out of date with respect to the tool debuginfod actually
uses for -U. Update the man page to fix the mismatch.
Reported-By: Simon McVittie <smcv@collabora.com> Signed-off-By: Frank Ch. Eigler <fche@redhat.com>
Aaron Merey [Mon, 22 Jan 2024 00:44:34 +0000 (19:44 -0500)]
unstrip: Call adjust_relocs no more than once per section.
During symtab merging, adjust_relocs might be called multiple times on
some SHT_REL/SHT_RELA sections. In these cases it is possible for a
relocation's symbol index to be correctly mapped from X to Y during the
first call to adjust_relocs but then wrongly remapped from Y to Z during
the second call.
Fix this by adjusting relocation symbol indices just once per section.
Also add stable sorting for symbols during symtab merging so that the
symbol order in the output file's symtab does not depend on undefined
behaviour in qsort.
Note that adjust_relocs still might be called a second time on a section
during add_new_section_symbols. However since add_new_section_symbols
generates its own distinct symbol index map, this should not trigger the
bug described above.
Mark Wielaard [Tue, 6 Feb 2024 11:34:51 +0000 (12:34 +0100)]
srcfiles: Fix --enable-gcov (BUILD_STATIC) build
When configuring with --enable-gcov we build most things static.
Including libdebuginfod. The src Makefile was only setup for a
shared library build of libdebuginfod.so. Fix this by providing
a static libdebuginfod in case of BUILD_STATIC.
This fixes the builder.sourceware.org elfutils-snapshots-coverage
and provides fresh coverage reports again at
https://snapshots.sourceware.org/elfutils/coverage/latest/
* Makefile.am (BUILD_STATIC): Provide libdebuginfod.a