Nick Alcock [Wed, 3 Dec 2025 15:05:56 +0000 (15:05 +0000)]
libctf: error code reduction (still underway)
This shows the fruits of the recent ctf_err rework: the fusion of a
bunch of error codes for which we had different error codes solely so
that users could see different output in the ctf_errmsg(). Now that we
have an error-handling scheme which can produce the function name and
arbitrary extra printf()ed strings, we can merge a whole bunch of these
codes together, making the codes purely semantic and much less granular,
since we only really care that an error code is unambiguous with respect
to any given API function (e.g. if ctf_add_forward() and
ctf_member_count() both return ECTF_WRONGTYPE, you know perfectly well
what it means in each case, even though it means something quite
different: you can't make a forward to nonforwardable types like ints,
and you can't count the members of a basic type.)
There is one tiny semantic change of sorts: ctf_member_info(),
ctf_enum_name() and ctf_enum_value() used to set an error if the name
specified wasn't found. This is a fairly expected case, so we change it
to a warning. A message is still logged on the error/warning stream, and
ECTF_NONAME is set as the ctf_errno; if the caller handles it, they can
use ctf_errwarning_remove() to remove the most ECTF_NONAME from the
stream once they've done so.
(Among all the removed ECTF_NOT* error codes, we have kept one,
ECTF_NOTREF, because it is actually useful to be able to distinguish
"you passed in the outright wrong kind" from "this type just doesn't
point to another one", and some existing callers do make this
distinction.)
Error code count: 78 -> 63: 15 removed. More will go.
We introduce new macro-assisted functions to handle internal error
reporting. Instead of the old
ctf_err_warn (fp, err/warning, errno, fmt, ...)
we now have a family of macros that wrap functions:
ctf_ret_t ctf_err (locus, err, format, ...)
void ctf_warn (locus, err, format, ...)
and ctf_typed_err, which is the same as ctf_err except that it returns a
ctf_id_t.
So ctf_err et al can be used like ctf_set_errno(): this is because if
you pass the err value, they now set an error as well as reporting
it. But set it on what? On the locus argument, which is set by a family
of macros pretending to be functions:
err_locus (fp)
This is the basic form: if an error is found on the fp (NULL == global
we're-opening-a-new-fp stream), and the err is passed as zero, its
errmsg will be printed after the format string, and the cuname of the fp
will be derived and and printed too. The function the err_locus() call
is located in (always the same, and usually the same line, as the
ctf_err() call) is also printed:
ctf_blah ("bar/baz.o"): format string: errmsg output
type_err_locus (fp, type)
The given type ID is also printed:
ctf_blah ("bar/baz.o", 0x472): format string: errmsg output
link_err_locus (err_fp, type_fp, input_num)
Used when linking some input TYPE_FP into ERR_FP: the error is extracted
from TYPE_FP and placed onto ERR_FP (if not explicitly specified), and
the INPUT_NUM (if > -1) is also printed.
Nick Alcock [Fri, 28 Nov 2025 20:45:44 +0000 (20:45 +0000)]
libctf: partial internal ctf_ret_tification
This is not a complete sweep of everything returning int, but much of
ctf-archive.c, ctf-dedup.c, ctf-link.c and ctf-serialize.c is now
ctf_ret_t-correct.
Nick Alcock [Wed, 19 Nov 2025 14:13:45 +0000 (14:13 +0000)]
libctf: API review: add ctf_errwarning_remove
Now that we have the error code in the error/warning stream, and are
preparing for pervasive use of the error stream to capture error content
for human consumption, we are faced with another problem: if a returned
error is not actually erroneous, and is handled, how do we discard it
without potentially losing other errors too? The problem is that some
errors might be followed by other errors for unrelated reasons: a
consumer (usually inside but potentially also outside libctf) might want
to remove the error it's handled from the stream without touching later
ones.
This is what ctf_errwarning_remove does. (As with ctf_errwarning_next,
it can take a NULL fp to indicate the special stream used for errors not
associated with any dictionary, like archive errors and open-time
errors.)
Nick Alcock [Wed, 19 Nov 2025 13:58:14 +0000 (13:58 +0000)]
libctf: API review: report error code in ctf_errwarning_next stream
The ctf_err_warn internal function that pushes new errors onto the
ctf_errwarning_next stream takes a ctf_error_t as well as an error
string. It doesn't put this onto the error it saves, but only
debugging-printf()s it. This means that it throws away the error code
and relies on the textual error redundantly transmitting the same
information, which is quite undesirable, particularly if the caller
wants to take note of the errors programmatically.
So store each error code with the corresponding error string, and
report it via the error pointer argument that ctf_errwarning_next
is already being provided (and which has never been initialized
unless ctf_errwarning_next itself encountered an error and returned
NULL, so this isn't even an API break).
Nick Alcock [Wed, 19 Nov 2025 12:32:23 +0000 (12:32 +0000)]
fixup! libctf: move string deduplication into ctf-archive
ctf_err_copy is an obscure function used during archive generation,
propagating the errwarning stream from archives beyond the first (the
child into the first one (the parent): this means that the rest of
libctf can ignore the possibility that archives are being generated when
emitting errors, and the archive writer will compensate.
This code has been emitting format string errors for as long as it was
written, and I've been ignoring them for just as long because of course
we know it must be a false positive because this is a ctf_err_warn()
call that is propagating something that is already stored in an
errwarning string and thus has already been passed through vasprintf().
It isn`t. If for some reason we ended up emitting a %% into the
errwarning stream, this second invocation would treat it as a conversion
specification and might do all the horrible things that can happen
then. Though we do no such thing in any ctf_err_warn() calls in libctf,
this is still a real risk because we print human-readable text derived
from CTF dicts on error paths. These are all type names, so can't
usually contain % characters, but crafted CTF dict inputs might, and
then all bets are off. We are mostly saved by the only real caller of
this code being the linker, but even then, one crafted object file in
the link triggering an error during archive preserialization and
suddenly we are printf()ing using an attacker-controlled format string.
Fix this by manually propagating the errors, avoiding a second
ctf_err_warn() call.
This function only exists on the ctfv4 branch: no released binutils are
impacted, nor is binutils trunk.
Nick Alcock [Tue, 18 Nov 2025 16:02:43 +0000 (16:02 +0000)]
libctf, include: delete ctf_add_* flag arg and CTF_ADD_*.
Finally we can remove the poorly-named "flag" argument and rely on using
ctf_type_set_conflicting to make (non-root-visible) conflicting types.
This is an API break for essentially anything that adds any types at
all, but probably won't require more than deleting CTF_ADD_ROOT
parameters everywhere, and possibly turning CTF_ADD_NONROOT into calls
to ctf_type_set_conflicting.
Nick Alcock [Tue, 18 Nov 2025 12:52:30 +0000 (12:52 +0000)]
libctf: call ctf_type_set_conflicting earlier in dedup
Now that we can use ctf_type_set_conflicting to indicate that a type
that will be emitted later should be marked conflicting, do that, and
exploit this in normal links so that should you ever do a normal link
with conflicted types in it, the CU name on the output conflict marking
is going to be one of those on the inputs (we make no attempt to
pick a consistent one, though).
Nick Alcock [Tue, 18 Nov 2025 12:03:47 +0000 (12:03 +0000)]
libctf, include: API review: extend ctf_type_set_conflicting
We'd like the libctf API to make common things easy and not to be
cluttered up with rarely-needed things except when those things are
actually required. The current API's ctf_add_*() calls are all far from
that ideal because every single one of them takes a badly-named "flag"
which indicates whether this type is root-visible or not.
Almost all types are root-visible: C doesn't even have a concept of a
non-root-visible type, and BTF doesn't implement it at all. CTFv4 uses
the concept solely for conflicting types that have nonetheless ended up
in the same dict (usually because several dicts have been merged into
one via the CU-mapping mechanism). Normal links cannot do this, so
almost no ctf_add_*() callers are ever going to want a non-root-visible
type. So we should definitely drop the flag...
... but if we do, how does code that does want a conflicting,
non-root-visible type indicate it? It can't pass CTF_ADD_NONROOT if that
is deleted, but it also can't use ctf_type_set_conflicting because that
can only be called on existing types, but if a type is conflicting its
name is going to clash with an existing type, so we're going to want to
mark it as such from the moment it's added, before its type ID is even
assigned (at the time name-clash detection is carried out).
Fix this by extending ctf_type_set_conflicting to let you say that type
0 is conflicting, which is taken to mean that the next
successfully-added type in this dict is marked conflicting as soon as it
is added, just like passing CTF_ADD_NONROOT currently does. Type 0 has
no physical existence (BTF uses it for void), so it cannot be
conflicting and so is free in this API for this obscure edge-case use.
(This interface would be horrible if it was meant for frequent use, but
only deduplicators are likely to use it, and not most of those.)
Nick Alcock [Tue, 18 Nov 2025 10:25:23 +0000 (10:25 +0000)]
libctf, include: API review: ctf_add_qualifier
We don't need three separate functions just to add cvr-quals. So replace
ctf_add_const, ctf_add_restrict and ctf_add_volatile with one new
function, ctf_add_qualifier, and use it in the deduplicator rather than
its current use of the libctf-internal ctf_add_reftype. This also lets
us move various pointer-only stuff like ptrtab updating into
ctf_add_pointer, where it belongs.
(We don't export ctf_add_reftype because pointers are too conceptually
different from qualifiers to add with the same function.)
Nick Alcock [Tue, 11 Nov 2025 17:33:56 +0000 (17:33 +0000)]
libctf: API review: delete deprecated functions
ctf_file_close, ctf_arc_open_by_name and the struct ctf_file (but not
the ctf_file_t typedef) have almost no remaining external users and
can be deleted easily.
Nick Alcock [Tue, 11 Nov 2025 17:11:08 +0000 (17:11 +0000)]
libctf: API review: delete ctf_gzwrite
This function is absolutely unused, does something nearly useless
(writing out the entire dict compressed to a zlib stream, which is not a
thing ctf_open can even open), has its own dedicated testcase, and
didn't work for years and nobody noticed.
Delete it.
(ctf_compress_write is also unused, but seems somewhat less useless.
Holding off rationalizing that until after the archives are redone, but
it will probably go too in favour of just letting you compress with a
threshold.)
Nick Alcock [Tue, 11 Nov 2025 17:00:31 +0000 (17:00 +0000)]
libctf: API review: delete ctf_update, ctf_discard
There is a surprising amount of infrastructure for these horrible
deprecated functions: two distinct pieces of state in the ctf_dict_,
and their own dedicated error... which never worked because we failed to
update one of those pieces of state properly.
Delete the lot. There are a lot of callers in older code, but they are
all obsolete and can mostly be either deleted (if ctf_update() was there
simply to arrange to be able to access newly-added types) or replaced
with calls to ctf_snapshot/ctf_rollback.
Nick Alcock [Tue, 11 Nov 2025 16:51:59 +0000 (16:51 +0000)]
libctf, gdb, include: API review: merge ctf_add_{objt,func}_sym: more enums
This merges together ctf_add_objt_sym and ctf_add_func_sym into one
ctf_add_funcobjt_sym (which is what is used under the covers anyway),
with a new enum instead of the 'is this a function' enum (to allow for
future expansion into other symbol types, and for self-documentation:
CTF_STT_OBJT is just clearer than an unadorned 0).
Nick Alcock [Tue, 11 Nov 2025 16:40:22 +0000 (16:40 +0000)]
libctf: API review: deprecate slice addition
Slices are generally deprecated and present only for v3 dict
compatibility: adding new ones is almost certainly a mistake. Deprecate
ctf_add_slice accordingly.
Nick Alcock [Tue, 11 Nov 2025 16:37:31 +0000 (16:37 +0000)]
libctf: API review: delete ctf_getdebug
While setting debugging programmatically is incredibly useful for
conditional debugging, getting it is just a recipe for disaster. We
don't want callers to ever change behaviour merely because debugging is
on!
Nick Alcock [Tue, 11 Nov 2025 16:29:27 +0000 (16:29 +0000)]
libctf: API review: C++ compatibility for enumized flags
A number of the recently-enumized flags are in flags for which we want
the caller to be able to specify "no flags set" by just passing 0.
This is, alas, not C++-compatible: C++ only defines implicit conversions
from enum to int, not the other way round, not even for 0. So make them
all typedefs to int on C++. (Does not affect name mangling: this is all
in extern "C". It does mean you'll get fewer warnings in C++ code than
in C code because the compiler can no longer do enum coverage analysis,
but in most cases this is harmless, except for ctf_type_kind_t.)
Nick Alcock [Tue, 11 Nov 2025 16:09:36 +0000 (16:09 +0000)]
libctf, gdb, include: API review: zap ctf_lookup_variable, ctf_variable_next
These functions are totally redundant now that we have
ctf_lookup_by_kind and ctf_type_kind_next: the most you might have to do
is a ctf_type_reference to get from the CTF_K_VAR to its type.
The names of these functions are historical and inconsistent with other
functions in the API: they relate to dict-wide properties but are not in
the ctf_dict_* namespace, and some of them take ints that could be
enums.
Nick Alcock [Mon, 10 Nov 2025 17:56:42 +0000 (17:56 +0000)]
libctf, include, gdb: API review: rethink function type APIs
The existing interfaces for function type lookup were as follows:
- ctf_func_type_info, which stuffed a bit of info about the function
(return type, varags, number of args) into a structure
- ctf_func_type_args, which took a big-enough structure and put arg
info into it
- ctf_func_type_arg_names (new in CTFv4), which did the same for a
const char * array of arg names.
(and the ctf_func_info/args/arg_names by-symbol stuff we just deleted.)
This interface was always clunky, but with the addition of arg names it
become fearfully inconvenient, requiring the user to hand-allocate
things over and over again, and forcing the filling out of tiny
structures where simple parameters would do just as well.
Redo this completely, ripping out ctf_funcinfo_t and all three
ctf_func_type_* functions, replacing them with these:
/* Given a type ID relating to a function type or function linkage type, return
an array of arg types, and optionally in the RET argument the return type
too. Vararg functions set CTF_FUNC_VARARG in the optional FLAG argument. The
NARGS arg gives the length of the array (optional because it will always
have the same value when you call both ctf_func_type and ctf_func_arg_names,
so you only need to get it from one of them). */
Using these is *radically* easier because of the reduction of manual
allocation overhead, with only one tiny cost: if you only want to know
how many args a function has, but not anything about them, you have to
free the args ctf_func_type() returns now (and pay the cost of
populating it). If this really becomes a problem, we can fix it when
that happens.
(Look at the change to ctf_dedup_rhash_type for an example of the amount
of code redunction you can often expect.)
We also make the function flags into an enum while we're at it.
Nick Alcock [Mon, 10 Nov 2025 16:04:35 +0000 (16:04 +0000)]
include, libctf: API review: ctf_dict_set_flag, ctf_dict_get_flag
Turn the int this takes into an enum, and rename ctf_dict_get_flag to
ctf_dict_flag because querying functions in this library do not usually
use _get in their names.
This function is only useful for looking up enumerators in dictionaries
where multiple enumerators can have the same value. C cannot define
such dictionaries, and as of 2024 the deduplicator will not produce
them, so they should get rarer over time.
Adjust the deprecation attribute so that it's more specifically
libctf-named, and arrange to turn it off when compiling libctf itself,
because other functions (notably ctf_arc_lookup_enumerator_next) use
ctf_lookup_enumerator_next internally, and we don't want deprecation
warnings for internal uses by libctf of its own functionality.
Nick Alcock [Mon, 10 Nov 2025 15:37:43 +0000 (15:37 +0000)]
libctf, gdb, include: API review: enum values become ctf_enum_value_t
A ctf_enum_value is a discriminated union of a signed and unsigned
int64, and a ctf_encoding_t giving the encoding of the corresponding
enum, so you can tell whether it's signed or not by looking up
encoding.cte_format & CTF_INT_SIGNED. Because the union is at the
start, you can usually just cast the entire structure to an int64_t or
whatever if you already know the signedness of the enum values.
Changing every function that takes or receives an enum value
(ctf_enum_value, ctf_enum_name, ctf_enum_next, ctf_lookup_enumerator,
ctf_lookup_enumerator_next, ctf_arc_lookup_enumerator_next) to take a
ctf_enum_value_t instead means that we can drop ctf_enum_unsigned and
ctf_enum_unsigned_value and finally treat signed and unsigned enums, and
enums and enum64s, the same in most end-user code.
Nick Alcock [Mon, 10 Nov 2025 13:35:49 +0000 (13:35 +0000)]
include, libctf: API review: ctf_add_enum* rationalization
Similarly to ctf_add_struct, we can convert four enum-addition functions
into one by unconditionally adding the encoding parameter, but allowing
it to be NULL, and adding a type kind that can be UNKNOWN (which in this
case means "enum64", since narrow enums are obsolescent):
Uses that don't care about encodings can pass NULL: uses that just want
unsigned enums can pass a ctf_encoding_t initialized to all-0 encoding
(since 0 is !CTF_INT_SIGNED): only people that really want slices need
to pass anything else:
if (ctf_add_enum (fp, CTF_ADD_ROOT, "an_enum", 0, 0) == 0)
Uses that don't want narrow enums can just pass 0 as the type kind; uses
that want an enum like some other enum can pass its kind and encoding in:
Nick Alcock [Sun, 9 Nov 2025 15:43:01 +0000 (15:43 +0000)]
libctf, include: API review: ctf_add_member rationalization
ctf_add_member*() is a historical mess. There were four functions:
- ctf_add_member, which added a member to a struct or union, but didn't
let you say where: it picked the next offset based on the natural
alignment of the type. This was the only function available in
Solaris libctf.
- ctf_add_member_offset, which let you specify an offset, but in order
to emulate ctf_add_member (or just C-style "stick a member on the
end, I really don't care what the offset is") let you pass in -1 to
indicate that. Alas, this meant that it took an ssize_t, which
forced all the iteration functions returning offsets to also return
an ssize_t, even though structure offsets can easily exceed an
ssize_t's range: the structure used by ctf_member_info had a size_t
offset!
- ctf_add_member_bitfield: a CTFv4 addition to add the new
BTF-compatile bitfield members. This is the actual underlying
function that all the others are implemented in terms of. It lets
you add members at any offset higher than any you have previously
specified, even too high for BTF to encode (CTFv4 CTF_K_BIG
structures are used in this case). The offset here was an unsigned
long (?!).
- ctf_add_member_encoded, which used to be a utility function to add
slices without needing to realise they weren't old-style encoded base
types, but now serve to automatically introduce slices if you try to
add a bitfield to a non-bitfield-capable struct.
Rationalize this giant mess by eliminating ctf_add_member_offset, adding
an offset to ctf_add_member, and dropping ctf_add_member_encoded
entirely on the grounds that anyone trying to add a bitfield to a
non-bitfield-capable struct is introducing a bug, and why are we adding
functions to cater to *that*?
We also change all the offsets to size_t, including in the return value
of ctf_member_next.
But this adds another piece of pain. ctf_member_next represents its
error return by returning -1, and callers usually used >= 0 to check for
end-of-iteration -- but if offsets are unsigned, suddenly the common
case of end-of-iteration checking means an ugly typecast! The same is
true for adding members of don't-care offset.
Make this cleaner with two new #defines, both of which expand to the
same (size_t) -1 typecast.
CTF_MEMBER_ERR lets you do this with ctf_member_next:
if (ctf_add_member (fp, type, "wombat", 0, CTF_NEXT_MEMBER) == 0)
The similarity of names between ctf_member_next() and CTF_NEXT_MEMBER
seems like it should be confusing, but not really, as they have similar
names because they are similar: ctf_member_next() gives you the next
member, and CTF_NEXT_MEMBER means "just make this member the next
member, I don't care what the offset is".
Maybe CTF_NEXT_MEMBER_OFFSET would be clearer, but it's a bit long...
CTF_NEXT_MEMB_OFF? Ew.
Nick Alcock [Sun, 9 Nov 2025 15:41:26 +0000 (15:41 +0000)]
libctf, include: API review: delete ctf_add_member_encoded
This function's only use over simply calling ctf_add_member_bitfield is
to automatically add a slice if the encoding passed specifies a bitfield
but this structure is not bitfield-capable.
We're trying to deprecate slices, and this use adds nothing. Delete it,
and its only caller (a test devoted entriely to testing it).
Nick Alcock [Sun, 9 Nov 2025 14:41:42 +0000 (14:41 +0000)]
libctf, include: fold ctf_add_struct*, ctf_add_union* into ctf_add_struct
Rather than having a whole family of functions to add structs and
unions, we now have just one, that you can call like this (full
generality, adding a union of known size with bitfields in it):
or like this ("work out if it's a struct yourself", right now, always a
struct: in future, adding overlapping members may change it into a
union, plus shorthand for "no bitfields"):
(the first 0 is technically CTF_K_UNKNOWN, but it means you can just
call ctf_add_struct() with a lot of zero parameters to add a normal,
boring, ordinary C struct, and you don't need to remember what each zero
individually means.)
Nick Alcock [Sun, 9 Nov 2025 14:39:03 +0000 (14:39 +0000)]
include: struct members must be added in ascending order by offset
We've thrown ECTF_DESCENDING for some time if this wasn't true. (In
theory this restriction could be lifted later on if necessary, but in
practice since this means adding the members in a different order from
their C declaration, this seems unlikely ever to be something anyone
would want to do.)
include/
* ctf-api.h (ctf_add_member): You cannot add struct members
backwards any more.
Nick Alcock [Fri, 7 Nov 2025 14:58:09 +0000 (14:58 +0000)]
include, libctf: API review: ctf_linkages_t enum
This moves all the (somewhat redundant) CTF names for BTF linkage flags
into their own enum. Alas, ctf_linkage_t is already taken for the
BTF/CTF structure actually holding the linkages... but this name will
do.
Nick Alcock [Thu, 6 Nov 2025 20:27:45 +0000 (20:27 +0000)]
libctf, include, gdb: introduce ctf_kind_t enum for type kinds
This lets us validate that switches on type kinds cover all possible
options, and serves as documentation that another class of mysterious
ints in the public API is a type kind rather than a random integer.
Nick Alcock [Thu, 6 Nov 2025 19:52:58 +0000 (19:52 +0000)]
libctf: create: improve handling of root flag in ctf_add_type
The result of the internal *CTF_INFO_ISROOT macros is necessarily
convertible to the "flag" argument of the ctf_add_* functions, though as
it happens right now they are (CTF_ADD_ROOT == 1).
Use an explicit conditional, and split the two up.
Nick Alcock [Thu, 6 Nov 2025 19:04:58 +0000 (19:04 +0000)]
libctf, include: API review: ctf_type_name changes
ctf_type_aname_raw is implementable as ctf_type_aname() and a trivial
strdup call: delete as useless.
ctf_type_lname has no users and an odious API: delete.
ctf_type_name squats on a really useful function name, has an odious
API, corrupts its return value via truncation and reeks of non-GNU
ancient Unix practices. It's widely used, unfortunately, so to
discourage use and force auditing rename it to ctf_type_sname, add a
*len parameter so it can also supplant ctf_type_lname, and mark it
deprecated.
Nick Alcock [Thu, 6 Nov 2025 17:26:07 +0000 (17:26 +0000)]
libctf: API review: _next iterators are freed/annulled on error
The existing pattern for _next iterators had them automatically freed
and set to NULL on successful iteration ("error" return ECTF_NEXT_END),
but on other errors, the iterator was not freed, on the grounds that you
might in theory be able to recover, and continue iteration.
While theoretically more expressive, this is in practice fiercely
annoying: even I, as its author, hardly ever remember to free on
non-ECTF_NEXT_END error returns, and if I forget everyone else is sure
to. Further, in practice this source of leaks gains us nothing:
essentially no errors are recoverable except for ECTF_NOPARENT
(ctf_import and try again), and ECTF_NOPARENT is about to be removed
when ctf_import becomes implicit at open time. (Also, since it always
happens at start of iteration, you could always have simply re-done the
iteration in any case.)
Move to implicitly destroying and annulling iterators on all iteration
errors. No callers need adjusting: ctf_next_destroy() on the null
pointer that it will get after this change is a NOP, just like free().
Also do an audit of all iterator uses for error-checking and lack-of-
necessary-ctf_next_destroy()-on-early-exit bugs, and fix all found.
(There were even a few cases where we were failing to check for errors
and end-of-iteration!)
Nick Alcock [Thu, 6 Nov 2025 16:30:23 +0000 (16:30 +0000)]
libctf: ctf_dynhash_next_remove: fix crash on error case
If this is called on an errored-out or freed iterator, it's meant to do
nothing, just like free(). Instead, it dereferences a null pointer due
to a couple of early tests being done in the wrong order.
Fix that.
libctf/
* ctf-hash.c (ctf_dynhash_next_remove): Check for null iterator
before dereferencing it.
Nick Alcock [Thu, 6 Nov 2025 15:34:12 +0000 (15:34 +0000)]
libctf, include: make ctf_lookup_by_symbol and ctf_lookup_by_symbol_name private
These two functions allow you to look up symbols by name and symbol
index in single CTF dicts. Now that ctf_dict_arc can always give you
back an archive for any dict (whether opened from an archive or not),
they are strictly redundant to ctf_arc_lookup_symbol and
ctf_arc_lookup_symbol_name, given that symbols by definition are
per-ELF-object and thus there can be only one type for a given symbol
across an entire archive.
So drop both these functions from the public symbol set, and fix up all
callers (trivial).
Nick Alcock [Thu, 6 Nov 2025 15:09:47 +0000 (15:09 +0000)]
libctf, include, gdb: API review: ctf_get_arc -> ctf_dict_arc, endianness
libctf lets you specify the endianness of the symbol table, for those
cases where the symbol table is not derived from an ELF file. It has
not one but two functions to do this, ctf_symsect_endianness and
ctf_symsect_endianness.
This is such an obscure use case that having two functions seems like
overkill. We can reduce it to only one by saying that ctf_get_arc
can synthesise an archive if one doesn't already exist: then all dual
ctf_* and ctf_arc_* functions can be reduced to just the ctf_arc_*
form. While we're at it, rename ctf_get_arc to ctf_dict_arc, like all
other functions that take a dict and return something else.
We do need to add a parameter to ctf_dict_arc, to allow for one use
case. Normally, if you call it and it synthesises an archive for you,
you don't want to be bothered with it: ctf_dict_close frees it
automatically and you can ignore its lifetime. But if you are
specifically calling ctf_dict_arc because you know you opened the
archive yourself (it's not a synthetic one) and you need to close
it explicitly, you absolutely do not want one to be synthesised: this
will lead to a double-free when you do a ctf_dict_close() followed by a
ctf_close().
So add a "do not synthesise an archive, just return the real thing"
parameter to ctf_dict_arc accordingly.
Nick Alcock [Thu, 6 Nov 2025 14:43:05 +0000 (14:43 +0000)]
libctf, include: API review: merge ctf_get*sect into ctf_get_elf_sect
These functions all have the same purpose: to return some ELF section
related to CTF, directly from the internal ctf_dict_t state, so that
it can be freed after close (for those cases where the dict was opened
via ctf_bufopen et al, and now the caller wants those sections back
so it can free them).
Merge them into one function, rename it so it's clearly about ELF
sections, not CTF sections, and pass in a new enum ctf_elfsect_names_t
to indicate which section is wanted.
Nick Alcock [Thu, 6 Nov 2025 14:27:30 +0000 (14:27 +0000)]
libctf, include: API review: delete ctf_dict_open_sections et al
This function has no practical use; it lets you open an archive but also
associate it with a string and symbol table different from that in the
ELF file it was linked to. It might in theory be useful if standalone
archives not in ELF sections but associated with ELF string and symbol
tables existed, but they don't, and why would they? Standalone archives
are by definition not associated with an ELF file, so are hardly going
to reference some ELF file's strtab and symtab.
Delete the obsolete alias ctf_arc_open_by_name_sections too.
This lets us fold ctf_dict_open_sections into its only remaining caller.
Nick Alcock [Thu, 6 Nov 2025 14:21:12 +0000 (14:21 +0000)]
libctf: API review: delete ctf_ref
This function was meant to be used to let you bump the refcount on dicts
so that you could use them if you were passed an opened dict by someone
else who went on to close it. The only instance of this in the libctf
API was ctf_archive_iter(), which has been deleted: and in any case,
ctf_archive_next() and therefore also ctf_archive_iter() these days
cache opened dicts, so it was entirely safe to just hang on to them
and keep using them after the ctf_archive_iter() iteration function
returned.
Nick Alcock [Thu, 6 Nov 2025 14:16:19 +0000 (14:16 +0000)]
libctf, include: API review: restrict ctf_simple_open to the testsuite
This convenience function seems to have no practical use outside of
testcases and the guts of the ctf_open() code: for now, move it into a
non-installed header. Real-world users either seem to call ctf_bufopen
or CTF archive opening functions, or have an ELF executable and call one
of the ctf_open() ELF-or-BFD functions instead.
Nick Alcock [Thu, 6 Nov 2025 14:04:15 +0000 (14:04 +0000)]
libctf, include: API review: ctf_archive_raw_iter -> ctf_archive_raw_next
This disposes of the last _iter function, replacing it with a _next
implementation instead. (It's very obscure, used only by archiving
tools for CTF archives, none of which are upstreamed anywhere: hence
the lack of a _next before now.)
Nick Alcock [Thu, 6 Nov 2025 13:43:50 +0000 (13:43 +0000)]
libctf, include, gdb: API review: delete ctf_*_iter
These functions are all redundant to the corresponding *_next function,
are usually less pleasant to use, and are convertible into trivial while
loops with very little effort. It's time to drop them.
Nick Alcock [Wed, 5 Nov 2025 13:58:00 +0000 (13:58 +0000)]
libctf, ld, binutils, include: API review: ctf_error_t/ctf_ret_t/ctf_bool_t
This first round of the libctf API review changes a whole pile of
apparently-identical int return types and function parameters into
typedefs that more clearly denote their purpose: ctf_error_t is a
positive ECTF_* or errno error value, ctf_ret_t is 0 or -1 on error, and
ctf_bool_t is 0/1 (true/false), or -1 for error.
(Doing this teased multiple bugs out of libctf itself where we were
accidentally returning the wrong things in obscure error paths, so just
using ints for all three cases was clearly too hard even for the
libctf authors, let alone for users.)
This is ABI-compatible, but it still requires (trivial) source changes
in callers to avoid -Wincompatible-pointer-types warnings.
Nick Alcock [Wed, 5 Nov 2025 13:55:03 +0000 (13:55 +0000)]
gdb: ctfread: adjust for API changes in libctf v4
These are all pretty simple: the most invasive modification is that
because struct members that are bitfields now carry their bit-width
width them, GDB no longer needs to figure this out from the base type.
Nick Alcock [Tue, 8 Jul 2025 14:10:43 +0000 (15:10 +0100)]
libctf: more archive-related field renamings
This one is definitely beneficial: ctf_archive_modent's ctf_offset
is just a bad name now it can be used as the offset of CTF, BTF,
or property values. Just call it 'contents' instead (and drop
the 'offset' from the name_offset field as well.)
Nick Alcock [Wed, 28 May 2025 14:53:52 +0000 (15:53 +0100)]
libctf: archive: endianness-flipping and range-checking
This does endianness-flipping just like CTF dicts, flipping aggressively on
open, taking advantage of the archive's mmapped nature to flip all the size
words before each archive member as well.
The range checking verifies non-overlappingness of archive sections and
non-overrunning: it does not verify that archive members don't overlap,
because any such overlap would almost certainly fail at open time anyway
(due to the prefixed size word if nothing else).
This dug up a bug in v1 archives, where the size word included the length
of *the size word itself*: we correspondingly reduce that size if v1
archives are encountered (and fail if the result underflows).
Nick Alcock [Wed, 28 May 2025 12:42:11 +0000 (13:42 +0100)]
libctf: archive: format v2
This commit does a bunch of things, all tangled together tightly enough that
disentangling them seemed no to be worth doing.
The biggest is a new archive format, v2, identified by a magic number which
is one higher than the v1 format's magic number. As usual with libctf we
can only write out the new format, but can still read the old one.
The new format has multiple improvements over the old:
- It is written native-endian and aggressively endian-swapped at open time,
just like CTF and BTF dicts; format v1 was little-endian, necessitating
byteswapping all over the place at read and write time rather than
localized in one pair of functions at read time.
- The modent array of name-offset -> archive-offset mappings for the CTF
archives is explicitly pointed at via a new ctfa_modents header member
rather than just starting after the end of the header.
- The length that prepends each archive member actually indicates its
length rather than always being sizeof (uint64_t) bytes too high (this
was an outright bug)
- There is a new shared properties table which in future we may be able to
use to unify common values from the constituent CTF headers, reducing the
size overhead of these (repeated, uncompressed) entities. Right now it
only contains one value, parent_name, which is the parent dict name if
one is common across all dicts in the archive (always true for any
archives derived from ctf_link()). This is used to let
ctf_archive_next() et al reliably open dicts in the archive even if they
are child BTF dicts (which do not contain a header name).
The properties table shares its property names with the CTF members,
and uses the same format (and shared code) for the property values as for
CTF archive members: length-prepended. The archive members and
name->value table ("modents") use distinct tables for properties and CTF
dicts, to ensure they are spatially separated in the file, to maximize
compressibility if we end up with a lot of properties and people compress
the whole thing.
We can also restrict various old bug-workaround kludges that only apply to
dicts found in v1 archives: in particular, we needed to dig out the preamble
of some CTF dicts without opening them to figure out whether they used the
.dynstr or .strtab sections: this whole bug workaround is now unnecessary
for v2 and above.
There are other changes for readability and consistency:
- The archive wrapper data structure, known outside ctf-archive.c as
ctf_archive_t, is now consistently referred to inside ctf-archive.c as
'struct ctf_archive_internal' and given the parameter name 'arci' rather
than sometimes using ctf_archive_t and sometimes using 'wrapper' or 'arc'
as parameter names. The archive itself is always called 'struct
ctf_archive' to emphasise that it is *not* a ctf_archive_t.
ctf_archive_t remains the public typedef: the fact that it's not actually
the same thing as the archive file format is an internal implementation
detail.
- We keep the archive header around in a new ctfi_hdr member, distinct
from the actual archive itself, to make upgrading from v1 and cross-
endianness support easier. The archive itself is now kept as a char *
and used only to root pointer arithmetic.
Nick Alcock [Fri, 25 Apr 2025 20:54:28 +0000 (21:54 +0100)]
libctf: API change documentation (NOT FOR UPSTREAMING)
These probably need to be turned into libctf/NEWS content once we decide (if
we decide) that these changes are good. (I do hope we don't make too many
changes because it'll be horribly disruptive, but I wouldn't be surprised to
see a few...)
Nick Alcock [Fri, 25 Apr 2025 20:53:18 +0000 (21:53 +0100)]
ld: testsuite: a tiny start on ld.ctf test adjustments
This is just allowing for changes in objdump itself -- the actual
test results cannot be adjusted until either CTFv4 is emitted or
back-compatibility upgrading is implemented (or, preferably, both,
plus testing of a subset of them with -gbtf as well).
Nick Alcock [Fri, 25 Apr 2025 20:49:22 +0000 (21:49 +0100)]
libctf: run_lookup_test: force BTF emission (NOT FOR UPSTREAMING)
Pro tem as a hack until GCC supports -gctf for v4, or v3 upgrading
is supported, or direct CTF-then-BTF tests are written, just emit
BTF for test purposes.
Nick Alcock [Fri, 25 Apr 2025 20:45:45 +0000 (21:45 +0100)]
binutils: objdump, readelf: BTF dumping support
objdump and readelf's --ctf option can now dump BTF as well (in CTF dumping
format, which is quite high-level and C-like compared to bpftool btf dump:
both have their uses).
Nick Alcock [Fri, 25 Apr 2025 20:33:53 +0000 (21:33 +0100)]
ld: BTF deduplication
Figuring out what to do when a mix of BTF and CTF sections is supplied is a
little magical. We hunt down all the .BTF and .ctf sections in the link,
but just because all the sections we found were .BTF doesn't mean that the
output isn't going to be CTF: if deduplication results in any conflicting
types, we'll need a CTF section to encode them (BTF cannot yet represent
such things).
So if we find that we've got nothing but .BTF sections, we do as we do for
.ctf and mark all but one of them as excluded from the link (with the intent
of creating the deduplicated output in the remaining one): but we also
create a provisional linker-created .ctf section, just in case we need it
later (we can't tell at this stage, before deduplication).
After deduplication, one of these sections is unneeded. Sometimes, we can
do this removal via Depending on the emulation, lang_write_ctf may be called
either early (long before bfd_elf_final_link) or late (after final link and
indeed symtab and strtab writeout). (ELF calls it late). If called early,
we can figure out what format was emitted by ctf_link and freely remove the
unwanted section by flipping on its SEC_EXCLUDE flag.
But that leaves late calls. We add a new ctf_remove_section hook to the
bfd_link_callbacks, which is invoked at the very start of
bfd_elf_final_link, when removal of sections is still permitted. We invoke
section removal for the .ctf-or-.BTF section via this hook if lang_write_ctf
is called late: this then does the extra trickery with section count
adjustment etc needed to remove sections so late, and communicates the fact
that it's removed sections to bfd_elf_final_link, which can then decide to
call _bfd_fix_excluded_sec_syms (which is quite expensive, so we do it only
once for all sections removed at this stage, by whatever means).
Nick Alcock [Fri, 25 Apr 2025 20:28:25 +0000 (21:28 +0100)]
bfd, ld: allow the disabling of CTF deduplication; BTF linking
This first half of ld support for BTF deduplication adds a facility to GNU
ld and BFD to entirely disable CTF or BTF deduplication via the new
--disable-ctf-dedup linker option, and (slightly entangled with it) modifies
the existing BFD CTF support so that it can also deduplicate BTF.
Determining whether deduplication is disabled when all you have is a section
requires a bit of digging around and the proxying of of _bfd_get_link_info
into bfd-elf.h via a new _bfd_elf_get_link_info, to dodge include ordering
problems.
(Note that BTF deduplication support is not yet complete: in particular,
relocs into BTF sections don't get handled at all yet.)
Nick Alcock [Fri, 25 Apr 2025 20:18:29 +0000 (21:18 +0100)]
libctf: dump: dump the header; dump enum64s; adapt to API changes
A bunch of dumper changes. Most importantly, adapt to the changes in the _f
iteration function prototypes by no longer carrying around our own cds_fp
dict pointer everywhere but just using the one we are given by the iteration
function.
But also, dump the v3 and v4/BTF headers separately, using the stored
original v3-pre-upgrade header copy if present. The v3 dumper is not tested
yet, of course, but is more or less unchanged from the old code, so probably
nearly works. The v4 dumper is tested.
Add enum64 support (basically just a bit of extra code to print the
signedness of enums).
Nick Alcock [Fri, 25 Apr 2025 20:09:34 +0000 (21:09 +0100)]
libctf: archive: allow opening BTF dicts in archives (not for upstreaming)
BTF dicts are normally suppressed in archives, but it is possible
to create them with enough cunning. If such an archive is
encountered, the BTF dicts in it have no parent name, which
means that ctf_arc_import_parent (used by ctf_dict_open_cached,
ctf_archive_next, and all the ctf_arc_lookup functions) fails
to figure out what parent to import, and fails.
Kludge around it by relying on our secret knowledge that ctf_link_write
always emits the parent dict into the archive first. If no name is set,
import the parent dict for now. (Before upstreaming, a new archive format
with a dedicated parent dict field will turn up, obviating this kludge.)
Nick Alcock [Fri, 25 Apr 2025 20:06:46 +0000 (21:06 +0100)]
libctf: link: improve BTF child dict naming
BTF dicts don't have a cuname, which means that when the deduplicator runs
over them any child dicts that result from conflicted types found in those
CUs end up with no name either. Detect such unnamed dicts and propagate
in the name the linker gave them at input time instead. (There is always
*some* such name, even if it's something totally useless like "#1"; usually
it's much more useful.)
Nick Alcock [Fri, 25 Apr 2025 19:49:26 +0000 (20:49 +0100)]
libctf: dedup: conflicting CU names and merging into the parent
The last two dedup changes are, firstly, to use ctf_add_conflicting() to
arrange that conflicting types that are hidden because they are added to the
same dict as the types they conflict with (e.g. conflicting types in
modules) are properly marked with the CU name that the type comes from.
This could of course not be done with the old non-root flag, but now that we
have proper prefix types, we can record it, and consumers can find out what
CU any type comes from via ctf_type_conflicting (or, for non-kernel CTF
generated by GNU ld, via the ctf_cuname of the per-cu dict).
Secondly, we add a new kind of CU mapping for cu-mapped (two-stage) links
(as a reminder, these carry out a second stage of dedupping in which they
squash specific CUs down to a named set of child dicts, fusing named inputs
into particular named outputs: the kernel linker uses this to make child
dicts that represent modules rather than translation units). You can now map
any CU name to "" (the null string). This indicates that types that would
land in the CU in question should not be emitted into any sort of per-module
dict but should instead just be emitted into the shared dict, possibly being
marked conflicting as they do so. The usual popcount mechanism will be used
to pick the type which is left unhidden. The usual forwarding stubs you
would expect to find for conflicting structs and unions will not be emitted:
instead, real structs and unions will take their place. Consumers must take
care when chasing parent types that point to tagged structs to make sure
that there isn't a correspondingly-named struct in the child they're looking
at (but this is generally a problem with type chasing in children anyway,
which I have a TODO open to find some sort of solution to: this should be
being done automatically, and isn't).
Nick Alcock [Fri, 25 Apr 2025 19:41:14 +0000 (20:41 +0100)]
libctf: dedup: decl tag support.
Decl tags to types and to functions and function arguments are relatively
straightforward, as are decl tags to structures as a whole or to members of
untagged structures; but decl tags to specific members of tagged structs and
unions have two separate nasty problems, entirely down to the use of tagged
structures to break cycles in the type graph.
The first is that we have to mark decl tags conflicting if their associated
struct is conflicting, but traversal from types to their parents halts at
tagged structs and unions, because the type graph is sharded via stubs at
those points and conflictedness ceases. But we don't want to do that here:
a decl_tag to member 10 of some struct is only valid if that struct *has*
ten members, and if the struct is conflicted, some may have only one. The
decl tag is only valid for the specific struct-with-ten-members it was
originally pointing at, anyway: other structs-with-ten-members may have
entirely different members there, which are not tagged or which are tagged
with something else.
So we track this by keeping track of the only thing that is knowable about
struct/union stubs: their decorated name. The citers graph gains mappings
from decorated SoU names to decl tags (where the decl tag has a
component_idx), and conflictedness marking chases that and marks
accordingly, via the new ctf_dedup_mark_conflicting_hash_citers.
The second problem is that we have to emit decl tags to struct members of
all kinds after the members are emitted, but the members are emitted later
than core type deduplication because they might refer to any types in the
dict, including types added after the struct was added. So we need to
accumulate decl tags to struct members in a new hashtab
(cd_emission_struct_decl_tags) and add yet *another* pass that traverses
that and emits all the decl tags in it. (If it turns out that decl tags to
other things can similarly appear before the type they refer to, we'll
either have to sort them earlier or emit them at the end as well -- but this
seems unlikely.)
None of this complexity is properly tested, because we're not yet emitting
decl tags (as far as I know). But at least it doesn't break anything else,
and it's somewhere to start.
Nick Alcock [Fri, 25 Apr 2025 19:32:58 +0000 (20:32 +0100)]
libctf: dedup: type tags
Another trivial case: they're just like pointers except that they have a
name (and we don't need to care about that, because names are hashed in, if
present, anyway).
Nick Alcock [Fri, 25 Apr 2025 18:22:42 +0000 (19:22 +0100)]
libctf: dedup: datasecs and vars
These are a bit trickier than previous things. Datasecs are unusual: the
content they contain for a given variable is conceptually part of that
variable, in that a variable can only appear in one datasec: so if two TUs
have different datasec values for a variable, you'll want to emit two
conflicting variables with different datasec entries. Equally, if they
have entries in different datasecs, they're conflicting. But the *index*
of a variable in a datasec has nothing to do with the variable: it's just
a property of how many other variables are in the datasec.
So we turn the type graph upside down for them. We track the variable ->
datasec mappings for every variable we are dedupping, and use this to hash
variables with datasec entries *twice*: firstly, as purely variable type,
name, and promoted-to-non-extern linkage, and secondly with all of that plus
the datasec name, offset and size: we indicate that the non-extern hash
*replaces* the extern one, and use this later on. The datasec itself is not
hashed at all! We skip it at both hashing and emission time (without
breaking anything else, because nothing points at datasecs, so nothing will
ever recurse down into one).
The popcount code (used to find the "most popular" type, the one to put in
the shared dict) changes to say that replaced types (extern vars) popcounts
are added to the counts of the types that replace them (the corresponding
non-extern vars).
At emission time, replaced variables (extern variables) are skipped,
ensuring that extern vars with non-conflicting non-extern counterparts are
skipped in favour of the non-extern ones. ctf_add_section_variable then
takes care of emitting both the var and its corresponding datasec for us.
Nick Alcock [Fri, 25 Apr 2025 18:14:02 +0000 (19:14 +0100)]
libctf: dedup: structs with bitfields, BTF floats
The last two trivial cases. Hash in the bitfieldness of structs and the
bit-width of members (their bit-offset is already being hashed in), and emit
them accordingly.
BTF floats hardly have any state: emitting them is even easier.
These are all fairly simple and are handled together because some of the
diffs are annoyingly entwined.
enum and enum64 are trivial: it's just like enums used to be, except that we
hash in the unsignedness value, and emit signed or unsigned enums or enum64s
appropriately. (The signedness stuff on the emission side is fairly
invisible: it's automatically handled for us by ctf_type_encoding and
ctf_add_enum*_encoded, via the CTF_INT_SIGNED encoding.)
Functions are also fairly simple: we hash in all the parameter names as well
as the args, and emit them accordingly.
Linkage is more difficult. We want to deduplicate extern and non-extern
declarations together, while leaving static ones separate. We do this by
promoting extern linkage to global at hashing time, and maintaining a
cd_linkages hashmap which maps from type hash values of func linkages (and
vars) to the best linkage known so far, then updating it if a better one
("less extern") comes along (relying on the fact that we are already
unifying the hashes of otherwise-identical extern and non-extern types). At
emission time, we use this hashtab to figure out what linkage to emit.
Nick Alcock [Fri, 25 Apr 2025 17:42:55 +0000 (18:42 +0100)]
libctf: dedup: fix a broken error path in string dedup
If we run out of memory updating the string counts, set the right errno:
ctf_dynhash_insert returns a *negative* error value, and we want a positive
one in the ctf_errno.
Nick Alcock [Fri, 25 Apr 2025 17:40:39 +0000 (18:40 +0100)]
libctf: dedup: chase API changes: use the public API more
To get ready for the deduplicator changes, we chase the API changes to
things like ctf_member_next, and add support for prefix types (using the
suffix where appropriate, etc). We use the ctf-types API for things like
forward lookup, using the private _tp functions to reduce overhead while
centralizing knowledge of things like the encoding of enum forwards outside
the deduplicator.
Nick Alcock [Fri, 25 Apr 2025 17:26:45 +0000 (18:26 +0100)]
libctf: open-bfd: open BTF dicts
Teaching ctf_open and ctf_fdopen to open BTF dicts if passed is quite
simple: we just need to check the magic number and allow BTF dicts
into the lower-level ctf_simple_open machinery (which ultimately
calls ctf_bufopen).
Nick Alcock [Fri, 25 Apr 2025 17:29:06 +0000 (18:29 +0100)]
libctf: link: drop unnecessary back-compatibility code
We no longer need to ensure that inputs have a new-format func info
section: no such sections exist in CTFv4 (and the v3 compatibility
code will throw away old-format sections).
The idea here is that callers can call ctf_link_output_is_btf on a
ctf_link()ed (deduplicated) dict to tell whether a link will yield
BTF-compatible output before actually generating that output, so
they can e.g. decide whether to avoid trying to compress the dict
if they know it would be BTF otherwise (since compressing a dict
renders it non-BTF-compatible).
ctf_link_write() gains an optional is_btf output parameter that
reports whether the dict that was finally generated is actually BTF
after all, perhaps because the caller didn't call
ctf_link_output_is_btf or wants to be robust against possible future
changes that may add other reasons why a written-out dict can't be BTF
at the last minute.
These are simple wrappers around already-existing machinery earlier in
this series.
Nick Alcock [Fri, 25 Apr 2025 17:17:33 +0000 (18:17 +0100)]
libctf: strings: don't check for non-deduplicable atoms in the parent
Callers of ctf_str_add_no_dedup_ref are indicating that they would like the
string they have added a reference to to appear in the current dict and not
be deduplicated into the parent. This is true even if the string already
exists in the parent, so we should not check for strings in the parent and
reuse them in this case.
Nick Alcock [Fri, 25 Apr 2025 17:12:47 +0000 (18:12 +0100)]
libctf: serialize: finish off the serializer
The only remaining parts of serialization that need fixing up is
ctf_preserialize, which despite its name does nearly all the work of
serialization: the only bit it doesn't do is write the string tables
(since that has to happen across dicts after all the dicts have otherwise
been laid out, in order to deduplicate the strtabs).
As usual in this series, there's adjustment for various field name changes
(maxtypes -> ntypes, the move into ctf_serialize, etc), and extra work to
figure out whether we're emitting BTF or not and to handle the distinction
between CTF and BTF headers, and not try to emit CTF-only stuff like the
symtypetabs into BTF dicts; we can also throw out a bunch of old code that
sets compatibility flags, everything to do with forcing variables into the
dynamic state in case they changed (we're going to handle that more
generally for everything in the types table at a later date, outside
serialization), and everything to do with special handling of variables in
general.
But much of that is only a couple of lines each, and most of the changes are
mechanical: this is probably the simplest serialization commit in this
series.
Nick Alcock [Fri, 25 Apr 2025 17:09:02 +0000 (18:09 +0100)]
libctf: open: fix closing of children with imported parents
Closing a parent dict for the last time erases all its types and strings,
which makes type and string lookups in any surviving children impossible
from then on. Since children hold a reference to their parent, this can
only happen in ctf_dict_close of the last child, after the parent has
been closed by the caller as well. Since DTD deletion now involves
doing type and string lookups in order to clean out the name tables,
close the parent only after the child DTDs have been deleted.
Nick Alcock [Fri, 25 Apr 2025 16:59:31 +0000 (17:59 +0100)]
libctf: open, types: ctf_import for BTF
ctf_import needs a bunch of fixes to work with pure BTF dicts -- and, for
that matter, importing newly-created parent dicts that have never been
written out, which may have a bunch of nonprovisional types (if types were
added to it before any imports were done) or may not (if at least one
ctf_import into it was done before any types were added).
So we adjust things so that the values that are checked against are the
nonprovisional-types values: the header revisions actually changed the name
of cth_parent_typemax to cth_parent_ntypes to make this clearer, so catch up
with that. In the parent, we have to use ctf_idmax, not ctf_typemax.
One thing we must prohibit is that you cannot add a bunch of types to a
child and then import a parent into it: the type IDs will all be wrong
and the string offsets more so. This was partly prohibited: prohibit it
entirely (excepting only that the not-actually-written-out void type
we might add to new BTF dicts does not influence this check).
Since BTF children don't have a cth_parent_ntypes or a cth_parent_strlen, we
cannot check this stuff, but just set them and hope.
Nick Alcock [Fri, 25 Apr 2025 16:54:48 +0000 (17:54 +0100)]
libctf: serialize: handle CTF-versus-BTF output format checks
The internal function ctf_serialize_output_format centralizes all the checks
for BTF-versus-CTF, checking to see if the type section, active
suppressions, and BTF-emission mode permit BTF emission, setting
ctf_serialize.cs_is_btf if we are actually BTF, and raising ECTF_NOTBTF if
we are requiring BTF emission but the type section is such that we can't
emit it.
(There is a forcing parameter in place, as with most of these serialization
functions, to allow for the caller to force CTF emission if it knows the
output will be compressed or will be part of multi-member archives or
something else external to the type section that BTF does not support.)