gdb/gdbserver: pass inferior arguments as a single string
GDB holds the inferior arguments as a single string. Currently when
GDB needs to pass the inferior arguments to a remote target as part of
a vRun packet, this is done by splitting the single argument string
into its component arguments by calling gdb::remote_args::split, which
uses the gdb_argv class to split the arguments for us.
The same gdb_argv class is used when the user has asked GDB/gdbserver
to start the inferior without first invoking a shell; the gdb_argv
class is used to split the argument string into it component
arguments, and each is passed as a separate argument to the execve
call which spawns the inferior.
There is however, a problem with using gdb_argv to split the arguments
before passing them to a remote target. To understand this problem we
must first understand how gdb_argv is used when invoking an inferior
without a shell.
And to understand how gdb_argv is used to start an inferior without a
shell, I feel we need to first look at an example of starting an
inferior with a shell.
Consider these two cases:
(a) (gdb) set args \$VAR
(b) (gdb) set args $VAR
When starting with a shell, in case (a) the user expects the inferior
to receive a literal '$VAR' string as an argument, while in case (b)
the user expects to see the shell expanded value of the variable $VAR.
If the user does 'set startup-with-shell off', then in (a) GDB will
strip the '\' while splitting the arguments, and the inferior will be
passed a literal '$VAR'. In (b) there is no '\' to strip, so also in
this case the inferior will receive a literal '$VAR', remember
startup-with-shell is off, so there is no shell that can ever expand
$VAR.
Notice, that when startup-with-shell is off, we end up with a many to
one mapping, both (a) and (b) result in the literal string $VAR being
passed to the inferior. I think this is the correct behaviour in this
case.
However, as we use gdb_argv to split the remote arguments we have the
same many to one mapping within the vRun packet. But the vRun packet
will be used when startup-with-shell is both on and off. What this
means is that when gdbserver receives a vRun packet containing '$VAR'
it doesn't know if GDB actually had '$VAR', or if GDB had '\$VAR'.
And this is a huge problem.
We can address this by making the argument splitting for remote
targets smarter, and I do have patches that try to do this in this
series:
https://inbox.sourceware.org/gdb-patches/cover.
1730731085.git.aburgess@redhat.com
That series was pretty long, and wasn't getting reviewed, so I'm
pulling the individual patches out and posting them separately.
This patch doesn't try to improve remote argument splitting. I think
that splitting and then joining the arguments is a mistake which can
only introduce problems. The patch in the above series which tries to
make the splitting and joining "smarter" handles unquoted, single
quoted, and double quoted strings. But that doesn't really address
parameter substitution, command substitution, or arithmetic expansion.
And even if we did try to address these cases, what rules exactly
would we implement? Probably POSIX shell rules, but what if the
remote target doesn't have a POSIX shell? The only reason we're
talking about which shell rules to follow is because the splitting and
joining logic needs to mirror those rules. If we stop splitting and
joining then we no longer need to care about the target's shell.
Clearly, for backward compatibility we need to maintain some degree of
argument splitting and joining as we currently have; and that's why I
have a later patch (see the series above) that tries to improve that
splitting and joining a little. But I think, what we should really
do, is add a new feature flag (as used by the qSupported packet) and,
if GDB and the remote target agree, we should pass the inferior
arguments as a single string.
This solves all our problems. In the startup with shell case, we no
longer need to worry about splitting at all. The arguments are passed
unmodified to the remote target, that can then pass the arguments to
the shell directly.
In the 'startup-with-shell off' case it is now up to the remote target
to split the arguments, though in gdbserver we already did this, so
nothing really changes in this case. And if the remote target doesn't
have a POSIX shell, well GDB just doesn't need to worry about it!
Something similar to this was originally suggested in this series:
https://inbox.sourceware.org/gdb-patches/
20211022071933.
3478427-1-m.weghorn@posteo.de/
though this series didn't try to maintain backward compatibility,
which I think is an issue that my patch solves. Additionally, this
series only passed the arguments as a single string in some cases,
I've simplified this so that, when GDB and the remote agree, the
arguments are always passed as a single string. I think this is a
little cleaner.
I've also added documentation and some tests with this commit,
including ensuring that we test both the new single string approach,
and the fallback split/join approach.
I've credited the author of the referenced series as co-author as they
did come to a similar conclusion, though I think my implementation is
different enough that I'm happy to list myself as primary author.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28392
Co-Authored-By: Michael Weghorn <m.weghorn@posteo.de>
Reviewed-By: Eli Zaretskii <eliz@gnu.org>
Tested-By: Guinevere Larsen <guinevere@redhat.com>
Approved-by: Kevin Buettner <kevinb@redhat.com>