Rework user/mount namespace handling and tools tree
The biggest change is that instead of making bwrap() responsible
for mounting the tools tree, we do it ourselves before we build/boot
each image. We do the same for remounting the top level directories
read-only, instead of leaving it to bwrap(), we do it once at the
start of run_verb(). Because we now mess with the host system mounts
ourselves again, we also go back to unconditionally unsharing a mount
namespace, even when running as root.
With the above out of the way, there's no real reason left to run
regular executables with bwrap(), so those are moved back to be
executed using run(). The above changes also remove the need for
bwrap_cmd(), so it is merged back with bwrap() again.
One nasty caveat of overmounting /usr ourselves at the start of
execution is that some python modules are loaded dynamically and we
need to make sure this has happened before we start overmounting /usr.
Finally, this commit also gets rid of running the image build in a
subprocess. Instead, after doing the build and doing the final tools
tree mount for the image we're going to boot/qemu/ssh into, if we're
going to do an unprivleged operation, we change uid/gid to the invoking
user. This is more or less the same as running these operations unprivileged
outside of the user namespace.
For boot/shell, these only run privileged, so we check beforehand
that we're running as root, and this doesn't change after become_root(),
so since we're just root all the time, there's no need to run the image
build in a subprocess.
To keep ssh working, we have to trick it into recognizing our user in
the user namespace by overmounting /etc/passwd with a file containing
an entry for the mapped user uid.
We also unify more of the uid/gid handling in run_verb() in general.