journal-file: port journal_file_open() to openat_report_new()
We so far had some magic logic in place that files we open for write
with size zero are freshly created. That of course is a bogus
assumption, in particular as this code deals with corrupted file systems
which oftentimes contain zero size inodes from left-over runs.
Let's fix this properly, and actually let the kernel tell us whether it
create the file or not.
fs-util: add openat_report_new() wrapper around openat()
This is a wrapper around openat(). It works mostly the same, except for
one thing: it race-freely reports whether we just created the indicated
file in case O_CREAT is passed without O_EXCL.
basic/strv: avoid potential UB with references to array[-1]
"""
Given an array a[N] of N elements of type T:
- Forming a pointer &a[i] (or a + i) with 0 ≤ i ≤ N is safe.
- Forming a pointer &a[i] with i < 0 or i > N causes undefined behavior.
- Dereferencing a pointer &a[i] with 0 ≤ i < N is safe.
- Dereferencing a pointer &a[i] with i < 0 or i ≥ N causes undefined behavior.
"""
As pointed by by @medhefgo, here we were forming a pointer to a[-1]. a itself
wasn't NULL, so a > 0, and a-1 was also >= 0, and this didn't seem to cause any
problems. But it's better to be formally correct, especially if we move the
code to src/fundamental/ later on and compile it differently.
Compilation shows no size change (with -O0 -g) on build/systemd, so this should
have no effect whatsoever.
With my crappy home network the test takes 29.5s usually. But with any
tiny slowdown, it goes above the 30s limit and fails. Let's bump the
timeout to avoid spurious failures.
"meson test" uses a test name generated from the file name and those long names
cause the test log output to exceed terminal width which looks bad. Let's replace
some long names with more-meaningful names that actually say something about
the tests.
These unit (if enabled) will try to update the OS in regular intervals.
Moreover, every day in the early morning this will attempt to reboot the
system if there's a newer version installed than running.
timesyncd: improve log message whe getting a reply from server
The message is misleading: it's not about synchronization but about
successful communicaiton. And it's not about "initial", but only about
first contact since we siwtched to this server.
timesyncd: generate a structure log message the first time we set the clock correctly
Usecase: later on we can use this to retroactively adjust log output in
journalctl or similar on systems lacking an RTC: we just have to search
for this sructured log message that indicates the first sync point and
can then retroactively adjust the incorrect timestamps collected before
that.
timesyncd: move stuff that is not about setting the clock out of manager_adjust_clock()
Let's make sure manager_adjust_clock() is purely about setting the
clock, and nothing else.
Let's clean up logging this way. manager_adjust_clock() now won#t log
about errors, but the caller can safely do that, and do with the right
log message string.
shared: split out ESP/XBOOTLDR search stuff from bootspec.c
The code is quite different from the rest of bootspec.c, with different
deps and stuff. There's even a /***/ line to separate the two parts.
Given how large the file already is, let#s just split it into two.
docs: add new "sort-key" field to boot loader spec
This allows snippet generators to explicitly order entries: any string
can be set as an entry's "sort key". If set, sd-boot will use it to sort
entries on display.
New logic is hence (ignore the boot counting logic)
sort-key is set → primary sort key: sort-key (lexicographically increasing order)
→ secondary sort key: machine-id (also increasing order)
→ tertiary sort key: version (lexicographically decreasing order!)
sort-key is not set → primary sort key: entry filename (aka id), lexicographically increasing order)
With this scheme we can order OSes by their names from A-Z but then put
within the same OS still the newest version first. This should clean up
the order to match expectations more.
shared/install: do not print aliases longer than UNIT_NAME_MAX
065364920281e1cf59cab989e17aff21790505c4 did the conversion to install_path_printf().
But IIUC, here we are just looking at a unit file name, not the full
path.
shared/install: consistently use 'lp' as the name for the LookupPaths instance
Most of the codebase does this. Here we were using 'p' or 'paths'
instead. Those names are very generic and not good for a "global-like"
object like the LookupPaths instance. And we also have 'path' variable,
and it's confusing to have 'path' and 'paths' in the same function that
are unrelated.
Also pass down LookupPaths* lower in the call stack, in preparation for
future changes.
homed: permit inodes owned by UID_MAPPED_ROOT to be created in $HOME
If people use nspawn in their $HOME we should allow this inodes owned by
this special UID to be created temporarily, so that UID mapped nspawn
containers just work.
nspawn: make sure host root can write to the uidmapped mounts we prepare for the container payload
When using user namespaces in conjunction with uidmapped mounts, nspawn
so far set up two uidmappings:
1. One that is used for the uidmapped mount and that maps the UID range
0…65535 on the backing fs to some high UID range X…X+65535 on the
uidmapped fs. (Let's call this mapping the "mount mapping")
2. One that is used for the userns namespace the container payload
processes run in, that maps X…X+65535 back to 0…65535. (Let's call
this one the "process mapping").
These mappings hence are pretty much identical, one just moves things up
and one back down. (Reminder: we do all this so that the processes can
run under high UIDs while running off file systems that require no
recursive chown()ing, i.e. we want processes with high UID range but
files with low UID range.)
This creates one problem, i.e. issue #20989: if nspawn (which runs as
host root, i.e. host UID 0) wants to add inodes to the uidmapped mount
it can't do that, since host UID 0 is not defined in the mount mapping
(only the X…X+65536 range is, after all, and X > 0), and processes whose
UID is not mapped in a uidmapped fs cannot create inodes in it since
those would be owned by an unmapped UID, which then triggers
the famous EOVERFLOW error.
Let's fix this, by explicitly including an entry for the host UID 0 in
the mount mapping. Specifically, we'll extend the mount mapping to map
UID 2147483646 (which is INT32_MAX-1, see code for an explanation why I
picked this one) of the backing fs to UID 0 on the uidmapped fs. This
way nspawn can creates inode on the uidmapped as it likes (which will
then actually be owned by UID 2147483646 on the backing fs), and as it
always did. Note that we do *not* create a similar entry in the process
mapping. Thus any files created by nspawn that way (and not chown()ed to
something better) will appear as unmapped (i.e. as overflowuid/"nobody")
in the container payload. And that's good. Of course, the latter is
mostly theoretic, as nspawn should generally chown() the inodes it
creates to UID ranges that actually make sense for the container (and we
generally already do this correctly), but it#s good to know that we are
safe here, given we might accidentally forget to chown() some inodes we
create.
Net effect: the two mappings will not be identical anymore. The mount
mapping has one entry more, and the only reason it exists is so that
nspawn can access the uidmapped fs reasonably independently from any
process mapping.
Yu Watanabe [Wed, 16 Mar 2022 11:46:49 +0000 (20:46 +0900)]
udev: run the main process, workers, and spawned commands in /udev subcgroup
And enable cgroup delegation for udevd.
Then, processes invoked through ExecReload= are assigned .control
subcgroup, and they are not killed by cg_kill().