resolved: synthesize NODATA cache results when we find matching NSEC RRs
If we have a precisely matching NSEC RR for a name, we can use its type
bit field to synthesize NODATA cache lookup results for all types not
mentioned in there.
This is useful for mDNS where NSEC RRs are used to indicate missing RRs
for a specific type, but is beneficial in other cases too.
This adds most basic operation for doing DNSSEC validation on the
client side. However, it does not actually add the verification logic to
the resolver. Specifically, this patch only includes:
- Verifying DNSKEY RRs against a DS RRs
- Verifying RRSets against a combination of RRSIG and DNSKEY RRs
- Matching up RRSIG RRs and DNSKEY RRs
- Matching up RR keys and RRSIG RRs
- Calculating the DNSSEC key tag from a DNSKEY RR
All currently used DNSSEC combinations of SHA and RSA are implemented. Support
for MD5 hashing and DSA or EC cyphers are not. MD5 and DSA are probably
obsolete, and shouldn't be added. EC should probably be added
eventually, if it actually is deployed on the Internet.
resolved: port ResolveRecord() bus call implementation to dns_resource_record_to_wire_format()
Now that we have dns_resource_record_to_wire_format() we can generate
the RR serialization we return to bus clients in ResolveRecord() with
it. We pass the RR data along in the original form, not the DNSSEC
canonical form, since that would mean we'd lose RR name casing, which is
however important to keep for DNS-SD services and similar.
resolved: add code to generate the wire format for a single RR
This adds dns_resource_record_to_wire_format() that generates the raw
wire-format of a single DnsResourceRecord object, and caches it in the
object, optionally in DNSSEC canonical form. This call is used later to
generate the RR serialization of RRs to verify.
This adds four new fields to DnsResourceRecord objects:
- wire_format points to the buffer with the wire-format version of the
RR
- wire_format_size stores the size of that buffer
- wire_format_rdata_offset specifies the index into the buffer where the
RDATA of the RR begins (i.e. the size of the key part of the RR).
- wire_format_canonical is a boolean that stores whether the cached wire
format is in DNSSEC canonical form or not.
Note that this patch adds a mode where a DnsPacket is allocated on the
stack (instead of on the heap), so that it is cheaper to reuse the
DnsPacket object for generating this wire format. After all we reuse the
DnsPacket object for this, since it comes with all the dynamic memory
management, and serialization calls we need anyway.
resolved: store DNSKEY fields flags+protocol as-is
When verifying signatures we need to be able to verify the original
data we got for an RR set, and that means we cannot simply drop flags
bits or consider RRs invalid too eagerly. Hence, instead of parsing the
DNSKEY flags store them as-is. Similar, accept the protocol field as it
is, and don't consider it a parsing error if it is not 3.
Of course, this means that the DNSKEY handling code later on needs to
check explicit for protocol != 3.
Let's merge access_init() and mac_selinux_access_init(), and only call
mac_selinux_use() once, inside the merged function, instead of multiple
times, including in the caller.
Make sure dns_name_normalize(), dns_name_concat(), dns_name_is_valid()
do not accept/generate invalidly long hostnames, i.e. longer than 253
characters.
dns-domain: be more strict when encoding/decoding labels
Labels of zero length are not OK, refuse them early on. The concept of a
"zero-length label" doesn't exist, a zero-length full domain name
however does (representing the root domain). See RFC 2181, Section 11.
This way, directories created later for containers or for
journald-remote, will be readable by adm & wheel groups by default,
similarly to /var/log/journal/%m itself.
When we have non-owner user or group entries, we need the mask
for the acl to be valid. But acl_calc_mask() calculates the mask
to include all permissions, even those that were masked before.
Apparently this happens when we inherit *:r-x permissions from
a parent directory — the kernel sets *:r-x, mask:r--, effectively
masking the executable bit. acl_calc_mask() would set the mask:r-x,
effectively enabling the bit. To avoid this, be more conservative when
to add the mask entry: first iterate over all entries, and do nothing
if a mask.
This returns the code closer to J.A.Steffens' original version
in v204-90-g23ad4dd884.
Should fix https://github.com/systemd/systemd/issues/1977.
For now, only add_acls_for_user is tested. When run under root, it
actually sets the acls. When run under non-root, it sets the acls for
the user, which does nothing, but at least calls the functions.
journal: move the gist of server_fix_perms to acl-util.[hc]
Most of the function is moved to acl-util.c to make it possible to
add tests in subsequent commit.
Setting of the mode in server_fix_perms is removed:
- we either just created the file ourselves, and the permission be better right,
- or the file was already there, and we should not modify the permissions.
server_fix_perms is renamed to server_fix_acls to better reflect new
meaning, and made static because it is only used in one file.
libsystemd: make sure we prefix even the dirty secrets in our API with "_sd_"
This renames __useless_struct_to_allow_trailing_semicolon__ everywhere
to _sd_useless_struct_to_allow_trailing_semicolon_, to follow our usual
rule of prefixing stuff from public headers that should be considered
internal with "_sd_".
While we are at it, also to be safe: when the struct is used in the C++
protector macros make sure to use two different names depending on
whether it appears in the C++ or C side of things. After all, there
might be compilers that don't consider C++ and C structs the same.
See https://github.com/systemd/systemd/pull/2052#discussion_r46067059
selinux: split up mac_selinux_have() from mac_selinux_use()
Let's distuingish the cases where our code takes an active role in
selinux management, or just passively reports whatever selinux
properties are set.
mac_selinux_have() now checks whether selinux is around for the passive
stuff, and mac_selinux_use() for the active stuff. The latter checks the
former, plus also checks UID == 0, under the assumption that only when
we run priviliged selinux management really makes sense.
The header file defines some helpers for GLIBC NSS and doesn't include
anything else but glibc headers, hence there's little reason to keep it
in shared/.
tree-wide: expose "p"-suffix unref calls in public APIs to make gcc cleanup easy
GLIB has recently started to officially support the gcc cleanup
attribute in its public API, hence let's do the same for our APIs.
With this patch we'll define an xyz_unrefp() call for each public
xyz_unref() call, to make it easy to use inside a
__attribute__((cleanup())) expression. Then, all code is ported over to
make use of this.
The new calls are also documented in the man pages, with examples how to
use them (well, I only added docs where the _unref() call itself already
had docs, and the examples, only cover sd_bus_unrefp() and
sd_event_unrefp()).
This also renames sd_lldp_free() to sd_lldp_unref(), since that's how we
tend to call our destructors these days.
Note that this defines no public macro that wraps gcc's attribute and
makes it easier to use. While I think it's our duty in the library to
make our stuff easy to use, I figure it's not our duty to make gcc's own
features easy to use on its own. Most likely, client code which wants to
make use of this should define its own:
Tom Gundersen [Mon, 6 Jul 2015 14:48:24 +0000 (16:48 +0200)]
resolved: announce support for large UDP packets
This is often needed for proper DNSSEC support, and even to handle AAAA records
without falling back to TCP.
If the path between the client and server is fully compliant, this should always
work, however, that is not the case, and overlarge packets will get mysteriously
lost in some cases.
For that reason, we use a similar fallback mechanism as we do for palin EDNS0,
EDNS0+DO, etc.:
The large UDP size feature is different from the other supported feature, as we
cannot simply verify that it works based on receiving a reply (as the server
will usually send us much smaller packets than what we claim to support, so
simply receiving a reply does not mean much).
For that reason, we keep track of the largest UDP packet we ever received, as this
is the smallest known good size (defaulting to the standard 512 bytes). If
announcing the default large size of 4096 fails (in the same way as the other
features), we fall back to the known good size. The same logic of retrying after a
grace-period applies.
Tom Gundersen [Wed, 24 Jun 2015 13:08:40 +0000 (15:08 +0200)]
resolved: set the DNSSEC OK (DO) flag
This indicates that we can handle DNSSEC records (per RFC3225), even if
all we do is silently drop them. This feature requires EDNS0 support.
As we do not yet support larger UDP packets, this feature increases the
risk of getting truncated packets.
Similarly to how we fall back to plain UDP if EDNS0 fails, we will fall
back to plain EDNS0 if EDNS0+DO fails (with the same logic of remembering
success and retrying after a grace period after failure).
Tom Gundersen [Tue, 23 Jun 2015 21:06:09 +0000 (23:06 +0200)]
resolved: implement minimal EDNS0 support
This is a minimal implementation of RFC6891. Only default values
are used, so in reality this will be a noop.
EDNS0 support is dependent on the current server's feature level,
so appending the OPT pseudo RR is done when the packet is emitted,
rather than when it is assembled. To handle different feature
levels on retransmission, we strip off the OPT RR again after
sending the packet.
Similarly, to how we fall back to TCP if UDP fails, we fall back
to plain UDP if EDNS0 fails (but if EDNS0 ever succeeded we never
fall back again, and after a timeout we will retry EDNS0).
Tom Gundersen [Thu, 16 Jul 2015 12:39:55 +0000 (14:39 +0200)]
resolved: degrade the feature level on explicit failure
Previously, we would only degrade on packet loss, but when adding EDNS0 support,
we also have to handle the case where the server replies with an explicit error.
Tom Gundersen [Mon, 6 Jul 2015 06:15:25 +0000 (08:15 +0200)]
resolved: fallback to TCP if UDP fails
This is inspired by the logic in BIND [0], follow-up patches
will implement the reset of that scheme.
If we get a server error back, or if after several attempts we don't
get a reply at all, we switch from UDP to TCP for the given
server for the current and all subsequent requests. However, if
we ever successfully received a reply over UDP, we never fall
back to TCP, and once a grace-period has passed, we try to upgrade
again to using UDP. The grace-period starts off at five minutes
after the current feature level was verified and then grows
exponentially to six hours. This is to mitigate problems due
to temporary lack of network connectivity, but at the same time
avoid flooding the network with retries when the feature attempted
feature level genuinely does not work.
Note that UDP is likely much more commonly supported than TCP,
but depending on the path between the client and the server, we
may have more luck with TCP in case something is wrong. We really
do prefer UDP though, as that is much more lightweight, that is
why TCP is only the last resort.
If /etc/resolv.conf is missing, this should not result in the server
list to be cleared, after all the native data from resolved.conf
shouldn't be flushed out then. Hence flush out the data only if
/etc/resolv.conf exists, but we cannot read it for some reason.
resolved: don't follow the global search list on local scopes
It probably doesn't make sense to mix local and global configuration.
Applying global search lists to local DNS servers appears unnecessary
and creates problems because we'll traverse the search domains
non-simultaneously on multiple scopes.
resolved: handle properly if there are multiple transactions for the same key per scope
When the zone probing code looks for a transaction to reuse it will
refuse to look at transactions that have been answered from cache or the
zone itself, but insist on the network. This has the effect that there
might be multiple transactions around for the same key on the same
scope. Previously we'd track all transactions in a hashmap, indexed by
the key, which implied that there would be only one transaction per key,
per scope. With this change the hashmap will only store the most recent
transaction per key, and a linked list will be used to track all
transactions per scope, allowing multiple per-key per-scope.
Note that the linked list fields for this actually already existed in
the DnsTransaction structure, but were previously unused.
resolved: for a transaction, keep track where the answer data came from
Let's track where the data came from: from the network, the cache or the
local zone. This is not only useful for debugging purposes, but is also
useful when the zone probing wants to ensure it's not reusing
transactions that were answered from the cache or the zone itself.
resolved: store just the DnsAnswer instead of a DnsPacket as answer in DnsTransaction objects
Previously we'd only store the DnsPacket in the DnsTransaction, and the
DnsQuery would then take the DnsPacket's DnsAnswer and return it. With
this change we already pull the DnsAnswer out inside the transaction.
We still store the DnsPacket in the transaction, if we have it, since we
still need to determine from which peer a response originates, to
implement caching properly. However, the DnsQuery logic doesn't care
anymore for the packet, it now only looks at answers and rcodes from the
successfuly candidate.
This also has the benefit of unifying how we propagate incoming packets,
data from the local zone or the local cache.
Let's use a more useful way to write the flags. Also, leave some space
in the middle for the mDNS flags. After all, these flags are exposed on
the bus, and we should really make sure to expose flags that are going
to be stable, hence allow some room here...
(Not that the room really mattered, except to be nice to one's OCD)
The name RandomSec is too generic: "Sec" just specifies the default
unit type, and "Random" by itself is not enough. Rename to something
that should give the user general idea what the setting does without
looking at documentation.
Tom Gundersen [Thu, 26 Nov 2015 02:58:08 +0000 (03:58 +0100)]
resolved: bus - follow CNAME chains when resolving addresses
It may be unexpected to find a CNAME record when doing a reverse lookup, as we
expect to find a PTR record directly. However, it is explicitly supported
according to <https://tools.ietf.org/html/rfc2181#section-10.2>, and there
seems to be no benefit to not supporting it.
Tom Gundersen [Wed, 25 Nov 2015 21:22:38 +0000 (22:22 +0100)]
resolved: do not reject NSEC records with empty bitmaps
The assumption that no NSEC bitmap could be empty due to the presence of the bit representing
the record itself turns out to be flawed. See (the admittedly experimental) RFC4956 for a
counter example.
dns-domain: rework dns_label_escape() to not imply memory allocation
The new dns_label_escape() call now operates on a buffer passed in,
similar to dns_label_unescape(). This should make decoding a bit faster,
and nicer.
dns-domain: simplify dns_name_is_root() and dns_name_is_single_label()
Let's change the return value to bool. If we encounter an error while
parsing, return "false" instead of the actual parsing error, after all
the specified hostname does not qualify for what the function is
supposed to test.
Dealing with the additional error codes was always cumbersome, and
easily misused, like for example in the DHCP code.
Let's also rename the functions from dns_name_root() to
dns_name_is_root(), to indicate that this function checks something and
returns a bool. Similar for dns_name_is_signal_label().