- allow multiple signal handlers per signal?
- document chaining of signal handler for SIGCHLD and child handlers
- define more intervals where we will shift wakeup intervals around in, 1h, 6h, 24h, ...
- - generate a failure of a default event loop is executed out-of-thread
* investigate endianness issues of UUID vs. GUID
resolver is not capable of authenticating the server, so it is
vulnerable to "man-in-the-middle" attacks.</para>
+ <para>Server Name Indication (SNI) can be used when opening a TLS connection.
+ Entries in <varname>DNS=</varname> should be in format <literal>address#server_name</literal>.</para>
+
<para>In addition to this global DNSOverTLS setting
<citerefentry><refentrytitle>systemd-networkd.service</refentrytitle><manvolnum>8</manvolnum></citerefentry>
also maintains per-link DNSOverTLS settings. For system DNS
['sd_bus_wait', '3', [], ''],
['sd_event_add_child',
'3',
- ['sd_event_child_handler_t', 'sd_event_source_get_child_pid'],
+ ['sd_event_add_child_pidfd',
+ 'sd_event_child_handler_t',
+ 'sd_event_source_get_child_pid',
+ 'sd_event_source_get_child_pidfd',
+ 'sd_event_source_get_child_pidfd_own',
+ 'sd_event_source_get_child_process_own',
+ 'sd_event_source_send_child_signal',
+ 'sd_event_source_set_child_pidfd_own',
+ 'sd_event_source_set_child_process_own'],
''],
['sd_event_add_defer',
'3',
<refnamediv>
<refname>sd_event_add_child</refname>
+ <refname>sd_event_add_child_pidfd</refname>
<refname>sd_event_source_get_child_pid</refname>
+ <refname>sd_event_source_get_child_pidfd</refname>
+ <refname>sd_event_source_get_child_pidfd_own</refname>
+ <refname>sd_event_source_set_child_pidfd_own</refname>
+ <refname>sd_event_source_get_child_process_own</refname>
+ <refname>sd_event_source_set_child_process_own</refname>
+ <refname>sd_event_source_send_child_signal</refname>
<refname>sd_event_child_handler_t</refname>
<refpurpose>Add a child process state change event source to an event loop</refpurpose>
<paramdef>void *<parameter>userdata</parameter></paramdef>
</funcprototype>
+ <funcprototype>
+ <funcdef>int <function>sd_event_add_child_pidfd</function></funcdef>
+ <paramdef>sd_event *<parameter>event</parameter></paramdef>
+ <paramdef>sd_event_source **<parameter>source</parameter></paramdef>
+ <paramdef>int <parameter>pidfd</parameter></paramdef>
+ <paramdef>int <parameter>options</parameter></paramdef>
+ <paramdef>sd_event_child_handler_t <parameter>handler</parameter></paramdef>
+ <paramdef>void *<parameter>userdata</parameter></paramdef>
+ </funcprototype>
+
<funcprototype>
<funcdef>int <function>sd_event_source_get_child_pid</function></funcdef>
<paramdef>sd_event_source *<parameter>source</parameter></paramdef>
<paramdef>pid_t *<parameter>pid</parameter></paramdef>
</funcprototype>
+ <funcprototype>
+ <funcdef>int <function>sd_event_source_get_child_pidfd</function></funcdef>
+ <paramdef>sd_event_source *<parameter>source</parameter></paramdef>
+ </funcprototype>
+
+ <funcprototype>
+ <funcdef>int <function>sd_event_source_get_child_pidfd_own</function></funcdef>
+ <paramdef>sd_event_source *<parameter>source</parameter></paramdef>
+ </funcprototype>
+
+ <funcprototype>
+ <funcdef>int <function>sd_event_source_set_child_pidfd_own</function></funcdef>
+ <paramdef>sd_event_source *<parameter>source</parameter></paramdef>
+ <paramdef>int <parameter>own</parameter></paramdef>
+ </funcprototype>
+
+ <funcprototype>
+ <funcdef>int <function>sd_event_source_get_child_process_own</function></funcdef>
+ <paramdef>sd_event_source *<parameter>source</parameter></paramdef>
+ </funcprototype>
+
+ <funcprototype>
+ <funcdef>int <function>sd_event_source_set_child_process_own</function></funcdef>
+ <paramdef>sd_event_source *<parameter>source</parameter></paramdef>
+ <paramdef>int <parameter>own</parameter></paramdef>
+ </funcprototype>
+
+ <funcprototype>
+ <funcdef>int <function>sd_event_source_send_child_signal</function></funcdef>
+ <paramdef>sd_event_source *<parameter>source</parameter></paramdef>
+ <paramdef>int <parameter>sig</parameter></paramdef>
+ <paramdef>const siginfo_t *<parameter>info</parameter></paramdef>
+ <paramdef>unsigned <parameter>flags</parameter></paramdef>
+ </funcprototype>
+
</funcsynopsis>
</refsynopsisdiv>
<refsect1>
<title>Description</title>
- <para><function>sd_event_add_child()</function> adds a new child
- process state change event source to an event loop. The event loop
- object is specified in the <parameter>event</parameter> parameter,
- the event source object is returned in the
- <parameter>source</parameter> parameter. The
- <parameter>pid</parameter> parameter specifies the PID of the
- process to watch. The <parameter>handler</parameter> must
- reference a function to call when the process changes state. The
- handler function will be passed the
- <parameter>userdata</parameter> pointer, which may be chosen
- freely by the caller. The handler also receives a pointer to a
- <structname>siginfo_t</structname> structure containing
- information about the child process event. The
- <parameter>options</parameter> parameter determines which state
- changes will be watched for. It must contain an OR-ed mask of
- <constant>WEXITED</constant> (watch for the child process
- terminating), <constant>WSTOPPED</constant> (watch for the child
- process being stopped by a signal), and
- <constant>WCONTINUED</constant> (watch for the child process being
- resumed by a signal). See <citerefentry
- project='man-pages'><refentrytitle>waitid</refentrytitle><manvolnum>2</manvolnum></citerefentry>
- for further information.</para>
+ <para><function>sd_event_add_child()</function> adds a new child process state change event source to an
+ event loop. The event loop object is specified in the <parameter>event</parameter> parameter, the event
+ source object is returned in the <parameter>source</parameter> parameter. The <parameter>pid</parameter>
+ parameter specifies the PID of the process to watch, which must be a direct child process of the invoking
+ process. The <parameter>handler</parameter> must reference a function to call when the process changes
+ state. The handler function will be passed the <parameter>userdata</parameter> pointer, which may be
+ chosen freely by the caller. The handler also receives a pointer to a <structname>siginfo_t</structname>
+ structure containing information about the child process event. The <parameter>options</parameter>
+ parameter determines which state changes will be watched for. It must contain an OR-ed mask of
+ <constant>WEXITED</constant> (watch for the child process terminating), <constant>WSTOPPED</constant>
+ (watch for the child process being stopped by a signal), and <constant>WCONTINUED</constant> (watch for
+ the child process being resumed by a signal). See <citerefentry
+ project='man-pages'><refentrytitle>waitid</refentrytitle><manvolnum>2</manvolnum></citerefentry> for
+ further information.</para>
<para>Only a single handler may be installed for a specific
child process. The handler is enabled for a single event
<constant>SD_EVENT_OFF</constant> with
<citerefentry><refentrytitle>sd_event_source_set_enabled</refentrytitle><manvolnum>3</manvolnum></citerefentry>.</para>
+ <para>The <constant>SIGCHLD</constant> signal must be blocked in all threads before this function is
+ called (using <citerefentry
+ project='man-pages'><refentrytitle>sigprocmask</refentrytitle><manvolnum>2</manvolnum></citerefentry> or
+ <citerefentry
+ project='man-pages'><refentrytitle>pthread_sigmask</refentrytitle><manvolnum>3</manvolnum></citerefentry>).</para>
+
<para>If the second parameter of
<function>sd_event_add_child()</function> is passed as NULL no
reference to the event source object is returned. In this case the
processed first, it should leave the child processes for which
child process state change event sources are installed unreaped.</para>
+ <para><function>sd_event_add_child_pidfd()</function> is similar to
+ <function>sd_event_add_child()</function> but takes a file descriptor referencing the process ("pidfd")
+ instead of the numeric PID. A suitable file descriptor may be acquired via <citerefentry
+ project='man-pages'><refentrytitle>pidfd_open</refentrytitle><manvolnum>2</manvolnum></citerefentry> and
+ related calls. The passed file descriptor is not closed when the event source is freed again, unless
+ <function>sd_event_source_set_child_pidfd_own()</function> is used to turn this behaviour on. Note that
+ regardless which of <function>sd_event_add_child()</function> and
+ <function>sd_event_add_child_pidfd()</function> is used for allocating an event source, the watched
+ process has to be a direct child process of the invoking process. Also in both cases
+ <constant>SIGCHLD</constant> has to be blocked in the invoking process.</para>
+
<para><function>sd_event_source_get_child_pid()</function>
retrieves the configured PID of a child process state change event
source created previously with
pointer to a <type>pid_t</type> variable to return the process ID
in.
</para>
+
+ <para><function>sd_event_source_get_child_pidfd()</function> retrieves the file descriptor referencing
+ the watched process ("pidfd") if this functionality is available. On kernels that support the concept the
+ event loop will make use of pidfds to watch child processes, regardless if the individual event sources
+ are allocated via <function>sd_event_add_child()</function> or
+ <function>sd_event_add_child_pidfd()</function>. If the latter call was used to allocate the event
+ source, this function returns the file descriptor used for allocation. On kernels that do not support the
+ pidfd concept this function will fail with <constant>EOPNOTSUPP</constant>. This call takes the event
+ source object as the <parameter>source</parameter> parameter and returns the numeric file descriptor.
+ </para>
+
+ <para><function>sd_event_source_get_child_pidfd_own()</function> may be used to query whether the pidfd
+ the event source encapsulates shall be closed when the event source is freed. This function returns zero
+ if the pidfd shall be left open, and positive if it shall be closed automatically. By default this
+ setting defaults to on if the event source was allocated via <function>sd_event_add_child()</function>
+ and off if it was allocated via <function>sd_event_add_child_pidfd()</function>. The
+ <function>sd_event_source_set_child_pidfd_own()</function> function may be used to change the setting and
+ takes a boolean parameter with the new setting.</para>
+
+ <para><function>sd_event_source_get_child_process_own()</function> may be used to query whether the
+ process the event source watches shall be killed (with <constant>SIGKILL</constant>) and reaped when the
+ event source is freed. This function returns zero if the process shell be left running, and positive if
+ it shall be killed and reaped automatically. By default this setting defaults to off. The
+ <function>sd_event_source_set_child_process_own()</function> function may be used to change the setting
+ and takes a boolean parameter with the new setting. Note that currently if the calling process is
+ terminated abnormally the watched process might survive even thought the event source ceases to
+ exist. This behaviour might change eventually.</para>
+
+ <para><function>sd_event_source_send_child_signal()</function> may be used to send a UNIX signal to the
+ watched process. If the pidfd concept is supported in the kernel, this is implemented via <citerefentry
+ project='man-pages'><refentrytitle>pidfd_send_signal</refentrytitle><manvolnum>2</manvolnum></citerefentry>
+ and otherwise via <citerefentry
+ project='man-pages'><refentrytitle>rt_sigqueueinfo</refentrytitle><manvolnum>2</manvolnum></citerefentry>
+ (or via <citerefentry
+ project='man-pages'><refentrytitle>kill</refentrytitle><manvolnum>2</manvolnum></citerefentry> in case
+ <parameter>info</parameter> is <constant>NULL</constant>). The specified parameters match those of these
+ underlying system calls, except that the <parameter>info</parameter> is never modified (and is thus
+ declared constant). Like for the underlying system calls, the <parameter>flags</parameter> parameter
+ currently must be zero.</para>
</refsect1>
<refsect1>
<varlistentry>
<term><constant>-EBUSY</constant></term>
- <listitem><para>A handler is already installed for this
- child process.</para></listitem>
+ <listitem><para>A handler is already installed for this child process, or
+ <constant>SIGCHLD</constant> is not blocked.</para></listitem>
</varlistentry>
<listitem><para>The passed event source is not a child process event source.</para></listitem>
</varlistentry>
+ <varlistentry>
+ <term><constant>-EOPNOTSUPP</constant></term>
+
+ <listitem><para>A pidfd was requested but the kernel does not support this concept.</para></listitem>
+ </varlistentry>
+
</variablelist>
</refsect2>
</refsect1>
<citerefentry><refentrytitle>sd_event_source_set_userdata</refentrytitle><manvolnum>3</manvolnum></citerefentry>,
<citerefentry><refentrytitle>sd_event_source_set_description</refentrytitle><manvolnum>3</manvolnum></citerefentry>,
<citerefentry><refentrytitle>sd_event_source_set_floating</refentrytitle><manvolnum>3</manvolnum></citerefentry>,
- <citerefentry project='man-pages'><refentrytitle>waitid</refentrytitle><manvolnum>2</manvolnum></citerefentry>
+ <citerefentry project='man-pages'><refentrytitle>waitid</refentrytitle><manvolnum>2</manvolnum></citerefentry>,
+ <citerefentry project='man-pages'><refentrytitle>sigprocmask</refentrytitle><manvolnum>2</manvolnum></citerefentry>,
+ <citerefentry project='man-pages'><refentrytitle>pthread_sigmask</refentrytitle><manvolnum>3</manvolnum></citerefentry>,
+ <citerefentry project='man-pages'><refentrytitle>pidfd_open</refentrytitle><manvolnum>2</manvolnum></citerefentry>,
+ <citerefentry project='man-pages'><refentrytitle>pidfd_send_signal</refentrytitle><manvolnum>2</manvolnum></citerefentry>,
+ <citerefentry project='man-pages'><refentrytitle>rt_sigqueueinfo</refentrytitle><manvolnum>2</manvolnum></citerefentry>,
+ <citerefentry project='man-pages'><refentrytitle>kill</refentrytitle><manvolnum>2</manvolnum></citerefentry>
</para>
</refsect1>
project='man-pages'><refentrytitle>signalfd</refentrytitle><manvolnum>2</manvolnum></citerefentry>
for further information.</para>
- <para>Only a single handler may be installed for a specific
- signal. The signal will be unblocked by this call, and must be
- blocked before this function is called in all threads (using
+ <para>Only a single handler may be installed for a specific signal. The signal must be blocked in all
+ threads before this function is called (using <citerefentry
+ project='man-pages'><refentrytitle>sigprocmask</refentrytitle><manvolnum>2</manvolnum></citerefentry> or
<citerefentry
- project='man-pages'><refentrytitle>sigprocmask</refentrytitle><manvolnum>2</manvolnum></citerefentry>). If
- the handler is not specified (<parameter>handler</parameter> is
- <constant>NULL</constant>), a default handler which causes the
- program to exit cleanly will be used.</para>
+ project='man-pages'><refentrytitle>pthread_sigmask</refentrytitle><manvolnum>3</manvolnum></citerefentry>). If
+ the handler is not specified (<parameter>handler</parameter> is <constant>NULL</constant>), a default
+ handler which causes the program to exit cleanly will be used.</para>
<para>By default, the event source is enabled permanently
(<constant>SD_EVENT_ON</constant>), but this may be changed with
<citerefentry><refentrytitle>sd_event_source_set_userdata</refentrytitle><manvolnum>3</manvolnum></citerefentry>,
<citerefentry><refentrytitle>sd_event_source_set_floating</refentrytitle><manvolnum>3</manvolnum></citerefentry>,
<citerefentry project='man-pages'><refentrytitle>signal</refentrytitle><manvolnum>7</manvolnum></citerefentry>,
- <citerefentry project='man-pages'><refentrytitle>signalfd</refentrytitle><manvolnum>2</manvolnum></citerefentry>
+ <citerefentry project='man-pages'><refentrytitle>signalfd</refentrytitle><manvolnum>2</manvolnum></citerefentry>,
+ <citerefentry project='man-pages'><refentrytitle>sigprocmask</refentrytitle><manvolnum>2</manvolnum></citerefentry>,
+ <citerefentry project='man-pages'><refentrytitle>pthread_sigmask</refentrytitle><manvolnum>3</manvolnum></citerefentry>
</para>
</refsect1>
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><varname>TokenBufferFilterLatencySec=</varname></term>
+ <listitem>
+ <para>Specifies the latency parameter, which specifies the maximum amount of time a
+ packet can sit in the Token Buffer Filter (TBF). Defaults to unset.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>TokenBufferFilterBurst=</varname></term>
+ <listitem>
+ <para>Specifies the size of the bucket. This is the maximum amount of bytes that tokens
+ can be available for instantaneous transfer. When the size is suffixed with K, M, or G, it is
+ parsed as Kilobytes, Megabytes, or Gigabytes, respectively, to the base of 1000. Defaults to
+ unset.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>TokenBufferFilterRate=</varname></term>
+ <listitem>
+ <para>Specifies the device specific bandwidth. When suffixed with K, M, or G, the specified
+ bandwidth is parsed as Kilobytes, Megabytes, or Gigabytes, respectively, to the base of 1000.
+ Defaults to unset.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>StochasticFairnessQueueingPerturbPeriodSec=</varname></term>
+ <listitem>
+ <para>Specifies the interval in seconds for queue algorithm perturbation. Defaults to unset.</para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</refsect1>
#include <unistd.h>'''],
['get_mempolicy', '''#include <stdlib.h>
#include <unistd.h>'''],
+ ['pidfd_send_signal', '''#include <stdlib.h>
+ #include <unistd.h>
+ #include <signal.h>
+ #include <sys/wait.h>'''],
+ ['pidfd_open', '''#include <stdlib.h>
+ #include <unistd.h>
+ #include <signal.h>
+ #include <sys/wait.h>'''],
+ ['rt_sigqueueinfo', '''#include <stdlib.h>
+ #include <unistd.h>
+ #include <signal.h>
+ #include <sys/wait.h>'''],
]
have = cc.has_function(ident[0], prefix : ident[1], args : '-D_GNU_SOURCE')
EACCES,
EPERM);
}
+
+/* Three difference errors for "not enough disk space" */
+static inline bool ERRNO_IS_DISK_SPACE(int r) {
+ return IN_SET(abs(r),
+ ENOSPC,
+ EDQUOT,
+ EFBIG);
+}
void *explicit_bzero_safe(void *p, size_t l);
#endif
-static inline void erase_and_freep(void *p) {
- void *ptr = *(void**) p;
+static inline void* erase_and_free(void *p) {
+ size_t l;
+
+ if (!p)
+ return NULL;
+
+ l = malloc_usable_size(p);
+ explicit_bzero_safe(p, l);
+ free(p);
- if (ptr) {
- size_t l = malloc_usable_size(ptr);
- explicit_bzero_safe(ptr, l);
- free(ptr);
- }
+ return NULL;
+}
+
+static inline void erase_and_freep(void *p) {
+ erase_and_free(*(void**) p);
}
/* Use with _cleanup_ to erase a single 'char' when leaving scope */
#include <errno.h>
#include <fcntl.h>
+#include <signal.h>
#include <sys/syscall.h>
#include <sys/types.h>
+#include <sys/wait.h>
#include <unistd.h>
#ifdef ARCH_MIPS
#define get_mempolicy missing_get_mempolicy
#endif
+
+#if !HAVE_PIDFD_OPEN
+/* may be (invalid) negative number due to libseccomp, see PR 13319 */
+# if ! (defined __NR_pidfd_open && __NR_pidfd_open > 0)
+# if defined __NR_pidfd_open
+# undef __NR_pidfd_open
+# endif
+# define __NR_pidfd_open 434
+#endif
+static inline int pidfd_open(pid_t pid, unsigned flags) {
+#ifdef __NR_pidfd_open
+ return syscall(__NR_pidfd_open, pid, flags);
+#else
+ errno = ENOSYS;
+ return -1;
+#endif
+}
+#endif
+
+#if !HAVE_PIDFD_SEND_SIGNAL
+/* may be (invalid) negative number due to libseccomp, see PR 13319 */
+# if ! (defined __NR_pidfd_send_signal && __NR_pidfd_send_signal > 0)
+# if defined __NR_pidfd_send_signal
+# undef __NR_pidfd_send_signal
+# endif
+# define __NR_pidfd_send_signal 424
+#endif
+static inline int pidfd_send_signal(int fd, int sig, siginfo_t *info, unsigned flags) {
+#ifdef __NR_pidfd_open
+ return syscall(__NR_pidfd_send_signal, fd, sig, info, flags);
+#else
+ errno = ENOSYS;
+ return -1;
+#endif
+}
+#endif
+
+#if !HAVE_RT_SIGQUEUEINFO
+static inline int rt_sigqueueinfo(pid_t tgid, int sig, siginfo_t *info) {
+ return syscall(__NR_rt_sigqueueinfo, tgid, sig, info);
+}
+#endif
return ordered_hashmap_remove((OrderedHashmap*) s, p);
}
+static inline void* ordered_set_first(OrderedSet *s) {
+ return ordered_hashmap_first((OrderedHashmap*) s);
+}
+
static inline void* ordered_set_steal_first(OrderedSet *s) {
return ordered_hashmap_steal_first((OrderedHashmap*) s);
}
unsigned long l;
assert(s);
- assert(ret_u);
assert(base <= 16);
/* strtoul() is happy to parse negative values, and silently
if ((unsigned long) (unsigned) l != l)
return -ERANGE;
- *ret_u = (unsigned) l;
+ if (ret_u)
+ *ret_u = (unsigned) l;
+
return 0;
}
long l;
assert(s);
- assert(ret_i);
errno = 0;
l = strtol(s, &x, 0);
if ((long) (int) l != l)
return -ERANGE;
- *ret_i = (int) l;
+ if (ret_i)
+ *ret_i = (int) l;
+
return 0;
}
unsigned long long l;
assert(s);
- assert(ret_llu);
s += strspn(s, WHITESPACE);
if (*s == '-')
return -ERANGE;
- *ret_llu = l;
+ if (ret_llu)
+ *ret_llu = l;
+
return 0;
}
long long l;
assert(s);
- assert(ret_lli);
errno = 0;
l = strtoll(s, &x, 0);
if (!x || x == s || *x != 0)
return -EINVAL;
- *ret_lli = l;
+ if (ret_lli)
+ *ret_lli = l;
+
return 0;
}
unsigned long l;
assert(s);
- assert(ret);
s += strspn(s, WHITESPACE);
if ((unsigned long) (uint8_t) l != l)
return -ERANGE;
- *ret = (uint8_t) l;
+ if (ret)
+ *ret = (uint8_t) l;
return 0;
}
long l;
assert(s);
- assert(ret);
errno = 0;
l = strtol(s, &x, 0);
if ((long) (int16_t) l != l)
return -ERANGE;
- *ret = (int16_t) l;
+ if (ret)
+ *ret = (int16_t) l;
+
return 0;
}
double d = 0;
assert(s);
- assert(ret_d);
loc = newlocale(LC_NUMERIC_MASK, "C", (locale_t) 0);
if (loc == (locale_t) 0)
if (!x || x == s || *x != 0)
return -EINVAL;
- *ret_d = (double) d;
+ if (ret_d)
+ *ret_d = (double) d;
+
return 0;
}
#include "rlimit-util.h"
#include "signal-util.h"
#include "stat-util.h"
+#include "stdio-util.h"
#include "string-table.h"
#include "string-util.h"
#include "terminal-util.h"
log_full_errno(prio, r, "Failed to connect stdin/stdout to /dev/null: %m");
_exit(EXIT_FAILURE);
}
+
+ } else if (flags & FORK_STDOUT_TO_STDERR) {
+
+ if (dup2(STDERR_FILENO, STDOUT_FILENO) < 0) {
+ log_full_errno(prio, r, "Failed to connect stdout to stderr: %m");
+ _exit(EXIT_FAILURE);
+ }
}
if (flags & FORK_RLIMIT_NOFILE_SAFE) {
WRITE_STRING_FILE_VERIFY_ON_FAILURE|WRITE_STRING_FILE_DISABLE_BUFFER);
}
+int pidfd_get_pid(int fd, pid_t *ret) {
+ char path[STRLEN("/proc/self/fdinfo/") + DECIMAL_STR_MAX(int)];
+ _cleanup_free_ char *fdinfo = NULL;
+ char *p;
+ int r;
+
+ if (fd < 0)
+ return -EBADF;
+
+ xsprintf(path, "/proc/self/fdinfo/%i", fd);
+
+ r = read_full_file(path, &fdinfo, NULL);
+ if (r == -ENOENT) /* if fdinfo doesn't exist we assume the process does not exist */
+ return -ESRCH;
+ if (r < 0)
+ return r;
+
+ p = startswith(fdinfo, "Pid:");
+ if (!p) {
+ p = strstr(fdinfo, "\nPid:");
+ if (!p)
+ return -ENOTTY; /* not a pidfd? */
+
+ p += 5;
+ }
+
+ p += strspn(p, WHITESPACE);
+ p[strcspn(p, WHITESPACE)] = 0;
+
+ return parse_pid(p, ret);
+}
+
static const char *const ioprio_class_table[] = {
[IOPRIO_CLASS_NONE] = "none",
[IOPRIO_CLASS_RT] = "realtime",
int must_be_root(void);
typedef enum ForkFlags {
- FORK_RESET_SIGNALS = 1 << 0, /* Reset all signal handlers and signal mask */
- FORK_CLOSE_ALL_FDS = 1 << 1, /* Close all open file descriptors in the child, except for 0,1,2 */
- FORK_DEATHSIG = 1 << 2, /* Set PR_DEATHSIG in the child */
- FORK_NULL_STDIO = 1 << 3, /* Connect 0,1,2 to /dev/null */
- FORK_REOPEN_LOG = 1 << 4, /* Reopen log connection */
- FORK_LOG = 1 << 5, /* Log above LOG_DEBUG log level about failures */
- FORK_WAIT = 1 << 6, /* Wait until child exited */
- FORK_NEW_MOUNTNS = 1 << 7, /* Run child in its own mount namespace */
- FORK_MOUNTNS_SLAVE = 1 << 8, /* Make child's mount namespace MS_SLAVE */
- FORK_RLIMIT_NOFILE_SAFE = 1 << 9, /* Set RLIMIT_NOFILE soft limit to 1K for select() compat */
+ FORK_RESET_SIGNALS = 1 << 0, /* Reset all signal handlers and signal mask */
+ FORK_CLOSE_ALL_FDS = 1 << 1, /* Close all open file descriptors in the child, except for 0,1,2 */
+ FORK_DEATHSIG = 1 << 2, /* Set PR_DEATHSIG in the child */
+ FORK_NULL_STDIO = 1 << 3, /* Connect 0,1,2 to /dev/null */
+ FORK_REOPEN_LOG = 1 << 4, /* Reopen log connection */
+ FORK_LOG = 1 << 5, /* Log above LOG_DEBUG log level about failures */
+ FORK_WAIT = 1 << 6, /* Wait until child exited */
+ FORK_NEW_MOUNTNS = 1 << 7, /* Run child in its own mount namespace */
+ FORK_MOUNTNS_SLAVE = 1 << 8, /* Make child's mount namespace MS_SLAVE */
+ FORK_RLIMIT_NOFILE_SAFE = 1 << 9, /* Set RLIMIT_NOFILE soft limit to 1K for select() compat */
+ FORK_STDOUT_TO_STDERR = 1 << 10, /* Make stdout a copy of stderr */
} ForkFlags;
int safe_fork_full(const char *name, const int except_fds[], size_t n_except_fds, ForkFlags flags, pid_t *ret_pid);
(pid) = 0; \
_pid_; \
})
+
+int pidfd_get_pid(int fd, pid_t *ret);
void nop_signal_handler(int sig) {
/* nothing here */
}
+
+int signal_is_blocked(int sig) {
+ sigset_t ss;
+ int r;
+
+ r = pthread_sigmask(SIG_SETMASK, NULL, &ss);
+ if (r != 0)
+ return -r;
+
+ r = sigismember(&ss, sig);
+ if (r < 0)
+ return -errno;
+
+ return r;
+}
return signal_to_string(n);
}
+
+int signal_is_blocked(int sig);
return true;
}
+
+char* string_erase(char *x) {
+ if (!x)
+ return NULL;
+
+ /* A delicious drop of snake-oil! To be called on memory where we stored passphrases or so, after we
+ * used them. */
+ explicit_bzero_safe(x, strlen(x));
+ return x;
+}
return (*p = t);
}
+
+char* string_erase(char *x);
#include "tmpfile-util.h"
#include "umask-util.h"
-int fopen_temporary(const char *path, FILE **_f, char **_temp_path) {
- FILE *f;
- char *t;
- int r, fd;
+int fopen_temporary(const char *path, FILE **ret_f, char **ret_temp_path) {
+ _cleanup_fclose_ FILE *f = NULL;
+ _cleanup_free_ char *t = NULL;
+ _cleanup_close_ int fd = -1;
+ int r;
- assert(path);
- assert(_f);
- assert(_temp_path);
+ if (path) {
+ r = tempfn_xxxxxx(path, NULL, &t);
+ if (r < 0)
+ return r;
+ } else {
+ const char *d;
- r = tempfn_xxxxxx(path, NULL, &t);
- if (r < 0)
- return r;
+ r = tmp_dir(&d);
+ if (r < 0)
+ return r;
+
+ t = path_join(d, "XXXXXX");
+ if (!t)
+ return -ENOMEM;
+ }
fd = mkostemp_safe(t);
- if (fd < 0) {
- free(t);
+ if (fd < 0)
return -errno;
- }
/* This assumes that returned FILE object is short-lived and used within the same single-threaded
* context and never shared externally, hence locking is not necessary. */
r = fdopen_unlocked(fd, "w", &f);
if (r < 0) {
- unlink(t);
- free(t);
- safe_close(fd);
+ (void) unlink(t);
return r;
}
- *_f = f;
- *_temp_path = t;
+ TAKE_FD(fd);
+
+ if (ret_f)
+ *ret_f = TAKE_PTR(f);
+
+ if (ret_temp_path)
+ *ret_temp_path = TAKE_PTR(t);
return 0;
}
/* This is much like mkostemp() but is subject to umask(). */
int mkostemp_safe(char *pattern) {
- _unused_ _cleanup_umask_ mode_t u = umask(0077);
int fd;
assert(pattern);
- fd = mkostemp(pattern, O_CLOEXEC);
+ RUN_WITH_UMASK(0077)
+ fd = mkostemp(pattern, O_CLOEXEC);
if (fd < 0)
return -errno;
return uid_to_name(getuid());
}
-static bool is_nologin_shell(const char *shell) {
+bool is_nologin_shell(const char *shell) {
return PATH_IN_SET(shell,
/* 'nologin' is the friendliest way to disable logins for a user account. It prints a nice
#define ETC_PASSWD_LOCK_PATH "/etc/.pwd.lock"
+static inline bool uid_is_system(uid_t uid) {
+ return uid <= SYSTEM_UID_MAX;
+}
+
+static inline bool gid_is_system(gid_t gid) {
+ return gid <= SYSTEM_GID_MAX;
+}
+
static inline bool uid_is_dynamic(uid_t uid) {
return DYNAMIC_UID_MIN <= uid && uid <= DYNAMIC_UID_MAX;
}
return uid_is_dynamic((uid_t) gid);
}
-static inline bool uid_is_system(uid_t uid) {
- return uid <= SYSTEM_UID_MAX;
+static inline bool uid_is_container(uid_t uid) {
+ return CONTAINER_UID_BASE_MIN <= uid && uid <= CONTAINER_UID_BASE_MAX;
}
-static inline bool gid_is_system(gid_t gid) {
- return gid <= SYSTEM_GID_MAX;
+static inline bool gid_is_container(gid_t gid) {
+ return uid_is_container((uid_t) gid);
}
/* The following macros add 1 when converting things, since UID 0 is a valid UID, while the pointer
#endif
int make_salt(char **ret);
+
+bool is_nologin_shell(const char *shell);
};
typedef struct {
- CHAR16 *id; /* The identifier for this entry (note that this id is not necessarily unique though!) */
+ CHAR16 *id; /* The unique identifier for this entry */
CHAR16 *title_show;
CHAR16 *title;
CHAR16 *version;
CHAR8 *line;
UINTN pos = 0;
CHAR8 *key, *value;
- UINTN len;
EFI_STATUS err;
EFI_FILE_HANDLE handle;
_cleanup_freepool_ CHAR16 *initrd = NULL;
entry->device = device;
entry->id = StrDuplicate(file);
- len = StrLen(entry->id);
- /* remove ".conf" */
- if (len > 5)
- entry->id[len - 5] = '\0';
StrLwr(entry->id);
config_add_entry(config, entry);
CHAR16 *id,
CHAR16 key,
CHAR16 *title,
- CHAR16 *loader) {
+ CHAR16 *loader,
+ CHAR16 *version) {
ConfigEntry *entry;
*entry = (ConfigEntry) {
.type = type,
.title = StrDuplicate(title),
+ .version = StrDuplicate(version),
.device = device,
.loader = StrDuplicate(loader),
.id = StrDuplicate(id),
return FALSE;
uefi_call_wrapper(handle->Close, 1, handle);
- entry = config_entry_add_loader(config, device, LOADER_UNDEFINED, id, key, title, loader);
+ entry = config_entry_add_loader(config, device, LOADER_UNDEFINED, id, key, title, loader, NULL);
if (!entry)
return FALSE;
CHAR8 *line;
UINTN pos = 0;
CHAR8 *key, *value;
+ CHAR16 *os_name_pretty = NULL;
CHAR16 *os_name = NULL;
CHAR16 *os_id = NULL;
CHAR16 *os_version = NULL;
- CHAR16 *os_build = NULL;
+ CHAR16 *os_version_id = NULL;
+ CHAR16 *os_build_id = NULL;
err = uefi_call_wrapper(linux_dir->Read, 3, linux_dir, &bufsize, buf);
if (bufsize == 0 || EFI_ERROR(err))
continue;
if (StriCmp(f->FileName + len - 4, L".efi") != 0)
continue;
+ if (StrnCmp(f->FileName, L"auto-", 5) == 0)
+ continue;
/* look for .osrel and .cmdline sections in the .efi binary */
err = pe_file_locate_sections(linux_dir, f->FileName, sections, addrs, offs, szs);
/* read properties from the embedded os-release file */
while ((line = line_get_key_value(content, (CHAR8 *)"=", &pos, &key, &value))) {
if (strcmpa((CHAR8 *)"PRETTY_NAME", key) == 0) {
+ FreePool(os_name_pretty);
+ os_name_pretty = stra_to_str(value);
+ continue;
+ }
+
+ if (strcmpa((CHAR8 *)"NAME", key) == 0) {
FreePool(os_name);
os_name = stra_to_str(value);
continue;
continue;
}
+ if (strcmpa((CHAR8 *)"VERSION_ID", key) == 0) {
+ FreePool(os_version_id);
+ os_version_id = stra_to_str(value);
+ continue;
+ }
+
if (strcmpa((CHAR8 *)"BUILD_ID", key) == 0) {
- FreePool(os_build);
- os_build = stra_to_str(value);
+ FreePool(os_build_id);
+ os_build_id = stra_to_str(value);
continue;
}
}
- if (os_name && os_id && (os_version || os_build)) {
- _cleanup_freepool_ CHAR16 *conf = NULL, *path = NULL;
+ if ((os_name_pretty || os_name) && os_id && (os_version || os_version_id || os_build_id)) {
+ _cleanup_freepool_ CHAR16 *path = NULL;
- conf = PoolPrint(L"%s-%s", os_id, os_version ? : os_build);
path = PoolPrint(L"\\EFI\\Linux\\%s", f->FileName);
- entry = config_entry_add_loader(config, device, LOADER_LINUX, conf, 'l', os_name, path);
+ entry = config_entry_add_loader(config, device, LOADER_LINUX, f->FileName, 'l',
+ os_name_pretty ? : (os_name ? : os_id), path,
+ os_version ? : (os_version_id ? : os_build_id));
FreePool(content);
content = NULL;
config_entry_parse_tries(entry, L"\\EFI\\Linux", f->FileName, L".efi");
}
+ FreePool(os_name_pretty);
FreePool(os_name);
FreePool(os_id);
FreePool(os_version);
- FreePool(os_build);
+ FreePool(os_version_id);
+ FreePool(os_build_id);
FreePool(content);
}
/* SPDX-License-Identifier: LGPL-2.1+ */
#include <errno.h>
+#include <linux/loop.h>
#include <sched.h>
#include <stdio.h>
#include <sys/mount.h>
r = loop_device_make_by_path(root_image,
dissect_image_flags & DISSECT_IMAGE_READ_ONLY ? O_RDONLY : O_RDWR,
+ LO_FLAGS_PARTSCAN,
&loop_device);
if (r < 0)
return log_debug_errno(r, "Failed to create loop device for root image: %m");
/* SPDX-License-Identifier: LGPL-2.1+ */
#include <fcntl.h>
-#include <stdio.h>
#include <getopt.h>
+#include <linux/loop.h>
+#include <stdio.h>
#include "architecture.h"
#include "dissect-image.h"
if (r <= 0)
return r;
- r = loop_device_make_by_path(arg_image, (arg_flags & DISSECT_IMAGE_READ_ONLY) ? O_RDONLY : O_RDWR, &d);
+ r = loop_device_make_by_path(arg_image, (arg_flags & DISSECT_IMAGE_READ_ONLY) ? O_RDONLY : O_RDWR, LO_FLAGS_PARTSCAN, &d);
if (r < 0)
return log_error_errno(r, "Failed to set up loopback device: %m");
sd_bus_object_vtable_format;
sd_event_source_disable_unref;
} LIBSYSTEMD_241;
+
+LIBSYSTEMD_245 {
+global:
+ sd_event_add_child_pidfd;
+ sd_event_source_get_child_pidfd;
+ sd_event_source_get_child_pidfd_own;
+ sd_event_source_set_child_pidfd_own;
+ sd_event_source_get_child_process_own;
+ sd_event_source_set_child_process_own;
+ sd_event_source_send_child_signal;
+} LIBSYSTEMD_243;
* we know how to dispatch it */
typedef enum WakeupType {
WAKEUP_NONE,
- WAKEUP_EVENT_SOURCE,
+ WAKEUP_EVENT_SOURCE, /* either I/O or pidfd wakeup */
WAKEUP_CLOCK_DATA,
WAKEUP_SIGNAL_DATA,
WAKEUP_INOTIFY_DATA,
siginfo_t siginfo;
pid_t pid;
int options;
+ int pidfd;
+ bool registered:1; /* whether the pidfd is registered in the epoll */
+ bool pidfd_owned:1; /* close pidfd when event source is freed */
+ bool process_owned:1; /* kill+reap process when event source is freed */
+ bool exited:1; /* true if process exited (i.e. if there's value in SIGKILLing it if we want to get rid of it) */
+ bool waited:1; /* true if process was waited for (i.e. if there's value in waitid(P_PID)'ing it if we want to get rid of it) */
} child;
struct {
sd_event_handler_t callback;
#include "sd-id128.h"
#include "alloc-util.h"
+#include "env-util.h"
#include "event-source.h"
#include "fd-util.h"
#include "fs-util.h"
#define DEFAULT_ACCURACY_USEC (250 * USEC_PER_MSEC)
+static bool EVENT_SOURCE_WATCH_PIDFD(sd_event_source *s) {
+ /* Returns true if this is a PID event source and can be implemented by watching EPOLLIN */
+ return s &&
+ s->type == SOURCE_CHILD &&
+ s->child.pidfd >= 0 &&
+ s->child.options == WEXITED;
+}
+
static const char* const event_source_type_table[_SOURCE_EVENT_SOURCE_TYPE_MAX] = {
[SOURCE_IO] = "io",
[SOURCE_TIME_REALTIME] = "realtime",
}
static void source_io_unregister(sd_event_source *s) {
- int r;
-
assert(s);
assert(s->type == SOURCE_IO);
if (!s->io.registered)
return;
- r = epoll_ctl(s->event->epoll_fd, EPOLL_CTL_DEL, s->io.fd, NULL);
- if (r < 0)
+ if (epoll_ctl(s->event->epoll_fd, EPOLL_CTL_DEL, s->io.fd, NULL) < 0)
log_debug_errno(errno, "Failed to remove source %s (type %s) from epoll: %m",
strna(s->description), event_source_type_to_string(s->type));
return 0;
}
+static void source_child_pidfd_unregister(sd_event_source *s) {
+ assert(s);
+ assert(s->type == SOURCE_CHILD);
+
+ if (event_pid_changed(s->event))
+ return;
+
+ if (!s->child.registered)
+ return;
+
+ if (EVENT_SOURCE_WATCH_PIDFD(s))
+ if (epoll_ctl(s->event->epoll_fd, EPOLL_CTL_DEL, s->child.pidfd, NULL) < 0)
+ log_debug_errno(errno, "Failed to remove source %s (type %s) from epoll: %m",
+ strna(s->description), event_source_type_to_string(s->type));
+
+ s->child.registered = false;
+}
+
+static int source_child_pidfd_register(sd_event_source *s, int enabled) {
+ int r;
+
+ assert(s);
+ assert(s->type == SOURCE_CHILD);
+ assert(enabled != SD_EVENT_OFF);
+
+ if (EVENT_SOURCE_WATCH_PIDFD(s)) {
+ struct epoll_event ev;
+
+ ev = (struct epoll_event) {
+ .events = EPOLLIN | (enabled == SD_EVENT_ONESHOT ? EPOLLONESHOT : 0),
+ .data.ptr = s,
+ };
+
+ if (s->child.registered)
+ r = epoll_ctl(s->event->epoll_fd, EPOLL_CTL_MOD, s->child.pidfd, &ev);
+ else
+ r = epoll_ctl(s->event->epoll_fd, EPOLL_CTL_ADD, s->child.pidfd, &ev);
+ if (r < 0)
+ return -errno;
+ }
+
+ s->child.registered = true;
+ return 0;
+}
+
static clockid_t event_source_type_to_clock(EventSourceType t) {
switch (t) {
assert(e);
- /* Rechecks if the specified signal is still something we are
- * interested in. If not, we'll unmask it, and possibly drop
- * the signalfd for it. */
+ /* Rechecks if the specified signal is still something we are interested in. If not, we'll unmask it,
+ * and possibly drop the signalfd for it. */
if (sig == SIGCHLD &&
e->n_enabled_child_sources > 0)
}
(void) hashmap_remove(s->event->child_sources, PID_TO_PTR(s->child.pid));
- event_gc_signal_data(s->event, &s->priority, SIGCHLD);
}
+ if (EVENT_SOURCE_WATCH_PIDFD(s))
+ source_child_pidfd_unregister(s);
+ else
+ event_gc_signal_data(s->event, &s->priority, SIGCHLD);
+
break;
case SOURCE_DEFER:
if (s->type == SOURCE_IO && s->io.owned)
s->io.fd = safe_close(s->io.fd);
+ if (s->type == SOURCE_CHILD) {
+ /* Eventually the kernel will do this automatically for us, but for now let's emulate this (unreliably) in userspace. */
+
+ if (s->child.process_owned) {
+
+ if (!s->child.exited) {
+ bool sent = false;
+
+ if (s->child.pidfd >= 0) {
+ if (pidfd_send_signal(s->child.pidfd, SIGKILL, NULL, 0) < 0) {
+ if (errno == ESRCH) /* Already dead */
+ sent = true;
+ else if (!ERRNO_IS_NOT_SUPPORTED(errno))
+ log_debug_errno(errno, "Failed to kill process " PID_FMT " via pidfd_send_signal(), re-trying via kill(): %m",
+ s->child.pid);
+ } else
+ sent = true;
+ }
+
+ if (!sent)
+ if (kill(s->child.pid, SIGKILL) < 0)
+ if (errno != ESRCH) /* Already dead */
+ log_debug_errno(errno, "Failed to kill process " PID_FMT " via kill(), ignoring: %m",
+ s->child.pid);
+ }
+
+ if (!s->child.waited) {
+ siginfo_t si = {};
+
+ /* Reap the child if we can */
+ (void) waitid(P_PID, s->child.pid, &si, WEXITED);
+ }
+ }
+
+ if (s->child.pidfd_owned)
+ s->child.pidfd = safe_close(s->child.pidfd);
+ }
+
if (s->destroy_callback)
s->destroy_callback(s->userdata);
_cleanup_(source_freep) sd_event_source *s = NULL;
struct signal_data *d;
- sigset_t ss;
int r;
assert_return(e, -EINVAL);
if (!callback)
callback = signal_exit_callback;
- r = pthread_sigmask(SIG_SETMASK, NULL, &ss);
- if (r != 0)
- return -r;
-
- if (!sigismember(&ss, sig))
+ r = signal_is_blocked(sig);
+ if (r < 0)
+ return r;
+ if (r == 0)
return -EBUSY;
if (!e->signal_sources) {
return 0;
}
+static bool shall_use_pidfd(void) {
+ /* Mostly relevant for debugging, i.e. this is used in test-event.c to test the event loop once with and once without pidfd */
+ return getenv_bool_secure("SYSTEMD_PIDFD") != 0;
+}
+
_public_ int sd_event_add_child(
sd_event *e,
sd_event_source **ret,
assert_return(e->state != SD_EVENT_FINISHED, -ESTALE);
assert_return(!event_pid_changed(e), -ECHILD);
+ if (e->n_enabled_child_sources == 0) {
+ /* Caller must block SIGCHLD before using us to watch children, even if pidfd is available,
+ * for compatibility with pre-pidfd and because we don't want the reap the child processes
+ * ourselves, i.e. call waitid(), and don't want Linux' default internal logic for that to
+ * take effect.
+ *
+ * (As an optimization we only do this check on the first child event source created.) */
+ r = signal_is_blocked(SIGCHLD);
+ if (r < 0)
+ return r;
+ if (r == 0)
+ return -EBUSY;
+ }
+
r = hashmap_ensure_allocated(&e->child_sources, NULL);
if (r < 0)
return r;
if (!s)
return -ENOMEM;
+ s->wakeup = WAKEUP_EVENT_SOURCE;
s->child.pid = pid;
s->child.options = options;
s->child.callback = callback;
s->userdata = userdata;
s->enabled = SD_EVENT_ONESHOT;
+ /* We always take a pidfd here if we can, even if we wait for anything else than WEXITED, so that we
+ * pin the PID, and make regular waitid() handling race-free. */
+
+ if (shall_use_pidfd()) {
+ s->child.pidfd = pidfd_open(s->child.pid, 0);
+ if (s->child.pidfd < 0) {
+ /* Propagate errors unless the syscall is not supported or blocked */
+ if (!ERRNO_IS_NOT_SUPPORTED(errno) && !ERRNO_IS_PRIVILEGE(errno))
+ return -errno;
+ } else
+ s->child.pidfd_owned = true; /* If we allocate the pidfd we own it by default */
+ } else
+ s->child.pidfd = -1;
+
r = hashmap_put(e->child_sources, PID_TO_PTR(pid), s);
if (r < 0)
return r;
e->n_enabled_child_sources++;
- r = event_make_signal_data(e, SIGCHLD, NULL);
- if (r < 0) {
- e->n_enabled_child_sources--;
- return r;
- }
+ if (EVENT_SOURCE_WATCH_PIDFD(s)) {
+ /* We have a pidfd and we only want to watch for exit */
+
+ r = source_child_pidfd_register(s, s->enabled);
+ if (r < 0) {
+ e->n_enabled_child_sources--;
+ return r;
+ }
+ } else {
+ /* We have no pidfd or we shall wait for some other event than WEXITED */
+
+ r = event_make_signal_data(e, SIGCHLD, NULL);
+ if (r < 0) {
+ e->n_enabled_child_sources--;
+ return r;
+ }
- e->need_process_child = true;
+ e->need_process_child = true;
+ }
if (ret)
*ret = s;
+
TAKE_PTR(s);
+ return 0;
+}
+
+_public_ int sd_event_add_child_pidfd(
+ sd_event *e,
+ sd_event_source **ret,
+ int pidfd,
+ int options,
+ sd_event_child_handler_t callback,
+ void *userdata) {
+
+
+ _cleanup_(source_freep) sd_event_source *s = NULL;
+ pid_t pid;
+ int r;
+
+ assert_return(e, -EINVAL);
+ assert_return(e = event_resolve(e), -ENOPKG);
+ assert_return(pidfd >= 0, -EBADF);
+ assert_return(!(options & ~(WEXITED|WSTOPPED|WCONTINUED)), -EINVAL);
+ assert_return(options != 0, -EINVAL);
+ assert_return(callback, -EINVAL);
+ assert_return(e->state != SD_EVENT_FINISHED, -ESTALE);
+ assert_return(!event_pid_changed(e), -ECHILD);
+
+ if (e->n_enabled_child_sources == 0) {
+ r = signal_is_blocked(SIGCHLD);
+ if (r < 0)
+ return r;
+ if (r == 0)
+ return -EBUSY;
+ }
+
+ r = hashmap_ensure_allocated(&e->child_sources, NULL);
+ if (r < 0)
+ return r;
+
+ r = pidfd_get_pid(pidfd, &pid);
+ if (r < 0)
+ return r;
+ if (hashmap_contains(e->child_sources, PID_TO_PTR(pid)))
+ return -EBUSY;
+
+ s = source_new(e, !ret, SOURCE_CHILD);
+ if (!s)
+ return -ENOMEM;
+
+ s->wakeup = WAKEUP_EVENT_SOURCE;
+ s->child.pidfd = pidfd;
+ s->child.pid = pid;
+ s->child.options = options;
+ s->child.callback = callback;
+ s->child.pidfd_owned = false; /* If we got the pidfd passed in we don't own it by default (similar to the IO fd case) */
+ s->userdata = userdata;
+ s->enabled = SD_EVENT_ONESHOT;
+
+ r = hashmap_put(e->child_sources, PID_TO_PTR(pid), s);
+ if (r < 0)
+ return r;
+
+ e->n_enabled_child_sources++;
+
+ if (EVENT_SOURCE_WATCH_PIDFD(s)) {
+ /* We only want to watch for WEXITED */
+
+ r = source_child_pidfd_register(s, s->enabled);
+ if (r < 0) {
+ e->n_enabled_child_sources--;
+ return r;
+ }
+ } else {
+ /* We shall wait for some other event than WEXITED */
+
+ r = event_make_signal_data(e, SIGCHLD, NULL);
+ if (r < 0) {
+ e->n_enabled_child_sources--;
+ return r;
+ }
+
+ e->need_process_child = true;
+ }
+
+ if (ret)
+ *ret = s;
+
+ TAKE_PTR(s);
return 0;
}
return r;
}
- epoll_ctl(s->event->epoll_fd, EPOLL_CTL_DEL, saved_fd, NULL);
+ (void) epoll_ctl(s->event->epoll_fd, EPOLL_CTL_DEL, saved_fd, NULL);
}
return 0;
assert(s->event->n_enabled_child_sources > 0);
s->event->n_enabled_child_sources--;
- event_gc_signal_data(s->event, &s->priority, SIGCHLD);
+ if (EVENT_SOURCE_WATCH_PIDFD(s))
+ source_child_pidfd_unregister(s);
+ else
+ event_gc_signal_data(s->event, &s->priority, SIGCHLD);
+
break;
case SOURCE_EXIT:
s->enabled = m;
- r = event_make_signal_data(s->event, SIGCHLD, NULL);
- if (r < 0) {
- s->enabled = SD_EVENT_OFF;
- s->event->n_enabled_child_sources--;
- event_gc_signal_data(s->event, &s->priority, SIGCHLD);
- return r;
+ if (EVENT_SOURCE_WATCH_PIDFD(s)) {
+ /* yes, we have pidfd */
+
+ r = source_child_pidfd_register(s, s->enabled);
+ if (r < 0) {
+ s->enabled = SD_EVENT_OFF;
+ s->event->n_enabled_child_sources--;
+ return r;
+ }
+ } else {
+ /* no pidfd, or something other to watch for than WEXITED */
+
+ r = event_make_signal_data(s->event, SIGCHLD, NULL);
+ if (r < 0) {
+ s->enabled = SD_EVENT_OFF;
+ s->event->n_enabled_child_sources--;
+ event_gc_signal_data(s->event, &s->priority, SIGCHLD);
+ return r;
+ }
}
break;
return 0;
}
+_public_ int sd_event_source_get_child_pidfd(sd_event_source *s) {
+ assert_return(s, -EINVAL);
+ assert_return(s->type == SOURCE_CHILD, -EDOM);
+ assert_return(!event_pid_changed(s->event), -ECHILD);
+
+ if (s->child.pidfd < 0)
+ return -EOPNOTSUPP;
+
+ return s->child.pidfd;
+}
+
+_public_ int sd_event_source_send_child_signal(sd_event_source *s, int sig, const siginfo_t *si, unsigned flags) {
+ assert_return(s, -EINVAL);
+ assert_return(s->type == SOURCE_CHILD, -EDOM);
+ assert_return(!event_pid_changed(s->event), -ECHILD);
+ assert_return(SIGNAL_VALID(sig), -EINVAL);
+
+ /* If we already have seen indication the process exited refuse sending a signal early. This way we
+ * can be sure we don't accidentally kill the wrong process on PID reuse when pidfds are not
+ * available. */
+ if (s->child.exited)
+ return -ESRCH;
+
+ if (s->child.pidfd >= 0) {
+ siginfo_t copy;
+
+ /* pidfd_send_signal() changes the siginfo_t argument. This is weird, let's hence copy the
+ * structure here */
+ if (si)
+ copy = *si;
+
+ if (pidfd_send_signal(s->child.pidfd, sig, si ? © : NULL, 0) < 0) {
+ /* Let's propagate the error only if the system call is not implemented or prohibited */
+ if (!ERRNO_IS_NOT_SUPPORTED(errno) && !ERRNO_IS_PRIVILEGE(errno))
+ return -errno;
+ } else
+ return 0;
+ }
+
+ /* Flags are only supported for pidfd_send_signal(), not for rt_sigqueueinfo(), hence let's refuse
+ * this here. */
+ if (flags != 0)
+ return -EOPNOTSUPP;
+
+ if (si) {
+ /* We use rt_sigqueueinfo() only if siginfo_t is specified. */
+ siginfo_t copy = *si;
+
+ if (rt_sigqueueinfo(s->child.pid, sig, ©) < 0)
+ return -errno;
+ } else if (kill(s->child.pid, sig) < 0)
+ return -errno;
+
+ return 0;
+}
+
+_public_ int sd_event_source_get_child_pidfd_own(sd_event_source *s) {
+ assert_return(s, -EINVAL);
+ assert_return(s->type == SOURCE_CHILD, -EDOM);
+
+ if (s->child.pidfd < 0)
+ return -EOPNOTSUPP;
+
+ return s->child.pidfd_owned;
+}
+
+_public_ int sd_event_source_set_child_pidfd_own(sd_event_source *s, int own) {
+ assert_return(s, -EINVAL);
+ assert_return(s->type == SOURCE_CHILD, -EDOM);
+
+ if (s->child.pidfd < 0)
+ return -EOPNOTSUPP;
+
+ s->child.pidfd_owned = own;
+ return 0;
+}
+
+_public_ int sd_event_source_get_child_process_own(sd_event_source *s) {
+ assert_return(s, -EINVAL);
+ assert_return(s->type == SOURCE_CHILD, -EDOM);
+
+ return s->child.process_owned;
+}
+
+_public_ int sd_event_source_set_child_process_own(sd_event_source *s, int own) {
+ assert_return(s, -EINVAL);
+ assert_return(s->type == SOURCE_CHILD, -EDOM);
+
+ s->child.process_owned = own;
+ return 0;
+}
+
_public_ int sd_event_source_get_inotify_mask(sd_event_source *s, uint32_t *mask) {
assert_return(s, -EINVAL);
assert_return(mask, -EINVAL);
if (s->enabled == SD_EVENT_OFF)
continue;
+ if (s->child.exited)
+ continue;
+
+ if (EVENT_SOURCE_WATCH_PIDFD(s)) /* There's a usable pidfd known for this event source? then don't waitid() for it here */
+ continue;
+
zero(s->child.siginfo);
r = waitid(P_PID, s->child.pid, &s->child.siginfo,
WNOHANG | (s->child.options & WEXITED ? WNOWAIT : 0) | s->child.options);
if (s->child.siginfo.si_pid != 0) {
bool zombie = IN_SET(s->child.siginfo.si_code, CLD_EXITED, CLD_KILLED, CLD_DUMPED);
+ if (zombie)
+ s->child.exited = true;
+
if (!zombie && (s->child.options & WEXITED)) {
/* If the child isn't dead then let's
* immediately remove the state change
return 0;
}
+static int process_pidfd(sd_event *e, sd_event_source *s, uint32_t revents) {
+ assert(e);
+ assert(s);
+ assert(s->type == SOURCE_CHILD);
+
+ if (s->pending)
+ return 0;
+
+ if (s->enabled == SD_EVENT_OFF)
+ return 0;
+
+ if (!EVENT_SOURCE_WATCH_PIDFD(s))
+ return 0;
+
+ zero(s->child.siginfo);
+ if (waitid(P_PID, s->child.pid, &s->child.siginfo, WNOHANG | WNOWAIT | s->child.options) < 0)
+ return -errno;
+
+ if (s->child.siginfo.si_pid == 0)
+ return 0;
+
+ if (IN_SET(s->child.siginfo.si_code, CLD_EXITED, CLD_KILLED, CLD_DUMPED))
+ s->child.exited = true;
+
+ return source_set_pending(s, true);
+}
+
static int process_signal(sd_event *e, struct signal_data *d, uint32_t events) {
bool read_one = false;
int r;
r = s->child.callback(s, &s->child.siginfo, s->userdata);
/* Now, reap the PID for good. */
- if (zombie)
+ if (zombie) {
(void) waitid(P_PID, s->child.pid, &s->child.siginfo, WNOHANG|WEXITED);
+ s->child.waited = true;
+ }
break;
}
assert_return(e->state != SD_EVENT_FINISHED, -ESTALE);
assert_return(e->state == SD_EVENT_INITIAL, -EBUSY);
+ /* Let's check that if we are a default event loop we are executed in the correct thread. We only do
+ * this check here once, since gettid() is typically not cached, and thus want to minimize
+ * syscalls */
+ assert_return(!e->default_event_ptr || e->tid == gettid(), -EREMOTEIO);
+
if (e->exit_requested)
goto pending;
switch (*t) {
- case WAKEUP_EVENT_SOURCE:
- r = process_io(e, ev_queue[i].data.ptr, ev_queue[i].events);
+ case WAKEUP_EVENT_SOURCE: {
+ sd_event_source *s = ev_queue[i].data.ptr;
+
+ assert(s);
+
+ switch (s->type) {
+
+ case SOURCE_IO:
+ r = process_io(e, s, ev_queue[i].events);
+ break;
+
+ case SOURCE_CHILD:
+ r = process_pidfd(e, s, ev_queue[i].events);
+ break;
+
+ default:
+ assert_not_reached("Unexpected event source type");
+ }
+
break;
+ }
case WAKEUP_CLOCK_DATA: {
struct clock_data *d = ev_queue[i].data.ptr;
+
+ assert(d);
+
r = flush_timer(e, d->fd, ev_queue[i].events, &d->next);
break;
}
} else {
if (e->watchdog_fd >= 0) {
- epoll_ctl(e->epoll_fd, EPOLL_CTL_DEL, e->watchdog_fd, NULL);
+ (void) epoll_ctl(e->epoll_fd, EPOLL_CTL_DEL, e->watchdog_fd, NULL);
e->watchdog_fd = safe_close(e->watchdog_fd);
}
}
#include "fs-util.h"
#include "log.h"
#include "macro.h"
+#include "missing_syscall.h"
#include "parse-util.h"
#include "path-util.h"
#include "process-util.h"
assert_se(s);
assert_se(si);
+ assert_se(si->si_uid == getuid());
+ assert_se(si->si_signo == SIGCHLD);
+ assert_se(si->si_code == CLD_EXITED);
+ assert_se(si->si_status == 78);
+
log_info("got child on %c", PTR_TO_INT(userdata));
assert_se(userdata == INT_TO_PTR('f'));
static int signal_handler(sd_event_source *s, const struct signalfd_siginfo *si, void *userdata) {
sd_event_source *p = NULL;
pid_t pid;
+ siginfo_t plain_si;
assert_se(s);
assert_se(si);
assert_se(userdata == INT_TO_PTR('e'));
- assert_se(sigprocmask_many(SIG_BLOCK, NULL, SIGCHLD, -1) >= 0);
+ assert_se(sigprocmask_many(SIG_BLOCK, NULL, SIGCHLD, SIGUSR2, -1) >= 0);
pid = fork();
assert_se(pid >= 0);
- if (pid == 0)
- _exit(EXIT_SUCCESS);
+ if (pid == 0) {
+ sigset_t ss;
+
+ assert_se(sigemptyset(&ss) >= 0);
+ assert_se(sigaddset(&ss, SIGUSR2) >= 0);
+
+ zero(plain_si);
+ assert_se(sigwaitinfo(&ss, &plain_si) >= 0);
+
+ assert_se(plain_si.si_signo == SIGUSR2);
+ assert_se(plain_si.si_value.sival_int == 4711);
+
+ _exit(78);
+ }
assert_se(sd_event_add_child(sd_event_source_get_event(s), &p, pid, WEXITED, child_handler, INT_TO_PTR('f')) >= 0);
assert_se(sd_event_source_set_enabled(p, SD_EVENT_ONESHOT) >= 0);
+ assert_se(sd_event_source_set_child_process_own(p, true) >= 0);
+
+ /* We can't use structured initialization here, since the structure contains various unions and these
+ * fields lie in overlapping (carefully aligned) unions that LLVM is allergic to allow assignments
+ * to */
+ zero(plain_si);
+ plain_si.si_signo = SIGUSR2;
+ plain_si.si_code = SI_QUEUE;
+ plain_si.si_pid = getpid();
+ plain_si.si_uid = getuid();
+ plain_si.si_value.sival_int = 4711;
+
+ assert_se(sd_event_source_send_child_signal(p, SIGUSR2, &plain_si, 0) >= 0);
sd_event_source_unref(s);
return 1;
}
-static bool do_quit = false;
+static bool do_quit;
static int time_handler(sd_event_source *s, uint64_t usec, void *userdata) {
log_info("got timer on %c", PTR_TO_INT(userdata));
return 2;
}
-static void test_basic(void) {
+static void test_basic(bool with_pidfd) {
sd_event *e = NULL;
sd_event_source *w = NULL, *x = NULL, *y = NULL, *z = NULL, *q = NULL, *t = NULL;
static const char ch = 'x';
uint64_t event_now;
int64_t priority;
+ assert_se(setenv("SYSTEMD_PIDFD", yes_no(with_pidfd), 1) >= 0);
+
assert_se(pipe(a) >= 0);
assert_se(pipe(b) >= 0);
assert_se(pipe(d) >= 0);
assert_se(sd_event_add_io(e, &x, a[0], EPOLLIN, io_handler, INT_TO_PTR('a')) >= 0);
assert_se(sd_event_add_io(e, &y, b[0], EPOLLIN, io_handler, INT_TO_PTR('b')) >= 0);
+
+ do_quit = false;
assert_se(sd_event_add_time(e, &z, CLOCK_MONOTONIC, 0, 0, time_handler, INT_TO_PTR('c')) >= 0);
assert_se(sd_event_add_exit(e, &q, exit_handler, INT_TO_PTR('g')) >= 0);
safe_close_pair(b);
safe_close_pair(d);
safe_close_pair(k);
+
+ assert_se(unsetenv("SYSTEMD_PIDFD") >= 0);
}
static void test_sd_event_now(void) {
sd_event_unref(e);
}
+static int pidfd_handler(sd_event_source *s, const siginfo_t *si, void *userdata) {
+ assert_se(s);
+ assert_se(si);
+
+ assert_se(si->si_uid == getuid());
+ assert_se(si->si_signo == SIGCHLD);
+ assert_se(si->si_code == CLD_EXITED);
+ assert_se(si->si_status == 66);
+
+ log_info("got pidfd on %c", PTR_TO_INT(userdata));
+
+ assert_se(userdata == INT_TO_PTR('p'));
+
+ assert_se(sd_event_exit(sd_event_source_get_event(s), 0) >= 0);
+ sd_event_source_unref(s);
+
+ return 0;
+}
+
+static void test_pidfd(void) {
+ sd_event_source *s = NULL, *t = NULL;
+ sd_event *e = NULL;
+ int pidfd;
+ pid_t pid, pid2;
+
+ assert_se(sigprocmask_many(SIG_BLOCK, NULL, SIGCHLD, -1) >= 0);
+
+ pid = fork();
+ if (pid == 0) {
+ /* child */
+ _exit(66);
+ }
+
+ assert_se(pid > 1);
+
+ pidfd = pidfd_open(pid, 0);
+ if (pidfd < 0) {
+ /* No pidfd_open() supported or blocked? */
+ assert_se(ERRNO_IS_NOT_SUPPORTED(errno) || ERRNO_IS_PRIVILEGE(errno));
+ (void) wait_for_terminate(pid, NULL);
+ return;
+ }
+
+ pid2 = fork();
+ if (pid2 == 0)
+ freeze();
+
+ assert_se(pid > 2);
+
+ assert_se(sd_event_default(&e) >= 0);
+ assert_se(sd_event_add_child_pidfd(e, &s, pidfd, WEXITED, pidfd_handler, INT_TO_PTR('p')) >= 0);
+ assert_se(sd_event_source_set_child_pidfd_own(s, true) >= 0);
+
+ /* This one should never trigger, since our second child lives forever */
+ assert_se(sd_event_add_child(e, &t, pid2, WEXITED, pidfd_handler, INT_TO_PTR('q')) >= 0);
+ assert_se(sd_event_source_set_child_process_own(t, true) >= 0);
+
+ assert_se(sd_event_loop(e) >= 0);
+
+ /* Child should still be alive */
+ assert_se(kill(pid2, 0) >= 0);
+
+ t = sd_event_source_unref(t);
+
+ /* Child should now be dead, since we dropped the ref */
+ assert_se(kill(pid2, 0) < 0 && errno == ESRCH);
+
+ sd_event_unref(e);
+}
+
int main(int argc, char *argv[]) {
test_setup_logging(LOG_INFO);
- test_basic();
+ test_basic(true); /* test with pidfd */
+ test_basic(false); /* test without pidfd */
+
test_sd_event_now();
test_rtqueue();
test_inotify(100); /* should work without overflow */
test_inotify(33000); /* should trigger a q overflow */
+ test_pidfd();
+
return 0;
}
assert_return(m, -EINVAL);
assert_return(!m->sealed, -EPERM);
- assert_return(m->n_containers > 0, -EINVAL);
r = add_rtattr(m, type | NLA_F_NESTED, NULL, 0);
if (r < 0)
tc/netem.h
tc/qdisc.c
tc/qdisc.h
+ tc/sfq.c
+ tc/sfq.h
+ tc/tbf.c
+ tc/tbf.h
tc/tc-util.c
tc/tc-util.h
'''.split())
.object_size = sizeof(Bond),
.init = bond_init,
.done = bond_done,
- .sections = "Match\0NetDev\0Bond\0",
+ .sections = NETDEV_COMMON_SECTIONS "Bond\0",
.fill_message_create = netdev_bond_fill_message_create,
.create_type = NETDEV_CREATE_MASTER,
.generate_mac = true,
const NetDevVTable bridge_vtable = {
.object_size = sizeof(Bridge),
.init = bridge_init,
- .sections = "Match\0NetDev\0Bridge\0",
+ .sections = NETDEV_COMMON_SECTIONS "Bridge\0",
.post_create = netdev_bridge_post_create,
.create_type = NETDEV_CREATE_MASTER,
};
const NetDevVTable dummy_vtable = {
.object_size = sizeof(Dummy),
- .sections = "Match\0NetDev\0",
+ .sections = NETDEV_COMMON_SECTIONS,
.create_type = NETDEV_CREATE_INDEPENDENT,
.generate_mac = true,
};
const NetDevVTable foutnl_vtable = {
.object_size = sizeof(FouTunnel),
.init = fou_tunnel_init,
- .sections = "Match\0NetDev\0FooOverUDP\0",
+ .sections = NETDEV_COMMON_SECTIONS "FooOverUDP\0",
.create = netdev_fou_tunnel_create,
.create_type = NETDEV_CREATE_INDEPENDENT,
.config_verify = netdev_fou_tunnel_verify,
const NetDevVTable geneve_vtable = {
.object_size = sizeof(Geneve),
.init = geneve_init,
- .sections = "Match\0NetDev\0GENEVE\0",
+ .sections = NETDEV_COMMON_SECTIONS "GENEVE\0",
.create = netdev_geneve_create,
.create_type = NETDEV_CREATE_INDEPENDENT,
.config_verify = netdev_geneve_verify,
const NetDevVTable ipvlan_vtable = {
.object_size = sizeof(IPVlan),
.init = ipvlan_init,
- .sections = "Match\0NetDev\0IPVLAN\0",
+ .sections = NETDEV_COMMON_SECTIONS "IPVLAN\0",
.fill_message_create = netdev_ipvlan_fill_message_create,
.create_type = NETDEV_CREATE_STACKED,
.generate_mac = true,
const NetDevVTable ipvtap_vtable = {
.object_size = sizeof(IPVlan),
.init = ipvlan_init,
- .sections = "Match\0NetDev\0IPVTAP\0",
+ .sections = NETDEV_COMMON_SECTIONS "IPVTAP\0",
.fill_message_create = netdev_ipvlan_fill_message_create,
.create_type = NETDEV_CREATE_STACKED,
.generate_mac = true,
const NetDevVTable l2tptnl_vtable = {
.object_size = sizeof(L2tpTunnel),
.init = l2tp_tunnel_init,
- .sections = "Match\0NetDev\0L2TP\0L2TPSession\0",
+ .sections = NETDEV_COMMON_SECTIONS "L2TP\0L2TPSession\0",
.create_after_configured = l2tp_create_tunnel,
.done = l2tp_tunnel_done,
.create_type = NETDEV_CREATE_AFTER_CONFIGURED,
const NetDevVTable macsec_vtable = {
.object_size = sizeof(MACsec),
.init = macsec_init,
- .sections = "Match\0NetDev\0MACsec\0MACsecReceiveChannel\0MACsecTransmitAssociation\0MACsecReceiveAssociation\0",
+ .sections = NETDEV_COMMON_SECTIONS "MACsec\0MACsecReceiveChannel\0MACsecTransmitAssociation\0MACsecReceiveAssociation\0",
.fill_message_create = netdev_macsec_fill_message_create,
.post_create = netdev_macsec_configure,
.done = macsec_done,
const NetDevVTable macvtap_vtable = {
.object_size = sizeof(MacVlan),
.init = macvlan_init,
- .sections = "Match\0NetDev\0MACVTAP\0",
+ .sections = NETDEV_COMMON_SECTIONS "MACVTAP\0",
.fill_message_create = netdev_macvlan_fill_message_create,
.create_type = NETDEV_CREATE_STACKED,
.generate_mac = true,
const NetDevVTable macvlan_vtable = {
.object_size = sizeof(MacVlan),
.init = macvlan_init,
- .sections = "Match\0NetDev\0MACVLAN\0",
+ .sections = NETDEV_COMMON_SECTIONS "MACVLAN\0",
.fill_message_create = netdev_macvlan_fill_message_create,
.create_type = NETDEV_CREATE_STACKED,
.generate_mac = true,
dropin_dirname = strjoina(basename(filename), ".d");
r = config_parse_many(filename, NETWORK_DIRS, dropin_dirname,
- "Match\0NetDev\0",
+ NETDEV_COMMON_SECTIONS NETDEV_OTHER_SECTIONS,
config_item_perf_lookup, network_netdev_gperf_lookup,
- CONFIG_PARSE_WARN|CONFIG_PARSE_RELAXED, netdev_raw);
+ CONFIG_PARSE_WARN, netdev_raw);
if (r < 0)
return r;
#include "networkd-link.h"
#include "time-util.h"
+#define NETDEV_COMMON_SECTIONS "Match\0NetDev\0"
+/* This is the list of known sections. We need to ignore them in the initial parsing phase. */
+#define NETDEV_OTHER_SECTIONS \
+ "-Bond\0" \
+ "-Bridge\0" \
+ "-FooOverUDP\0" \
+ "-GENEVE\0" \
+ "-IPVLAN\0" \
+ "-IPVTAP\0" \
+ "-L2TP\0" \
+ "-L2TPSession\0" \
+ "-MACsec\0" \
+ "-MACsecReceiveChannel\0" \
+ "-MACsecTransmitAssociation\0" \
+ "-MACsecReceiveAssociation\0" \
+ "-MACVTAP\0" \
+ "-MACVLAN\0" \
+ "-Tunnel\0" \
+ "-Tun\0" \
+ "-Tap\0" \
+ "-Peer\0" \
+ "-VLAN\0" \
+ "-VRF\0" \
+ "-VXCAN\0" \
+ "-VXLAN\0" \
+ "-WireGuard\0" \
+ "-WireGuardPeer\0" \
+ "-Xfrm\0"
+
typedef struct netdev_join_callback netdev_join_callback;
struct netdev_join_callback {
const NetDevVTable netdevsim_vtable = {
.object_size = sizeof(NetDevSim),
- .sections = "Match\0NetDev\0",
+ .sections = NETDEV_COMMON_SECTIONS,
.create_type = NETDEV_CREATE_INDEPENDENT,
.generate_mac = true,
};
const NetDevVTable nlmon_vtable = {
.object_size = sizeof(NLMon),
- .sections = "Match\0NetDev\0",
+ .sections = NETDEV_COMMON_SECTIONS,
.create_type = NETDEV_CREATE_INDEPENDENT,
.config_verify = netdev_nlmon_verify,
};
const NetDevVTable ipip_vtable = {
.object_size = sizeof(Tunnel),
.init = ipip_sit_init,
- .sections = "Match\0NetDev\0Tunnel\0",
+ .sections = NETDEV_COMMON_SECTIONS "Tunnel\0",
.fill_message_create = netdev_ipip_sit_fill_message_create,
.create_type = NETDEV_CREATE_STACKED,
.config_verify = netdev_tunnel_verify,
const NetDevVTable sit_vtable = {
.object_size = sizeof(Tunnel),
.init = ipip_sit_init,
- .sections = "Match\0NetDev\0Tunnel\0",
+ .sections = NETDEV_COMMON_SECTIONS "Tunnel\0",
.fill_message_create = netdev_ipip_sit_fill_message_create,
.create_type = NETDEV_CREATE_STACKED,
.config_verify = netdev_tunnel_verify,
const NetDevVTable vti_vtable = {
.object_size = sizeof(Tunnel),
.init = vti_init,
- .sections = "Match\0NetDev\0Tunnel\0",
+ .sections = NETDEV_COMMON_SECTIONS "Tunnel\0",
.fill_message_create = netdev_vti_fill_message_create,
.create_type = NETDEV_CREATE_STACKED,
.config_verify = netdev_tunnel_verify,
const NetDevVTable vti6_vtable = {
.object_size = sizeof(Tunnel),
.init = vti_init,
- .sections = "Match\0NetDev\0Tunnel\0",
+ .sections = NETDEV_COMMON_SECTIONS "Tunnel\0",
.fill_message_create = netdev_vti_fill_message_create,
.create_type = NETDEV_CREATE_STACKED,
.config_verify = netdev_tunnel_verify,
const NetDevVTable gre_vtable = {
.object_size = sizeof(Tunnel),
.init = gre_erspan_init,
- .sections = "Match\0NetDev\0Tunnel\0",
+ .sections = NETDEV_COMMON_SECTIONS "Tunnel\0",
.fill_message_create = netdev_gre_erspan_fill_message_create,
.create_type = NETDEV_CREATE_STACKED,
.config_verify = netdev_tunnel_verify,
const NetDevVTable gretap_vtable = {
.object_size = sizeof(Tunnel),
.init = gre_erspan_init,
- .sections = "Match\0NetDev\0Tunnel\0",
+ .sections = NETDEV_COMMON_SECTIONS "Tunnel\0",
.fill_message_create = netdev_gre_erspan_fill_message_create,
.create_type = NETDEV_CREATE_STACKED,
.config_verify = netdev_tunnel_verify,
const NetDevVTable ip6gre_vtable = {
.object_size = sizeof(Tunnel),
.init = ip6gre_init,
- .sections = "Match\0NetDev\0Tunnel\0",
+ .sections = NETDEV_COMMON_SECTIONS "Tunnel\0",
.fill_message_create = netdev_ip6gre_fill_message_create,
.create_type = NETDEV_CREATE_STACKED,
.config_verify = netdev_tunnel_verify,
const NetDevVTable ip6gretap_vtable = {
.object_size = sizeof(Tunnel),
.init = ip6gre_init,
- .sections = "Match\0NetDev\0Tunnel\0",
+ .sections = NETDEV_COMMON_SECTIONS "Tunnel\0",
.fill_message_create = netdev_ip6gre_fill_message_create,
.create_type = NETDEV_CREATE_STACKED,
.config_verify = netdev_tunnel_verify,
const NetDevVTable ip6tnl_vtable = {
.object_size = sizeof(Tunnel),
.init = ip6tnl_init,
- .sections = "Match\0NetDev\0Tunnel\0",
+ .sections = NETDEV_COMMON_SECTIONS "Tunnel\0",
.fill_message_create = netdev_ip6tnl_fill_message_create,
.create_type = NETDEV_CREATE_STACKED,
.config_verify = netdev_tunnel_verify,
const NetDevVTable erspan_vtable = {
.object_size = sizeof(Tunnel),
.init = gre_erspan_init,
- .sections = "Match\0NetDev\0Tunnel\0",
+ .sections = NETDEV_COMMON_SECTIONS "Tunnel\0",
.fill_message_create = netdev_gre_erspan_fill_message_create,
.create_type = NETDEV_CREATE_STACKED,
.config_verify = netdev_tunnel_verify,
const NetDevVTable tun_vtable = {
.object_size = sizeof(TunTap),
- .sections = "Match\0NetDev\0Tun\0",
+ .sections = NETDEV_COMMON_SECTIONS "Tun\0",
.config_verify = tuntap_verify,
.done = tuntap_done,
.create = netdev_create_tuntap,
const NetDevVTable tap_vtable = {
.object_size = sizeof(TunTap),
- .sections = "Match\0NetDev\0Tap\0",
+ .sections = NETDEV_COMMON_SECTIONS "Tap\0",
.config_verify = tuntap_verify,
.done = tuntap_done,
.create = netdev_create_tuntap,
const NetDevVTable vcan_vtable = {
.object_size = sizeof(VCan),
- .sections = "Match\0NetDev\0",
+ .sections = NETDEV_COMMON_SECTIONS,
.create_type = NETDEV_CREATE_INDEPENDENT,
.generate_mac = true,
};
const NetDevVTable veth_vtable = {
.object_size = sizeof(Veth),
- .sections = "Match\0NetDev\0Peer\0",
+ .sections = NETDEV_COMMON_SECTIONS "Peer\0",
.done = veth_done,
.fill_message_create = netdev_veth_fill_message_create,
.create_type = NETDEV_CREATE_INDEPENDENT,
const NetDevVTable vlan_vtable = {
.object_size = sizeof(VLan),
.init = vlan_init,
- .sections = "Match\0NetDev\0VLAN\0",
+ .sections = NETDEV_COMMON_SECTIONS "VLAN\0",
.fill_message_create = netdev_vlan_fill_message_create,
.create_type = NETDEV_CREATE_STACKED,
.config_verify = netdev_vlan_verify,
const NetDevVTable vrf_vtable = {
.object_size = sizeof(Vrf),
- .sections = "Match\0NetDev\0VRF\0",
+ .sections = NETDEV_COMMON_SECTIONS "VRF\0",
.fill_message_create = netdev_vrf_fill_message_create,
.create_type = NETDEV_CREATE_MASTER,
.generate_mac = true,
const NetDevVTable vxcan_vtable = {
.object_size = sizeof(VxCan),
- .sections = "Match\0NetDev\0VXCAN\0",
+ .sections = NETDEV_COMMON_SECTIONS "VXCAN\0",
.done = vxcan_done,
.fill_message_create = netdev_vxcan_fill_message_create,
.create_type = NETDEV_CREATE_INDEPENDENT,
const NetDevVTable vxlan_vtable = {
.object_size = sizeof(VxLan),
.init = vxlan_init,
- .sections = "Match\0NetDev\0VXLAN\0",
+ .sections = NETDEV_COMMON_SECTIONS "VXLAN\0",
.fill_message_create = netdev_vxlan_fill_message_create,
.create_type = NETDEV_CREATE_STACKED,
.config_verify = netdev_vxlan_verify,
const NetDevVTable wireguard_vtable = {
.object_size = sizeof(Wireguard),
- .sections = "Match\0NetDev\0WireGuard\0WireGuardPeer\0",
+ .sections = NETDEV_COMMON_SECTIONS "WireGuard\0WireGuardPeer\0",
.post_create = netdev_wireguard_post_create,
.init = wireguard_init,
.done = wireguard_done,
const NetDevVTable xfrm_vtable = {
.object_size = sizeof(Xfrm),
- .sections = "Match\0NetDev\0Xfrm\0",
+ .sections = NETDEV_COMMON_SECTIONS "Xfrm\0",
.fill_message_create = xfrm_fill_message_create,
.create_type = NETDEV_CREATE_STACKED
};
}
static int link_configure_qdiscs(Link *link) {
- QDiscs *qdisc;
+ QDisc *qdisc;
Iterator i;
int r;
if (link->qdisc_messages == 0)
link->qdiscs_configured = true;
else
- log_link_debug(link, "Configuring QDiscs");
+ log_link_debug(link, "Configuring queuing discipline (qdisc)");
return 0;
}
CAN.SamplePoint, config_parse_permille, 0, offsetof(Network, can_sample_point)
CAN.RestartSec, config_parse_sec, 0, offsetof(Network, can_restart_us)
CAN.TripleSampling, config_parse_tristate, 0, offsetof(Network, can_triple_sampling)
-TrafficControlQueueingDiscipline.Parent, config_parse_tc_qdiscs_parent, 0, 0
-TrafficControlQueueingDiscipline.NetworkEmulatorDelaySec, config_parse_tc_network_emulator_delay, 0, 0
-TrafficControlQueueingDiscipline.NetworkEmulatorDelayJitterSec, config_parse_tc_network_emulator_delay, 0, 0
-TrafficControlQueueingDiscipline.NetworkEmulatorLossRate, config_parse_tc_network_emulator_rate, 0, 0
-TrafficControlQueueingDiscipline.NetworkEmulatorDuplicateRate, config_parse_tc_network_emulator_rate, 0, 0
-TrafficControlQueueingDiscipline.NetworkEmulatorPacketLimit, config_parse_tc_network_emulator_packet_limit, 0, 0
+TrafficControlQueueingDiscipline.Parent, config_parse_tc_qdiscs_parent, 0, 0
+TrafficControlQueueingDiscipline.NetworkEmulatorDelaySec, config_parse_tc_network_emulator_delay, 0, 0
+TrafficControlQueueingDiscipline.NetworkEmulatorDelayJitterSec, config_parse_tc_network_emulator_delay, 0, 0
+TrafficControlQueueingDiscipline.NetworkEmulatorLossRate, config_parse_tc_network_emulator_rate, 0, 0
+TrafficControlQueueingDiscipline.NetworkEmulatorDuplicateRate, config_parse_tc_network_emulator_rate, 0, 0
+TrafficControlQueueingDiscipline.NetworkEmulatorPacketLimit, config_parse_tc_network_emulator_packet_limit, 0, 0
+TrafficControlQueueingDiscipline.TokenBufferFilterRate, config_parse_tc_token_buffer_filter_size, 0, 0
+TrafficControlQueueingDiscipline.TokenBufferFilterBurst, config_parse_tc_token_buffer_filter_size, 0, 0
+TrafficControlQueueingDiscipline.TokenBufferFilterLatencySec, config_parse_tc_token_buffer_filter_latency, 0, 0
+TrafficControlQueueingDiscipline.StochasticFairnessQueueingPerturbPeriodSec, config_parse_tc_stochastic_fairness_queueing_perturb_period, 0, 0
/* backwards compatibility: do not add new entries to this section */
Network.IPv4LL, config_parse_ipv4ll, 0, offsetof(Network, link_local)
DHCP.ClientIdentifier, config_parse_dhcp_client_identifier, 0, offsetof(Network, dhcp_client_identifier)
Prefix *prefix, *prefix_next;
Route *route, *route_next;
FdbEntry *fdb, *fdb_next;
+ QDisc *qdisc;
+ Iterator i;
assert(network);
assert(network->filename);
if (routing_policy_rule_section_verify(rule) < 0)
routing_policy_rule_free(rule);
+ bool has_root = false, has_clsact = false;
+ ORDERED_HASHMAP_FOREACH(qdisc, network->qdiscs_by_section, i)
+ if (qdisc_section_verify(qdisc, &has_root, &has_clsact) < 0)
+ qdisc_free(qdisc);
+
return 0;
}
* Copyright © 2019 VMware, Inc. */
#include <linux/pkt_sched.h>
-#include <math.h>
#include "alloc-util.h"
#include "conf-parser.h"
-#include "hashmap.h"
-#include "in-addr-util.h"
#include "netem.h"
#include "netlink-util.h"
#include "networkd-manager.h"
#include "qdisc.h"
#include "string-util.h"
#include "tc-util.h"
-#include "util.h"
int network_emulator_new(NetworkEmulator **ret) {
NetworkEmulator *ne = NULL;
return 0;
}
-int network_emulator_fill_message(Link *link, QDiscs *qdisc, sd_netlink_message *req) {
+int network_emulator_fill_message(Link *link, const NetworkEmulator *ne, sd_netlink_message *req) {
struct tc_netem_qopt opt = {
.limit = 1000,
};
int r;
assert(link);
- assert(qdisc);
+ assert(ne);
assert(req);
- if (qdisc->ne.limit > 0)
- opt.limit = qdisc->ne.limit;
+ if (ne->limit > 0)
+ opt.limit = ne->limit;
- if (qdisc->ne.loss > 0)
- opt.loss = qdisc->ne.loss;
+ if (ne->loss > 0)
+ opt.loss = ne->loss;
- if (qdisc->ne.duplicate > 0)
- opt.duplicate = qdisc->ne.duplicate;
+ if (ne->duplicate > 0)
+ opt.duplicate = ne->duplicate;
- if (qdisc->ne.delay != USEC_INFINITY) {
- r = tc_time_to_tick(qdisc->ne.delay, &opt.latency);
+ if (ne->delay != USEC_INFINITY) {
+ r = tc_time_to_tick(ne->delay, &opt.latency);
if (r < 0)
return log_link_error_errno(link, r, "Failed to calculate latency in TCA_OPTION: %m");
}
- if (qdisc->ne.jitter != USEC_INFINITY) {
- r = tc_time_to_tick(qdisc->ne.jitter, &opt.jitter);
+ if (ne->jitter != USEC_INFINITY) {
+ r = tc_time_to_tick(ne->jitter, &opt.jitter);
if (r < 0)
return log_link_error_errno(link, r, "Failed to calculate jitter in TCA_OPTION: %m");
}
void *data,
void *userdata) {
- _cleanup_(qdisc_free_or_set_invalidp) QDiscs *qdisc = NULL;
+ _cleanup_(qdisc_free_or_set_invalidp) QDisc *qdisc = NULL;
Network *network = data;
usec_t u;
int r;
void *data,
void *userdata) {
- _cleanup_(qdisc_free_or_set_invalidp) QDiscs *qdisc = NULL;
+ _cleanup_(qdisc_free_or_set_invalidp) QDisc *qdisc = NULL;
Network *network = data;
uint32_t rate;
int r;
void *data,
void *userdata) {
- _cleanup_(qdisc_free_or_set_invalidp) QDiscs *qdisc = NULL;
+ _cleanup_(qdisc_free_or_set_invalidp) QDisc *qdisc = NULL;
Network *network = data;
int r;
#include "networkd-link.h"
#include "time-util.h"
-typedef struct QDiscs QDiscs;
-
typedef struct NetworkEmulator {
usec_t delay;
usec_t jitter;
} NetworkEmulator;
int network_emulator_new(NetworkEmulator **ret);
-int network_emulator_fill_message(Link *link, QDiscs *qdisc, sd_netlink_message *req);
+int network_emulator_fill_message(Link *link, const NetworkEmulator *ne, sd_netlink_message *req);
CONFIG_PARSER_PROTOTYPE(config_parse_tc_network_emulator_delay);
CONFIG_PARSER_PROTOTYPE(config_parse_tc_network_emulator_rate);
#include "qdisc.h"
#include "set.h"
#include "string-util.h"
-#include "util.h"
-static int qdisc_new(QDiscs **ret) {
- QDiscs *qdisc;
+static int qdisc_new(QDisc **ret) {
+ QDisc *qdisc;
- qdisc = new(QDiscs, 1);
+ qdisc = new(QDisc, 1);
if (!qdisc)
return -ENOMEM;
- *qdisc = (QDiscs) {
+ *qdisc = (QDisc) {
.family = AF_UNSPEC,
.parent = TC_H_ROOT,
};
return 0;
}
-int qdisc_new_static(Network *network, const char *filename, unsigned section_line, QDiscs **ret) {
+int qdisc_new_static(Network *network, const char *filename, unsigned section_line, QDisc **ret) {
_cleanup_(network_config_section_freep) NetworkConfigSection *n = NULL;
- _cleanup_(qdisc_freep) QDiscs *qdisc = NULL;
+ _cleanup_(qdisc_freep) QDisc *qdisc = NULL;
int r;
assert(network);
return 0;
}
-void qdisc_free(QDiscs *qdisc) {
+void qdisc_free(QDisc *qdisc) {
if (!qdisc)
return;
}
if (link->route_messages == 0) {
- log_link_debug(link, "QDiscs configured");
+ log_link_debug(link, "QDisc configured");
link->qdiscs_configured = true;
link_check_ready(link);
}
return 1;
}
-int qdisc_configure(Link *link, QDiscs *qdisc) {
+int qdisc_configure(Link *link, QDisc *qdisc) {
_cleanup_(sd_netlink_message_unrefp) sd_netlink_message *req = NULL;
_cleanup_free_ char *tca_kind = NULL;
int r;
if (r < 0)
return log_oom();
- r = network_emulator_fill_message(link, qdisc, req);
+ r = network_emulator_fill_message(link, &qdisc->ne, req);
+ if (r < 0)
+ return r;
+ }
+
+ if (qdisc->has_token_buffer_filter) {
+ r = free_and_strdup(&tca_kind, "tbf");
+ if (r < 0)
+ return log_oom();
+
+ r = token_buffer_filter_fill_message(link, &qdisc->tbf, req);
+ if (r < 0)
+ return r;
+ }
+
+ if (qdisc->has_stochastic_fairness_queueing) {
+ r = free_and_strdup(&tca_kind, "sfq");
+ if (r < 0)
+ return log_oom();
+
+ r = stochastic_fairness_queueing_fill_message(link, &qdisc->sfq, req);
if (r < 0)
return r;
}
return 0;
}
+int qdisc_section_verify(QDisc *qdisc, bool *has_root, bool *has_clsact) {
+ unsigned i;
+
+ assert(qdisc);
+ assert(has_root);
+ assert(has_clsact);
+
+ if (section_is_invalid(qdisc->section))
+ return -EINVAL;
+
+ i = qdisc->has_network_emulator + qdisc->has_token_buffer_filter + qdisc->has_stochastic_fairness_queueing;
+ if (i > 1)
+ return log_warning_errno(SYNTHETIC_ERRNO(EINVAL),
+ "%s: TrafficControlQueueingDiscipline section has more than one type of discipline. "
+ "Ignoring [TrafficControlQueueingDiscipline] section from line %u.",
+ qdisc->section->filename, qdisc->section->line);
+
+ if (qdisc->parent == TC_H_ROOT) {
+ if (*has_root)
+ return log_warning_errno(SYNTHETIC_ERRNO(EINVAL),
+ "%s: More than one root TrafficControlQueueingDiscipline sections are defined. "
+ "Ignoring [TrafficControlQueueingDiscipline] section from line %u.",
+ qdisc->section->filename, qdisc->section->line);
+ *has_root = true;
+ } else if (qdisc->parent == TC_H_CLSACT) {
+ if (*has_clsact)
+ return log_warning_errno(SYNTHETIC_ERRNO(EINVAL),
+ "%s: More than one clsact TrafficControlQueueingDiscipline sections are defined. "
+ "Ignoring [TrafficControlQueueingDiscipline] section from line %u.",
+ qdisc->section->filename, qdisc->section->line);
+ *has_clsact = true;
+ }
+
+ return 0;
+}
+
int config_parse_tc_qdiscs_parent(
const char *unit,
const char *filename,
void *data,
void *userdata) {
- _cleanup_(qdisc_free_or_set_invalidp) QDiscs *qdisc = NULL;
+ _cleanup_(qdisc_free_or_set_invalidp) QDisc *qdisc = NULL;
Network *network = data;
int r;
#include "networkd-link.h"
#include "networkd-network.h"
#include "networkd-util.h"
+#include "sfq.h"
+#include "tbf.h"
-typedef struct QDiscs {
+typedef struct QDisc {
NetworkConfigSection *section;
Network *network;
uint32_t parent;
bool has_network_emulator:1;
+ bool has_token_buffer_filter:1;
+ bool has_stochastic_fairness_queueing:1;
NetworkEmulator ne;
-} QDiscs;
+ TokenBufferFilter tbf;
+ StochasticFairnessQueueing sfq;
+} QDisc;
-void qdisc_free(QDiscs *qdisc);
-int qdisc_new_static(Network *network, const char *filename, unsigned section_line, QDiscs **ret);
+void qdisc_free(QDisc *qdisc);
+int qdisc_new_static(Network *network, const char *filename, unsigned section_line, QDisc **ret);
-int qdisc_configure(Link *link, QDiscs *qdisc);
+int qdisc_configure(Link *link, QDisc *qdisc);
-DEFINE_NETWORK_SECTION_FUNCTIONS(QDiscs, qdisc_free);
+int qdisc_section_verify(QDisc *qdisc, bool *has_root, bool *has_clsact);
+
+DEFINE_NETWORK_SECTION_FUNCTIONS(QDisc, qdisc_free);
CONFIG_PARSER_PROTOTYPE(config_parse_tc_qdiscs_parent);
--- /dev/null
+/* SPDX-License-Identifier: LGPL-2.1+
+ * Copyright © 2019 VMware, Inc. */
+
+#include <linux/pkt_sched.h>
+
+#include "alloc-util.h"
+#include "conf-parser.h"
+#include "netlink-util.h"
+#include "parse-util.h"
+#include "qdisc.h"
+#include "sfq.h"
+#include "string-util.h"
+
+int stochastic_fairness_queueing_new(StochasticFairnessQueueing **ret) {
+ StochasticFairnessQueueing *sfq = NULL;
+
+ sfq = new0(StochasticFairnessQueueing, 1);
+ if (!sfq)
+ return -ENOMEM;
+
+ *ret = TAKE_PTR(sfq);
+
+ return 0;
+}
+
+int stochastic_fairness_queueing_fill_message(Link *link, const StochasticFairnessQueueing *sfq, sd_netlink_message *req) {
+ struct tc_sfq_qopt_v1 opt = {};
+ int r;
+
+ assert(link);
+ assert(sfq);
+ assert(req);
+
+ opt.v0.perturb_period = sfq->perturb_period / USEC_PER_SEC;
+
+ r = sd_netlink_message_append_data(req, TCA_OPTIONS, &opt, sizeof(struct tc_sfq_qopt_v1));
+ if (r < 0)
+ return log_link_error_errno(link, r, "Could not append TCA_OPTIONS attribute: %m");
+
+ return 0;
+}
+
+int config_parse_tc_stochastic_fairness_queueing_perturb_period(
+ const char *unit,
+ const char *filename,
+ unsigned line,
+ const char *section,
+ unsigned section_line,
+ const char *lvalue,
+ int ltype,
+ const char *rvalue,
+ void *data,
+ void *userdata) {
+
+ _cleanup_(qdisc_free_or_set_invalidp) QDisc *qdisc = NULL;
+ Network *network = data;
+ int r;
+
+ assert(filename);
+ assert(lvalue);
+ assert(rvalue);
+ assert(data);
+
+ r = qdisc_new_static(network, filename, section_line, &qdisc);
+ if (r < 0)
+ return r;
+
+ if (isempty(rvalue)) {
+ qdisc->sfq.perturb_period = 0;
+
+ qdisc = NULL;
+ return 0;
+ }
+
+ r = parse_sec(rvalue, &qdisc->sfq.perturb_period);
+ if (r < 0) {
+ log_syntax(unit, LOG_ERR, filename, line, r,
+ "Failed to parse '%s=', ignoring assignment: %s",
+ lvalue, rvalue);
+ return 0;
+ }
+
+ qdisc->has_stochastic_fairness_queueing = true;
+ qdisc = NULL;
+
+ return 0;
+}
--- /dev/null
+/* SPDX-License-Identifier: LGPL-2.1+
+ * Copyright © 2019 VMware, Inc. */
+#pragma once
+
+#include "sd-netlink.h"
+
+#include "conf-parser.h"
+#include "networkd-link.h"
+
+typedef struct StochasticFairnessQueueing {
+ usec_t perturb_period;
+} StochasticFairnessQueueing;
+
+int stochastic_fairness_queueing_new(StochasticFairnessQueueing **ret);
+int stochastic_fairness_queueing_fill_message(Link *link, const StochasticFairnessQueueing *sfq, sd_netlink_message *req);
+
+CONFIG_PARSER_PROTOTYPE(config_parse_tc_stochastic_fairness_queueing_perturb_period);
--- /dev/null
+/* SPDX-License-Identifier: LGPL-2.1+
+ * Copyright © 2019 VMware, Inc. */
+
+#include <linux/pkt_sched.h>
+#include <math.h>
+
+#include "alloc-util.h"
+#include "conf-parser.h"
+#include "netem.h"
+#include "netlink-util.h"
+#include "networkd-manager.h"
+#include "parse-util.h"
+#include "qdisc.h"
+#include "string-util.h"
+#include "util.h"
+
+int token_buffer_filter_new(TokenBufferFilter **ret) {
+ TokenBufferFilter *ne = NULL;
+
+ ne = new0(TokenBufferFilter, 1);
+ if (!ne)
+ return -ENOMEM;
+
+ *ret = TAKE_PTR(ne);
+
+ return 0;
+}
+
+int token_buffer_filter_fill_message(Link *link, const TokenBufferFilter *tbf, sd_netlink_message *req) {
+ struct tc_tbf_qopt opt = {};
+ int r;
+
+ assert(link);
+ assert(tbf);
+ assert(req);
+
+ opt.rate.rate = tbf->rate >= (1ULL << 32) ? ~0U : tbf->rate;
+ opt.limit = tbf->rate * (double) tbf->latency / USEC_PER_SEC + tbf->burst;
+
+ r = sd_netlink_message_open_array(req, TCA_OPTIONS);
+ if (r < 0)
+ return log_link_error_errno(link, r, "Could not open container TCA_OPTIONS: %m");
+
+ r = sd_netlink_message_append_data(req, TCA_TBF_PARMS, &opt, sizeof(struct tc_tbf_qopt));
+ if (r < 0)
+ return log_link_error_errno(link, r, "Could not append TCA_TBF_PARMS attribute: %m");
+
+ r = sd_netlink_message_append_data(req, TCA_TBF_BURST, &tbf->burst, sizeof(tbf->burst));
+ if (r < 0)
+ return log_link_error_errno(link, r, "Could not append TCA_TBF_BURST attribute: %m");
+
+ if (tbf->rate >= (1ULL << 32)) {
+ r = sd_netlink_message_append_data(req, TCA_TBF_RATE64, &tbf->rate, sizeof(tbf->rate));
+ if (r < 0)
+ return log_link_error_errno(link, r, "Could not append TCA_TBF_RATE64 attribute: %m");
+ }
+
+ r = sd_netlink_message_close_container(req);
+ if (r < 0)
+ return log_link_error_errno(link, r, "Could not close container TCA_OPTIONS: %m");
+
+ return 0;
+}
+
+int config_parse_tc_token_buffer_filter_size(
+ const char *unit,
+ const char *filename,
+ unsigned line,
+ const char *section,
+ unsigned section_line,
+ const char *lvalue,
+ int ltype,
+ const char *rvalue,
+ void *data,
+ void *userdata) {
+
+ _cleanup_(qdisc_free_or_set_invalidp) QDisc *qdisc = NULL;
+ Network *network = data;
+ uint64_t k;
+ int r;
+
+ assert(filename);
+ assert(lvalue);
+ assert(rvalue);
+ assert(data);
+
+ r = qdisc_new_static(network, filename, section_line, &qdisc);
+ if (r < 0)
+ return r;
+
+ if (isempty(rvalue)) {
+ if (streq(lvalue, "TokenBufferFilterRate"))
+ qdisc->tbf.rate = 0;
+ else if (streq(lvalue, "TokenBufferFilterBurst"))
+ qdisc->tbf.burst = 0;
+
+ qdisc = NULL;
+ return 0;
+ }
+
+ r = parse_size(rvalue, 1000, &k);
+ if (r < 0) {
+ log_syntax(unit, LOG_ERR, filename, line, r,
+ "Failed to parse '%s=', ignoring assignment: %s",
+ lvalue, rvalue);
+ return 0;
+ }
+
+ if (streq(lvalue, "TokenBufferFilterRate"))
+ qdisc->tbf.rate = k / 8;
+ else if (streq(lvalue, "TokenBufferFilterBurst"))
+ qdisc->tbf.burst = k;
+
+ qdisc->has_token_buffer_filter = true;
+ qdisc = NULL;
+
+ return 0;
+}
+
+int config_parse_tc_token_buffer_filter_latency(
+ const char *unit,
+ const char *filename,
+ unsigned line,
+ const char *section,
+ unsigned section_line,
+ const char *lvalue,
+ int ltype,
+ const char *rvalue,
+ void *data,
+ void *userdata) {
+
+ _cleanup_(qdisc_free_or_set_invalidp) QDisc *qdisc = NULL;
+ Network *network = data;
+ usec_t u;
+ int r;
+
+ assert(filename);
+ assert(lvalue);
+ assert(rvalue);
+ assert(data);
+
+ r = qdisc_new_static(network, filename, section_line, &qdisc);
+ if (r < 0)
+ return r;
+
+ if (isempty(rvalue)) {
+ qdisc->tbf.latency = 0;
+
+ qdisc = NULL;
+ return 0;
+ }
+
+ r = parse_sec(rvalue, &u);
+ if (r < 0) {
+ log_syntax(unit, LOG_ERR, filename, line, r,
+ "Failed to parse '%s=', ignoring assignment: %s",
+ lvalue, rvalue);
+ return 0;
+ }
+
+ qdisc->tbf.latency = u;
+
+ qdisc->has_token_buffer_filter = true;
+ qdisc = NULL;
+
+ return 0;
+}
--- /dev/null
+/* SPDX-License-Identifier: LGPL-2.1+
+ * Copyright © 2019 VMware, Inc. */
+#pragma once
+
+#include "sd-netlink.h"
+
+#include "conf-parser.h"
+#include "networkd-link.h"
+
+typedef struct TokenBufferFilter {
+ uint64_t rate;
+
+ uint32_t burst;
+ uint32_t latency;
+} TokenBufferFilter;
+
+int token_buffer_filter_new(TokenBufferFilter **ret);
+int token_buffer_filter_fill_message(Link *link, const TokenBufferFilter *tbf, sd_netlink_message *req);
+
+CONFIG_PARSER_PROTOTYPE(config_parse_tc_token_buffer_filter_latency);
+CONFIG_PARSER_PROTOTYPE(config_parse_tc_token_buffer_filter_size);
goto finish;
}
- r = loop_device_make_by_path(arg_image, arg_read_only ? O_RDONLY : O_RDWR, &loop);
+ r = loop_device_make_by_path(arg_image, arg_read_only ? O_RDONLY : O_RDWR, LO_FLAGS_PARTSCAN, &loop);
if (r < 0) {
log_error_errno(r, "Failed to set up loopback block device: %m");
goto finish;
/* SPDX-License-Identifier: LGPL-2.1+ */
+#include <linux/loop.h>
+
#include "bus-common-errors.h"
#include "bus-error.h"
#include "conf-files.h"
assert(path);
- r = loop_device_make_by_path(path, O_RDONLY, &d);
+ r = loop_device_make_by_path(path, O_RDONLY, LO_FLAGS_PARTSCAN, &d);
if (r == -EISDIR) {
/* We can't turn this into a loop-back block device, and this returns EISDIR? Then this is a directory
* tree and not a raw device. It's easy then. */
resolved-etc-hosts.h
resolved-etc-hosts.c
resolved-dnstls.h
+ resolved-util.c
+ resolved-util.h
'''.split())
resolvectl_sources = files('''
[],
[],
'ENABLE_RESOLVE', 'manual'],
+
+ [['src/resolve/test-resolved-util.c',
+ 'src/resolve/resolved-util.c',
+ 'src/resolve/resolved-util.h'],
+ [],
+ []],
]
#include "parse-util.h"
#include "resolved-conf.h"
#include "resolved-dnssd.h"
+#include "resolved-util.h"
#include "specifier.h"
#include "string-table.h"
#include "string-util.h"
union in_addr_union address;
int family, r, ifindex = 0;
DnsServer *s;
+ _cleanup_free_ char *server_name = NULL;
assert(m);
assert(word);
- r = in_addr_ifindex_from_string_auto(word, &family, &address, &ifindex);
+ r = in_addr_ifindex_name_from_string_auto(word, &family, &address, &ifindex, &server_name);
if (r < 0)
return r;
return 0;
}
- return dns_server_new(m, NULL, type, NULL, family, &address, ifindex);
+ return dns_server_new(m, NULL, type, NULL, family, &address, ifindex, server_name);
}
int manager_parse_dns_server_string_and_warn(Manager *m, DnsServerType type, const char *string) {
Link *l,
int family,
const union in_addr_union *in_addr,
- int ifindex) {
+ int ifindex,
+ const char *server_name) {
+ _cleanup_free_ char *name = NULL;
DnsServer *s;
assert(m);
return -E2BIG;
}
+ if (server_name) {
+ name = strdup(server_name);
+ if (!name)
+ return -ENOMEM;
+ }
+
s = new(DnsServer, 1);
if (!s)
return -ENOMEM;
.family = family,
.address = *in_addr,
.ifindex = ifindex,
+ .server_name = TAKE_PTR(name),
};
dns_server_reset_features(s);
#endif
free(s->server_string);
+ free(s->server_name);
return mfree(s);
}
char *server_string;
+ char *server_name;
+
/* The long-lived stream towards this server. */
DnsStream *stream;
Link *link,
int family,
const union in_addr_union *address,
- int ifindex);
+ int ifindex,
+ const char *server_string);
DnsServer* dns_server_ref(DnsServer *s);
DnsServer* dns_server_unref(DnsServer *s);
gnutls_session_set_verify_cert2(gs, &stream->dnstls_data.validation, 1, 0);
}
+ if (server->server_name) {
+ r = gnutls_server_name_set(gs, GNUTLS_NAME_DNS, server->server_name, strlen(server->server_name));
+ if (r < 0)
+ return log_debug_errno(SYNTHETIC_ERRNO(EINVAL), "Failed to set server name: %s", gnutls_strerror(r));
+ }
+
gnutls_handshake_set_timeout(gs, GNUTLS_DEFAULT_HANDSHAKE_TIMEOUT);
gnutls_transport_set_ptr2(gs, (gnutls_transport_ptr_t) (long) stream->fd, stream);
return -ECONNREFUSED;
}
+ if (server->server_name) {
+ r = SSL_set_tlsext_host_name(s, server->server_name);
+ if (r <= 0) {
+ char errbuf[256];
+
+ error = ERR_get_error();
+ ERR_error_string_n(error, errbuf, sizeof(errbuf));
+ return log_debug_errno(SYNTHETIC_ERRNO(EINVAL), "Failed to set server name: %s", errbuf);
+ }
+ }
+
ERR_clear_error();
stream->dnstls_data.handshake = SSL_do_handshake(s);
if (stream->dnstls_data.handshake <= 0) {
if (s)
dns_server_move_back_and_unmark(s);
else {
- r = dns_server_new(l->manager, NULL, DNS_SERVER_LINK, l, dns[i].family, &dns[i].address, 0);
+ r = dns_server_new(l->manager, NULL, DNS_SERVER_LINK, l, dns[i].family, &dns[i].address, 0, NULL);
if (r < 0)
goto clear;
}
return 0;
}
- return dns_server_new(l->manager, NULL, DNS_SERVER_LINK, l, family, &a, 0);
+ return dns_server_new(l->manager, NULL, DNS_SERVER_LINK, l, family, &a, 0, NULL);
}
static int link_update_dns_servers(Link *l) {
--- /dev/null
+/* SPDX-License-Identifier: LGPL-2.1+ */
+
+#include "alloc-util.h"
+#include "in-addr-util.h"
+#include "macro.h"
+#include "resolved-util.h"
+
+int in_addr_ifindex_name_from_string_auto(const char *s, int *family, union in_addr_union *ret, int *ifindex, char **server_name) {
+ _cleanup_free_ char *buf = NULL, *name = NULL;
+ const char *m;
+ int r;
+
+ assert(s);
+
+ m = strchr(s, '#');
+ if (m) {
+ name = strdup(m+1);
+ if (!name)
+ return -ENOMEM;
+
+ buf = strndup(s, m - s);
+ if (!buf)
+ return -ENOMEM;
+
+ s = buf;
+ }
+
+ r = in_addr_ifindex_from_string_auto(s, family, ret, ifindex);
+ if (r < 0)
+ return r;
+
+ if (server_name)
+ *server_name = TAKE_PTR(name);
+
+ return r;
+}
--- /dev/null
+/* SPDX-License-Identifier: LGPL-2.1+ */
+#pragma once
+
+#include "in-addr-util.h"
+
+int in_addr_ifindex_name_from_string_auto(const char *s, int *family, union in_addr_union *ret, int *ifindex, char **server_name);
--- /dev/null
+/* SPDX-License-Identifier: LGPL-2.1+ */
+
+#include "log.h"
+#include "resolved-util.h"
+#include "string-util.h"
+#include "tests.h"
+
+
+static void test_in_addr_ifindex_name_from_string_auto_one(const char *a, const char *expected) {
+ int family, ifindex;
+ union in_addr_union ua;
+ _cleanup_free_ char *server_name = NULL;
+
+ assert_se(in_addr_ifindex_name_from_string_auto(a, &family, &ua, &ifindex, &server_name) >= 0);
+ assert_se(streq_ptr(server_name, expected));
+}
+
+static void test_in_addr_ifindex_name_from_string_auto(void) {
+ log_info("/* %s */", __func__);
+
+ test_in_addr_ifindex_name_from_string_auto_one("192.168.0.1", NULL);
+ test_in_addr_ifindex_name_from_string_auto_one("192.168.0.1#test.com", "test.com");
+ test_in_addr_ifindex_name_from_string_auto_one("fe80::18%19", NULL);
+ test_in_addr_ifindex_name_from_string_auto_one("fe80::18%19#another.test.com", "another.test.com");
+}
+
+int main(int argc, char **argv) {
+ test_setup_logging(LOG_DEBUG);
+
+ test_in_addr_ifindex_name_from_string_auto();
+ return 0;
+}
assert(entry);
free(entry->id);
+ free(entry->id_old);
free(entry->path);
free(entry->root);
free(entry->title);
return log_error_errno(SYNTHETIC_ERRNO(EINVAL), "Invalid loader entry file suffix: %s", path);
b = basename(path);
- tmp.id = strndup(b, c - b);
- if (!tmp.id)
+ tmp.id = strdup(b);
+ tmp.id_old = strndup(b, c - b);
+ if (!tmp.id || !tmp.id_old)
return log_oom();
if (!efi_loader_entry_name_valid(tmp.id))
- return log_error_errno(SYNTHETIC_ERRNO(EINVAL), "Invalid loader entry filename: %s", path);
+ return log_error_errno(SYNTHETIC_ERRNO(EINVAL), "Invalid loader entry: %s", tmp.id);
tmp.path = strdup(path);
if (!tmp.path)
};
_cleanup_fclose_ FILE *f = NULL;
const char *k;
+ char *b;
int r;
assert(root);
if (!os_pretty_name || !os_id || !(version_id || build_id))
return log_error_errno(SYNTHETIC_ERRNO(EBADMSG), "Missing fields in os-release data from unified kernel image %s, refusing.", path);
- tmp.id = strjoin(os_id, "-", version_id ?: build_id);
- if (!tmp.id)
+ b = basename(path);
+ tmp.id = strdup(b);
+ tmp.id_old = strjoin(os_id, "-", version_id ?: build_id);
+ if (!tmp.id || !tmp.id_old)
return log_oom();
if (!efi_loader_entry_name_valid(tmp.id))
typedef struct BootEntry {
BootEntryType type;
char *id; /* This is the file basename without extension */
+ char *id_old; /* Old-style ID, for deduplication purposes. */
char *path; /* This is the full path to the drop-in file */
char *root; /* The root path in which the drop-in was found, i.e. to which 'kernel', 'efi' and 'initrd' are relative */
char *title;
static inline bool boot_config_has_entry(BootConfig *config, const char *id) {
size_t j;
- for (j = 0; j < config->n_entries; j++)
- if (streq(config->entries[j].id, id))
+ for (j = 0; j < config->n_entries; j++) {
+ const char* entry_id_old = config->entries[j].id_old;
+ if (streq(config->entries[j].id, id) ||
+ (entry_id_old && streq(entry_id_old, id)))
return true;
+ }
return false;
}
return -ENOMEM;
if (sections && !nulstr_contains(sections, n)) {
+ bool ignore = flags & CONFIG_PARSE_RELAXED;
+ const char *t;
- if (!(flags & CONFIG_PARSE_RELAXED) && !startswith(n, "X-"))
+ ignore = ignore || startswith(n, "X-");
+
+ if (!ignore)
+ NULSTR_FOREACH(t, sections)
+ if (streq_ptr(n, startswith(t, "-"))) {
+ ignore = true;
+ break;
+ }
+
+ if (!ignore)
log_syntax(unit, LOG_WARNING, filename, line, 0, "Unknown section '%s'. Ignoring.", n);
free(n);
_cleanup_free_ char *section = NULL, *continuation = NULL;
_cleanup_fclose_ FILE *ours = NULL;
unsigned line = 0, section_line = 0;
- bool section_ignored = false;
+ bool section_ignored = false, bom_seen = false;
int r;
assert(filename);
continue;
l = buf;
- if (!(flags & CONFIG_PARSE_REFUSE_BOM)) {
+ if (!bom_seen) {
char *q;
q = startswith(buf, UTF8_BYTE_ORDER_MARK);
if (q) {
l = q;
- flags |= CONFIG_PARSE_REFUSE_BOM;
+ bom_seen = true;
}
}
/* An abstract parser for simple, line based, shallow configuration files consisting of variable assignments only. */
typedef enum ConfigParseFlags {
- CONFIG_PARSE_RELAXED = 1 << 0,
- CONFIG_PARSE_ALLOW_INCLUDE = 1 << 1,
- CONFIG_PARSE_WARN = 1 << 2,
- CONFIG_PARSE_REFUSE_BOM = 1 << 3,
+ CONFIG_PARSE_RELAXED = 1 << 0, /* Do not warn about unknown non-extension fields */
+ CONFIG_PARSE_ALLOW_INCLUDE = 1 << 1, /* Allow the deprecated .include stanza */
+ CONFIG_PARSE_WARN = 1 << 2, /* Emit non-debug messages */
} ConfigParseFlags;
/* Argument list for parsers of specific configuration settings. */
/* SPDX-License-Identifier: LGPL-2.1+ */
+#if HAVE_VALGRIND_MEMCHECK_H
+#include <valgrind/memcheck.h>
+#endif
+
#include <linux/dm-ioctl.h>
#include <linux/loop.h>
#include <sys/mount.h>
* an explicit recognizable error about this, so that callers can generate a
* proper message explaining the situation. */
- if (ioctl(fd, LOOP_GET_STATUS64, &info) >= 0 && (info.lo_flags & LO_FLAGS_PARTSCAN) == 0) {
- log_debug("Device is a loop device and partition scanning is off!");
- return -EPROTONOSUPPORT;
+ if (ioctl(fd, LOOP_GET_STATUS64, &info) >= 0) {
+#if HAVE_VALGRIND_MEMCHECK_H
+ /* Valgrind currently doesn't know LOOP_GET_STATUS64. Remove this once it does */
+ VALGRIND_MAKE_MEM_DEFINED(&info, sizeof(info));
+#endif
+
+ if ((info.lo_flags & LO_FLAGS_PARTSCAN) == 0)
+ return log_debug_errno(EPROTONOSUPPORT,
+ "Device is a loop device and partition scanning is off!");
}
}
if (r != -EBUSY)
assert(c);
r = config_parse(info->name, path, f,
- NULL,
+ "Install\0"
+ "-Unit\0"
+ "-Automount\0"
+ "-Device\0"
+ "-Mount\0"
+ "-Path\0"
+ "-Scope\0"
+ "-Service\0"
+ "-Slice\0"
+ "-Socket\0"
+ "-Swap\0"
+ "-Target\0"
+ "-Timer\0",
config_item_table_lookup, items,
- CONFIG_PARSE_RELAXED|CONFIG_PARSE_ALLOW_INCLUDE, info);
+ CONFIG_PARSE_ALLOW_INCLUDE, info);
if (r < 0)
return log_debug_errno(r, "Failed to parse %s: %m", info->name);
/* SPDX-License-Identifier: LGPL-2.1+ */
+#if HAVE_VALGRIND_MEMCHECK_H
+#include <valgrind/memcheck.h>
+#endif
+
#include <errno.h>
#include <fcntl.h>
+#include <linux/blkpg.h>
+#include <linux/fs.h>
#include <linux/loop.h>
+#include <sys/file.h>
#include <sys/ioctl.h>
#include "alloc-util.h"
#include "fd-util.h"
+#include "fileio.h"
#include "loop-util.h"
+#include "parse-util.h"
#include "stat-util.h"
+#include "stdio-util.h"
-int loop_device_make(int fd, int open_flags, LoopDevice **ret) {
- const struct loop_info64 info = {
- .lo_flags = LO_FLAGS_AUTOCLEAR|LO_FLAGS_PARTSCAN|(open_flags == O_RDONLY ? LO_FLAGS_READ_ONLY : 0),
- };
+int loop_device_make_full(
+ int fd,
+ int open_flags,
+ uint64_t offset,
+ uint64_t size,
+ uint32_t loop_flags,
+ LoopDevice **ret) {
_cleanup_close_ int control = -1, loop = -1;
_cleanup_free_ char *loopdev = NULL;
unsigned n_attempts = 0;
+ struct loop_info64 info;
+ LoopDevice *d = NULL;
struct stat st;
- LoopDevice *d;
- int nr, r;
+ int nr = -1, r;
assert(fd >= 0);
assert(ret);
return -errno;
if (S_ISBLK(st.st_mode)) {
- int copy;
+ if (ioctl(loop, LOOP_GET_STATUS64, &info) >= 0) {
+ /* Oh! This is a loopback device? That's interesting! */
- /* If this is already a block device, store a copy of the fd as it is */
+#if HAVE_VALGRIND_MEMCHECK_H
+ /* Valgrind currently doesn't know LOOP_GET_STATUS64. Remove this once it does */
+ VALGRIND_MAKE_MEM_DEFINED(&info, sizeof(info));
+#endif
+ nr = info.lo_number;
- copy = fcntl(fd, F_DUPFD_CLOEXEC, 3);
- if (copy < 0)
- return -errno;
+ if (asprintf(&loopdev, "/dev/loop%i", nr) < 0)
+ return -ENOMEM;
+ }
- d = new0(LoopDevice, 1);
- if (!d)
- return -ENOMEM;
+ if (offset == 0 && IN_SET(size, 0, UINT64_MAX)) {
+ int copy;
- *d = (LoopDevice) {
- .fd = copy,
- .nr = -1,
- .relinquished = true, /* It's not allocated by us, don't destroy it when this object is freed */
- };
+ /* If this is already a block device, store a copy of the fd as it is */
- *ret = d;
- return d->fd;
- }
+ copy = fcntl(fd, F_DUPFD_CLOEXEC, 3);
+ if (copy < 0)
+ return -errno;
- r = stat_verify_regular(&st);
- if (r < 0)
- return r;
+ d = new(LoopDevice, 1);
+ if (!d)
+ return -ENOMEM;
+
+ *d = (LoopDevice) {
+ .fd = copy,
+ .nr = nr,
+ .node = TAKE_PTR(loopdev),
+ .relinquished = true, /* It's not allocated by us, don't destroy it when this object is freed */
+ };
+
+ *ret = d;
+ return d->fd;
+ }
+ } else {
+ r = stat_verify_regular(&st);
+ if (r < 0)
+ return r;
+ }
control = open("/dev/loop-control", O_RDWR|O_CLOEXEC|O_NOCTTY|O_NONBLOCK);
if (control < 0)
loop = safe_close(loop);
}
- if (ioctl(loop, LOOP_SET_STATUS64, &info) < 0)
- return -errno;
+ info = (struct loop_info64) {
+ /* Use the specified flags, but configure the read-only flag from the open flags, and force autoclear */
+ .lo_flags = (loop_flags & ~LO_FLAGS_READ_ONLY) | ((loop_flags & O_ACCMODE) == O_RDONLY ? LO_FLAGS_READ_ONLY : 0) | LO_FLAGS_AUTOCLEAR,
+ .lo_offset = offset,
+ .lo_sizelimit = size == UINT64_MAX ? 0 : size,
+ };
+
+ if (ioctl(loop, LOOP_SET_STATUS64, &info) < 0) {
+ r = -errno;
+ goto fail;
+ }
d = new(LoopDevice, 1);
- if (!d)
- return -ENOMEM;
+ if (!d) {
+ r = -ENOMEM;
+ goto fail;
+ }
*d = (LoopDevice) {
.fd = TAKE_FD(loop),
*ret = d;
return d->fd;
+
+fail:
+ if (fd >= 0)
+ (void) ioctl(fd, LOOP_CLR_FD);
+ if (d && d->fd >= 0)
+ (void) ioctl(d->fd, LOOP_CLR_FD);
+
+ return r;
}
-int loop_device_make_by_path(const char *path, int open_flags, LoopDevice **ret) {
+int loop_device_make_by_path(const char *path, int open_flags, uint32_t loop_flags, LoopDevice **ret) {
_cleanup_close_ int fd = -1;
assert(path);
if (fd < 0)
return -errno;
- return loop_device_make(fd, open_flags, ret);
+ return loop_device_make(fd, open_flags, loop_flags, ret);
}
LoopDevice* loop_device_unref(LoopDevice *d) {
d->relinquished = true;
}
+
+int loop_device_open(const char *loop_path, int open_flags, LoopDevice **ret) {
+ _cleanup_close_ int loop_fd = -1;
+ _cleanup_free_ char *p = NULL;
+ struct loop_info64 info;
+ struct stat st;
+ LoopDevice *d;
+ int nr;
+
+ assert(loop_path);
+ assert(ret);
+
+ loop_fd = open(loop_path, O_CLOEXEC|O_NONBLOCK|O_NOCTTY|open_flags);
+ if (loop_fd < 0)
+ return -errno;
+
+ if (fstat(loop_fd, &st) < 0)
+ return -errno;
+ if (!S_ISBLK(st.st_mode))
+ return -ENOTBLK;
+
+ if (ioctl(loop_fd, LOOP_GET_STATUS64, &info) >= 0) {
+#if HAVE_VALGRIND_MEMCHECK_H
+ /* Valgrind currently doesn't know LOOP_GET_STATUS64. Remove this once it does */
+ VALGRIND_MAKE_MEM_DEFINED(&info, sizeof(info));
+#endif
+ nr = info.lo_number;
+ } else
+ nr = -1;
+
+ p = strdup(loop_path);
+ if (!p)
+ return -ENOMEM;
+
+ d = new(LoopDevice, 1);
+ if (!d)
+ return -ENOMEM;
+
+ *d = (LoopDevice) {
+ .fd = TAKE_FD(loop_fd),
+ .nr = nr,
+ .node = TAKE_PTR(p),
+ .relinquished = true, /* It's not ours, don't try to destroy it when this object is freed */
+ };
+
+ *ret = d;
+ return d->fd;
+}
+
+static int resize_partition(int partition_fd, uint64_t offset, uint64_t size) {
+ char sysfs[STRLEN("/sys/dev/block/:/partition") + 2*DECIMAL_STR_MAX(dev_t) + 1];
+ _cleanup_free_ char *whole = NULL, *buffer = NULL;
+ uint64_t current_offset, current_size, partno;
+ _cleanup_close_ int whole_fd = -1;
+ struct stat st;
+ dev_t devno;
+ int r;
+
+ assert(partition_fd >= 0);
+
+ /* Resizes the partition the loopback device refer to (assuming it refers to one instead of an actual
+ * loopback device), and changes the offset, if needed. This is a fancy wrapper around
+ * BLKPG_RESIZE_PARTITION. */
+
+ if (fstat(partition_fd, &st) < 0)
+ return -errno;
+
+ assert(S_ISBLK(st.st_mode));
+
+ xsprintf(sysfs, "/sys/dev/block/%u:%u/partition", major(st.st_rdev), minor(st.st_rdev));
+ r = read_one_line_file(sysfs, &buffer);
+ if (r == -ENOENT) /* not a partition, cannot resize */
+ return -ENOTTY;
+ if (r < 0)
+ return r;
+ r = safe_atou64(buffer, &partno);
+ if (r < 0)
+ return r;
+
+ xsprintf(sysfs, "/sys/dev/block/%u:%u/start", major(st.st_rdev), minor(st.st_rdev));
+
+ buffer = mfree(buffer);
+ r = read_one_line_file(sysfs, &buffer);
+ if (r < 0)
+ return r;
+ r = safe_atou64(buffer, ¤t_offset);
+ if (r < 0)
+ return r;
+ if (current_offset > UINT64_MAX/512U)
+ return -EINVAL;
+ current_offset *= 512U;
+
+ if (ioctl(partition_fd, BLKGETSIZE64, ¤t_size) < 0)
+ return -EINVAL;
+
+ if (size == UINT64_MAX && offset == UINT64_MAX)
+ return 0;
+ if (current_size == size && current_offset == offset)
+ return 0;
+
+ xsprintf(sysfs, "/sys/dev/block/%u:%u/../dev", major(st.st_rdev), minor(st.st_rdev));
+
+ buffer = mfree(buffer);
+ r = read_one_line_file(sysfs, &buffer);
+ if (r < 0)
+ return r;
+ r = parse_dev(buffer, &devno);
+ if (r < 0)
+ return r;
+
+ r = device_path_make_major_minor(S_IFBLK, devno, &whole);
+ if (r < 0)
+ return r;
+
+ whole_fd = open(whole, O_RDWR|O_CLOEXEC|O_NONBLOCK|O_NOCTTY);
+ if (whole_fd < 0)
+ return -errno;
+
+ struct blkpg_partition bp = {
+ .pno = partno,
+ .start = offset == UINT64_MAX ? current_offset : offset,
+ .length = size == UINT64_MAX ? current_size : size,
+ };
+
+ struct blkpg_ioctl_arg ba = {
+ .op = BLKPG_RESIZE_PARTITION,
+ .data = &bp,
+ .datalen = sizeof(bp),
+ };
+
+ if (ioctl(whole_fd, BLKPG, &ba) < 0)
+ return -errno;
+
+ return 0;
+}
+
+int loop_device_refresh_size(LoopDevice *d, uint64_t offset, uint64_t size) {
+ struct loop_info64 info;
+ assert(d);
+
+ /* Changes the offset/start of the loop device relative to the beginning of the underlying file or
+ * block device. If this loop device actually refers to a partition and not a loopback device, we'll
+ * try to adjust the partition offsets instead.
+ *
+ * If either offset or size is UINT64_MAX we won't change that parameter. */
+
+ if (d->fd < 0)
+ return -EBADF;
+
+ if (d->nr < 0) /* not a loopback device */
+ return resize_partition(d->fd, offset, size);
+
+ if (ioctl(d->fd, LOOP_GET_STATUS64, &info) < 0)
+ return -errno;
+
+#if HAVE_VALGRIND_MEMCHECK_H
+ /* Valgrind currently doesn't know LOOP_GET_STATUS64. Remove this once it does */
+ VALGRIND_MAKE_MEM_DEFINED(&info, sizeof(info));
+#endif
+
+ if (size == UINT64_MAX && offset == UINT64_MAX)
+ return 0;
+ if (info.lo_sizelimit == size && info.lo_offset == offset)
+ return 0;
+
+ if (size != UINT64_MAX)
+ info.lo_sizelimit = size;
+ if (offset != UINT64_MAX)
+ info.lo_offset = offset;
+
+ if (ioctl(d->fd, LOOP_SET_STATUS64, &info) < 0)
+ return -errno;
+
+ return 0;
+}
+
+int loop_device_flock(LoopDevice *d, int operation) {
+ assert(d);
+
+ if (d->fd < 0)
+ return -EBADF;
+
+ if (flock(d->fd, operation) < 0)
+ return -errno;
+
+ return 0;
+}
bool relinquished;
};
-int loop_device_make(int fd, int open_flags, LoopDevice **ret);
-int loop_device_make_by_path(const char *path, int open_flags, LoopDevice **ret);
+int loop_device_make_full(int fd, int open_flags, uint64_t offset, uint64_t size, uint32_t loop_flags, LoopDevice **ret);
+static inline int loop_device_make(int fd, int open_flags, uint32_t loop_flags, LoopDevice **ret) {
+ return loop_device_make_full(fd, open_flags, 0, 0, loop_flags, ret);
+}
+
+int loop_device_make_by_path(const char *path, int open_flags, uint32_t loop_flags, LoopDevice **ret);
+int loop_device_open(const char *loop_path, int open_flags, LoopDevice **ret);
LoopDevice* loop_device_unref(LoopDevice *d);
DEFINE_TRIVIAL_CLEANUP_FUNC(LoopDevice*, loop_device_unref);
void loop_device_relinquish(LoopDevice *d);
+
+int loop_device_refresh_size(LoopDevice *d, uint64_t offset, uint64_t size);
+
+int loop_device_flock(LoopDevice *d, int operation);
#include <errno.h>
#include <fcntl.h>
+#include <linux/fs.h>
+#include <linux/loop.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/file.h>
#include <sys/ioctl.h>
#include <sys/stat.h>
#include <unistd.h>
-#include <linux/fs.h>
#include "alloc-util.h"
#include "btrfs-util.h"
_cleanup_(loop_device_unrefp) LoopDevice *d = NULL;
_cleanup_(dissected_image_unrefp) DissectedImage *m = NULL;
- r = loop_device_make_by_path(i->path, O_RDONLY, &d);
+ r = loop_device_make_by_path(i->path, O_RDONLY, LO_FLAGS_PARTSCAN, &d);
if (r < 0)
return r;
#include <stdlib.h>
+#include "sd-daemon.h"
+
#include "pager.h"
#include "selinux-util.h"
#include "spawn-ask-password-agent.h"
save_argc_argv(argc, argv); \
intro; \
r = impl; \
+ if (r < 0) \
+ (void) sd_notifyf(0, "ERRNO=%i", -r); \
ask_password_agent_close(); \
polkit_agent_close(); \
pager_close(); \
#include <sys/inotify.h>
#include <sys/signalfd.h>
#include <sys/types.h>
+#include <sys/wait.h>
#include <time.h>
#include "_sd-common.h"
int sd_event_add_time(sd_event *e, sd_event_source **s, clockid_t clock, uint64_t usec, uint64_t accuracy, sd_event_time_handler_t callback, void *userdata);
int sd_event_add_signal(sd_event *e, sd_event_source **s, int sig, sd_event_signal_handler_t callback, void *userdata);
int sd_event_add_child(sd_event *e, sd_event_source **s, pid_t pid, int options, sd_event_child_handler_t callback, void *userdata);
+int sd_event_add_child_pidfd(sd_event *e, sd_event_source **s, int pidfd, int options, sd_event_child_handler_t callback, void *userdata);
int sd_event_add_inotify(sd_event *e, sd_event_source **s, const char *path, uint32_t mask, sd_event_inotify_handler_t callback, void *userdata);
int sd_event_add_defer(sd_event *e, sd_event_source **s, sd_event_handler_t callback, void *userdata);
int sd_event_add_post(sd_event *e, sd_event_source **s, sd_event_handler_t callback, void *userdata);
int sd_event_source_get_time_clock(sd_event_source *s, clockid_t *clock);
int sd_event_source_get_signal(sd_event_source *s);
int sd_event_source_get_child_pid(sd_event_source *s, pid_t *pid);
+int sd_event_source_get_child_pidfd(sd_event_source *s);
+int sd_event_source_get_child_pidfd_own(sd_event_source *s);
+int sd_event_source_set_child_pidfd_own(sd_event_source *s, int own);
+int sd_event_source_get_child_process_own(sd_event_source *s);
+int sd_event_source_set_child_process_own(sd_event_source *s, int own);
+#if defined _GNU_SOURCE || (defined _POSIX_C_SOURCE && _POSIX_C_SOURCE >= 199309L)
+int sd_event_source_send_child_signal(sd_event_source *s, int sig, const siginfo_t *si, unsigned flags);
+#else
+int sd_event_source_send_child_signal(sd_event_source *s, int sig, const void *si, unsigned flags);
+#endif
int sd_event_source_get_inotify_mask(sd_event_source *s, uint32_t *ret);
int sd_event_source_set_destroy_callback(sd_event_source *s, sd_event_destroy_t callback);
int sd_event_source_get_destroy_callback(sd_event_source *s, sd_event_destroy_t *ret);
"[Section]\n"
"setting1=" /* many continuation lines, together above the limit */
x1000(x1000("x") x10("abcde") "\\\n") "xxx",
+
+ "[Section]\n"
+ "setting1=2\n"
+ "[NoWarnSection]\n"
+ "setting1=3\n"
+ "[WarnSection]\n"
+ "setting1=3\n"
+ "[X-Section]\n"
+ "setting1=3\n",
};
static void test_config_parse(unsigned i, const char *s) {
const char *sections,
ConfigItemLookup lookup,
const void *table,
- bool relaxed,
- bool allow_include,
- bool warn,
+ ConfigParseFlags flags,
void *userdata)
*/
r = config_parse(NULL, name, f,
- "Section\0",
+ "Section\0-NoWarnSection\0",
config_item_table_lookup, items,
CONFIG_PARSE_WARN, NULL);
assert_se(r == -ENOBUFS);
assert_se(setting1 == NULL);
break;
+
+ case 17:
+ assert_se(r == 0);
+ assert_se(streq(setting1, "2"));
+ break;
}
}
/* SPDX-License-Identifier: LGPL-2.1+ */
#include <fcntl.h>
+#include <linux/loop.h>
#include <stdio.h>
#include "dissect-image.h"
return EXIT_FAILURE;
}
- r = loop_device_make_by_path(argv[1], O_RDONLY, &d);
+ r = loop_device_make_by_path(argv[1], O_RDONLY, LO_FLAGS_PARTSCAN, &d);
if (r < 0) {
log_error_errno(r, "Failed to set up loopback device: %m");
return EXIT_FAILURE;
#include "utf8.h"
#include "util.h"
+static void test_string_erase(void) {
+ char *x;
+
+ x = strdupa("");
+ assert_se(streq(string_erase(x), ""));
+
+ x = strdupa("1");
+ assert_se(streq(string_erase(x), ""));
+
+ x = strdupa("123456789");
+ assert_se(streq(string_erase(x), ""));
+
+ assert_se(x[1] == '\0');
+ assert_se(x[2] == '\0');
+ assert_se(x[3] == '\0');
+ assert_se(x[4] == '\0');
+ assert_se(x[5] == '\0');
+ assert_se(x[6] == '\0');
+ assert_se(x[7] == '\0');
+ assert_se(x[8] == '\0');
+ assert_se(x[9] == '\0');
+}
+
static void test_free_and_strndup_one(char **t, const char *src, size_t l, const char *expected, bool change) {
int r;
int main(int argc, char *argv[]) {
test_setup_logging(LOG_DEBUG);
+ test_string_erase();
test_free_and_strndup();
test_ascii_strcasecmp_n();
test_ascii_strcasecmp_nn();
NetworkEmulatorLossRate=
NetworkEmulatorDuplicateRate=
NetworkEmulatorPacketLimit=
+TokenBufferFilterRate=
+TokenBufferFilterBurst=
+TokenBufferFilterLatencySec=
+StochasticFairnessQueueingPerturbPeriodSec=
--- /dev/null
+[Match]
+Name=test1
+
+[Network]
+IPv6AcceptRA=no
+Address=10.1.2.4/16
+
+[TrafficControlQueueingDiscipline]
+Parent=root
+TokenBufferFilterRate=0.5M
+TokenBufferFilterBurst=5K
+TokenBufferFilterLatencySec=70msec
+
+[TrafficControlQueueingDiscipline]
+Parent=clsact
+StochasticFairnessQueueingPerturbPeriodSec=5sec
'25-neighbor-ip-dummy.network',
'25-neighbor-ip.network',
'25-nexthop.network',
- '25-qdisc.network',
+ '25-qdisc-netem.network',
+ '25-qdisc-tbf-and-sfq.network',
'25-route-ipv6-src.network',
'25-route-static.network',
'25-gateway-static.network',
self.assertRegex(output, '192.168.5.1')
def test_qdisc(self):
- copy_unit_to_networkd_unit_path('25-qdisc.network', '12-dummy.netdev')
+ copy_unit_to_networkd_unit_path('25-qdisc-netem.network', '12-dummy.netdev',
+ '25-qdisc-tbf-and-sfq.network', '11-dummy.netdev')
start_networkd()
- self.wait_online(['dummy98:routable'])
+ self.wait_online(['dummy98:routable', 'test1:routable'])
output = check_output('tc qdisc show dev dummy98')
print(output)
+ self.assertRegex(output, 'qdisc netem')
self.assertRegex(output, 'limit 100 delay 50.0ms 10.0ms loss 20%')
self.assertRegex(output, 'limit 200 delay 100.0ms 13.0ms loss 20.5%')
+ output = check_output('tc qdisc show dev test1')
+ print(output)
+ self.assertRegex(output, 'qdisc tbf')
+ self.assertRegex(output, 'rate 500Kbit burst 5000b lat 70.0ms')
+ self.assertRegex(output, 'qdisc sfq')
+ self.assertRegex(output, 'perturb 5sec')
class NetworkdStateFileTests(unittest.TestCase, Utilities):
links = [