--- /dev/null
+# Binary IO API
+
+The binary input / output (bio) API is intended to abstract a wide
+range of issues related to network IO. Historically (v3) we just
+"wrote the code until it worked", which meant that the same piece of
+code handled network transport issues (e.g. TCP), protocol issues
+(e.g. RADIUS), connection issues (up / down / reconnect), and eventing
+issues (socket readable / blocked).
+
+This style of programming lead to complex interconnected state
+machines which were difficult to write, to maintain, and to debug.
+
+v4 is better with many of these functions split out into separate
+APIs, such as connections, trunking, etc. However, the input
+listeners and output client modules (e.g. rlm_radius and radclient)
+still have the transport and protocol states intermixed. This makes
+the read / write routines complex, and difficult to extend.
+
+For these reasons and more, as of early 2024, v4 does not have input
+TLS listeners, or output TCP or TLS for RADIUS proxying. We then have
+a horrid mess dynamic clients, haproxy connections, network source IP
+filtering, UDP vs TCP issues, and connected vs unconnected sockets,
+and finally TLS. It is essentially impossible to write code which
+handles all of these issues simultaneously.
+
+The issues addressed by bios include the following items:
+
+* abstracting TCP versus UDP socket IO
+
+* allowing packet-based reads and writes, instead of byte-based
+ * i.e. so that application protocol state machines do not have to
+ * deal with partial packets.
+
+* Use protocol-agnostic memory buffers to track partial reads and
+ partial writes.
+
+* allowing "written" data to be cancelled or "unwritten". Packets
+ which have been written to the bio, but not yet to the network can
+ be cancelled at any time. The data then disappears from the bio,
+ and is never written to the network.
+
+* allowing chaining, so that an application can write RADIUS packets
+ to a bio, and then have those packets go through a TLS
+ transformation, and then out a TCP socket.
+
+* Chaining also allows applications to selectively add per-chain
+ functionality, without affecting the producer or consumer of data.
+
+* allowing unchaining, so that we can have a bio say "I'm done, and no
+ longer needed". This happens for example when we have a connection
+ from haproxy. The first ~128 bytes of a TCP connection are the
+ original src/dst ip/port. The data after that is just the TLS
+ transport. The haproxy layer needs to be able to intercept and read
+ that data, and then remove itself from the chain of bios.
+
+* abstraction, so that the application can be handed a bio, and use
+ it. The underlying bio might be UDP, TCP, TLS, etc. The
+ application does not know, and can behave identically for all
+ situations. There are some limitations, of course. Something has
+ to create the bios and their respective chains. But once a "RADIUS"
+ bio, has been created, the RADIUS application can read and write
+ packets to it without worrying about underlying issues of UDP vs
+ TCP, TLS vs clear-text, dedup, etc.
+
+* simplicity. Any transport-specific function knows only about that
+ transport, and it's own bio. It does not need to know about other
+ bios (unless it needs them, as with TLS -> TCP). The function does
+ not know about packets or protocols. We should be able to use the
+ same basic UDP/TCP network bios for most protocols. Or if we
+ cannot, the duplicated code should be trivial, and little more than
+ `read()` and some checks for error conditions (EOF, blocked, etc.)
+
+* If the caller needs to do something with a particular bio, that bio
+ will expose an API specific to that bio. There is no reason to copy
+ that status back up the bio chain. This also means that the caller
+ often needs to cache the multiple bios, which is fine.
+
+* asynchronous at its core. Anything can block at any time. There
+ are callbacks if necessary.
+
+* no run-time memory allocations for bio operations. Everything
+ operates on pre-allocated structures
+
+* O(1) operations where possible.
+
+* each bio in large part runs as its own state machine. It does what
+ it needs to do. It exposes APIs for the caller (who must know what
+ it is). It has its own callbacks to modify its operation.
+
+* not thread-safe. Use locks, people.
+
+There are explicit _non-goals_ for the bio API. These non-goals are
+issues which are outside of the scope of bios, such as:
+
+* As an outcome of simplicity, there are no bio-specific wrappers for
+ modifying file descriptors. An application is free to cache the FD,
+ associate it with the application layer, and call eventing functions
+ to get "readable" or "writable" callbacks. The application can also
+ get / set socket information manually, such as "get IP" or "bind to
+ particular port".
+
+* configuration. The bios expose configuration structures (static
+ input used to create a bio), and run-time informational structures
+ (dynamic information about the state of the bio). The API is small,
+ and all uses of get/set member functions should be avoided. We
+ presume that the caller is smart enough to not muck with the current
+ state of the bio.
+
+* eventing and timers. The bios can allow an underlying file
+ descriptor to be used, but the bio layer itself runs nothing more
+ than state-specific callbacks, defined on a per-bio basis.
+
+* decoding / encoding packet contents. This is handled by dbuffs,
+ which are bounds checkers around memory buffers. i.e. they check
+ and enforce nested bounds on packets, nested attributes, etc. But
+ dbuffs have no concept of multiple packets, deduplication, file
+ descriptors, etc.
--- /dev/null
+SUBMAKEFILES := libfreeradius-bio.mk
+
+# bio_tests.mk
--- /dev/null
+/*
+ * This program is is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or (at
+ * your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
+ */
+
+/**
+ * $Id$
+ * @file lib/bio/base.c
+ * @brief Binary IO abstractions.
+ *
+ * @copyright 2024 Network RADIUS SAS (legal@networkradius.com)
+ */
+
+#include <freeradius-devel/bio/bio_priv.h>
+#include <freeradius-devel/bio/null.h>
+
+/** Free this bio.
+ *
+ * The bio can only be freed if it is not in any chain.
+ */
+int fr_bio_destructor(fr_bio_t *bio)
+{
+ fr_assert(!fr_bio_prev(bio));
+ fr_assert(!fr_bio_next(bio));
+
+ /*
+ * It's safe to free this bio.
+ */
+ return 0;
+}
+
+/** Always returns EOF on fr_bio_read()
+ *
+ */
+ssize_t fr_bio_eof_read(UNUSED fr_bio_t *bio, UNUSED void *packet_ctx, UNUSED void *buffer, UNUSED size_t size)
+{
+ return fr_bio_error(EOF);
+}
+
+/** Internal bio function which just reads from the "next" bio.
+ *
+ * It is mainly used when the current bio needs to modify the write
+ * path, but does not need to do anything on the read path.
+ */
+ssize_t fr_bio_next_read(fr_bio_t *bio, void *packet_ctx, void *buffer, size_t size)
+{
+ ssize_t rcode;
+ fr_bio_t *next;
+
+ next = fr_bio_next(bio);
+ fr_assert(next != NULL);
+
+ rcode = next->read(next, packet_ctx, buffer, size);
+ if (rcode >= 0) return rcode;
+
+ if (rcode == fr_bio_error(IO_WOULD_BLOCK)) return rcode;
+
+ bio->read = fr_bio_eof_read;
+ bio->write = fr_bio_null_write;
+ return rcode;
+}
+
+/** Internal bio function which just writes to the "next" bio.
+ *
+ * It is mainly used when the current bio needs to modify the read
+ * path, but does not need to do anything on the write path.
+ */
+ssize_t fr_bio_next_write(fr_bio_t *bio, void *packet_ctx, void const *buffer, size_t size)
+{
+ ssize_t rcode;
+ fr_bio_t *next;
+
+ next = fr_bio_next(bio);
+ fr_assert(next != NULL);
+
+ rcode = next->write(next, packet_ctx, buffer, size);
+ if (rcode >= 0) return rcode;
+
+ if (rcode == fr_bio_error(IO_WOULD_BLOCK)) return rcode;
+
+ bio->read = fr_bio_eof_read;
+ bio->write = fr_bio_null_write;
+ return rcode;
+}
+
+/** Free this bio, and everything it calls.
+ *
+ * We unlink the bio chain, and then free it individually. If there's an error, the bio chain is relinked.
+ * That way the error can be addressed (somehow) and this function can be called again.
+ *
+ * Note that we do not support talloc_free() for the bio chain. Each individual bio has to be unlinked from
+ * the chain before the destructor will allow it to be freed. This functionality is by design.
+ *
+ * We want to have an API where bios are created "bottom up", so that it is impossible for an application to
+ * create an incorrect chain. However, creating the chain bottom up means that the lower bios not parented
+ * from the higher bios, and therefore talloc_free() won't free them. As a result, we need an explicit
+ * bio_free() function.
+ */
+int fr_bio_free(fr_bio_t *bio)
+{
+ fr_bio_t *next = fr_bio_next(bio);
+
+ /*
+ * We cannot free a bio in the middle of a chain. It has to be unlinked first.
+ */
+ if (fr_bio_prev(bio)) return -1;
+
+ /*
+ * Unlink our bio, and recurse to free the next one. If we can't free it, re-chain it, but reset
+ * the read/write functions to do nothing.
+ */
+ if (next) {
+ fr_bio_unchain(bio);
+ if (fr_bio_free(next) < 0) {
+ fr_bio_chain(bio, next);
+ bio->read = fr_bio_eof_read;
+ bio->write = fr_bio_null_write;
+ return -1;
+ }
+ }
+
+ /*
+ * It's now safe to free this bio.
+ */
+ return talloc_free(bio);
+}
+
+/** Shut down a set of BIOs
+ *
+ * Must be called from the top-most bio.
+ *
+ * Will shut down the bios from the bottom-up.
+ *
+ * The shutdown function MUST be callable multiple times without breaking.
+ */
+int fr_bio_shutdown(fr_bio_t *bio)
+{
+ fr_bio_t *last;
+
+ fr_assert(!fr_bio_prev(bio));
+
+ /*
+ * Find the last bio in the chain.
+ */
+ for (last = bio; fr_bio_next(last) != NULL; last = fr_bio_next(last)) {
+ /* nothing */
+ }
+
+ /*
+ * Walk back up the chain, calling the shutdown functions.
+ */
+ do {
+ int rcode;
+ fr_bio_common_t *my = (fr_bio_common_t *) last;
+
+ /*
+ * Call user shutdown before the bio shutdown.
+ */
+ if (my->cb.shutdown && ((rcode = my->cb.shutdown(last)) < 0)) return rcode;
+
+ last = fr_bio_prev(last);
+ } while (last);
+
+ return 0;
+}
+
+/** Like fr_bio_shutdown(), but can be called by anyone in the chain.
+ *
+ */
+int fr_bio_shutdown_intermediate(fr_bio_t *bio)
+{
+ fr_bio_common_t *prev = (fr_bio_common_t *) fr_bio_prev(bio);
+
+ while ((prev = (fr_bio_common_t *) fr_bio_prev(bio)) != NULL) {
+ bio = (fr_bio_t *) prev;
+ }
+
+ return fr_bio_shutdown(bio);
+}
--- /dev/null
+#pragma once
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
+ */
+
+/**
+ * $Id$
+ * @file lib/bio/base.h
+ * @brief Binary IO abstractions.
+ *
+ * Create abstract binary input / output buffers.
+ *
+ * @copyright 2024 Network RADIUS SAS (legal@networkradius.com)
+ */
+RCSIDH(lib_bio_base_h, "$Id$")
+
+#include <freeradius-devel/util/debug.h>
+#include <freeradius-devel/util/dlist.h>
+
+#ifdef NDEBUG
+#define XDEBUG(_x)
+#else
+#define XDEBUG(fmt, ...) fprintf(stderr, fmt, ## __VA_ARGS__)
+#endif
+
+#ifdef _CONST
+# error _CONST can only be defined in the local header
+#endif
+#ifndef _BIO_PRIVATE
+# define _CONST const
+#else
+# define _CONST
+#endif
+
+typedef enum {
+ FR_BIO_ERROR_NONE = 0,
+ FR_BIO_ERROR_IO_WOULD_BLOCK, //!< IO would block
+
+ FR_BIO_ERROR_IO, //!< IO error - check errno
+ FR_BIO_ERROR_GENERIC, //!< generic "failed" error - check fr_strerror()
+ FR_BIO_ERROR_VERIFY, //!< some packet verification error
+ FR_BIO_ERROR_BUFFER_FULL, //!< the buffer is full
+ FR_BIO_ERROR_BUFFER_TOO_SMALL, //!< the output buffer is too small for the data
+
+ FR_BIO_ERROR_EOF, //!< at EOF
+} fr_bio_error_type_t;
+
+typedef struct fr_bio_s fr_bio_t;
+
+/** Do a raw read from a socket, or other data source
+ *
+ * These functions should be careful about packet_ctx. This handling depends on a number of factors. Note
+ * that the packet_ctx may be NULL!
+ *
+ * Stream sockets will generally ignore packet_ctx.
+ *
+ * Datagram sockets generally write src/dst IP/port to the packet context. This same packet_ctx is then
+ * passed to bio->write(), which can use it to send the data to the correct destination.
+ *
+ * @param bio the binary IO handler
+ * @param packet_ctx where the function can store per-packet information, such as src/dst IP/port for datagram sockets
+ * @param buffer where the function should store data it reads
+ * @param size the maximum amount of data to read.
+ * @return
+ * - <0 for error
+ * - 0 for "no data available". Note that this does NOT mean EOF! It could mean "we do not have a full packet"
+ * - >0 for amount of data which was read.
+ */
+typedef ssize_t (*fr_bio_read_t)(fr_bio_t *bio, void *packet_ctx, void *buffer, size_t size);
+typedef ssize_t (*fr_bio_write_t)(fr_bio_t *bio, void *packet_ctx, const void *buffer, size_t size);
+
+typedef int (*fr_bio_callback_t)(fr_bio_t *bio); /* activate / shutdown callbacks */
+
+typedef struct {
+ fr_bio_callback_t activate;
+ fr_bio_callback_t shutdown;
+} fr_bio_cb_funcs_t;
+
+/** Accept a new connection on a bio
+ *
+ * @param bio the binary IO handler
+ * @param ctx the talloc ctx for the new bio.
+ * @param[out] accepted the accepted bio
+ * @return
+ * - <0 on error
+ * - 0 for "we did nothing, and there is no new bio available"
+ * - 1 for "the accepted bio is available"
+ */
+typedef int (*fr_bio_accept_t)(fr_bio_t *bio, TALLOC_CTX *ctx, fr_bio_t **accepted);
+
+struct fr_bio_s {
+ void *uctx; //!< user ctx, caller can manually set it.
+
+ fr_bio_read_t _CONST read; //!< read from the underlying bio
+ fr_bio_write_t _CONST write; //!< write to the underlying bio
+
+ fr_dlist_t _CONST entry; //!< in the linked list of multiple bios
+};
+
+static inline CC_HINT(nonnull) fr_bio_t *fr_bio_prev(fr_bio_t *bio)
+{
+ fr_dlist_t *prev = bio->entry.prev;
+
+ if (!prev) return NULL;
+
+ return fr_dlist_entry_to_item(offsetof(fr_bio_t, entry), prev);
+}
+
+static inline CC_HINT(nonnull) fr_bio_t *fr_bio_next(fr_bio_t *bio)
+{
+ fr_dlist_t *next = bio->entry.next;
+
+ if (!next) return NULL;
+
+ return fr_dlist_entry_to_item(offsetof(fr_bio_t, entry), next);
+}
+
+static inline ssize_t CC_HINT(nonnull(1,3)) fr_bio_read(fr_bio_t *bio, void *packet_ctx, void *buffer, size_t size)
+{
+ if (size == 0) return 0;
+
+ /*
+ * We cannot read from the middle of a chain.
+ */
+ fr_assert(!fr_bio_next(bio));
+
+ return bio->read(bio, packet_ctx, buffer, size);
+}
+
+static inline ssize_t CC_HINT(nonnull(1)) fr_bio_write(fr_bio_t *bio, void *packet_ctx, void const *buffer, size_t size)
+{
+ if (size == 0) return 0;
+
+ /*
+ * We cannot write to the middle of a chain.
+ */
+ fr_assert(!fr_bio_prev(bio));
+
+ return bio->write(bio, packet_ctx, buffer, size);
+}
+
+int fr_bio_shutdown_intermediate(fr_bio_t *bio) CC_HINT(nonnull);
+
+#ifndef NDEBUG
+int fr_bio_destructor(fr_bio_t *bio) CC_HINT(nonnull);
+#else
+#define fr_bio_destructor (NULL)
+#endif
+
+#define fr_bio_error(_x) (-(FR_BIO_ERROR_ ## _x))
+
+ssize_t fr_bio_eof_read(fr_bio_t *bio, void *packet_ctx, void *buffer, size_t size);
+
+ssize_t fr_bio_next_read(fr_bio_t *bio, void *packet_ctx, void *buffer, size_t size);
+
+ssize_t fr_bio_next_write(fr_bio_t *bio, void *packet_ctx, void const *buffer, size_t size);
+
+int fr_bio_shutdown(fr_bio_t *bio) CC_HINT(nonnull);
+
+int fr_bio_free(fr_bio_t *bio) CC_HINT(nonnull);
--- /dev/null
+#pragma once
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
+ */
+
+/**
+ * $Id$
+ * @file lib/bio/bio_priv.h
+ * @brief Binary IO private functions
+ *
+ * Create abstract binary input / output buffers.
+ *
+ * @copyright 2024 Network RADIUS SAS (legal@networkradius.com)
+ */
+RCSIDH(lib_bio_bio_priv_h, "$Id$")
+
+#define _BIO_PRIVATE 1
+#include <freeradius-devel/bio/base.h>
+
+typedef int (*fr_bio_shutdown_t)(fr_bio_t *bio);
+
+typedef struct fr_bio_common_s fr_bio_common_t;
+
+/** Common elements at the start of each private #fr_bio_t
+ *
+ */
+#define FR_BIO_COMMON \
+ fr_bio_t bio; \
+ fr_bio_cb_funcs_t cb
+
+struct fr_bio_common_s {
+ FR_BIO_COMMON;
+};
+
+/** Chain one bio after another.
+ *
+ * @todo - this likely needs to be public
+ */
+static inline void CC_HINT(nonnull) fr_bio_chain(fr_bio_t *first, fr_bio_t *second)
+{
+ fr_dlist_entry_link_after(&first->entry, &second->entry);
+}
+
+/** Remove a bio from a chain
+ *
+ * And reset prev/next ptrs to NULL.
+ *
+ * @todo - this likely needs to be public
+ */
+static inline void CC_HINT(nonnull) fr_bio_unchain(fr_bio_t *bio)
+{
+ fr_assert(fr_bio_prev(bio) != NULL);
+ fr_assert(fr_bio_next(bio) != NULL);
+
+ fr_dlist_entry_unlink(&bio->entry);
+ bio->entry.prev = bio->entry.next = NULL;
+}
--- /dev/null
+/*
+ * This program is is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or (at
+ * your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
+ */
+
+/**
+ * $Id$
+ * @file lib/bio/buf.c
+ * @brief BIO abstractions for file descriptors
+ *
+ * @copyright 2024 Network RADIUS SAS (legal@networkradius.com)
+ */
+
+#include <freeradius-devel/util/debug.h>
+#include <freeradius-devel/bio/buf.h>
+
+size_t fr_bio_buf_make_room(fr_bio_buf_t *bio_buf)
+{
+ size_t used;
+
+ if (bio_buf->read == bio_buf->start) return fr_bio_buf_write_room(bio_buf);
+
+ used = bio_buf->write - bio_buf->read;
+ if (!used) return fr_bio_buf_write_room(bio_buf);
+
+ memmove(bio_buf->start, bio_buf->read, used);
+
+ bio_buf->read = bio_buf->start;
+ bio_buf->write = bio_buf->read + used;
+
+ return fr_bio_buf_write_room(bio_buf);
+}
+
+size_t fr_bio_buf_read(fr_bio_buf_t *bio_buf, void *buffer, size_t size)
+{
+ size_t used;
+
+ fr_bio_buf_verify(bio_buf);
+
+ used = bio_buf->write - bio_buf->read;
+ if (!used || !size) return 0;
+
+ /*
+ * Clamp the data to read at how much data is in the buffer.
+ */
+ if (size > used) size = used;
+
+ if (buffer) memcpy(buffer, bio_buf->read, size);
+
+ bio_buf->read += size;
+ if (bio_buf->read == bio_buf->write) {
+ fr_bio_buf_reset(bio_buf);
+
+ } else if ((bio_buf->end - bio_buf->read) < (bio_buf->read - bio_buf->start)) {
+ /*
+ * The "read" pointer is closer to the end of the
+ * buffer than to the start. Shift the data
+ * around to give more room for reading.
+ *
+ * @todo - change the check instead to "(end - write) < min_room"
+ *
+ * @todo - what about pending packets which point to the buffer?
+ */
+ fr_bio_buf_make_room(bio_buf);
+ }
+
+ return size;
+}
+
+ssize_t fr_bio_buf_write(fr_bio_buf_t *bio_buf, const void *buffer, size_t size)
+{
+ size_t room;
+
+ fr_bio_buf_verify(bio_buf);
+
+ room = fr_bio_buf_write_room(bio_buf);
+
+ if (room < size) {
+ return -room; /* how much more room we would need */
+ }
+
+ /*
+ * The data might already be in the buffer, in which case we can skip the memcpy().
+ *
+ * But the data MUST be at the current "write" position. i.e. we can't have overlapping /
+ * conflicting writes.
+ *
+ * @todo - if it's after the current write position, maybe still allow it? That's so
+ * fr_bio_mem_write() and friends can write partial packets into the buffer. Maybe add a
+ * fr_bio_buf_write_partial() API, which takes (packet, already_written, size), and then does the
+ * right thing. If the packet is not within the buffer, then it devolves to fr_bio_buf_write(),
+ * otherwise it moves the write ptr in the buffer to after the packet.
+ */
+ if (buffer != bio_buf->write) {
+ fr_assert(!fr_bio_buf_contains(bio_buf, buffer));
+ memcpy(bio_buf->write, buffer, size);
+ }
+ bio_buf->write += size;
+
+ return size;
+}
--- /dev/null
+#pragma once
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
+ */
+
+/**
+ * $Id$
+ * @file lib/bio/buf.h
+ * @brief Binary IO abstractions for buffers
+ *
+ * The #fr_bio_buf_t allows readers and writers to use a shared buffer, without overflow.
+ *
+ * @copyright 2024 Network RADIUS SAS (legal@networkradius.com)
+ */
+RCSIDH(lib_bio_buf_h, "$Id$")
+
+typedef struct {
+ uint8_t *start; //!< start of the buffer
+ uint8_t *end; //!< end of the buffer
+
+ uint8_t *read; //!< where in the buffer reads are taken from
+ uint8_t *write; //!< where in the buffer writes are sent to
+} fr_bio_buf_t;
+
+static inline void fr_bio_buf_init(fr_bio_buf_t *bio_buf, uint8_t *buffer, size_t size)
+{
+ bio_buf->start = bio_buf->read = bio_buf->write = buffer;
+ bio_buf->end = buffer + size;
+}
+
+int fr_bio_buf_alloc(TALLOC_CTX *ctx, fr_bio_buf_t *bio_buf, size_t size) CC_HINT(nonnull);
+
+int fr_bio_buf_resize(fr_bio_buf_t *bio_buf, uint8_t *buffer, size_t size) CC_HINT(nonnull);
+
+size_t fr_bio_buf_make_room(fr_bio_buf_t *bio_buf);
+
+size_t fr_bio_buf_read(fr_bio_buf_t *bio_buf, void *buffer, size_t size) CC_HINT(nonnull(1));
+ssize_t fr_bio_buf_write(fr_bio_buf_t *bio_buf, const void *buffer, size_t size) CC_HINT(nonnull);
+
+
+static inline void CC_HINT(nonnull) fr_bio_buf_verify(fr_bio_buf_t const *bio_buf)
+{
+ fr_assert(bio_buf->start != NULL);
+ fr_assert(bio_buf->start <= bio_buf->read);
+ fr_assert(bio_buf->read <= bio_buf->write);
+ fr_assert(bio_buf->write <= bio_buf->end);
+}
+
+static inline void CC_HINT(nonnull) fr_bio_buf_reset(fr_bio_buf_t *bio_buf)
+{
+ fr_bio_buf_verify(bio_buf);
+
+ bio_buf->read = bio_buf->write = bio_buf->start;
+}
+
+static inline bool CC_HINT(nonnull) fr_bio_buf_initialized(fr_bio_buf_t const *bio_buf)
+{
+ return (bio_buf->start != NULL);
+}
+
+static inline size_t CC_HINT(nonnull) fr_bio_buf_used(fr_bio_buf_t const *bio_buf)
+{
+ if (!fr_bio_buf_initialized(bio_buf)) return 0;
+
+ fr_bio_buf_verify(bio_buf);
+
+ return (bio_buf->write - bio_buf->read);
+}
+
+static inline size_t CC_HINT(nonnull) fr_bio_buf_write_room(fr_bio_buf_t const *bio_buf)
+{
+ fr_bio_buf_verify(bio_buf);
+
+ return bio_buf->end - bio_buf->write;
+}
+
+static inline uint8_t *CC_HINT(nonnull) fr_bio_buf_write_reserve(fr_bio_buf_t *bio_buf, size_t size)
+{
+ fr_bio_buf_verify(bio_buf);
+
+ if (fr_bio_buf_write_room(bio_buf) < size) return NULL;
+
+ return bio_buf->write;
+}
+
+static inline int CC_HINT(nonnull) fr_bio_buf_write_alloc(fr_bio_buf_t *bio_buf, size_t size)
+{
+ fr_bio_buf_verify(bio_buf);
+
+ if (fr_bio_buf_write_room(bio_buf) < size) return -1;
+
+ bio_buf->write += size;
+
+ fr_bio_buf_verify(bio_buf);
+
+ return 0;
+}
+
+static inline void CC_HINT(nonnull) fr_bio_buf_write_undo(fr_bio_buf_t *bio_buf, size_t size)
+{
+ fr_bio_buf_verify(bio_buf);
+
+ fr_assert(bio_buf->read + size <= bio_buf->write);
+
+ bio_buf->write -= size;
+ fr_bio_buf_verify(bio_buf);
+
+ if (bio_buf->read == bio_buf->write) {
+ fr_bio_buf_reset(bio_buf);
+ }
+}
+
+static inline bool fr_bio_buf_contains(fr_bio_buf_t *bio_buf, void const *buffer)
+{
+ return ((uint8_t const *) buffer >= bio_buf->start) && ((uint8_t const *) buffer <= bio_buf->end);
+}
+
+static inline void CC_HINT(nonnull) fr_bio_buf_write_update(fr_bio_buf_t *bio_buf, void const *buffer, size_t size, size_t written)
+{
+ if (!fr_bio_buf_initialized(bio_buf)) return;
+
+ fr_bio_buf_verify(bio_buf);
+
+ if (bio_buf->read == buffer) {
+ fr_assert(fr_bio_buf_used(bio_buf) >= size);
+
+ (void) fr_bio_buf_read(bio_buf, NULL, written);
+ } else {
+ /*
+ * If we're not writing from the start of write_buffer, then the data to
+ * be written CANNOT appear anywhere in the buffer.
+ */
+ fr_assert(!fr_bio_buf_contains(bio_buf, buffer));
+ }
+}
--- /dev/null
+/*
+ * This program is is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or (at
+ * your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
+ */
+
+/**
+ * $Id$
+ * @file lib/bio/fd.c
+ * @brief BIO abstractions for file descriptors
+ *
+ * @copyright 2024 Network RADIUS SAS (legal@networkradius.com)
+ */
+
+#ifdef __linux__
+/*
+ * for accept4()
+ */
+#define _GNU_SOURCE
+#endif
+
+#include <freeradius-devel/bio/fd_priv.h>
+#include <freeradius-devel/bio/null.h>
+
+/*
+ * More portability idiocy
+ * Mac OSX Lion doesn't define SOL_IP. But IPPROTO_IP works.
+ */
+#ifndef SOL_IP
+# define SOL_IP IPPROTO_IP
+#endif
+
+/*
+ * glibc 2.4 and uClibc 0.9.29 introduce IPV6_RECVPKTINFO etc. and
+ * change IPV6_PKTINFO This is only supported in Linux kernel >=
+ * 2.6.14
+ *
+ * This is only an approximation because the kernel version that libc
+ * was compiled against could be older or newer than the one being
+ * run. But this should not be a problem -- we just keep using the
+ * old kernel interface.
+ */
+#ifdef __linux__
+# ifdef IPV6_RECVPKTINFO
+# include <linux/version.h>
+# if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,14)
+# ifdef IPV6_2292PKTINFO
+# undef IPV6_RECVPKTINFO
+# undef IPV6_PKTINFO
+# define IPV6_RECVPKTINFO IPV6_2292PKTINFO
+# define IPV6_PKTINFO IPV6_2292PKTINFO
+# endif
+# endif
+/* Fall back to the legacy socket option if IPV6_RECVPKTINFO isn't defined */
+# elif defined(IPV6_2292PKTINFO)
+# define IPV6_RECVPKTINFO IPV6_2292PKTINFO
+# endif
+#else
+
+/*
+ * For everything that's not Linux we assume RFC 3542 compliance
+ * - setsockopt() takes IPV6_RECVPKTINFO
+ * - cmsg_type is IPV6_PKTINFO (in sendmsg, recvmsg)
+ *
+ * If we don't have IPV6_RECVPKTINFO defined but do have IPV6_PKTINFO
+ * defined, chances are the API is RFC2292 compliant and we need to use
+ * IPV6_PKTINFO for both.
+ */
+# if !defined(IPV6_RECVPKTINFO) && defined(IPV6_PKTINFO)
+# define IPV6_RECVPKTINFO IPV6_PKTINFO
+
+/*
+ * Ensure IPV6_RECVPKTINFO is not defined somehow if we have we
+ * don't have IPV6_PKTINFO.
+ */
+# elif !defined(IPV6_PKTINFO)
+# undef IPV6_RECVPKTINFO
+# endif
+#endif
+
+#define ADDR_INIT do { \
+ addr->when = fr_time(); \
+ addr->socket.type = my->info.socket.type; \
+ addr->socket.fd = -1; \
+ addr->socket.inet.ifindex = my->info.socket.inet.ifindex; \
+ } while (0)
+
+/*
+ * Close the descriptor and free the bio.
+ */
+static int fr_bio_fd_destructor(fr_bio_fd_t *my)
+{
+ /*
+ * The upstream bio must have unlinked it from the chain before calling talloc_free() on this
+ * bio.
+ */
+ fr_assert(!fr_bio_prev(&my->bio));
+ fr_assert(!fr_bio_next(&my->bio));
+
+ return fr_bio_fd_close(&my->bio);
+}
+
+/** Stream read.
+ *
+ */
+static ssize_t fr_bio_fd_read_stream(fr_bio_t *bio, UNUSED void *packet_ctx, void *buffer, size_t size)
+{
+ int tries = 0;
+ ssize_t rcode;
+ fr_bio_fd_t *my = talloc_get_type_abort(bio, fr_bio_fd_t);
+
+ my->info.read_blocked = false;
+
+retry:
+ rcode = read(my->info.socket.fd, buffer, size);
+ if (rcode > 0) return rcode;
+
+ if (rcode == 0) {
+ /*
+ * Stream sockets return 0 at EOF. However, we want to distinguish that from the case of datagram
+ * sockets, which return 0 when there's no data. So we over-ride the 0 value here, and instead
+ * return an EOF error.
+ */
+ bio->read = fr_bio_eof_read;
+ bio->write = fr_bio_null_write;
+ my->info.eof = true;
+
+ return fr_bio_error(EOF);
+ }
+
+#undef flag_blocked
+#define flag_blocked info.read_blocked
+#include "fd_errno.h"
+
+ return fr_bio_error(IO);
+}
+
+/** Connected datagram read.
+ *
+ * The difference between this and stream protocols is that for datagrams. a read of zero means "no packets",
+ * where a read of zero on a steam socket means "EOF".
+ */
+static ssize_t fr_bio_fd_read_connected_datagram(fr_bio_t *bio, void *packet_ctx, void *buffer, size_t size)
+{
+ int tries = 0;
+ ssize_t rcode;
+ fr_bio_fd_t *my = talloc_get_type_abort(bio, fr_bio_fd_t);
+
+ my->info.read_blocked = false;
+
+retry:
+ rcode = read(my->info.socket.fd, buffer, size);
+ if (rcode > 0) {
+ fr_bio_fd_packet_ctx_t *addr = fr_bio_fd_packet_ctx(my, packet_ctx);
+
+ ADDR_INIT;
+
+ addr->socket.inet.dst_ipaddr = my->info.socket.inet.src_ipaddr;
+ addr->socket.inet.dst_port = my->info.socket.inet.src_port;
+
+ addr->socket.inet.src_ipaddr = my->info.socket.inet.dst_ipaddr;
+ addr->socket.inet.src_port = my->info.socket.inet.dst_port;
+ return rcode;
+ }
+
+ if (rcode == 0) return rcode;
+
+#undef flag_blocked
+#define flag_blocked info.read_blocked
+#include "fd_errno.h"
+
+ return fr_bio_error(IO);
+}
+
+/** Read from a UDP socket where we know our IP
+ */
+static ssize_t fr_bio_fd_recvfrom(fr_bio_t *bio, void *packet_ctx, void *buffer, size_t size)
+{
+ int tries = 0;
+ ssize_t rcode;
+ fr_bio_fd_t *my = talloc_get_type_abort(bio, fr_bio_fd_t);
+ socklen_t salen;
+ struct sockaddr_storage sockaddr;
+
+ my->info.read_blocked = false;
+
+retry:
+ salen = sizeof(sockaddr);
+
+ rcode = recvfrom(my->info.socket.fd, buffer, size, 0, (struct sockaddr *) &sockaddr, &salen);
+ if (rcode > 0) {
+ fr_bio_fd_packet_ctx_t *addr = fr_bio_fd_packet_ctx(my, packet_ctx);
+
+ ADDR_INIT;
+
+ addr->socket.inet.dst_ipaddr = my->info.socket.inet.src_ipaddr;
+ addr->socket.inet.dst_port = my->info.socket.inet.src_port;
+
+ (void) fr_ipaddr_from_sockaddr(&addr->socket.inet.src_ipaddr, addr->socket.inet.src_port,
+ &sockaddr, salen);
+ return rcode;
+ }
+
+ if (rcode == 0 ) return rcode;
+
+#undef flag_blocked
+#define flag_blocked info.read_blocked
+#include "fd_errno.h"
+
+ return fr_bio_error(IO);
+}
+
+
+/** Write to fd
+ *
+ */
+static ssize_t fr_bio_fd_write(fr_bio_t *bio, UNUSED void *packet_ctx, const void *buffer, size_t size)
+{
+ int tries = 0;
+ ssize_t rcode;
+ fr_bio_fd_t *my = talloc_get_type_abort(bio, fr_bio_fd_t);
+
+ /*
+ * FD bios do nothing on flush.
+ */
+ if (!buffer) return 0;
+
+ my->info.write_blocked = false;
+
+retry:
+ /*
+ * Note that we call send() and not write()! Posix says:
+ *
+ * "A write was attempted on a socket that is shut down for writing, or is no longer
+ * connected. In the latter case, if the socket is of type SOCK_STREAM, a SIGPIPE signal shall
+ * also be sent to the thread."
+ *
+ * We can override this behavior by calling send(), and passing the special flag which says
+ * "don't do that!". The system call will then return EPIPE, which indicates that the socket is
+ * no longer usavle.
+ */
+ rcode = send(my->info.socket.fd, buffer, size, MSG_NOSIGNAL);
+ if (rcode >= 0) return rcode;
+
+#undef flag_blocked
+#define flag_blocked info.write_blocked
+#include "fd_errno.h"
+
+ return fr_bio_error(IO);
+}
+
+/** Write to a UDP socket where we know our IP
+ *
+ */
+static ssize_t fr_bio_fd_sendto(fr_bio_t *bio, UNUSED void *packet_ctx, const void *buffer, size_t size)
+{
+ int tries = 0;
+ ssize_t rcode;
+ fr_bio_fd_t *my = talloc_get_type_abort(bio, fr_bio_fd_t);
+ socklen_t salen;
+ struct sockaddr_storage sockaddr;
+
+ /*
+ * FD bios do nothing on flush.
+ */
+ if (!buffer) return 0;
+
+ my->info.write_blocked = false;
+
+ // get destination IP
+ salen = sizeof(sockaddr);
+
+retry:
+ rcode = sendto(my->info.socket.fd, buffer, size, 0, (struct sockaddr *) &sockaddr, salen);
+ if (rcode >= 0) return rcode;
+
+#undef flag_blocked
+#define flag_blocked info.write_blocked
+#include "fd_errno.h"
+
+ return fr_bio_error(IO);
+}
+
+
+#if defined(IP_PKTINFO) || defined(IP_RECVDSTADDR) || defined(IPV6_PKTINFO)
+static ssize_t fd_fd_recvfromto_common(fr_bio_fd_t *my, void *packet_ctx, void *buffer, size_t size)
+{
+ int tries = 0;
+ ssize_t rcode;
+ struct sockaddr_storage from;
+ socklen_t from_len;
+ fr_bio_fd_packet_ctx_t *addr = fr_bio_fd_packet_ctx(my, packet_ctx);
+
+ my->info.read_blocked = false;
+
+ memset(&my->cbuf, 0, sizeof(my->cbuf));
+ memset(&my->msgh, 0, sizeof(struct msghdr));
+
+ my->iov = (struct iovec) {
+ .iov_base = buffer,
+ .iov_len = size,
+ };
+
+ my->msgh = (struct msghdr) {
+ .msg_control = my->cbuf,
+ .msg_controllen = sizeof(my->cbuf),
+ .msg_name = &from,
+ .msg_namelen = &from_len,
+ .msg_iov = &my->iov,
+ .msg_iovlen = 1,
+ .msg_flags = 0,
+ };
+
+retry:
+ rcode = recvmsg(my->info.socket.fd, &my->msgh, 0);
+ if (rcode > 0) {
+ ADDR_INIT;
+
+ (void) fr_ipaddr_from_sockaddr(&addr->socket.inet.src_ipaddr, &addr->socket.inet.src_port,
+ &from, from_len);
+
+ return rcode;
+ }
+
+ if (rcode == 0) return rcode;
+
+#undef flag_blocked
+#define flag_blocked info.read_blocked
+#include "fd_errno.h"
+
+ return fr_bio_error(IO);
+}
+#endif
+
+#if defined(IP_PKTINFO) || defined(IP_RECVDSTADDR)
+
+/** Read from a UDP socket where we can change our IP, IPv4 version.
+ */
+static ssize_t fr_bio_fd_recvfromto4(fr_bio_t *bio, void *packet_ctx, void *buffer, size_t size)
+{
+ ssize_t rcode;
+ struct cmsghdr *cmsg;
+ fr_time_t when = fr_time_wrap(0);
+ fr_bio_fd_t *my = talloc_get_type_abort(bio, fr_bio_fd_t);
+ fr_bio_fd_packet_ctx_t *addr = fr_bio_fd_packet_ctx(my, packet_ctx);
+
+ rcode = fd_fd_recvfromto_common(my, packet_ctx, buffer, size);
+ if (rcode <= 0) return rcode;
+
+DIAG_OFF(sign-compare)
+ /* Process auxiliary received data in msgh */
+ for (cmsg = CMSG_FIRSTHDR(&my->msgh);
+ cmsg != NULL;
+ cmsg = CMSG_NXTHDR(&my->msgh, cmsg)) {
+DIAG_ON(sign-compare)
+
+#ifdef IP_PKTINFO
+ if ((cmsg->cmsg_level == SOL_IP) &&
+ (cmsg->cmsg_type == IP_PKTINFO)) {
+ struct in_pktinfo *i = (struct in_pktinfo *) CMSG_DATA(cmsg);
+ struct sockaddr_in to;
+
+ to.sin_addr = i->ipi_addr;
+
+ (void) fr_ipaddr_from_sockaddr(&addr->socket.inet.dst_ipaddr, &addr->socket.inet.dst_port,
+ (struct sockaddr_storage *) &to, sizeof(struct sockaddr_in));
+ addr->socket.inet.ifindex = i->ipi_ifindex;
+ break;
+ }
+#endif
+
+#ifdef IP_RECVDSTADDR
+ if ((cmsg->cmsg_level == IPPROTO_IP) &&
+ (cmsg->cmsg_type == IP_RECVDSTADDR)) {
+ struct in_addr *i = (struct in_addr *) CMSG_DATA(cmsg);
+ struct sockaddr_in to;
+
+ to.sin_addr = *i;
+ (void) fr_ipaddr_from_sockaddr(&addr->socket.inet.dst_ipaddr, &addr->socket.inet.dst_port,
+ (struct sockaddr_storage *) &to, sizeof(struct sockaddr_in));
+ break;
+ }
+#endif
+
+#ifdef SO_TIMESTAMPNS
+ if ((cmsg->cmsg_level == SOL_IP) && (cmsg->cmsg_type == SO_TIMESTAMPNS)) {
+ when = fr_time_from_timespec((struct timespec *)CMSG_DATA(cmsg));
+ }
+
+#elif defined(SO_TIMESTAMP)
+ if ((cmsg->cmsg_level == SOL_IP) && (cmsg->cmsg_type == SO_TIMESTAMP)) {
+ when = fr_time_from_timeval((struct timeval *)CMSG_DATA(cmsg));
+ }
+#endif
+ }
+
+ if fr_time_eq(when, fr_time_wrap(0)) when = fr_time();
+
+ addr->when = when;
+
+ return rcode;
+}
+
+/** Send to UDP socket where we can change our IP, IPv4 version.
+ */
+static ssize_t fr_bio_fd_sendfromto4(fr_bio_t *bio, void *packet_ctx, const void *buffer, size_t size)
+{
+ int tries = 0;
+ ssize_t rcode;
+ struct cmsghdr *cmsg;
+ struct sockaddr_storage to;
+ socklen_t to_len;
+ fr_bio_fd_t *my = talloc_get_type_abort(bio, fr_bio_fd_t);
+ fr_bio_fd_packet_ctx_t *addr = fr_bio_fd_packet_ctx(my, packet_ctx);
+
+ my->info.write_blocked = false;
+
+ memset(&my->cbuf, 0, sizeof(my->cbuf));
+ memset(&my->msgh, 0, sizeof(struct msghdr));
+
+ (void) fr_ipaddr_to_sockaddr(&to, &to_len, &addr->socket.inet.dst_ipaddr, addr->socket.inet.dst_port);
+
+ my->iov = (struct iovec) {
+ .iov_base = UNCONST(void *, buffer),
+ .iov_len = size,
+ };
+
+ my->msgh = (struct msghdr) {
+ .msg_control = my->cbuf,
+ // controllen is set below
+ .msg_name = &to,
+ .msg_namelen = &to_len,
+ .msg_iov = &my->iov,
+ .msg_iovlen = 1,
+ .msg_flags = 0,
+ };
+
+ cmsg = CMSG_FIRSTHDR(&my->msgh);
+
+ {
+#ifdef IP_PKTINFO
+ struct in_pktinfo *pkt;
+
+ my->msgh.msg_controllen = CMSG_SPACE(sizeof(*pkt));
+
+ cmsg->cmsg_level = SOL_IP;
+ cmsg->cmsg_type = IP_PKTINFO;
+ cmsg->cmsg_len = CMSG_LEN(sizeof(*pkt));
+
+ pkt = (struct in_pktinfo *) CMSG_DATA(cmsg);
+ memset(pkt, 0, sizeof(*pkt));
+ pkt->ipi_spec_dst = addr->socket.inet.src_ipaddr.addr.v4;
+ pkt->ipi_ifindex = addr->socket.inet.ifindex;
+
+#elif defined(IP_SENDSRCADDR)
+ struct in_addr *in;
+
+ my->msgh.msg_controllen = CMSG_SPACE(sizeof(*in));
+
+ cmsg->cmsg_level = IPPROTO_IP;
+ cmsg->cmsg_type = IP_SENDSRCADDR;
+ cmsg->cmsg_len = CMSG_LEN(sizeof(*in));
+
+ in = (struct in_addr *) CMSG_DATA(cmsg);
+ *in = addr->socket.inet.src_ipaddr.addr.v4;
+#endif
+ }
+
+retry:
+ rcode = sendmsg(my->info.socket.fd, &my->msgh, 0);
+ if (rcode >= 0) return rcode;
+
+#undef flag_blocked
+#define flag_blocked info.read_blocked
+#include "fd_errno.h"
+
+ return fr_bio_error(IO);
+}
+
+static inline int fr_bio_fd_udpfromto_init4(int fd)
+{
+ int proto = 0, flag = 0, opt = 1;
+
+#ifdef HAVE_IP_PKTINFO
+ /*
+ * Linux
+ */
+ proto = SOL_IP;
+ flag = IP_PKTINFO;
+
+#elif defined(IP_RECVDSTADDR)
+ /*
+ * Set the IP_RECVDSTADDR option (BSD). Note:
+ * IP_RECVDSTADDR == IP_SENDSRCADDR
+ */
+ proto = IPPROTO_IP;
+ flag = IP_RECVDSTADDR;
+#endif
+
+ return setsockopt(fd, proto, flag, &opt, sizeof(opt));
+}
+#endif
+
+#if defined(IPV6_PKTINFO)
+/** Read from a UDP socket where we can change our IP, IPv4 version.
+ */
+static ssize_t fr_bio_fd_recvfromto6(fr_bio_t *bio, void *packet_ctx, void *buffer, size_t size)
+{
+ ssize_t rcode;
+ struct cmsghdr *cmsg;
+ fr_time_t when = fr_time_wrap(0);
+ fr_bio_fd_t *my = talloc_get_type_abort(bio, fr_bio_fd_t);
+ fr_bio_fd_packet_ctx_t *addr = fr_bio_fd_packet_ctx(my, packet_ctx);
+
+ rcode = fd_fd_recvfromto_common(my, packet_ctx, buffer, size);
+ if (rcode <= 0) return rcode;
+
+DIAG_OFF(sign-compare)
+ /* Process auxiliary received data in msgh */
+ for (cmsg = CMSG_FIRSTHDR(&my->msgh);
+ cmsg != NULL;
+ cmsg = CMSG_NXTHDR(&my->msgh, cmsg)) {
+DIAG_ON(sign-compare)
+
+ if ((cmsg->cmsg_level == IPPROTO_IPV6) &&
+ (cmsg->cmsg_type == IPV6_PKTINFO)) {
+ struct in6_pktinfo *i = (struct in6_pktinfo *) CMSG_DATA(cmsg);
+ struct sockaddr_in6 to;
+
+ to.sin6_addr = i->ipi6_addr;
+
+ (void) fr_ipaddr_from_sockaddr(&addr->socket.inet.dst_ipaddr, &addr->socket.inet.dst_port,
+ (struct sockaddr_storage *) &to, sizeof(struct sockaddr_in6));
+ addr->socket.inet.ifindex = i->ipi6_ifindex;
+ break;
+ }
+
+#ifdef SO_TIMESTAMPNS
+ if ((cmsg->cmsg_level == SOL_IP) && (cmsg->cmsg_type == SO_TIMESTAMPNS)) {
+ when = fr_time_from_timespec((struct timespec *)CMSG_DATA(cmsg));
+ }
+
+#elif defined(SO_TIMESTAMP)
+ if ((cmsg->cmsg_level == SOL_IP) && (cmsg->cmsg_type == SO_TIMESTAMP)) {
+ when = fr_time_from_timeval((struct timeval *)CMSG_DATA(cmsg));
+ }
+#endif
+ }
+
+ if fr_time_eq(when, fr_time_wrap(0)) when = fr_time();
+
+ addr->when = when;
+
+ return rcode;
+}
+
+/** Send to UDP socket where we can change our IP, IPv4 version.
+ */
+static ssize_t fr_bio_fd_sendfromto6(fr_bio_t *bio, void *packet_ctx, const void *buffer, size_t size)
+{
+ int tries = 0;
+ ssize_t rcode;
+ struct cmsghdr *cmsg;
+ struct sockaddr_storage to;
+ socklen_t to_len;
+ fr_bio_fd_t *my = talloc_get_type_abort(bio, fr_bio_fd_t);
+ fr_bio_fd_packet_ctx_t *addr = fr_bio_fd_packet_ctx(my, packet_ctx);
+
+ my->info.write_blocked = false;
+
+ memset(&my->cbuf, 0, sizeof(my->cbuf));
+ memset(&my->msgh, 0, sizeof(struct msghdr));
+
+ (void) fr_ipaddr_to_sockaddr(&to, &to_len, &addr->socket.inet.dst_ipaddr, addr->socket.inet.dst_port);
+
+ my->iov = (struct iovec) {
+ .iov_base = UNCONST(void *, buffer),
+ .iov_len = size,
+ };
+
+ my->msgh = (struct msghdr) {
+ .msg_control = my->cbuf,
+ // controllen is set below
+ .msg_name = &to,
+ .msg_namelen = &to_len,
+ .msg_iov = &my->iov,
+ .msg_iovlen = 1,
+ .msg_flags = 0,
+ };
+
+ cmsg = CMSG_FIRSTHDR(&my->msgh);
+
+ {
+ struct in6_pktinfo *pkt;
+
+ my->msgh.msg_controllen = CMSG_SPACE(sizeof(*pkt));
+
+ cmsg->cmsg_level = IPPROTO_IPV6;
+ cmsg->cmsg_type = IPV6_PKTINFO;
+ cmsg->cmsg_len = CMSG_LEN(sizeof(*pkt));
+
+ pkt = (struct in6_pktinfo *) CMSG_DATA(cmsg);
+ memset(pkt, 0, sizeof(*pkt));
+ pkt->ipi6_addr = addr->socket.inet.src_ipaddr.addr.v6;
+ pkt->ipi6_ifindex = addr->socket.inet.ifindex;
+ }
+
+retry:
+ rcode = sendmsg(my->info.socket.fd, &my->msgh, 0);
+ if (rcode >= 0) return rcode;
+
+#undef flag_blocked
+#define flag_blocked info.read_blocked
+#include "fd_errno.h"
+
+ return fr_bio_error(IO);
+}
+
+
+static inline int fr_bio_fd_udpfromto_init6(int fd)
+{
+ int opt = 1;
+
+ return setsockopt(fd, IPPROTO_IPV6, IPV6_RECVPKTINFO, &opt, sizeof(opt));
+}
+#endif
+
+int fr_filename_to_sockaddr(struct sockaddr_un *sun, socklen_t *sunlen, char const *filename)
+{
+ size_t len;
+
+ len = strlen(filename);
+ if (len >= sizeof(sun->sun_path)) {
+ fr_strerror_const("Failed parsing unix domain socket filename: Name is too long");
+ return -1;
+ }
+
+ sun->sun_family = AF_UNIX;
+ memcpy(sun->sun_path, filename, len + 1); /* SUN_LEN will do strlen */
+
+ *sunlen = SUN_LEN(sun);
+
+ return 0;
+}
+
+
+/** Try to connect().
+ *
+ * If connect is blocking, we either succeed or error immediately. Otherwise, the caller has to select the
+ * socket for writeability, and then call fr_bio_fd_connect() as soon as the socket is writeable.
+ */
+static ssize_t fr_bio_fd_try_connect(fr_bio_fd_t *my)
+{
+ int tries = 0;
+ int rcode;
+ socklen_t salen;
+ struct sockaddr_storage sockaddr;
+
+ if (my->info.socket.af != AF_UNIX) {
+ rcode = fr_ipaddr_to_sockaddr(&sockaddr, &salen, &my->info.socket.inet.dst_ipaddr, &my->info.socket.inet.dst_port);
+ } else {
+ rcode = fr_filename_to_sockaddr((struct sockaddr_un *) &sockaddr, &salen, my->info.socket.unix.path);
+ }
+
+ if (rcode < 0) {
+ fr_bio_shutdown(&my->bio);
+ return fr_bio_error(GENERIC);
+ }
+
+ my->info.state = FR_BIO_FD_STATE_CONNECTING;
+
+retry:
+ if (connect(my->info.socket.fd, (struct sockaddr *) &sockaddr, salen) == 0) {
+ my->info.state = FR_BIO_FD_STATE_OPEN;
+
+ if (fr_bio_fd_init_common(my) < 0) goto fail;
+
+ return 0;
+ }
+
+ switch (errno) {
+ case EINTR:
+ tries++;
+ if (tries <= my->max_tries) goto retry;
+ FALL_THROUGH;
+
+ /*
+ * This shouldn't happen, but we'll allow it
+ */
+ case EALREADY:
+ FALL_THROUGH;
+
+ /*
+ * Once the socket is writable, it will be active, or in an error state. The caller has
+ * to call fr_bio_fd_connect() before calling write()
+ */
+ case EINPROGRESS:
+ my->info.write_blocked = true;
+ return fr_bio_error(IO_WOULD_BLOCK);
+
+ default:
+ break;
+ }
+
+fail:
+ fr_bio_shutdown(&my->bio);
+ return fr_bio_error(IO);
+}
+
+int fr_bio_fd_init_connected(fr_bio_fd_t *my)
+{
+ /*
+ * Connected datagrams must have real IPs
+ */
+ if (fr_ipaddr_is_inaddr_any(&my->info.socket.inet.src_ipaddr)) return -1;
+ if (fr_ipaddr_is_inaddr_any(&my->info.socket.inet.dst_ipaddr)) return -1;
+
+ /*
+ * Don't do any reads until we're connected.
+ */
+ my->bio.read = fr_bio_null_read;
+ my->bio.write = fr_bio_null_write;
+
+ my->info.eof = false;
+
+ /*
+ * The socket shouldn't be selected for read. But it should be selected for write.
+ */
+ my->info.read_blocked = false;
+ my->info.write_blocked = true;
+
+#ifdef SO_NOSIGPIPE
+ /*
+ * Although the server ignore SIGPIPE, some operating systems like BSD and OSX ignore the
+ * ignoring.
+ *
+ * Fortunately, those operating systems usually support SO_NOSIGPIPE. We set that to prevent
+ * them raising the signal in the first place.
+ */
+ {
+ int on = 1;
+
+ setsockopt(my->info.socket.fd, SOL_SOCKET, SO_NOSIGPIPE, &on, sizeof(on));
+ }
+#endif
+
+ return fr_bio_fd_try_connect(my);
+}
+
+int fr_bio_fd_init_common(fr_bio_fd_t *my)
+{
+ if (my->info.socket.type == SOCK_STREAM) { //!< stream socket
+ my->bio.read = fr_bio_fd_read_stream;
+ my->bio.write = fr_bio_fd_write;
+
+ } else if (my->info.type == FR_BIO_FD_CONNECTED) { //!< connected datagram
+ my->bio.read = fr_bio_fd_read_connected_datagram;
+ my->bio.write = fr_bio_fd_write;
+
+ } else if (!fr_ipaddr_is_inaddr_any(&my->info.socket.inet.src_ipaddr)) { //!< we know our IP address
+ my->bio.read = fr_bio_fd_recvfrom;
+ my->bio.write = fr_bio_fd_sendto;
+
+#if defined(IP_PKTINFO) || defined(IP_RECVDSTADDR)
+ } else if (my->info.socket.inet.src_ipaddr.af == AF_INET) { //!< we don't know our IPv4
+ if (fr_bio_fd_udpfromto_init4(my->info.socket.fd) < 0) return -1;
+
+ my->bio.read = fr_bio_fd_recvfromto4;
+ my->bio.write = fr_bio_fd_sendfromto4;
+#endif
+
+#if defined(IPV6_PKTINFO)
+ } else if (my->info.socket.inet.src_ipaddr.af == AF_INET6) { //!< we don't know our IPv6
+
+ if (fr_bio_fd_udpfromto_init6(my->info.socket.fd) < 0) return -1;
+
+ my->bio.read = fr_bio_fd_recvfromto6;
+ my->bio.write = fr_bio_fd_sendfromto6;
+#endif
+
+ } else {
+ fr_strerror_const("Failed initializing socket: cannot determine what to do");
+ return -1;
+ }
+
+ my->info.state = FR_BIO_FD_STATE_OPEN;
+ my->info.eof = false;
+ my->info.read_blocked = false;
+ my->info.write_blocked = false;
+
+ return 0;
+}
+
+/** Return an fd on read()
+ *
+ * With packet_ctx containing information about the socket.
+ */
+static ssize_t fr_bio_fd_read_accept(fr_bio_t *bio, void *packet_ctx, void *buffer, size_t size)
+{
+ int fd, tries = 0;
+ fr_bio_fd_t *my = talloc_get_type_abort(bio, fr_bio_fd_t);
+ socklen_t salen;
+ struct sockaddr_storage sockaddr;
+
+ if (size < sizeof(int)) return fr_bio_error(BUFFER_TOO_SMALL);
+
+ salen = sizeof(sockaddr);
+
+retry:
+#ifdef __linux__
+ /*
+ * Set these flags immediately on the new socket.
+ */
+ fd = accept4(my->info.socket.fd, (struct sockaddr *) &sockaddr, &salen, SOCK_NONBLOCK | SOCK_CLOEXEC);
+#else
+ fd = accept(my->info.socket.fd, (struct sockaddr *) &sockaddr, &salen);
+#endif
+ if (fd >= 0) {
+ fr_bio_fd_packet_ctx_t *addr = fr_bio_fd_packet_ctx(my, packet_ctx);
+
+ ADDR_INIT;
+
+ (void) fr_ipaddr_from_sockaddr(&addr->socket.inet.src_ipaddr, addr->socket.inet.src_port,
+ &sockaddr, salen);
+
+ addr->socket.inet.dst_ipaddr = my->info.socket.inet.src_ipaddr;
+ addr->socket.inet.dst_port = my->info.socket.inet.src_port;
+ addr->socket.fd = fd; /* might as well! */
+
+ *(int *) buffer = fd;
+ return sizeof(int);
+ }
+
+ switch (errno) {
+ case EINTR:
+ /*
+ * Try a few times before giving up.
+ */
+ tries++;
+ if (tries <= my->max_tries) goto retry;
+ return 0;
+
+ /*
+ * We can ignore these errors.
+ */
+ case ECONNABORTED:
+#if defined(EWOULDBLOCK) && (EWOULDBLOCK != EAGAIN)
+ case EWOULDBLOCK:
+#endif
+ case EAGAIN:
+#ifdef EPERM
+ case EPERM:
+#endif
+#ifdef ETIMEDOUT
+ case ETIMEDOUT:
+#endif
+ return 0;
+
+ default:
+ /*
+ * Some other error, it's fatal.
+ */
+ fr_bio_shutdown(&my->bio);
+ break;
+ }
+
+ return fr_bio_error(IO);
+}
+
+
+int fr_bio_fd_init_accept(fr_bio_fd_t *my)
+{
+ my->info.state = FR_BIO_FD_STATE_OPEN;
+ my->info.eof = false;
+ my->info.read_blocked = true;
+ my->info.write_blocked = false; /* don't select() for write */
+
+ my->bio.read = fr_bio_fd_read_accept;
+ my->bio.write = fr_bio_null_write;
+
+ if (listen(my->info.socket.fd, 8) < 0) {
+ fr_strerror_printf("Failed opening setting FD_CLOEXE: %s", fr_syserror(errno));
+ return -1;
+ }
+
+ return 0;
+}
+
+
+/** Allocate a FD bio
+ *
+ * The caller is responsible for tracking the FD, and all associated management of it. The bio API is
+ * intended to be simple, and does not provide wrapper functions for various ioctls. The caller should
+ * instead do that work.
+ *
+ * Once the FD is give to the bio, its lifetime is "owned" by the bio. Calling talloc_free(bio) will close
+ * the FD.
+ *
+ * The caller can still manage the FD for being readable / writeable. However, the caller should not call
+ * this bio directly (unless it is the only one). Instead, the caller should read from / write to the
+ * previous bio which will then eventually call this one.
+ *
+ * Before updating any event handler readable / writeable callbacks, the caller should check
+ * fr_bio_fd_at_eof(). If true, then the handlers should not be inserted. The previous bios should still be
+ * called to process any pending data, until they return EOF.
+ *
+ * The main purpose of an FD bio is to wrap the FD in a bio container. That, and handling retries on read /
+ * write, along with returning EOF as an error instead of zero.
+ *
+ * Note that the read / write functions can return partial data. It is the callers responsibility to ensure
+ * that any writes continue from where they left off (otherwise dat awill be missing). And any partial reads
+ * should go to a memory bio.
+ *
+ * If a read returns EOF, then the FD remains open until talloc_free(bio) or fr_bio_fd_close() is called.
+ *
+ * @param ctx the talloc ctx
+ * @param cb callbacks
+ * @param sock structure holding socket information
+ * src_ip is always *our* IP. dst_ip is always *their* IP.
+ * @param type type of the bio
+ * @param offset for datagram sockets, where #fr_bio_fd_packet_ctx_t is stored
+ * @return
+ * - NULL on error, memory allocation failed
+ * - !NULL the bio
+ */
+fr_bio_t *fr_bio_fd_alloc(TALLOC_CTX *ctx, fr_bio_cb_funcs_t *cb, fr_socket_t const *sock, fr_bio_fd_type_t type, size_t offset)
+{
+ fr_bio_fd_t *my;
+
+ my = talloc_zero(ctx, fr_bio_fd_t);
+ if (!my) return NULL;
+
+ if (cb) my->cb = *cb;
+ my->max_tries = 4;
+ my->offset = offset;
+
+ if (sock) {
+ my->info.type = type;
+ my->info.state = FR_BIO_FD_STATE_CLOSED;
+
+ if ((my->info.socket.fd >= 0) &&
+ (fr_bio_fd_init(&my->bio, sock) < 0)) {
+ talloc_free(my);
+ return -1;
+ }
+ } else {
+ /*
+ * We can allocate a "place-holder" FD bio, and then later fill it in with
+ * fr_bio_fd_init().
+ *
+ * @todo - maybe just use fr_bio_fd_open() all of the time?
+ */
+ my->info = (fr_bio_fd_info_t) {
+ .socket = {
+ .af = AF_UNSPEC,
+ },
+ .type = type,
+ .read_blocked = true,
+ .write_blocked = true,
+ .eof = false,
+ .state = FR_BIO_FD_STATE_CLOSED,
+ };
+
+ my->bio.read = fr_bio_eof_read;
+ my->bio.write = fr_bio_null_write;
+ }
+
+ talloc_set_destructor(my, fr_bio_fd_destructor);
+ return (fr_bio_t *) my;
+}
+
+/** Close the FD, but leave the bio allocated and alive.
+ *
+ */
+int fr_bio_fd_close(fr_bio_t *bio)
+{
+ int rcode;
+ int tries = 0;
+ fr_bio_fd_t *my = talloc_get_type_abort(bio, fr_bio_fd_t);
+
+ if (my->info.state == FR_BIO_FD_STATE_CLOSED) return 0;
+
+ /*
+ * Shut the bio down cleanly.
+ */
+ rcode = fr_bio_shutdown(bio);
+ if (rcode < 0) return rcode;
+
+ my->bio.read = fr_bio_eof_read;
+ my->bio.write = fr_bio_null_write;
+
+ /*
+ * Shut down the connected socket. The only errors possible here are things we can't do anything
+ * about.
+ *
+ * shutdown() will close ALL versions of this file descriptor, even if it's (somehow) used in
+ * another process. shutdown() will also tell the kernel to gracefully close the connected
+ * socket, so that it can signal the other end, instead of having the connection disappear.
+ *
+ * This shouldn't strictly be necessary, as no other processes should be sharing this file
+ * descriptor. But it's the safe (and polite) thing to do.
+ */
+ if (my->info.type == FR_BIO_FD_CONNECTED) {
+ (void) shutdown(my->info.socket.fd, SHUT_RDWR);
+ }
+
+retry:
+ rcode = close(my->info.socket.fd);
+ if (rcode < 0) {
+ switch (errno) {
+ case EINTR:
+ case EIO:
+ tries++;
+ if (tries < my->max_tries) goto retry;
+ return -1;
+
+ default:
+ /*
+ * EBADF, or other unrecoverable error. We just call it closed, and continue.
+ */
+ break;
+ }
+ }
+
+ my->info.state = FR_BIO_FD_STATE_CLOSED;
+ my->info.read_blocked = true;
+ my->info.write_blocked = true;
+ my->info.eof = true;
+
+ return 0;
+}
+
+/** re-open the bio
+ */
+int fr_bio_fd_init(fr_bio_t *bio, fr_socket_t const *sock)
+{
+ fr_bio_fd_t *my = talloc_get_type_abort(bio, fr_bio_fd_t);
+
+ fr_assert(my->info.socket.inet.src_ipaddr.af == my->info.socket.inet.dst_ipaddr.af);
+
+ /*
+ * The bio can't be open if we're re-initializing it.
+ */
+ if (my->info.state == FR_BIO_FD_STATE_OPEN) return -1;
+
+ my->info.socket = *sock;
+
+ switch (my->info.type) {
+ case FR_BIO_FD_UNCONNECTED:
+ return fr_bio_fd_init_common(my);
+
+ case FR_BIO_FD_CONNECTED:
+ return fr_bio_fd_init_connected(my);
+
+ case FR_BIO_FD_ACCEPT:
+ return fr_bio_fd_init_accept(my);
+ }
+}
+
+/** Finalize a connect()
+ *
+ * connect() said "come back when the socket is writeable". It's now writeable, so we check if there was a
+ * connection error.
+ */
+int fr_bio_fd_connect(fr_bio_t *bio)
+{
+ int error;
+ socklen_t socklen = sizeof(error);
+ fr_bio_fd_t *my = talloc_get_type_abort(bio, fr_bio_fd_t);
+
+ if (my->info.state == FR_BIO_FD_STATE_OPEN) return 0;
+
+ if (my->info.state != FR_BIO_FD_STATE_CONNECTING) return fr_bio_error(GENERIC);
+
+ /*
+ * The socket is writeable. Let's see if there's an error.
+ *
+ * Unix Network Programming says:
+ *
+ * ""If so_error is nonzero when the process calls write, -1 is returned with errno set to the
+ * value of SO_ERROR (p. 495 of TCPv2) and SO_ERROR is reset to 0. We have to check for the
+ * error, and if there's no error, set the state to "open". ""
+ *
+ * The same applies to connect(). If a non-blocking connect returns INPROGRESS, it may later
+ * become writable. It will be writable even if the connection fails. Rather than writing some
+ * random application data, we call SO_ERROR, and get the underlying error.
+ */
+ if (getsockopt(my->info.socket.fd, SOL_SOCKET, SO_ERROR, (void *)&error, &socklen) < 0) {
+ fail:
+ fr_bio_shutdown(bio);
+ return fr_bio_error(IO);
+ }
+
+ my->info.state = FR_BIO_FD_STATE_OPEN;
+
+ /*
+ * The socket is connected, so initialize the normal IO handlers.
+ */
+ if (fr_bio_fd_init_common(my) < 0) goto fail;
+
+ return 0;
+}
+
+/** Returns a pointer to the bio-specific information.
+ *
+ */
+fr_bio_fd_info_t const *fr_bio_fd_info(fr_bio_t *bio)
+{
+ fr_bio_fd_t *my = talloc_get_type_abort(bio, fr_bio_fd_t);
+
+ return &my->info;
+}
+
+
+/** Discard all reads from a UDP socket.
+ */
+static ssize_t fr_bio_fd_read_discard(fr_bio_t *bio, UNUSED void *packet_ctx, void *buffer, size_t size)
+{
+ int tries = 0;
+ ssize_t rcode;
+ fr_bio_fd_t *my = talloc_get_type_abort(bio, fr_bio_fd_t);
+
+ my->info.read_blocked = false;
+
+retry:
+ rcode = read(my->info.socket.fd, buffer, size);
+ if (rcode >= 0) return 0;
+
+#undef flag_blocked
+#define flag_blocked info.read_blocked
+#include "fd_errno.h"
+
+ return fr_bio_error(IO);
+}
+
+/** Mark up a bio as write-only
+ *
+ */
+int fr_bio_fd_write_only(fr_bio_t *bio)
+{
+ fr_bio_fd_t *my = talloc_get_type_abort(bio, fr_bio_fd_t);
+
+ switch (my->info.type) {
+ case FR_BIO_FD_UNCONNECTED:
+ if (my->info.socket.type != SOCK_DGRAM) {
+ fr_strerror_const("Only datagram sockets can be marked 'write-only'");
+ return -1;
+ }
+ break;
+
+ case FR_BIO_FD_CONNECTED:
+ case FR_BIO_FD_ACCEPT:
+ fr_strerror_const("Only unconnected sockets can be marked 'write-only'");
+ return -1;
+ }
+
+ my->bio.read = fr_bio_fd_read_discard;
+ return 0;
+}
--- /dev/null
+#pragma once
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
+ */
+
+/**
+ * $Id$
+ * @file lib/bio/fd.h
+ * @brief Binary IO abstractions for file descriptors
+ *
+ * Allow reads and writes from file descriptors.
+ *
+ * @copyright 2024 Network RADIUS SAS (legal@networkradius.com)
+ */
+RCSIDH(lib_bio_fd_h, "$Id$")
+
+#include <freeradius-devel/bio/base.h>
+#include <freeradius-devel/util/socket.h>
+
+/** Per-packet context
+ *
+ * For reading packets src_ip is *their* IP, and dst_ip is *our* IP.
+ *
+ * For writing packets, src_ip is *our* IP, and dst_ip is *their* IP.
+ *
+ * This context is returned only for datagram sockets. For stream sockets (TCP and Unix domain), it
+ * isn't used. The caller can look at the socket information to determine src/dst ip/port.
+ */
+typedef struct {
+ fr_time_t when; //!< when the packet was received
+ fr_socket_t socket; //!< socket information, including FD.
+} fr_bio_fd_packet_ctx_t;
+
+typedef enum {
+ FR_BIO_FD_STATE_INVALID = 0,
+ FR_BIO_FD_STATE_CLOSED,
+ FR_BIO_FD_STATE_OPEN, //!< error states must be before this
+ FR_BIO_FD_STATE_CONNECTING,
+} fr_bio_fd_state_t;
+
+typedef enum {
+ FR_BIO_FD_UNCONNECTED, //!< unconnected UDP / datagram only
+ // updates #fr_bio_fd_packet_ctx_t for reads,
+ // uses #fr_bio_fd_packet_ctx_t for writes
+ FR_BIO_FD_CONNECTED, //!< connected client sockets (UDP or TCP)
+ FR_BIO_FD_ACCEPT, //!< returns new fd in buffer on fr_bio_read()
+ // updates #fr_bio_fd_packet_ctx_t on successful FD read.
+} fr_bio_fd_type_t;
+
+/** Run-time status of the socket.
+ *
+ */
+typedef struct {
+ fr_socket_t socket; //!< as connected socket
+
+ fr_bio_fd_type_t type; //!< type of the socket
+
+ fr_bio_fd_state_t state; //!< connecting, open, closed, etc.
+
+ bool read_blocked; //!< did we block on read?
+ bool write_blocked; //!< did we block on write?
+ bool eof; //!< are we at EOF?
+
+} fr_bio_fd_info_t;
+
+/** Configuration for sockets
+ *
+ * Each piece of information is broken out into a separate field, so that the configuration file parser can
+ * parse each field independently.
+ *
+ * We also include more information here than we need in an #fr_socket_t.
+ */
+typedef struct {
+ fr_bio_fd_type_t type; //!< accept, connected, unconnected, etc.
+
+ int socket_type; //!< SOCK_STREAM or SOCK_DGRAM
+
+ fr_ipaddr_t src_ipaddr; //!< our IP address
+ fr_ipaddr_t dst_ipaddr; //!< their IP address
+
+ uint16_t src_port; //!< our port
+ uint16_t dst_port; //!< their port
+
+ char const *interface; //!< for binding to an interface
+
+ uint32_t recv_buff; //!< How big the kernel's receive buffer should be.
+ uint32_t send_buff; //!< How big the kernel's send buffer should be.
+
+ char const *path; //!< for Unix domain sockets
+ mode_t perm; //!< permissions for domain sockets
+ uid_t uid; //!< who owns the socket
+ gid_t gid; //!< who owns the socket
+
+ bool async; //!< is it async
+} fr_bio_fd_config_t;
+
+fr_bio_t *fr_bio_fd_alloc(TALLOC_CTX *ctx, fr_bio_cb_funcs_t *cb, fr_socket_t const *sock, fr_bio_fd_type_t type, size_t offset) CC_HINT(nonnull(1));
+
+int fr_bio_fd_close(fr_bio_t *bio) CC_HINT(nonnull);
+
+int fr_bio_fd_init(fr_bio_t *bio, fr_socket_t const *sock) CC_HINT(nonnull);
+
+int fr_bio_fd_connect(fr_bio_t *bio) CC_HINT(nonnull);
+
+fr_bio_fd_info_t const *fr_bio_fd_info(fr_bio_t *bio) CC_HINT(nonnull);
+
+int fr_bio_fd_socket_open(fr_bio_t *bio, fr_bio_fd_config_t const *cfg) CC_HINT(nonnull);
+
+int fr_bio_fd_write_only(fr_bio_t *bio);
--- /dev/null
+/*
+ * Code snippet to avoid duplication.
+ */
+switch (errno) {
+case EINTR:
+ /*
+ * Try a few times before giving up.
+ */
+ tries++;
+ if (tries <= my->max_tries) goto retry;
+ return 0;
+
+#if defined(EWOULDBLOCK) && (EWOULDBLOCK != EAGAIN)
+case EWOULDBLOCK:
+#endif
+case EAGAIN:
+ /*
+ * The operation would block, return that.
+ */
+ my->flag_blocked = true;
+ return fr_bio_error(IO_WOULD_BLOCK);
+
+default:
+ /*
+ * Some other error, it's fatal.
+ */
+ fr_bio_shutdown(&my->bio);
+ break;
+}
--- /dev/null
+/*
+ * This program is is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or (at
+ * your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
+ */
+
+/**
+ * $Id$
+ * @file lib/bio/fd_open.c
+ * @brief BIO abstractions for opening file descriptors
+ *
+ * @copyright 2024 Network RADIUS SAS (legal@networkradius.com)
+ */
+
+#include <freeradius-devel/bio/fd_priv.h>
+#include <freeradius-devel/util/file.h>
+
+#include <sys/stat.h>
+#include <net/if.h>
+#include <fcntl.h>
+#include <libgen.h>
+
+/** Initialize common datagram information
+ *
+ */
+static int fr_bio_fd_common_tcp(int fd, UNUSED fr_socket_t const *sock, UNUSED fr_bio_fd_config_t const *cfg)
+{
+ int on = 1;
+
+#ifdef SO_KEEPALIVE
+ if (setsockopt(fd, SOL_SOCKET, SO_KEEPALIVE, &on, sizeof(on)) < 0) {
+ fr_strerror_printf("Failed setting SO_KEEPALIVE: %s", fr_syserror(errno));
+ return -1;
+ }
+#endif
+
+ return 0;
+}
+
+
+/** Initialize common datagram information
+ *
+ */
+static int fr_bio_fd_common_datagram(int fd, UNUSED fr_socket_t const *sock, fr_bio_fd_config_t const *cfg)
+{
+ int on = 1;
+
+#ifdef SO_TIMESTAMPNS
+ /*
+ * Enable receive timestamps, these should reflect
+ * when the packet was received, not when it was read
+ * from the socket.
+ */
+ if (setsockopt(fd, SOL_SOCKET, SO_TIMESTAMPNS, &on, sizeof(int)) < 0) {
+ fr_strerror_printf("Failed setting SO_TIMESTAMPNS: %s", fr_syserror(errno));
+ return -1;
+ }
+
+#elif defined(SO_TIMESTAMP)
+ /*
+ * Enable receive timestamps, these should reflect
+ * when the packet was received, not when it was read
+ * from the socket.
+ */
+ if (setsockopt(fd, SOL_SOCKET, SO_TIMESTAMP, &on, sizeof(int)) < 0) {
+ fr_strerror_printf("Failed setting SO_TIMESTAMP: %s", fr_syserror(errno));
+ return -1;
+ }
+#endif
+
+#ifdef SO_RCVBUF
+ if (cfg->recv_buff) {
+ int opt = cfg->recv_buff;
+
+ if (setsockopt(fd, SOL_SOCKET, SO_RCVBUF, &opt, sizeof(opt)) < 0) {
+ fr_strerror_printf("Failed setting SO_RCVBUF: %s", fr_syserror(errno));
+ return -1;
+ }
+ }
+#endif
+
+#ifdef SO_SNDBUF
+ if (cfg->send_buff) {
+ int opt = cfg->send_buff;
+
+ if (setsockopt(fd, SOL_SOCKET, SO_SNDBUF, &opt, sizeof(opt)) < 0) {
+ fr_strerror_printf("Failed setting SO_SNDBUF: %s", fr_syserror(errno));
+ return -1;
+ }
+ }
+#endif
+
+ return 0;
+}
+
+/** Initialize a UDP server socket.
+ *
+ */
+static int fr_bio_fd_server_udp(int fd, fr_socket_t const *sock, fr_bio_fd_config_t const *cfg)
+{
+#ifdef SO_REUSEPORT
+ int on = 1;
+
+ /*
+ * Set SO_REUSEPORT before bind, so that all sockets can
+ * listen on the same destination IP address.
+ */
+ if (setsockopt(fd, SOL_SOCKET, SO_REUSEPORT, &on, sizeof(on)) < 0) {
+ fr_strerror_printf("Failed setting SO_REUSEPORT: %s", fr_syserror(errno));
+ return -1;
+ }
+#endif
+
+ return fr_bio_fd_common_datagram(fd, sock, cfg);
+}
+
+/** Initialize a TCP server socket.
+ *
+ */
+static int fr_bio_fd_server_tcp(int fd, UNUSED fr_socket_t const *sock)
+{
+ int on = 1;
+
+ if (setsockopt(fd, SOL_SOCKET, SO_REUSEADDR, &on, sizeof(on)) < 0) {
+ fr_strerror_printf("Failed setting SO_REUSEADDR: %s", fr_syserror(errno));
+ return -1;
+ }
+
+ return 0;
+}
+
+/** Initialize an IPv4 server socket.
+ *
+ */
+static int fr_bio_fd_server_ipv4(int fd, fr_socket_t const *sock, fr_bio_fd_config_t const *cfg)
+{
+ int flag;
+
+#if defined(IP_MTU_DISCOVER) && defined(IP_PMTUDISC_DONT)
+ /*
+ * Disable PMTU discovery. On Linux, this also makes sure that the "don't
+ * fragment" flag is zero.
+ */
+ flag = IP_PMTUDISC_DONT;
+
+ if (setsockopt(fd, IPPROTO_IP, IP_MTU_DISCOVER, &flag, sizeof(flag)) < 0) {
+ fr_strerror_printf("Failed setting IP_MTU_DISCOVER: %s", fr_syserror(errno));
+ return -1;
+ }
+#endif
+
+#if defined(IP_DONTFRAG)
+ /*
+ * Ensure that the "don't fragment" flag is zero.
+ */
+ flag = 0;
+
+ if (setsockopt(fd, IPPROTO_IP, IP_DONTFRAG, &flag, sizeof(flag)) < 0) {
+ fr_strerror_printf("Failed setting IP_DONTFRAG: %s", fr_syserror(errno));
+ return -1;
+ }
+#endif
+
+ /*
+ * And set up any UDP / TCP specific information.
+ */
+ if (sock->type == SOCK_DGRAM) return fr_bio_fd_server_udp(fd, sock, cfg);
+
+ return fr_bio_fd_server_tcp(fd, sock);
+}
+
+/** Initialize an IPv6 server socket.
+ *
+ */
+static int fr_bio_fd_server_ipv6(int fd, fr_socket_t const *sock, fr_bio_fd_config_t const *cfg)
+{
+#ifdef IPV6_V6ONLY
+ /*
+ * Don't allow v4 packets on v6 connections.
+ */
+ if (IN6_IS_ADDR_UNSPECIFIED(UNCONST(struct in6_addr *, &sock->inet.src_ipaddr.addr.v6))) {
+ int on = 1;
+
+ if (setsockopt(fd, IPPROTO_IPV6, IPV6_V6ONLY, (char *)&on, sizeof(on)) < 0) {
+ fr_strerror_printf("Failed setting IPV6_ONLY: %s", fr_syserror(errno));
+ return -1;
+ }
+ }
+#endif /* IPV6_V6ONLY */
+
+ /*
+ * And set up any UDP / TCP specific information.
+ */
+ if (sock->type == SOCK_DGRAM) return fr_bio_fd_server_udp(fd, sock, cfg);
+
+ return fr_bio_fd_server_tcp(fd, sock);
+}
+
+/** Verify or clean up a pre-existing domain socket.
+ *
+ */
+static int fr_bio_fd_socket_unix_verify(int dirfd, char const *filename, fr_bio_fd_config_t const *cfg)
+{
+ int fd;
+ struct stat buf;
+
+ /*
+ * See if the socket exits. If there's an error opening it, that's an issue.
+ *
+ * If it doesn't exist, that's fine.
+ */
+ if (fstatat(dirfd, filename, &buf, AT_SYMLINK_NOFOLLOW) < 0) {
+ if (errno != ENOENT) {
+ fr_strerror_printf("Failed opening domain socket %s: %s", cfg->path, fr_syserror(errno));
+ return -1;
+ }
+
+ return 0;
+ }
+
+ /*
+ * If it exists, it must be a socket.
+ */
+ if (!S_ISSOCK(buf.st_mode)) {
+ fr_strerror_printf("Failed open domain socket %s: it is not a socket", filename);
+ return -1;
+ }
+
+ /*
+ * Refuse to open sockets not owned by us. This prevents configurations from stomping on each
+ * other.
+ */
+ if (buf.st_uid != cfg->uid) {
+ fr_strerror_printf("Failed opening domain socket %s: incorrect UID", cfg->path);
+ return -1;
+ }
+
+ /*
+ * The file exists,and someone is listening. We can't claim it for ourselves.
+ *
+ * Note that this function calls connect(), but connect() always returns immediately for domain
+ * sockets.
+ *
+ * @todo - redo that function here, with separate checks for permission errors vs anything else.
+ */
+ fd = fr_socket_client_unix(cfg->path, false);
+ if (fd >= 0) {
+ close(fd);
+ fr_strerror_printf("Failed creating domain socket %s: It is currently active", cfg->path);
+ return -1;
+ }
+
+ /*
+ * It exists, but no one is listening. Delete it so that we can re-bind to it.
+ */
+ if (unlinkat(dirfd, filename, 0) < 0) {
+ fr_strerror_printf("Failed removing pre-existing domain socket %s: %s",
+ cfg->path, fr_syserror(errno));
+ return -1;
+ }
+
+ return 0;
+}
+
+/*
+ * We normally can't call fchmod() or fchown() on sockets, as they don't really exist in the file system.
+ * Instead, we enforce those permissions on the parent directory of the socket.
+ */
+static int fr_bio_fd_socket_unix_mkdir(int *dirfd, char const **filename, fr_bio_fd_config_t const *cfg)
+{
+ mode_t perm;
+ int parent_fd, fd;
+ char const *path = cfg->path;
+ char *dir, *p;
+ char *slashes[2];
+
+ perm = S_IREAD | S_IWRITE | S_IEXEC;
+ perm |= S_IRGRP | S_IWGRP | S_IXGRP;
+
+ /*
+ * The parent directory exists. Ensure that it has the correct ownership and permissions.
+ *
+ * If the parent directory exists, then it enforces access, and we can create the domain socket
+ * within it.
+ */
+ if (fr_dirfd(dirfd, filename, path) == 0) {
+ struct stat buf;
+
+ if (fstat(*dirfd, &buf) < 0) {
+ fr_strerror_printf("Failed reading parent directory for file %s: %s", path, fr_syserror(errno));
+ close(*dirfd);
+ return -1;
+ }
+
+ if (buf.st_uid != cfg->uid) {
+ fr_strerror_printf("Failed reading parent directory for file %s: Incorrect UID", path);
+ return -1;
+ }
+
+ if (buf.st_gid != cfg->gid) {
+ fr_strerror_printf("Failed reading parent directory for file %s: Incorrect GID", path);
+ return -1;
+ }
+
+ /*
+ * We don't have the correct permissions on the directory, so we fix them.
+ *
+ * @todo - allow for "other" to read/write if we do authentication on the socket?
+ */
+ if (fchmod(*dirfd, perm) < 0) {
+ fr_strerror_printf("Failed setting parent directory permissions for file %s: %s", path, fr_syserror(errno));
+ close(*dirfd);
+ return -1;
+ }
+
+ return 0;
+ }
+
+ dir = talloc_strdup(NULL, path);
+ if (!dir) return -1;
+
+ /*
+ * Find the last two directory separators.
+ */
+ slashes[0] = slashes[1] = NULL;
+ for (p = dir; *p != '\0'; p++) {
+ if (*p == '/') {
+ slashes[0] = slashes[1];
+ slashes[1] = p;
+ }
+ }
+
+ /*
+ * There's only one / in the path, we can't do anything.
+ *
+ * Opening 'foo/bar.sock' might be useful, but isn't normally a good idea.
+ */
+ if (!slashes[0]) {
+ fr_strerror_printf("Failed parsing filename %s: it is not absolute", path);
+ fail:
+ talloc_free(dir);
+ return -1;
+ }
+
+ /*
+ * Ensure that the grandparent directory exists.
+ *
+ * /var/run/radiusd/foo.sock
+ *
+ * slashes[0] points to the slash after 'run'.
+ *
+ * slashes[1] points to the slash after 'radiusd', which doesn't exist.
+ */
+ slashes[0] = '\0';
+
+ /*
+ * If the grandparent doesn't exist, then we don't create it.
+ *
+ * These checks minimize the possibility that a misconfiguration by user "radiusd" can cause a
+ * suid-root binary top create a directory in the wrong place. These checks are only necessary
+ * if the unix domain socket is opened as root.
+ */
+ parent_fd = open(dir, O_DIRECTORY | O_NOFOLLOW);
+ if (parent_fd < 0) {
+ fr_strerror_printf("Failed opening directory %s: %s", dir, fr_syserror(errno));
+ goto fail;
+ }
+
+ /*
+ * Create the parent directory.
+ */
+ slashes[0] = '/';
+ slashes[1] = '\0';
+ if (mkdirat(parent_fd, dir, 0700) < 0) {
+ fr_strerror_printf("Failed creating directory %s: %s", dir, fr_syserror(errno));
+ close_parent:
+ close(parent_fd);
+ goto fail;
+ }
+
+ fd = openat(parent_fd, dir, O_DIRECTORY);
+ if (fd < 0) {
+ fr_strerror_printf("Failed opening directory %s: %s", dir, fr_syserror(errno));
+ goto close_parent;
+ }
+
+ if (fchmod(fd, perm) < 0) {
+ fr_strerror_printf("Failed changing permission for directory %s: %s", dir, fr_syserror(errno));
+ close_fd:
+ close(fd);
+ goto close_parent;
+ }
+
+ /*
+ * This is a NOOP if we're chowning a file owned by ourselves to our own UID / GID.
+ *
+ * Otherwise if we're running as root, it will set ownership to the correct user.
+ */
+ if (fchown(fd, cfg->uid, cfg->gid) < 0) {
+ fr_strerror_printf("Failed changing ownershipt for directory %s: %s", dir, fr_syserror(errno));
+ goto close_fd;
+ }
+
+ talloc_free(dir);
+ close(fd);
+ close(parent_fd);
+
+ return 0;
+}
+
+static int fr_bio_fd_unix_shutdown(fr_bio_t *bio)
+{
+ fr_bio_fd_t *my = talloc_get_type_abort(bio, fr_bio_fd_t);
+
+ /*
+ * The bio must be open in order to shut it down.
+ *
+ * Unix domain sockets are deleted when the bio is closed.
+ *
+ * Unix domain sockets are never in the "connecting" state, because connect() always returns
+ * immediately.
+ */
+ fr_assert(my->info.state == FR_BIO_FD_STATE_OPEN);
+
+ /*
+ * Run the user shutdown before we run ours.
+ */
+ if (my->user_shutdown) {
+ if (my->user_shutdown(bio) < 0) return -1;
+ }
+
+ return unlink(my->info.socket.unix.path);
+}
+
+/** Bind to a Unix domain socket.
+ *
+ * @todo - this function only does a tiny bit of what fr_server_domain_socket_peercred() and
+ * fr_server_domain_socket_perm() do. Those functions do a lot more sanity checks.
+ *
+ * The main question is whether or not those checks are useful. In many cases, fchmod() and fchown() are not
+ * possible on Unix sockets, so we shouldn't bother doing them,
+ *
+ * Note that the listeners generally call these functions with wrappers of fr_suid_up() and fr_suid_down().
+ * So these functions are running as "root", and will create files owned as "root".
+ */
+static int fr_bio_fd_socket_bind_unix(fr_bio_fd_t *my, fr_bio_fd_config_t const *cfg)
+{
+ int dirfd, rcode;
+ char const *filename, *p;
+ socklen_t sunlen;
+ struct sockaddr_un sun;
+
+ p = strrchr(my->info.socket.unix.path, '/');
+
+ /*
+ * The UID and GID should be taken automatically from the "user" and "group" settings in
+ * mainconfig. There is no reason to set them to anything else.
+ */
+ if (cfg->uid == (uid_t) -1) {
+ fr_strerror_printf("Failed opening domain socket %s: no UID specified", my->info.socket.unix.path);
+ return -1;
+ }
+
+ if (cfg->gid == (gid_t) -1) {
+ fr_strerror_printf("Failed opening domain socket %s: no GID specified", my->info.socket.unix.path);
+ return -1;
+ }
+
+ if (cfg->uid == 0) {
+ fr_strerror_printf("Failed opening domain socket %s: refusing to open as UID 0", my->info.socket.unix.path);
+ return -1;
+ }
+
+ if (cfg->gid == 0) {
+ fr_strerror_printf("Failed opening domain socket %s: refusing to open as GID 0", my->info.socket.unix.path);
+ return -1;
+ }
+
+ /*
+ * Opening 'foo.sock' is OK.
+ */
+ if (!p) {
+ dirfd = AT_FDCWD;
+ filename = my->info.socket.unix.path;
+
+ } else if (p == my->info.socket.unix.path) {
+ /*
+ * Opening '/foo.sock' is dumb.
+ */
+ fr_strerror_printf("Failed opening domain socket %s: cannot exist at file system root", p);
+ return -1;
+
+ } else if (fr_bio_fd_socket_unix_mkdir(&dirfd, &filename, cfg) < 0) {
+ return -1;
+ }
+
+ /*
+ * Verify and/or clean up the domain socket.
+ */
+ if (fr_bio_fd_socket_unix_verify(dirfd, filename, cfg) < 0) {
+ fail:
+ if (dirfd != AT_FDCWD) close(dirfd);
+ return -1;
+ }
+
+#ifdef HAVE_BINDAT
+ /*
+ * The best function to use here is bindat(), but only quite recent versions of FreeBSD actually
+ * have it, and it's definitely not POSIX.
+ *
+ * If we use bindat(), we pass a relative pathname.
+ */
+ if (fr_filename_to_sockaddr(&sun, &sunlen, filename) < 0) goto fail;
+
+ rcode = bindat(dirfd, my->info.socket.fd, (struct sockaddr *) &sun, sunlen);
+#else
+ /*
+ * For bind(), we pass the full path.
+ */
+ if (fr_filename_to_sockaddr(&sun, &sunlen, my->info.socket.unix.path) < 0) goto fail;
+
+ rcode = bind(my->info.socket.fd, (struct sockaddr *) &sun, sunlen);
+#endif
+ if (rcode < 0) {
+ /*
+ * @todo - if EADDRINUSE, then the socket exists. Try connect(), and if that fails,
+ * delete the socket and try again. This may be simpler than the checks above.
+ */
+ fr_strerror_printf("Failed binding to domain socket %s: %s", my->info.socket.unix.path, fr_syserror(errno));
+ goto fail;
+ }
+
+#ifdef __linux__
+ /*
+ * Linux supports chown && chmod for sockets.
+ */
+ if (fchmod(my->info.socket.fd, S_IREAD | S_IWRITE | S_IEXEC | S_IRGRP | S_IWGRP | S_IXGRP) < 0) {
+ fr_strerror_printf("Failed changing permission for domain socket %s: %s", my->info.socket.unix.path, fr_syserror(errno));
+ goto fail;
+ }
+
+ /*
+ * This is a NOOP if we're chowning a file owned by ourselves to our own UID / GID.
+ *
+ * Otherwise if we're running as root, it will set ownership to the correct user.
+ */
+ if (fchown(my->info.socket.fd, cfg->uid, cfg->gid) < 0) {
+ fr_strerror_printf("Failed changing ownershipt for domain directory %s: %s", my->info.socket.unix.path, fr_syserror(errno));
+ goto fail;
+ }
+
+#endif
+
+ /*
+ * Socket is open. We need to clean it up on shutdown.
+ */
+ if (my->cb.shutdown) my->user_shutdown = my->cb.shutdown;
+ my->cb.shutdown = fr_bio_fd_unix_shutdown;
+
+ return 0;
+}
+
+#ifdef SO_BINDTODEVICE
+/** Linux bind to device by name.
+ *
+ */
+static int fr_bio_fd_socket_bind_to_device(fr_bio_fd_t *my, fr_bio_fd_config_t const *cfg)
+{
+ char *ifname;
+ char buffer[IFNAMSIZ];
+
+ /*
+ * ifindex isn't set, do nothing.
+ */
+ if (!my->info.socket.inet.ifindex) return 0;
+
+ /*
+ * The internet hints that CAP_NET_RAW is required to use SO_BINDTODEVICE.
+ *
+ * This function also sets fr_strerror() on failure, which will be seen if the bind fails. If
+ * the bind succeeds, then we don't really care that the capability change has failed. We must
+ * already have that capability.
+ */
+#ifdef HAVE_CAPABILITY_H
+ (void)fr_cap_enable(CAP_NET_RAW, CAP_EFFECTIVE);
+#endif
+
+ if (setsockopt(my->info.socket.fd, SOL_SOCKET, SO_BINDTODEVICE, cfg->interface, strlen(cfg->interface)) < 0) {
+ fr_strerror_printf("Failed setting SO_BINDTODEVICE for %s: %s", cfg->interface, fr_syserror(errno));
+ return -1;
+ }
+
+ return 0;
+}
+
+#elif defined(IP_BOUND_IF) || defined(IPV6_BOUND_IF)
+/** *BSD bind to interface by index.
+ *
+ */
+static int fr_bio_fd_socket_bind_to_device(fr_bio_fd_t *my, UNUSED fr_bio_fd_config_t const *cfg)
+{
+ int opt, rcode;
+
+ if (!my->info.socket.inet.ifindex) return 0;
+
+ opt = my->info.socket.inet.ifindex;
+
+ switch (my->info.socket.af) {
+ case AF_UNIX:
+ rcode = setsockopt(my->info.socket.fd, IPPROTO_IP, IP_BOUND_IF, &opt, sizeof(opt));
+ break;
+
+ case AF_INET6:
+ rcode = setsockopt(my->info.socket.fd, IPPROTO_IPV6, IPV6_BOUND_IF, &opt, sizeof(opt));
+ break;
+
+ default:
+ rcode = -1;
+ errno = EAFNOSUPPORT;
+ break;
+ }
+
+ fr_strerror_printf("Failed setting IP_BOUND_IF: %s", fr_syserror(errno));
+ return rcode;
+}
+#else
+
+#error This system is missing SO_BINDTODEVICE, IP_BOUND_IF, IPV6_BOUND_IF
+
+/** ??? Who knows?
+ *
+ */
+static int fr_bio_fd_socket_bind_to_device(fr_bio_fd_t *my, fr_bio_fd_config_t const *cfg)
+{
+ /*
+ * @todo - see fr_socket_bind(). Troll through the interfaces to see which interface has a name
+ * which matches the named interface. If so, copy over it's IP to our src_ip, so long as src_ip
+ * is INADDR_ANY.
+ */
+
+ return -1;
+}
+
+/* bind to device */
+#endif
+
+static int fr_bio_fd_socket_bind(fr_bio_fd_t *my, fr_bio_fd_config_t const *cfg)
+{
+ socklen_t salen;
+ struct sockaddr_storage salocal;
+
+ if (my->info.socket.af == AF_UNIX) {
+ return fr_bio_fd_socket_bind_unix(my, cfg);
+ }
+
+#ifdef HAVE_CAPABILITY_H
+ /*
+ * If we're binding to a special port as non-root, then
+ * check capabilities. If we're root, we already have
+ * equivalent capabilities so we don't need to check.
+ */
+ if ((my->info.socket.inet.src_port < 1024) && (geteuid() != 0)) {
+ (void)fr_cap_enable(CAP_NET_BIND_SERVICE, CAP_EFFECTIVE);
+ }
+#endif
+
+ if (fr_bio_fd_socket_bind_to_device(my, cfg) < 0) return -1;
+
+ /*
+ * Bind to the IP + interface.
+ */
+ if (fr_ipaddr_to_sockaddr(&salocal, &salen, &my->info.socket.inet.src_ipaddr, my->info.socket.inet.src_port) < 0) return -1;
+
+ if (bind(my->info.socket.fd, (struct sockaddr *) &salocal, salen) < 0) {
+ fr_strerror_printf("Failed binding to socket: %s", fr_syserror(errno));
+ return -1;
+ }
+
+ /*
+ * FreeBSD jail issues. We bind to 0.0.0.0, but the
+ * kernel instead binds us to a 1.2.3.4. So once the
+ * socket is bound, ask it what it's IP address is.
+ */
+ salen = sizeof(salocal);
+ memset(&salocal, 0, salen);
+ if (getsockname(my->info.socket.fd, (struct sockaddr *) &salocal, &salen) < 0) {
+ fr_strerror_printf("Failed getting socket name: %s", fr_syserror(errno));
+ return -1;
+ }
+
+ if (fr_ipaddr_from_sockaddr(&my->info.socket.inet.src_ipaddr, &my->info.socket.inet.src_port, &salocal, salen) < 0) return -1;
+
+ return 0;
+}
+
+/** Opens a socket and updates sock->fd
+ *
+ * Note that it does not call connect()!
+ */
+int fr_bio_fd_socket_open(fr_bio_t *bio, fr_bio_fd_config_t const *cfg)
+{
+ int fd, protocol;
+ int rcode;
+ fr_bio_fd_t *my = talloc_get_type_abort(bio, fr_bio_fd_t);
+
+ fr_strerror_clear();
+
+ my->info.socket = (fr_socket_t) {};
+
+ if (cfg->path) {
+ my->info.socket.af = AF_UNIX;
+ } else {
+ my->info.socket.af = cfg->src_ipaddr.af;
+ }
+ my->info.socket.type = cfg->socket_type;
+
+ switch (my->info.socket.af) {
+ case AF_INET:
+ case AF_INET6:
+ my->info.socket.inet.src_ipaddr = cfg->src_ipaddr;
+ my->info.socket.inet.dst_ipaddr = cfg->dst_ipaddr;
+ my->info.socket.inet.src_port = cfg->src_port;
+ my->info.socket.inet.dst_port = cfg->dst_port;
+
+ if (cfg->socket_type == SOCK_STREAM) {
+ protocol = IPPROTO_TCP;
+ } else {
+ protocol = IPPROTO_UDP;
+ }
+
+ if (cfg->interface) {
+ my->info.socket.inet.ifindex = if_nametoindex(cfg->interface);
+
+ if (!my->info.socket.inet.ifindex) {
+ fr_strerror_printf_push("Failed finding interface %s: %s", cfg->interface, fr_syserror(errno));
+ return -1;
+ }
+ }
+ break;
+
+ case AF_UNIX:
+ my->info.socket.unix.path = cfg->path;
+ my->info.socket.type = SOCK_STREAM;
+ protocol = 0;
+ break;
+
+ default:
+ fr_strerror_const("Failed opening socket: unsupported address family");
+ return -1;
+ }
+
+ /*
+ * Open the socket.
+ */
+ fd = socket(my->info.socket.af, my->info.socket.type, protocol);
+ if (fd < 0) {
+ fr_strerror_printf("Failed opening socket: %s", fr_syserror(errno));
+ return -1;
+ }
+
+ /*
+ * Set it to be non-blocking if required.
+ */
+ if (cfg->async && (fr_nonblock(fd) < 0)) {
+ fr_strerror_printf("Failed opening setting O_NONBLOCK: %s", fr_syserror(errno));
+
+ fail:
+ my->info.socket.fd = -1;
+ my->info.state = FR_BIO_FD_STATE_CLOSED;
+ close(fd);
+ return -1;
+ }
+
+#ifdef FD_CLOEXEC
+ /*
+ * We don't want child processes inheriting these file descriptors.
+ */
+ rcode = fcntl(fd, F_GETFD);
+ if (rcode >= 0) {
+ if (fcntl(fd, F_SETFD, rcode | FD_CLOEXEC) < 0) {
+ fr_strerror_printf("Failed opening setting FD_CLOEXE: %s", fr_syserror(errno));
+ goto fail;
+ }
+ }
+#endif
+
+ /*
+ * Initialize the bio information before calling the various setup functions.
+ */
+ my->info.state = (cfg->type == FR_BIO_FD_CONNECTED) ? FR_BIO_FD_STATE_CONNECTING : FR_BIO_FD_STATE_OPEN;
+
+ /*
+ * Set the FD so that the subsequent calls can use it.
+ */
+ my->info.socket.fd = fd;
+
+ /*
+ * Do sanity checks, bootstrap common socket options, bind to the socket, and initialize the read
+ * / write functions.
+ */
+ switch (cfg->type) {
+ /*
+ * Unconnected UDP or datagram AF_UNUX server sockets.
+ */
+ case FR_BIO_FD_UNCONNECTED:
+ if (my->info.socket.type != SOCK_DGRAM) {
+ fr_strerror_const("Failed configuring socket: unconnected sockets must be UDP");
+ return -1;
+ }
+
+ if (my->info.socket.af == AF_UNIX) {
+ rcode = fr_bio_fd_common_datagram(fd, &my->info.socket, cfg);
+ } else {
+ rcode = fr_bio_fd_server_udp(fd, &my->info.socket, cfg); /* sets SO_REUSEPORT, too */
+ }
+ if (rcode < 0) goto fail;
+
+ if (fr_bio_fd_socket_bind(my, cfg) < 0) goto fail;
+
+ if (fr_bio_fd_init_common(my) < 0) goto fail;
+ break;
+
+ /*
+ * A connected client: UDP, TCP, or AF_UNIX.
+ */
+ case FR_BIO_FD_CONNECTED:
+ if (my->info.socket.type == SOCK_DGRAM) {
+ rcode = fr_bio_fd_common_datagram(fd, &my->info.socket, cfg); /* we don't use SO_REUSEPORT for clients */
+ if (rcode < 0) goto fail;
+
+ } else if (my->info.socket.af != AF_UNIX) {
+ rcode = fr_bio_fd_common_tcp(fd, &my->info.socket, cfg);
+ if (rcode < 0) goto fail;
+ }
+
+ if (fr_bio_fd_socket_bind(my, cfg) < 0) goto fail;
+
+ if (fr_bio_fd_init_connected(my) < 0) goto fail;
+ break;
+
+ /*
+ * Server socket which listens for new stream connections
+ */
+ case FR_BIO_FD_ACCEPT:
+ fr_assert(my->info.socket.type == SOCK_STREAM);
+
+ switch (my->info.socket.af) {
+ case AF_INET:
+ rcode = fr_bio_fd_server_ipv4(fd, &my->info.socket, cfg);
+ break;
+
+ case AF_INET6:
+ rcode = fr_bio_fd_server_ipv6(fd, &my->info.socket, cfg);
+ break;
+
+ case AF_UNIX:
+ rcode = 0;
+ break;
+
+ default:
+ rcode = -1;
+ errno = EAFNOSUPPORT;
+ break;
+ }
+ if (rcode < 0) goto fail;
+
+ if (fr_bio_fd_socket_bind(my, cfg) < 0) goto fail;
+
+ if (fr_bio_fd_init_accept(my) < 0) goto fail;
+ break;
+ }
+ return 0;
+}
--- /dev/null
+#pragma once
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
+ */
+
+/**
+ * $Id$
+ * @file lib/bio/fd_priv.h
+ * @brief Private binary IO abstractions for file descriptors
+ *
+ * @copyright 2024 Network RADIUS SAS (legal@networkradius.com)
+ */
+RCSIDH(lib_bio_fd_privh, "$Id$")
+
+#include <freeradius-devel/util/syserror.h>
+
+#include <freeradius-devel/bio/bio_priv.h>
+#include <freeradius-devel/bio/fd.h>
+
+/** Our FD bio structure.
+ *
+ */
+typedef struct fr_bio_fd_s {
+ FR_BIO_COMMON;
+ fr_bio_callback_t user_shutdown; //!< user shutdown
+
+ fr_bio_fd_info_t info;
+
+ int max_tries; //!< how many times we retry on EINTR
+ size_t offset; //!< where #fr_bio_fd_packet_ctx_t is stored
+
+#if defined(IP_PKTINFO) || defined(IP_RECVDSTADDR) || defined(IPV6_PKTINFO)
+ struct iovec iov; //!< for recvfromto
+ struct msghdr msgh; //!< for recvfromto
+ uint8_t cbuf[256]; //!< for recvfromto
+#endif
+} fr_bio_fd_t;
+
+#define fr_bio_fd_packet_ctx(_my, _packet_ctx) ((fr_bio_fd_packet_ctx_t *) (((uint8_t *) _packet_ctx) + _my->offset))
+
+int fr_filename_to_sockaddr(struct sockaddr_un *sun, socklen_t *sunlen, char const *filename) CC_HINT(nonnull);
+
+int fr_bio_fd_init_common(fr_bio_fd_t *my);
+
+int fr_bio_fd_init_connected(fr_bio_fd_t *my);
+
+int fr_bio_fd_init_accept(fr_bio_fd_t *my);
--- /dev/null
+/*
+ * This program is is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or (at
+ * your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
+ */
+
+/**
+ * $Id$
+ * @file lib/bio/haproxy.c
+ * @brief BIO abstractions for HA proxy protocol interceptors
+ *
+ * @copyright 2024 Network RADIUS SAS (legal@networkradius.com)
+ */
+
+#include <freeradius-devel/bio/bio_priv.h>
+#include <freeradius-devel/bio/null.h>
+#include <freeradius-devel/bio/buf.h>
+
+#include <freeradius-devel/bio/haproxy.h>
+
+#define HAPROXY_HEADER_V1_SIZE (108)
+
+/** The haproxy bio
+ *
+ */
+typedef struct {
+ FR_BIO_COMMON;
+
+ fr_bio_haproxy_info_t info; //!< Information about the "real" client which has connected.
+ // @todo - for v2 of the haproxy protocol, add TLS parameters!
+
+ fr_bio_buf_t buffer; //!< intermediate buffer to read the haproxy header
+
+ bool available; //!< is the haxproxy header available and done
+} fr_bio_haproxy_t;
+
+/** Parse the haproxy header, version 1.
+ *
+ */
+static ssize_t fr_bio_haproxy_v1(fr_bio_haproxy_t *my)
+{
+ int af, argc, port;
+ ssize_t rcode;
+ uint8_t *p, *end;
+ char *eos, *argv[5];
+
+ p = my->buffer.read;
+ end = my->buffer.write;
+
+ /*
+ * We only support v1, and only TCP.
+ */
+ if (memcmp(my->buffer.read, "PROXY TCP", 9) != 0) {
+ fail:
+ fr_bio_shutdown(&my->bio);
+ return fr_bio_error(VERIFY);
+ }
+ p += 9;
+
+ if (*p == '4') {
+ af = AF_INET;
+
+ } else if (*p == '6') {
+ af = AF_INET6;
+
+ } else {
+ goto fail;
+ }
+ p++;
+
+ if (*(p++) != ' ') goto fail;
+
+ argc = 0;
+ rcode = -1;
+ while (p < end) {
+ if (*p > ' ') {
+ if (argc > 4) goto fail;
+
+ argv[argc++] = (char *) p;
+
+ while ((*p > ' ') && (p < end)) p++;
+ continue;
+ }
+
+ if (*p < ' ') {
+ if ((end - p) < 3) goto fail;
+
+ if (memcmp(p, "\r\n", 3) != 0) goto fail;
+
+ *p = '\0';
+ end = p + 3;
+ rcode = 0;
+ break;
+ }
+
+ if (*p != ' ') goto fail;
+
+ *(p++) = '\0';
+ }
+
+ /*
+ * Didn't end with CRLF and zero.
+ */
+ if (rcode < 0) goto fail;
+
+ if (fr_inet_pton(&my->info.socket.inet.src_ipaddr, argv[0], -1, af, false, false) < 0) goto fail;
+ if (fr_inet_pton(&my->info.socket.inet.dst_ipaddr, argv[1], -1, af, false, false) < 0) goto fail;
+
+ port = strtoul(argv[2], &eos, 10);
+ if (port > 65535) goto fail;
+ if (*eos) goto fail;
+ my->info.socket.inet.src_port = port;
+
+ port = strtoul(argv[3], &eos, 10);
+ if (port > 65535) goto fail;
+ if (*eos) goto fail;
+ my->info.socket.inet.dst_port = port;
+
+ /*
+ * Return how many bytes we read. The remainder are for the application.
+ */
+ return (end - my->buffer.read);
+}
+
+/** Satisfy reads from the "next" bio
+ *
+ * The caveat is that there may be data left in our buffer which is needed for the application. We can't
+ * unchain ourselves until we've returned that data to the application, and emptied our buffer.
+ */
+static ssize_t fr_bio_haproxy_read_next(fr_bio_t *bio, UNUSED void *packet_ctx, void *buffer, size_t size)
+{
+ ssize_t rcode;
+ size_t used;
+ fr_bio_haproxy_t *my = talloc_get_type_abort(bio, fr_bio_haproxy_t);
+
+ my->available = true;
+
+ used = fr_bio_buf_used(&my->buffer);
+
+ /*
+ * Somehow (magically) we can satisy the read from our buffer. Do so. Note that we do NOT run
+ * the activation callback, as there is still data in our buffer
+ */
+ if (size < used) {
+ (void) fr_bio_buf_read(&my->buffer, buffer, size);
+ return size;
+ }
+
+ /*
+ * We are asked to empty the buffer. Copy the data to the caller.
+ */
+ (void) fr_bio_buf_read(&my->buffer, buffer, used);
+
+ /*
+ * Call the users activation function, which might remove us from the proxy chain.
+ */
+ if (my->cb.activate) {
+ rcode = my->cb.activate(bio);
+ if (rcode < 0) return rcode;
+ }
+
+ return used;
+}
+
+/** Read from the next bio, and determine if we have an haproxy header.
+ *
+ */
+static ssize_t fr_bio_haproxy_read(fr_bio_t *bio, void *packet_ctx, void *buffer, size_t size)
+{
+ ssize_t rcode;
+ fr_bio_haproxy_t *my = talloc_get_type_abort(bio, fr_bio_haproxy_t);
+ fr_bio_t *next;
+
+ next = fr_bio_next(&my->bio);
+ fr_assert(next != NULL);
+
+ fr_assert(fr_bio_buf_write_room(&my->buffer) > 0);
+
+ rcode = next->read(next, NULL, my->buffer.read, fr_bio_buf_write_room(&my->buffer));
+ if (rcode <= 0) return rcode;
+
+ /*
+ * Not enough room for a full v1 header, tell the caller
+ * that no data was read. The caller should call us
+ * again when the underlying FD is readable.
+ */
+ if (fr_bio_buf_used(&my->buffer) < 16) return 0;
+
+ /*
+ * Process haproxy protocol v1 header.
+ */
+ rcode = fr_bio_haproxy_v1(my);
+ if (rcode <= 0) return rcode;
+
+ /*
+ * We've read a number of bytes from our buffer. The remaining ones are for the application.
+ */
+ (void) fr_bio_buf_read(&my->buffer, NULL, rcode);
+ my->bio.read = fr_bio_haproxy_read_next;
+
+ return fr_bio_haproxy_read_next(bio, packet_ctx, buffer, size);
+}
+
+/** Allocate an haproxy bio.
+ *
+ */
+fr_bio_t *fr_bio_haproxy_alloc(TALLOC_CTX *ctx, fr_bio_cb_funcs_t *cb, fr_bio_t *next)
+{
+ fr_bio_haproxy_t *my;
+ uint8_t *data;
+
+ my = talloc_zero(ctx, fr_bio_haproxy_t);
+ if (!my) return NULL;
+
+ data = talloc_array(my, uint8_t, HAPROXY_HEADER_V1_SIZE);
+ if (!data) {
+ talloc_free(my);
+ return NULL;
+ }
+
+ fr_bio_buf_init(&my->buffer, data, HAPROXY_HEADER_V1_SIZE);
+
+ my->bio.read = fr_bio_haproxy_read;
+ my->bio.write = fr_bio_null_write; /* can't write to this bio */
+ my->cb = *cb;
+
+ fr_bio_chain(&my->bio, next);
+
+ talloc_set_destructor((fr_bio_t *) my, fr_bio_destructor);
+ return (fr_bio_t *) my;
+}
+
+/** Get client information from the haproxy bio.
+ *
+ */
+fr_bio_haproxy_info_t const *fr_bio_haproxy_info(fr_bio_t *bio)
+{
+ fr_bio_haproxy_t *my = talloc_get_type_abort(bio, fr_bio_haproxy_t);
+
+ if (!my->available) return NULL;
+
+ return &my->info;
+}
--- /dev/null
+#pragma once
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
+ */
+
+/**
+ * $Id$
+ * @file lib/bio/haproxy.h
+ * @brief Binary IO abstractions for HA proxy protocol interceptors
+ *
+ * The haproxy bio should be inserted before an FD bio. The caller
+ * can then read from it until the "activation" function is called.
+ * The activate callback should unchain the haproxy bio, and add the
+ * real top-level bio. Or, just use the FD bio as-is.
+ *
+ * This process means that the caller should manually cache pointers
+ * to the individual bios, so that they can be tracked and queried as
+ * necessary.
+ *
+ * @copyright 2024 Network RADIUS SAS (legal@networkradius.com)
+ */
+RCSIDH(lib_bio_fd_h, "$Id$")
+
+#include <freeradius-devel/util/socket.h>
+
+/** Data structure which describes the "real" client connection.
+ *
+ */
+typedef struct {
+ fr_socket_t socket;
+} fr_bio_haproxy_info_t;
+
+fr_bio_t *fr_bio_haproxy_alloc(TALLOC_CTX *ctx, fr_bio_cb_funcs_t *cb, fr_bio_t *next) CC_HINT(nonnull);
+
+fr_bio_haproxy_info_t const *fr_bio_haproxy_info(fr_bio_t *bio) CC_HINT(nonnull);
--- /dev/null
+TARGET := libfreeradius-bio$(L)
+
+SOURCES := \
+ base.c \
+ buf.c \
+ fd.c \
+ fd_open.c \
+ haproxy.c \
+ mem.c \
+ network.c \
+ null.c \
+ packet.c \
+ pipe.c
+
+TGT_PREREQS := libfreeradius-util$(L)
--- /dev/null
+/*
+ * This program is is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or (at
+ * your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
+ */
+
+/**
+ * $Id$
+ * @file lib/bio/mem.c
+ * @brief BIO abstractions for memory buffers
+ *
+ * @copyright 2024 Network RADIUS SAS (legal@networkradius.com)
+ */
+
+#include <freeradius-devel/bio/bio_priv.h>
+#include <freeradius-devel/bio/null.h>
+#include <freeradius-devel/bio/buf.h>
+
+#include <freeradius-devel/bio/mem.h>
+
+/** The memory buffer bio
+ *
+ * It is used to buffer reads / writes to a streaming socket.
+ */
+typedef struct fr_bio_mem_s {
+ FR_BIO_COMMON;
+
+ fr_bio_verify_t verify; //!< verify data to see if we have a packet.
+
+ fr_bio_buf_t read_buffer; //!< buffering for reads
+ fr_bio_buf_t write_buffer; //!< buffering for writes
+} fr_bio_mem_t;
+
+static ssize_t fr_bio_mem_write_buffer(fr_bio_t *bio, void *packet_ctx, void const *buffer, size_t size);
+
+static int fr_bio_mem_verify_packet(fr_bio_t *bio, void *packet_ctx, size_t *size) CC_HINT(nonnull(1,3));
+
+/** At EOF, read data from the buffer until it is empty.
+ *
+ * When "next" bio returns EOF, there may still be pending data in the memory buffer. Return that until it's
+ * empty, and then EOF from then on.
+ */
+static ssize_t fr_bio_mem_read_eof(fr_bio_t *bio, UNUSED void *packet_ctx, void *buffer, size_t size)
+{
+ fr_bio_mem_t *my = talloc_get_type_abort(bio, fr_bio_mem_t);
+
+ /*
+ * No more data: return EOF from now on.
+ */
+ if (fr_bio_buf_used(&my->read_buffer) == 0) {
+ my->bio.read = fr_bio_eof_read;
+ return fr_bio_error(EOF);
+ }
+
+ /*
+ * Return whatever data we have available. One the buffer is empty, the next read will get EOF.
+ */
+ return fr_bio_buf_read(&my->read_buffer, buffer, size);
+}
+
+/** Read from a memory BIO
+ *
+ * This bio reads as much data as possible into the memory buffer. On the theory that a few memcpy() or
+ * memmove() calls are much cheaper than a system call.
+ *
+ * If the read buffer has enough data to satisfy the read, then it is returned.
+ *
+ * Otherwise the next bio is called to re-fill the buffer. The next read call will try to get as much data
+ * as possible into the buffer, even if that results in reading more than "size" bytes.
+ *
+ * Once the next read has been done, then the data from the buffer is returned, even if it is less than
+ * "size".
+ */
+static ssize_t fr_bio_mem_read(fr_bio_t *bio, void *packet_ctx, void *buffer, size_t size)
+{
+ ssize_t rcode;
+ size_t used, room;
+ uint8_t *p;
+ fr_bio_mem_t *my = talloc_get_type_abort(bio, fr_bio_mem_t);
+ fr_bio_t *next;
+
+ /*
+ * We can satisfy the read from the memory buffer: do so.
+ */
+ used = fr_bio_buf_used(&my->read_buffer);
+ if (size <= used) {
+ return fr_bio_buf_read(&my->read_buffer, buffer, size);
+ }
+
+ /*
+ * There must be a next bio.
+ */
+ next = fr_bio_next(&my->bio);
+ fr_assert(next != NULL);
+
+ /*
+ * If there's no room to store more data in the buffer. Just return whatever data we have in the
+ * buffer.
+ */
+ room = fr_bio_buf_write_room(&my->read_buffer);
+ if (!room) return fr_bio_buf_read(&my->read_buffer, buffer, size);
+
+ /*
+ * We try to fill the buffer as much as possible from the network, even if that means reading
+ * more than "size" amount of data.
+ */
+ p = fr_bio_buf_write_reserve(&my->read_buffer, room);
+ fr_assert(p != NULL); /* otherwise room would be zero */
+
+ rcode = next->read(next, packet_ctx, p, room);
+
+ /*
+ * Ensure that whatever data we have read is marked as "used" in the buffer, and then return
+ * whatever data is available back to the caller.
+ */
+ if (rcode >= 0) {
+ if (rcode > 0) (void) fr_bio_buf_write_alloc(&my->read_buffer, (size_t) rcode);
+
+ return fr_bio_buf_read(&my->read_buffer, buffer, size);
+ }
+
+ /*
+ * The next bio returned an error. Whatever it is, it's fatal. We can read from the memory
+ * buffer until it's empty, but we can no longer write to the memory buffer. Any data written to
+ * the buffer is lost.
+ */
+ bio->read = fr_bio_mem_read_eof;
+ bio->write = fr_bio_null_write;
+ return rcode;
+}
+
+/** Return data only if we have a complete packet.
+ *
+ */
+static ssize_t fr_bio_mem_read_packet(fr_bio_t *bio, void *packet_ctx, void *buffer, size_t size)
+{
+ ssize_t rcode;
+ size_t used, room, want;
+ uint8_t *p;
+ fr_bio_mem_t *my = talloc_get_type_abort(bio, fr_bio_mem_t);
+ fr_bio_t *next;
+
+ /*
+ * We may be able to satisfy the read from the memory buffer.
+ */
+ used = fr_bio_buf_used(&my->read_buffer);
+ if (used) {
+ /*
+ * See if there are valid packets in the buffer.
+ */
+ rcode = fr_bio_mem_verify_packet(bio, packet_ctx, &want);
+ if (rcode < 0) {
+ rcode = fr_bio_error(VERIFY);
+ goto fail;
+ }
+
+ /*
+ * There's at least one valid packet, return it.
+ */
+ if (rcode == 1) {
+ /*
+ * This isn't a fatal error. The caller should check how much room is needed by calling
+ * fr_bio_mem_verify_packet(), and retry.
+ *
+ * But in general, the caller should make sure that the output buffer has enough
+ * room for at least one packet. The verify() function should also ensure that
+ * the packet is no larger than our application maximum, even if the protocol
+ * allows for it to be larger.
+ */
+ if (want > size) return fr_bio_error(BUFFER_TOO_SMALL);
+
+ return fr_bio_buf_read(&my->read_buffer, buffer, want);
+ }
+
+ /*
+ * Else we need to read more data to have a complete packet.
+ */
+ }
+
+ /*
+ * There must be a next bio.
+ */
+ next = fr_bio_next(&my->bio);
+ fr_assert(next != NULL);
+
+ /*
+ * If there's no room to store more data in the buffer, try to make some room.
+ */
+ room = fr_bio_buf_write_room(&my->read_buffer);
+ if (!room) {
+ room = fr_bio_buf_make_room(&my->read_buffer);
+
+ /*
+ * We've tried to make room and failed. Which means that the buffer is full, AND there
+ * still isn't a compelte packet in the buffer. This is therefore a fatal error. The
+ * application has not supplied us with enough read_buffer space to store a complete
+ * packet.
+ */
+ if (!room) {
+ rcode = fr_bio_error(BUFFER_FULL);
+ goto fail;
+ }
+ }
+
+ /*
+ * We try to fill the buffer as much as possible from the network. The theory is that a few
+ * extra memcpy() or memmove()s are cheaper than a system call for reading each packet.
+ */
+ p = fr_bio_buf_write_reserve(&my->read_buffer, room);
+ fr_assert(p != NULL); /* otherwise room would be zero */
+
+ rcode = next->read(next, packet_ctx, p, room);
+
+ /*
+ * The next bio returned some data. See if it's a valid packet.
+ */
+ if (rcode > 0) {
+ (void) fr_bio_buf_write_alloc(&my->read_buffer, (size_t) rcode);
+
+ want = fr_bio_buf_used(&my->read_buffer);
+ if (size <= want) want = size;
+
+ /*
+ * See if there are valid packets in the buffer.
+ */
+ rcode = fr_bio_mem_verify_packet(bio, packet_ctx, &want);
+ if (rcode < 0) {
+ rcode = fr_bio_error(VERIFY);
+ goto fail;
+ }
+
+ /*
+ * There's at least one valid packet, return it.
+ */
+ if (rcode == 1) return fr_bio_buf_read(&my->read_buffer, buffer, want);
+
+ /*
+ * No valid packets. The next call to read will call verify again, which will return a
+ * partial packet. And then it will try to fill the buffer from the next bio.
+ */
+ return 0;
+ }
+
+ /*
+ * No data was read from the next bio, we still don't have a packet. Return nothing.
+ */
+ if (rcode == 0) return 0;
+
+ /*
+ * The next bio returned an error. Whatever it is, it's fatal. We can read from the memory
+ * buffer until it's empty, but we can no longer write to the memory buffer. Any data written to
+ * the buffer is lost.
+ */
+fail:
+ bio->read = fr_bio_mem_read_eof;
+ bio->write = fr_bio_null_write;
+ return rcode;
+}
+
+/** Pass writes to the next BIO
+ *
+ * For speed, we try to bypass the memory buffer and write directly to the next bio. However, if the next
+ * bio returns EWOULDBLOCK, we write the data to the memory buffer, even if it is partial data.
+ */
+static ssize_t fr_bio_mem_write_next(fr_bio_t *bio, void *packet_ctx, void const *buffer, size_t size)
+{
+ ssize_t rcode;
+ size_t room, leftover;
+ fr_bio_mem_t *my = talloc_get_type_abort(bio, fr_bio_mem_t);
+ fr_bio_t *next;
+
+ /*
+ * We can't call the next bio if there's still cached data to flush.
+ *
+ * There must be a next bio.
+ */
+ fr_assert(fr_bio_buf_used(&my->write_buffer) == 0);
+
+ next = fr_bio_next(&my->bio);
+ fr_assert(next != NULL);
+
+ /*
+ * The next bio may write all of the data. If so, we return that,
+ */
+ rcode = next->write(next, packet_ctx, buffer, size);
+ if ((size_t) rcode == size) return rcode;
+
+ /*
+ * The next bio returned an error. Anything other than WOULD BLOCK is fatal. We can read from
+ * the memory buffer until it's empty, but we can no longer write to the memory buffer.
+ */
+ if ((rcode < 0) && (rcode != fr_bio_error(IO_WOULD_BLOCK))) {
+ bio->read = fr_bio_mem_read_eof;
+ bio->write = fr_bio_null_write;
+ return rcode;
+ }
+
+ /*
+ * We were flushing the buffer, return however much data we managed to write.
+ *
+ * Note that flushes can never block.
+ */
+ if (!buffer) {
+ fr_assert(rcode != fr_bio_error(IO_WOULD_BLOCK));
+ return rcode;
+ }
+
+ /*
+ * We had WOULD BLOCK, or wrote partial bytes. Save the data to the memory buffer, and ensure
+ * that future writes are ordered. i.e. they write to the memory buffer before writing to the
+ * next bio.
+ */
+ bio->write = fr_bio_mem_write_buffer;
+
+ /*
+ * Clamp the write to however much data is available in the buffer.
+ */
+ leftover = size - rcode;
+ room = fr_bio_buf_write_room(&my->write_buffer);
+
+ /*
+ * If we have "used == 0" above, then we must also have "room > 0".
+ */
+ fr_assert(room > 0);
+
+ if (room < leftover) leftover = room;
+
+ /*
+ * Since we've clamped the write, this call can never fail.
+ */
+ (void) fr_bio_buf_write(&my->write_buffer, ((uint8_t const *) buffer) + rcode, leftover);
+
+ /*
+ * Some of the data base been written to the next bio, and some to our cache. The caller has to
+ * ensure that the first subsequent write will send over the rest of the data.
+ */
+ return rcode + leftover;
+}
+
+/** Flush the memory buffer.
+ *
+ */
+static ssize_t fr_bio_mem_write_flush(fr_bio_mem_t *my, size_t size)
+{
+ int rcode;
+ size_t used;
+ fr_bio_t *next;
+
+ /*
+ * Nothing to flush, don't do any writes.
+ *
+ * Instead, set the write function to write next, where data will be sent directly to the next
+ * bio, and will bypass the write buffer.
+ */
+ used = fr_bio_buf_used(&my->write_buffer);
+ if (!used) {
+ my->bio.write = fr_bio_mem_write_next;
+ return 0;
+ }
+
+ next = fr_bio_next(&my->bio);
+ fr_assert(next != NULL);
+
+ /*
+ * Clamp the amount of data written. If the caller wants to write everything, it should
+ * pass SIZE_MAX.
+ */
+ if (used < size) used = size;
+
+ /*
+ * Flush the buffer to the next bio in line. That function will write as much data as possible,
+ * but may return a partial write.
+ */
+ rcode = next->write(next, NULL, my->write_buffer.write, used);
+
+ /*
+ * The next bio returned an error. Anything other than WOULD BLOCK is fatal. We can read from
+ * the memory buffer until it's empty, but we can no longer write to the memory buffer.
+ */
+ if ((rcode < 0) && (rcode != fr_bio_error(IO_WOULD_BLOCK))) {
+ my->bio.read = fr_bio_mem_read_eof;
+ my->bio.write = fr_bio_null_write;
+ return rcode;
+ }
+
+ /*
+ * We didn't write anything, return that.
+ */
+ if ((rcode == 0) || (rcode == fr_bio_error(IO_WOULD_BLOCK))) return rcode;
+
+ /*
+ * Tell the buffer that we've read a certain amount of data from it.
+ */
+ (void) fr_bio_buf_read(&my->write_buffer, NULL, (size_t) rcode);
+
+ /*
+ * We haven't emptied the buffer, return the partial write.
+ */
+ if ((size_t) rcode < used) return rcode;
+
+ /*
+ * We've flushed all of the buffer. Revert back to "pass through" writing.
+ */
+ fr_assert(fr_bio_buf_used(&my->write_buffer) == 0);
+ my->bio.write = fr_bio_mem_write_next;
+ return rcode;
+}
+
+/** Write to the memory buffer.
+ *
+ * The special buffer pointer of NULL means flush(). On flush, we call next->read(), and if that succeeds,
+ * go back to "pass through" mode for the buffers.
+ */
+static ssize_t fr_bio_mem_write_buffer(fr_bio_t *bio, UNUSED void *packet_ctx, void const *buffer, size_t size)
+{
+ size_t room;
+ fr_bio_mem_t *my = talloc_get_type_abort(bio, fr_bio_mem_t);
+
+ /*
+ * Flush the output buffer.
+ */
+ if (unlikely(!buffer)) return fr_bio_mem_write_flush(my, size);
+
+ /*
+ * Clamp the write to however much data is available in the buffer.
+ */
+ room = fr_bio_buf_write_room(&my->write_buffer);
+
+ /*
+ * The buffer is full. We're now blocked.
+ */
+ if (!room) return fr_bio_error(IO_WOULD_BLOCK);
+
+ if (room < size) size = room;
+
+ /*
+ * As we have clamped the write, we know that this call must succeed.
+ */
+ return fr_bio_buf_write(&my->write_buffer, buffer, size);
+}
+
+/** Peek at the data in the read buffer
+ *
+ * Peeking at the data allows us to avoid many memory copies.
+ */
+uint8_t const *fr_bio_mem_read_peek(fr_bio_t *bio, size_t *size)
+{
+ size_t used;
+ fr_bio_mem_t *my = talloc_get_type_abort(bio, fr_bio_mem_t);
+
+ used = fr_bio_buf_used(&my->read_buffer);
+
+ if (!used) return NULL;
+
+ *size = used;
+ return my->read_buffer.read;
+}
+
+/** Discard data from the read buffer.
+ *
+ * Discarding allows the caller to silently omit packets, so that
+ * they are not passed up to previous bios.
+ */
+void fr_bio_mem_read_discard(fr_bio_t *bio, size_t size)
+{
+ fr_bio_mem_t *my = talloc_get_type_abort(bio, fr_bio_mem_t);
+
+ (void) fr_bio_buf_read(&my->read_buffer, NULL, size);
+}
+
+/** Verify that a packet is OK.
+ *
+ * @todo - have this as a parameter to the read routines, so that they only return complete packets?
+ *
+ * @param bio the #fr_bio_mem_t
+ * @param packet_ctx the packet ctx
+ * @param[out] size how big the verified packet is
+ * @return
+ * - <0 on error, the caller should close the bio.
+ * - 0 for "we have a partial packet", the size to read is in *size
+ * - 1 for "we have at least one good packet", the size of it is in *size
+ */
+static int fr_bio_mem_verify_packet(fr_bio_t *bio, void *packet_ctx, size_t *size)
+{
+ uint8_t *packet, *end;
+ fr_bio_mem_t *my = talloc_get_type_abort(bio, fr_bio_mem_t);
+
+ packet = my->read_buffer.read;
+ end = my->read_buffer.write;
+
+ while (packet < end) {
+ size_t want;
+#ifndef NDEBUG
+ size_t used;
+
+ used = end - packet;
+#endif
+
+ want = end - packet;
+
+ switch (my->verify((fr_bio_t *) my, packet_ctx, packet, &want)) {
+ /*
+ * The data in the buffer is exactly a packet. Return that.
+ *
+ * @todo - if there are multiple packets, return the total size of packets?
+ */
+ case FR_BIO_VERIFY_OK:
+ fr_assert(want <= used);
+ *size = want;
+ return 1;
+
+ /*
+ * The packet needs more data. Return how much data we need for one packet.
+ */
+ case FR_BIO_VERIFY_WANT_MORE:
+ fr_assert(want > used);
+ *size = want;
+ return 0;
+
+ case FR_BIO_VERIFY_DISCARD:
+ /*
+ * We don't call fr_bio_buf_read(), because that will move the memory around, and
+ * we want to avoid that if at all possible.
+ */
+ fr_assert(want <= used);
+ fr_assert(packet == my->read_buffer.read);
+ my->read_buffer.read += want;
+ continue;
+
+ /*
+ * Some kind of fatal validation error.
+ */
+ case FR_BIO_VERIFY_ERROR_CLOSE:
+ break;
+ }
+ }
+
+ return -1;
+}
+
+/** Allocate a memory buffer bio for either reading or writing.
+ */
+static bool fr_bio_mem_buf_alloc(fr_bio_mem_t *my, fr_bio_buf_t *buf, size_t size)
+{
+ uint8_t *data;
+
+ if (size < 1024) size = 1024;
+ if (size > (1 << 20)) size = 1 << 20;
+
+ data = talloc_array(my, uint8_t, size);
+ if (!data) {
+ talloc_free(my);
+ return false;
+ }
+
+ fr_bio_buf_init(buf, data, size);
+ return true;
+}
+
+/** Allocate a memory buffer bio
+ *
+ * The "read buffer" will cache reads from the next bio in the chain. If the next bio returns more data than
+ * the caller asked for, the extra data is cached in the read buffer.
+ *
+ * The "write buffer" will buffer writes to the next bio in the chain. If the caller writes more data than
+ * the next bio can process, the extra data is cached in the write buffer.
+ *
+ * When the bio is closed (or freed) any pending data in the buffers is lost. The same happens if the next
+ * bio returns a fatal error.
+ *
+ * At some point during a read, the next bio may return EOF. When that happens, the caller should not rely
+ * on the next FD being readable or writable. Instead, it should keep reading from the memory bio until it
+ * returns EOF. See fr_bio_fd_eof() for details.
+ *
+ * @param ctx the talloc ctx
+ * @param read_size size of the read buffer. Must be 1024..1^20
+ * @param write_size size of the write buffer. Must be 1024..1^20
+ * @param next the next bio which will perform the underlying reads and writes.
+ * - NULL on error, memory allocation failed
+ * - !NULL the bio
+ */
+fr_bio_t *fr_bio_mem_alloc(TALLOC_CTX *ctx, size_t read_size, size_t write_size, fr_bio_t *next)
+{
+ fr_bio_mem_t *my;
+
+ my = talloc_zero(ctx, fr_bio_mem_t);
+ if (!my) return NULL;
+
+ /*
+ * The caller has to state that the API is caching data both ways.
+ */
+ if (!read_size || !write_size) return NULL;
+
+ if (!fr_bio_mem_buf_alloc(my, &my->read_buffer, read_size)) return NULL;
+ if (!fr_bio_mem_buf_alloc(my, &my->write_buffer, write_size)) return NULL;
+
+ my->bio.read = fr_bio_mem_read;
+ my->bio.write = fr_bio_mem_write_next;
+
+ fr_bio_chain(&my->bio, next);
+
+ talloc_set_destructor((fr_bio_t *) my, fr_bio_destructor);
+ return (fr_bio_t *) my;
+}
+
+/** Only return verified packets.
+ *
+ * Like fr_bio_mem_alloc(), but only returns packets.
+ *
+ * Writes pass straight through to the next bio.
+ */
+fr_bio_t *fr_bio_mem_packet_alloc(TALLOC_CTX *ctx, size_t read_size, fr_bio_t *next,
+ fr_bio_verify_t verify, void *uctx)
+{
+ fr_bio_mem_t *my;
+
+ my = (fr_bio_mem_t *) fr_bio_mem_sink_alloc(ctx, read_size);
+ if (!my) return NULL;
+
+ my->verify = verify;
+ my->bio.read = fr_bio_mem_read_packet;
+ my->bio.write = fr_bio_next_write;
+
+ fr_bio_chain(&my->bio, next);
+
+ return (fr_bio_t *) my;
+}
+
+/** Allocate a memory buffer which sources data from the callers application into the bio system.
+ *
+ * The caller writes data to the buffer, but never reads from it. This bio will call the "next" bio to sink
+ * the data.
+ */
+fr_bio_t *fr_bio_mem_source_alloc(TALLOC_CTX *ctx, size_t write_size, fr_bio_t *next)
+{
+ fr_bio_mem_t *my;
+
+ my = talloc_zero(ctx, fr_bio_mem_t);
+ if (!my) return NULL;
+
+ /*
+ * The caller has to state that the API is caching data.
+ */
+ if (!write_size) return NULL;
+
+ if (!fr_bio_mem_buf_alloc(my, &my->write_buffer, write_size)) return NULL;
+
+ my->bio.read = fr_bio_null_read; /* reading FROM this bio is not possible */
+ my->bio.write = fr_bio_mem_write_next;
+
+ fr_bio_chain(&my->bio, next);
+
+ talloc_set_destructor((fr_bio_t *) my, fr_bio_destructor);
+ return (fr_bio_t *) my;
+}
+
+/** Read from a buffer which a previous bio has filled.
+ *
+ * This function is called by the application which wants to read from a sink.
+ */
+static ssize_t fr_bio_mem_read_buffer(fr_bio_t *bio, UNUSED void *packet_ctx, void *buffer, size_t size)
+{
+ fr_bio_mem_t *my = talloc_get_type_abort(bio, fr_bio_mem_t);
+
+ return fr_bio_buf_read(&my->read_buffer, buffer, size);
+}
+
+/** Write to the read buffer.
+ *
+ * This function is called by an upstream function which writes into our local buffer.
+ */
+static ssize_t fr_bio_mem_write_read_buffer(fr_bio_t *bio, UNUSED void *packet_ctx, void const *buffer, size_t size)
+{
+ size_t room;
+ fr_bio_mem_t *my = talloc_get_type_abort(bio, fr_bio_mem_t);
+
+ /*
+ * Clamp the write to however much data is available in the buffer.
+ */
+ room = fr_bio_buf_write_room(&my->read_buffer);
+
+ /*
+ * The buffer is full. We're now blocked.
+ */
+ if (!room) return fr_bio_error(IO_WOULD_BLOCK);
+
+ if (room < size) size = room;
+
+ /*
+ * As we have clamped the write, we know that this call must succeed.
+ */
+ return fr_bio_buf_write(&my->read_buffer, buffer, size);
+}
+
+/** Allocate a memory buffer which sinks data from a bio system into the callers application.
+ *
+ * The caller reads data from this bio, but never writes to it. Upstream bios will source the data.
+ */
+fr_bio_t *fr_bio_mem_sink_alloc(TALLOC_CTX *ctx, size_t read_size)
+{
+ fr_bio_mem_t *my;
+
+ my = talloc_zero(ctx, fr_bio_mem_t);
+ if (!my) return NULL;
+
+ /*
+ * The caller has to state that the API is caching data.
+ */
+ if (!read_size) return NULL;
+
+ if (!fr_bio_mem_buf_alloc(my, &my->read_buffer, read_size)) return NULL;
+ my->bio.read = fr_bio_mem_read_buffer;
+ my->bio.write = fr_bio_mem_write_read_buffer; /* the upstream will write to our read buffer */
+
+ talloc_set_destructor((fr_bio_t *) my, fr_bio_destructor);
+ return (fr_bio_t *) my;
+}
--- /dev/null
+#pragma once
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
+ */
+
+/**
+ * $Id$
+ * @file lib/bio/mem.h
+ * @brief Binary IO abstractions for memory buffers
+ *
+ * Allow reads and writes from memory buffers
+ *
+ * @copyright 2024 Network RADIUS SAS (legal@networkradius.com)
+ */
+RCSIDH(lib_bio_mem_h, "$Id$")
+
+/** Status returned by the verification callback.
+ *
+ */
+typedef enum {
+ FR_BIO_VERIFY_OK = 0, //!< packet is OK
+ FR_BIO_VERIFY_DISCARD, //!< the packet should be discarded
+ FR_BIO_VERIFY_WANT_MORE, //!< not enough data for one packet
+ FR_BIO_VERIFY_ERROR_CLOSE, //!< fatal error, the bio should be closed.
+} fr_bio_verify_action_t;
+
+/** Verifies the packet
+ *
+ * If the packet is a dup, then this function can return DISCARD, or
+ * update the packet_ctx to say "dup", and then return OK.
+ *
+ * @param bio the bio to read
+ * @param packet_ctx as passed in to fr_bio_read()
+ * @param buffer pointer to the raw data
+ * @param[in,out] size in: size of data in the buffer. out: size of the packet to return, or data to discard.
+ * @return action to take
+ */
+typedef fr_bio_verify_action_t (*fr_bio_verify_t)(fr_bio_t *bio, void *packet_ctx, const void *buffer, size_t *size);
+
+fr_bio_t *fr_bio_mem_alloc(TALLOC_CTX *ctx, size_t read_size, size_t write_size, fr_bio_t *next) CC_HINT(nonnull);
+
+fr_bio_t *fr_bio_mem_packet_alloc(TALLOC_CTX *ctx, size_t read_size, fr_bio_t *next,
+ fr_bio_verify_t verify, void *uctx) CC_HINT(nonnull(1,3,4));
+
+fr_bio_t *fr_bio_mem_source_alloc(TALLOC_CTX *ctx, size_t buffer_size, fr_bio_t *next) CC_HINT(nonnull);
+
+fr_bio_t *fr_bio_mem_sink_alloc(TALLOC_CTX *ctx, size_t buffer_size) CC_HINT(nonnull);
+
+uint8_t const *fr_bio_mem_read_peek(fr_bio_t *bio, size_t *size) CC_HINT(nonnull);
+
+void fr_bio_mem_read_discard(fr_bio_t *bio, size_t size) CC_HINT(nonnull);
--- /dev/null
+/*
+ * This program is is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or (at
+ * your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
+ */
+
+/**
+ * $Id$
+ * @file lib/bio/network.c
+ * @brief BIO patricia trie filtering handlers
+ *
+ * @copyright 2024 Network RADIUS SAS (legal@networkradius.com)
+ */
+
+#include <freeradius-devel/util/value.h>
+#include <freeradius-devel/util/trie.h>
+
+#include <freeradius-devel/bio/bio_priv.h>
+#include <freeradius-devel/bio/fd_priv.h>
+
+#include <freeradius-devel/bio/network.h>
+
+/** The network filtering bio
+ */
+typedef struct {
+ FR_BIO_COMMON;
+
+ fr_bio_read_t discard; //!< callback to run when discarding a packet due to filtering
+
+ size_t offset; //!< where #fr_bio_fd_packet_ctx_t is stored
+
+ fr_trie_t const *trie; //!< patricia trie for filtering
+} fr_bio_network_t;
+
+/** Read a UDP packet, and only return packets from allowed sources.
+ *
+ */
+static ssize_t fr_bio_network_read(fr_bio_t *bio, void *packet_ctx, void *buffer, size_t size)
+{
+ ssize_t rcode;
+ bool *value;
+ fr_bio_network_t *my = talloc_get_type_abort(bio, fr_bio_network_t);
+ fr_bio_fd_packet_ctx_t *addr;
+ fr_bio_t *next;
+
+ next = fr_bio_next(&my->bio);
+ fr_assert(next != NULL);
+
+ rcode = next->read(next, packet_ctx, buffer, size);
+ if (rcode <= 0) return rcode;
+
+ if (!packet_ctx) return rcode;
+
+ addr = fr_bio_fd_packet_ctx(my, packet_ctx);
+
+ /*
+ * Look up this particular source. If it's not found, then we suppress this packet.
+ */
+ value = fr_trie_lookup_by_key(my->trie,
+ &addr->socket.inet.src_ipaddr.addr, addr->socket.inet.src_ipaddr.prefix);
+ if (value != FR_BIO_NETWORK_ALLOW) {
+ if (my->discard) return my->discard(bio, packet_ctx, buffer, rcode);
+ return 0;
+ }
+
+ return rcode;
+}
+
+
+/** Allocate a bio for filtering IP addresses
+ *
+ * This is used for unconnected UDP bios, where we filter packets based on source IP address.
+ *
+ * It is also used for accept bios, where we filter new connections based on source IP address. The caller
+ * should chain this bio to the next FD bio, and then fr_bio_read() from the top-level bio. The result will
+ * be filtered or "clean" FDs.
+ *
+ * A patricia trie (but not the bio) could also be used in an haproxy "activate" callback, where the callback
+ * gets the haproxy socket info, and then checks if the source is allowed. However, that patricia trie is a
+ * property of the main "accept" bio, and should be managed by the activate() callback for the haproxy bio.
+ */
+fr_bio_t *fr_bio_network_alloc(TALLOC_CTX *ctx, fr_ipaddr_t const *allow, fr_ipaddr_t const *deny,
+ fr_bio_read_t discard, fr_bio_t *next)
+{
+ fr_bio_network_t *my;
+ fr_bio_t *fd;
+ fr_bio_fd_info_t const *info;
+
+ /*
+ * We are only useable for FD bios. We need to get "offset" into the packet_ctx, and we don't
+ * want to have an API which allows for two different "offset" values to be passed to two
+ * different bios.
+ */
+ fd = NULL;
+
+ /*
+ * @todo - add an internal "type" to the bio?
+ */
+ while (next && (strcmp(talloc_get_name(next), "fr_bio_fd_t") != 0)) {
+ next = fr_bio_next(next);
+ }
+
+ if (!fd) return -1;
+
+ info = fr_bio_fd_info(fd);
+ fr_assert(info != NULL);
+
+ /*
+ * We can only filter connections for IP address families.
+ *
+ * Unix domain sockets have to use a different method for filtering input connections.
+ */
+ if (!((info->socket.af == AF_INET) || (info->socket.af == AF_INET6))) return -1;
+
+ /*
+ * We can only be used for accept() sockets, or unconnected UDP sockets.
+ */
+ switch (info->type) {
+ case FR_BIO_FD_UNCONNECTED:
+ break;
+
+ case FR_BIO_FD_CONNECTED:
+ return -1;
+
+ case FR_BIO_FD_ACCEPT:
+ break;
+ }
+
+ my = talloc_zero(ctx, fr_bio_network_t);
+ if (!my) return NULL;
+
+ my->offset = ((fr_bio_fd_t *) fd)->offset;
+ my->discard = discard;
+
+ my->bio.write = fr_bio_next_write;
+ my->bio.read = fr_bio_network_read;
+
+ my->trie = fr_bio_network_trie_alloc(my, info->socket.af, allow, deny);
+ if (!my->trie) {
+ talloc_free(my);
+ return NULL;
+ }
+
+ fr_bio_chain(&my->bio, next);
+
+ return (fr_bio_t *) my;
+}
+
+/** Create a patricia trie for doing network filtering.
+ *
+ */
+fr_trie_t *fr_bio_network_trie_alloc(TALLOC_CTX *ctx, int af, fr_ipaddr_t const *allow, fr_ipaddr_t const *deny)
+{
+ size_t i, num;
+ fr_trie_t *trie;
+
+ trie = fr_trie_alloc(ctx, NULL, NULL);
+ if (!trie) return NULL;
+
+ num = talloc_array_length(allow);
+ fr_assert(num > 0);
+
+ for (i = 0; i < num; i++) {
+ bool *value;
+
+ /*
+ * Can't add v4 networks to a v6 socket, or vice versa.
+ */
+ if (allow[i].af != af) {
+ fr_strerror_printf("Address family in entry %zd - 'allow = %pV' "
+ "does not match 'ipaddr'", i + 1, fr_box_ipaddr(allow[i]));
+ fail:
+ talloc_free(trie);
+ return NULL;
+ }
+
+ /*
+ * Duplicates are bad.
+ */
+ value = fr_trie_match_by_key(trie, &allow[i].addr, allow[i].prefix);
+ if (value) {
+ fr_strerror_printf("Cannot add duplicate entry 'allow = %pV'",
+ fr_box_ipaddr(allow[i]));
+ goto fail;
+ }
+
+#if 0
+ /*
+ * Look for overlapping entries. i.e. the networks MUST be disjoint.
+ *
+ * Note that this catches 192.168.1/24 followed by 192.168/16, but NOT the other way
+ * around. The best fix is likely to add a flag to fr_trie_alloc() saying "we can only
+ * have terminal fr_trie_user_t nodes"
+ */
+ value = fr_trie_lookup_by_key(trie, &allow[i].addr, allow[i].prefix);
+ if (network && (network->prefix <= allow[i].prefix)) {
+ fr_strerror_printf("Cannot add overlapping entry 'allow = %pV'", fr_box_ipaddr(allow[i]));
+ fr_strerror_const("Entry is completely enclosed inside of a previously defined network.");
+ goto fail;
+ }
+#endif
+
+ /*
+ * Insert the network into the trie. Lookups will return a bool ptr of allow / deny.
+ */
+ if (fr_trie_insert_by_key(trie, &allow[i].addr, allow[i].prefix, FR_BIO_NETWORK_ALLOW) < 0) {
+ fr_strerror_printf("Failed adding 'allow = %pV' to filtering rules", fr_box_ipaddr(allow[i]));
+ return NULL;
+ }
+ }
+
+ /*
+ * And now check denied networks.
+ */
+ num = talloc_array_length(deny);
+ if (!num) return trie;
+
+ /*
+ * Since the default is to deny, you can only add a "deny" inside of a previous "allow".
+ */
+ for (i = 0; i < num; i++) {
+ bool *value;
+
+ /*
+ * Can't add v4 networks to a v6 socket, or vice versa.
+ */
+ if (deny[i].af != af) {
+ fr_strerror_printf("Address family in entry %zd - 'deny = %pV' "
+ "does not match 'ipaddr'", i + 1, fr_box_ipaddr(deny[i]));
+ goto fail;
+ }
+
+ /*
+ * Exact duplicates are forbidden.
+ */
+ value = fr_trie_match_by_key(trie, &deny[i].addr, deny[i].prefix);
+ if (value) {
+ fr_strerror_printf("Cannot add duplicate entry 'deny = %pV'", fr_box_ipaddr(deny[i]));
+ goto fail;
+ }
+
+ /*
+ * A "deny" can only be within a previous "allow".
+ */
+ value = fr_trie_lookup_by_key(trie, &deny[i].addr, deny[i].prefix);
+ if (!value) {
+ fr_strerror_printf("The network in entry %zd - 'deny = %pV' is not "
+ "contained within a previous 'allow'", i + 1, fr_box_ipaddr(deny[i]));
+ goto fail;
+ }
+
+ /*
+ * A "deny" cannot be within a previous "deny".
+ */
+ if (value == FR_BIO_NETWORK_DENY) {
+ fr_strerror_printf("The network in entry %zd - 'deny = %pV' is overlaps "
+ "with another 'deny' rule", i + 1, fr_box_ipaddr(deny[i]));
+ goto fail;
+ }
+
+ /*
+ * Insert the rule into the trie.
+ */
+ if (fr_trie_insert_by_key(trie, &deny[i].addr, deny[i].prefix, FR_BIO_NETWORK_DENY) < 0) {
+ fr_strerror_printf("Failed adding 'deny = %pV' to filtering rules", fr_box_ipaddr(deny[i]));
+ return NULL;
+ }
+ }
+
+ return trie;
+}
--- /dev/null
+#pragma once
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
+ */
+
+/**
+ * $Id$
+ * @file lib/bio/network.h
+ * @brief BIO patricia trie filtering handlers
+ *
+ * @copyright 2024 Network RADIUS SAS (legal@networkradius.com)
+ */
+RCSIDH(lib_bio_network_h, "$Id$")
+
+#include <freeradius-devel/util/inet.h>
+
+fr_bio_t *fr_bio_network_alloc(TALLOC_CTX *ctx, fr_ipaddr_t const *allow, fr_ipaddr_t const *deny,
+ fr_bio_read_t discard, fr_bio_t *next) CC_HINT(nonnull(1,3,5));
+
+fr_trie_t *fr_bio_network_trie_alloc(TALLOC_CTX *ctx, int af, fr_ipaddr_t const *allow, fr_ipaddr_t const *deny);
+
+/*
+ * IP address lookups return one of these two magic pointers.
+ *
+ * NULL means "nothing matches", which should also be interpreted as "deny".
+ *
+ * The difference between "NULL" and "deny" is that NULL is an IP address which was never inserted into
+ * the trie. Whereas "deny" menas that there is a parent "allow" range, and we are carving out a "deny"
+ * in the middle of that range.
+ */
+#define FR_BIO_NETWORK_ALLOW ((void *) (-1))
+#define FR_BIO_NETWORK_DENY ((void *) (-2))
--- /dev/null
+/*
+ * This program is is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or (at
+ * your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
+ */
+
+/**
+ * $Id$
+ * @file lib/bio/null.c
+ * @brief BIO NULL handlers
+ *
+ * @copyright 2024 Network RADIUS SAS (legal@networkradius.com)
+ */
+
+#include <freeradius-devel/bio/bio_priv.h>
+#include <freeradius-devel/bio/null.h>
+
+/** Always return 0 on read.
+ *
+ */
+ssize_t fr_bio_null_read(UNUSED fr_bio_t *bio, UNUSED void *packet_ctx, UNUSED void *buffer, UNUSED size_t size)
+{
+ return 0;
+}
+
+/** Always return 0 on write.
+ *
+ */
+ssize_t fr_bio_null_write(UNUSED fr_bio_t *bio, UNUSED void *packet_ctx, UNUSED void const *buffer, UNUSED size_t size)
+{
+ return 0;
+}
--- /dev/null
+#pragma once
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
+ */
+
+/**
+ * $Id$
+ * @file lib/bio/null.h
+ * @brief BIO null handlers.
+ *
+ * @copyright 2024 Network RADIUS SAS (legal@networkradius.com)
+ */
+RCSIDH(lib_bio_null_h, "$Id$")
+
+ssize_t fr_bio_null_read(fr_bio_t *bio, void *packet_ctx, void *buffer, size_t size);
+ssize_t fr_bio_null_write(fr_bio_t *bio, void *packet_ctx, void const *buffer, size_t size);
--- /dev/null
+/*
+ * This program is is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or (at
+ * your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
+ */
+
+/**
+ * $Id$
+ * @file lib/bio/fd.c
+ * @brief Binary IO abstractions for packets in buffers
+ *
+ * @copyright 2024 Network RADIUS SAS (legal@networkradius.com)
+ */
+
+#include <freeradius-devel/bio/bio_priv.h>
+#include <freeradius-devel/bio/packet.h>
+#include <freeradius-devel/bio/null.h>
+#include <freeradius-devel/util/dlist.h>
+
+typedef struct fr_bio_packet_entry_s fr_bio_packet_entry_t;
+typedef struct fr_bio_packet_list_s fr_bio_packet_list_t;
+typedef struct fr_bio_packet_s fr_bio_packet_t;
+
+/*
+ * Define type-safe wrappers for head and entry definitions.
+ */
+FR_DLIST_TYPES(fr_bio_packet_list)
+
+/*
+ * For delayed writes.
+ *
+ * @todo - we can remove the "cancelled" field by setting packet_ctx == my?
+ */
+struct fr_bio_packet_entry_s {
+ void *packet_ctx;
+ void const *buffer;
+ size_t size;
+ size_t already_written;
+ bool cancelled;
+
+ fr_bio_packet_t *my;
+
+ FR_DLIST_ENTRY(fr_bio_packet_list) entry; //!< List entry.
+};
+
+struct fr_bio_packet_list_s {
+ FR_DLIST_HEAD(fr_bio_packet_list) saved;
+ FR_DLIST_HEAD(fr_bio_packet_list) free;
+};
+
+FR_DLIST_FUNCS(fr_bio_packet_list, fr_bio_packet_entry_t, entry)
+
+
+typedef struct fr_bio_packet_s {
+ FR_BIO_COMMON;
+
+ size_t max_saved;
+
+ fr_bio_packet_saved_t saved;
+ fr_bio_packet_callback_t sent;
+ fr_bio_packet_callback_t cancel;
+
+ FR_DLIST_HEAD(fr_bio_packet_list) pending;
+ FR_DLIST_HEAD(fr_bio_packet_list) free;
+
+ fr_bio_packet_entry_t array[];
+} fr_bio_packet_t;
+
+static ssize_t fr_bio_packet_write_buffer(fr_bio_t *bio, void *packet_ctx, void const *buffer, size_t size);
+
+/** Forcibly cancel all outstanding packets.
+ *
+ * Even partially written ones. This function is called from
+ * shutdown(), when the destructor is called, or on fatal read / write
+ * errors.
+ */
+static void fr_bio_packet_list_cancel(fr_bio_packet_t *my)
+{
+ fr_bio_packet_entry_t *item;
+
+ if (!my->cancel) return;
+
+ if (fr_bio_packet_list_num_elements(&my->pending) == 0) return;
+
+ /*
+ * Cancel any remaining saved items.
+ */
+ while ((item = fr_bio_packet_list_pop_head(&my->pending)) != NULL) {
+ my->cancel(&my->bio, item->packet_ctx, item->buffer, item->size);
+ item->cancelled = true;
+ fr_bio_packet_list_insert_head(&my->free, item);
+ }
+}
+
+static int fr_bio_packet_destructor(fr_bio_packet_t *my)
+{
+ fr_assert(my->cancel); /* otherwise it would be fr_bio_destructor */
+
+ my->bio.write = fr_bio_null_write;
+ fr_bio_packet_list_cancel(my);
+
+ return fr_bio_destructor(&my->bio);
+}
+
+/** Push a packet onto a list.
+ *
+ */
+static ssize_t fr_bio_packet_list_push(fr_bio_packet_t *my, void *packet_ctx, const void *buffer, size_t size, size_t already_written)
+{
+ fr_bio_packet_entry_t *item;
+
+ item = fr_bio_packet_list_pop_head(&my->free);
+ if (!item) return fr_bio_error(IO_WOULD_BLOCK);
+
+ /*
+ * If we're the first entry in the saved list, we can have a partially written packet.
+ *
+ * Otherwise, we're a subsequent entry, and we cannot have any data which is partially written.
+ */
+ fr_assert((fr_bio_packet_list_num_elements(&my->pending) == 0) ||
+ (already_written == 0));
+
+ item->packet_ctx = packet_ctx;
+ item->buffer = buffer;
+ item->size = size;
+ item->already_written = already_written;
+ item->cancelled = false;
+
+ fr_bio_packet_list_insert_tail(&my->pending, item);
+
+ if (my->saved) my->saved(&my->bio, packet_ctx, buffer, size, item);
+
+ return size;
+}
+
+/** Write one packet to the next bio.
+ *
+ * If it blocks, save the packet and return OK to the caller.
+ */
+static ssize_t fr_bio_packet_write_next(fr_bio_t *bio, void *packet_ctx, void const *buffer, size_t size)
+{
+ ssize_t rcode;
+ fr_bio_packet_t *my = talloc_get_type_abort(bio, fr_bio_packet_t);
+ fr_bio_t *next;
+
+ /*
+ * We can't call the next bio if there's still cached data to flush.
+ */
+ fr_assert(fr_bio_packet_list_num_elements(&my->pending) == 0);
+
+ next = fr_bio_next(&my->bio);
+ fr_assert(next != NULL);
+
+ /*
+ * Write the data out. If we write all of it, we're done.
+ */
+ rcode = next->write(next, packet_ctx, buffer, size);
+ if ((size_t) rcode == size) return rcode;
+
+ if (rcode < 0) {
+ /*
+ * A non-blocking error: return it back up the chain.
+ */
+ if (rcode != fr_bio_error(IO_WOULD_BLOCK)) return rcode;
+
+ /*
+ * All other errors are fatal.
+ */
+ my->bio.read = fr_bio_eof_read;
+ my->bio.write = fr_bio_null_write;
+
+ fr_bio_packet_list_cancel(my);
+ return rcode;
+ }
+
+ /*
+ * We were flushing the next buffer, return any data which was written.
+ */
+ if (!buffer) return rcode;
+
+ /*
+ * The next bio wrote a partial packet. Save the entire packet, and swap the write function to
+ * save all future packets in the saved list.
+ */
+ bio->write = fr_bio_packet_write_buffer;
+
+ fr_assert(fr_bio_packet_list_num_elements(&my->free) > 0);
+
+ /*
+ * This can only error out if the free list has no more entries.
+ */
+ return fr_bio_packet_list_push(my, packet_ctx, buffer, size, (size_t) rcode);
+}
+
+/** Flush the packet list.
+ *
+ */
+static ssize_t fr_bio_packet_write_flush(fr_bio_packet_t *my, size_t size)
+{
+ size_t written;
+ fr_bio_t *next;
+
+ if (fr_bio_packet_list_num_elements(&my->pending) == 0) {
+ my->bio.write = fr_bio_packet_write_next;
+ return 0;
+ }
+
+ next = fr_bio_next(&my->bio);
+ fr_assert(next != NULL);
+
+ /*
+ * Loop over the saved packets, flushing them to the next bio.
+ */
+ written = 0;
+ while (written < size) {
+ ssize_t rcode;
+ fr_bio_packet_entry_t *item;
+
+ /*
+ * No more saved packets to write: stop.
+ */
+ item = fr_bio_packet_list_head(&my->pending);
+ if (!item) break;
+
+ /*
+ * A cancelled item must be partially written. A cancelled item which has zero bytes
+ * written should not be in the saved list.
+ */
+ fr_assert(!item->cancelled || (item->already_written > 0));
+
+ /*
+ * Push out however much data we can to the next bio.
+ */
+ rcode = next->write(next, item->packet_ctx, ((uint8_t const *) item->buffer) + item->already_written, item->size - item->already_written);
+ if (rcode == 0) break;
+
+ if (rcode < 0) {
+ if (rcode == fr_bio_error(IO_WOULD_BLOCK)) break;
+
+ return rcode;
+ }
+
+ /*
+ * Update the written count.
+ */
+ written += rcode;
+ item->already_written += rcode;
+
+ if (item->already_written < item->size) break;
+
+ /*
+ * We don't run "sent" callbacks for cancelled items.
+ */
+ if (item->cancelled) {
+ if (my->cancel) my->cancel(&my->bio, item->packet_ctx, item->buffer, item->size);
+ } else {
+ if (my->sent) my->sent(&my->bio, item->packet_ctx, item->buffer, item->size);
+ }
+
+ (void) fr_bio_packet_list_pop_head(&my->pending);
+#ifndef NDEBUG
+ item->buffer = NULL;
+ item->packet_ctx = NULL;
+ item->size = 0;
+ item->already_written = 0;
+#endif
+ item->cancelled = true;
+
+ fr_bio_packet_list_insert_head(&my->free, item);
+ }
+
+ /*
+ * If we've written all of the saved packets, go back to writing to the "next" bio.
+ */
+ if (fr_bio_packet_list_head(&my->pending)) my->bio.write = fr_bio_packet_write_next;
+
+ return written;
+}
+
+/** Write to the packet list buffer.
+ *
+ * The special buffer pointer of NULL means flush(). On flush, we call next->read(), and if that succeeds,
+ * go back to "pass through" mode for the buffers.
+ */
+static ssize_t fr_bio_packet_write_buffer(fr_bio_t *bio, void *packet_ctx, void const *buffer, size_t size)
+{
+ fr_bio_packet_t *my = talloc_get_type_abort(bio, fr_bio_packet_t);
+
+ if (!buffer) return fr_bio_packet_write_flush(my, size);
+
+ /*
+ * This can only error out if the free list has no more entries.
+ */
+ return fr_bio_packet_list_push(my, packet_ctx, buffer, size, 0);
+}
+
+/** Read one packet from next bio.
+ *
+ * This function does NOT respect packet boundaries. The caller should use other APIs to determine how big
+ * the "next" packet is.
+ *
+ * The caller may buffer the output data itself, or it may use other APIs to do checking.
+ *
+ * The main
+ */
+static ssize_t fr_bio_packet_read(fr_bio_t *bio, void *packet_ctx, void *buffer, size_t size)
+{
+ int rcode;
+ fr_bio_packet_t *my = talloc_get_type_abort(bio, fr_bio_packet_t);
+ fr_bio_t *next;
+
+ next = fr_bio_next(&my->bio);
+ fr_assert(next != NULL);
+
+ rcode = next->read(next, packet_ctx, buffer, size);
+ if (rcode >= 0) return rcode;
+
+ /*
+ * We didn't read anything, return that.
+ */
+ if (rcode == fr_bio_error(IO_WOULD_BLOCK)) return rcode;
+
+ /*
+ * Error reading, which means that we can't write to it, either. We don't care if the error is
+ * EOF or anything else. We just cancel the outstanding packets, and shut ourselves down.
+ */
+ my->bio.read = fr_bio_eof_read;
+ my->bio.write = fr_bio_null_write;
+
+ fr_bio_packet_list_cancel(my);
+ return rcode;
+}
+
+/** Shutdown
+ *
+ * Cancel / close has to be called before re-init.
+ */
+static int fr_bio_packet_shutdown(fr_bio_t *bio)
+{
+ fr_bio_packet_t *my = talloc_get_type_abort(bio, fr_bio_packet_t);
+
+ fr_bio_packet_list_cancel(my);
+
+ my->bio.read = fr_bio_packet_read;
+ my->bio.write = fr_bio_packet_write_next;
+
+ return 0;
+}
+
+
+/** Allocate a packet-based bio.
+ *
+ * This bio assumes that each call to fr_bio_write() is for one packet, and only one packet. If the next bio
+ * returns a partial write, or WOULD BLOCK, then information about the packet is cached. Subsequent writes
+ * will write the partial data first, and then continue with subsequent writes.
+ *
+ * The caller is responsible for not freeing the packet ctx or the packet buffer until either the write has
+ * been performed, or the write has been cancelled.
+ *
+ * The read() API makes no provisions for reading complete packets. It simply returns whatever the next bio
+ * allows. If instead there is a need to read only complete packets, then the next bio should be
+ * fr_bio_mem_packet_alloc().
+ *
+ * The read() API may return 0. There may have been data read from an underlying FD, but that data did not
+ * make it through the filters of the "next" bios. e.g. Any underlying FD should be put into a "wait for
+ * readable" state.
+ *
+ * The write() API will return a full write, even if the next layer is blocked. Any underlying FD
+ * should be put into a "wait for writeable" state. The packet which was supposed to be written has been
+ * cached, and cannot be cancelled as it is partially written. The caller should likely start using another
+ * bio for writes. If the caller continues to use the bio, then any subsequent writes will *always* cache
+ * the packets. @todo - we need to mark up the bio as "blocked", and then have a write_blocked() API? ugh.
+ * or just add `bool blocked` and `bool eof` to both read/write bios
+ *
+ * Once the underlying FD has become writeable, the caller should call fr_bio_write(bio, NULL, NULL, SIZE_MAX);
+ * That will cause the pending packets to be flushed.
+ *
+ * The write() API may return that it's written a full packet, in which case it's either completely written to
+ * the next bio, or to the pending queue.
+ *
+ * The read / write APIs can return WOULD_BLOCK, in which case nothing was done. Any underlying FD should be
+ * put into a "wait for writeable" state. Other errors from bios "further down" the chain can also be
+ * returned.
+ *
+ * @param ctx the talloc ctx
+ * @param max_saved Maximum number of packets to cache. Must be 1..1^17
+ * @param saved callback to run when a packet is saved in the pending queue
+ * @param sent callback to run when a packet is sent.
+ * @param cancel callback to run when a packet is cancelled.
+ * @param next the next bio which will perform the underlying reads and writes.
+ * - NULL on error, memory allocation failed
+ * - !NULL the bio
+ */
+fr_bio_t *fr_bio_packet_alloc(TALLOC_CTX *ctx, size_t max_saved,
+ fr_bio_packet_saved_t saved,
+ fr_bio_packet_callback_t sent,
+ fr_bio_packet_callback_t cancel,
+ fr_bio_t *next)
+{
+ size_t i;
+ fr_bio_packet_t *my;
+
+ if (!max_saved) max_saved = 1;
+ if (max_saved > (1 << 17)) max_saved = 1 << 17;
+
+ my = (fr_bio_packet_t *) talloc_zero_array(ctx, uint8_t, sizeof(fr_bio_packet_t) +
+ sizeof(fr_bio_packet_entry_t) * max_saved);
+ if (!my) return NULL;
+
+ talloc_set_type(my, fr_bio_packet_t);
+
+ my->max_saved = max_saved;
+
+ fr_bio_packet_list_init(&my->pending);
+ fr_bio_packet_list_init(&my->free);
+
+ my->saved = saved;
+ my->sent = sent;
+ my->cancel = cancel;
+
+ for (i = 0; i < max_saved; i++) {
+ my->array[i].my = my;
+ my->array[i].cancelled = true;
+ fr_bio_packet_list_insert_tail(&my->free, &my->array[i]);
+ }
+
+ my->bio.read = fr_bio_packet_read;
+ my->bio.write = fr_bio_packet_write_next;
+ my->cb.shutdown = fr_bio_packet_shutdown;
+
+ fr_bio_chain(&my->bio, next);
+
+ if (my->cancel) {
+ talloc_set_destructor(my, fr_bio_packet_destructor);
+ } else {
+ talloc_set_destructor((fr_bio_t *) my, fr_bio_destructor);
+ }
+
+ return (fr_bio_t *) my;
+}
+
+/** Cancel the write for a packet.
+ *
+ * Cancel one a saved packets, and call the cancel() routine if it exists.
+ *
+ * There is no way to cancel all packets. The caller must find the lowest bio in the chain, and shutdown it.
+ * e.g. by closing the socket via fr_bio_fd_close(). That function will take care of walking back up the
+ * chain, and shutdownting each bio.
+ *
+ * @param bio the #fr_bio_packet_t
+ * @param ctx The context returned from #fr_bio_packet_saved_t
+ * @return
+ * - <0 no such packet was found in the list of saved packets, OR the packet cannot be cancelled.
+ * - 0 the packet was cancelled.
+ */
+int fr_bio_packet_cancel(fr_bio_t *bio, void *ctx)
+{
+ fr_bio_packet_t *my = talloc_get_type_abort(bio, fr_bio_packet_t);
+ fr_bio_packet_entry_t *item = ctx;
+
+ if (!(item >= &my->array[0]) && (item < &my->array[my->max_saved])) {
+ return -1;
+ }
+
+ /*
+ * Already cancelled, that's a NOOP.
+ */
+ if (item->cancelled) return 0;
+
+ /*
+ * If the item has been partially written, AND we have a working write function, see if we can
+ * cancel it.
+ */
+ if (item->already_written && (my->bio.write != fr_bio_null_write)) {
+ ssize_t rcode;
+ fr_bio_t *next;
+
+ next = fr_bio_next(bio);
+ fr_assert(next != NULL);
+
+ /*
+ * If the write fails or returns nothing, the item can't be cancelled.
+ */
+ rcode = next->write(next, item->packet_ctx, ((uint8_t const *) item->buffer) + item->already_written, item->size - item->already_written);
+ if (rcode <= 0) return -1;
+
+ /*
+ * If we haven't written the full item, it can't be cancelled.
+ */
+ item->already_written += rcode;
+ if (item->already_written < item->size) return -1;
+
+ /*
+ * Else the item has been fully written, it can be safely cancelled.
+ */
+ }
+
+
+ /*
+ * Remove it from the saved list, and run the cancellation callback.
+ */
+ (void) fr_bio_packet_list_remove(&my->pending, item);
+ fr_bio_packet_list_insert_head(&my->free, item);
+
+ if (my->cancel) my->cancel(bio, item->packet_ctx, item->buffer, item->size);
+ return 0;
+}
--- /dev/null
+#pragma once
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
+ */
+
+/**
+ * $Id$
+ * @file lib/bio/packet.h
+ * @brief Binary IO abstractions for packets in buffers
+ *
+ * Write packets of data to bios. If a packet is partially
+ * read/written, it is cached for later processing.
+ *
+ * @todo - Not quite done yet. It still needs to be integrated into the bio framework,
+ * and be managed through a bio of its own.
+ *
+ * @copyright 2024 Network RADIUS SAS (legal@networkradius.com)
+ */
+RCSIDH(lib_bio_packet_h, "$Id$")
+
+typedef void (*fr_bio_packet_callback_t)(fr_bio_t *bio, void *packet_ctx, const void *buffer, size_t size);
+typedef void (*fr_bio_packet_saved_t)(fr_bio_t *bio, void *packet_ctx, const void *buffer, size_t size, void *ctx);
+
+fr_bio_t *fr_bio_packet_alloc(TALLOC_CTX *ctx, size_t max_saved,
+ fr_bio_packet_saved_t saved,
+ fr_bio_packet_callback_t sent,
+ fr_bio_packet_callback_t cancel,
+ fr_bio_t *next) CC_HINT(nonnull(1,6));
+
+int fr_bio_packet_cancel(fr_bio_t *bio, void *ctx) CC_HINT(nonnull);
--- /dev/null
+/*
+ * This program is is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or (at
+ * your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
+ */
+
+/**
+ * $Id$
+ * @file lib/bio/pipe.c
+ * @brief BIO abstractions for in-memory pipes
+ *
+ * @copyright 2024 Network RADIUS SAS (legal@networkradius.com)
+ */
+#include <freeradius-devel/bio/bio_priv.h>
+#include <freeradius-devel/bio/mem.h>
+
+#include <freeradius-devel/bio/pipe.h>
+
+#include <pthread.h>
+
+/** The pipe bio
+ *
+ */
+typedef struct {
+ FR_BIO_COMMON;
+
+ fr_bio_t *next;
+
+ bool eof; //!< are we at EOF?
+
+ fr_bio_pipe_cb_funcs_t signal; //!< inform us that the pipe is readable
+
+ pthread_mutex_t mutex;
+} fr_bio_pipe_t;
+
+
+static int fr_bio_pipe_destructor(fr_bio_pipe_t *my)
+{
+ pthread_mutex_destroy(&my->mutex);
+
+ return 0;
+}
+
+/** Read from the pipe.
+ *
+ * Once EOF is set, any pending data is read, and then EOF is returned.
+ */
+static ssize_t fr_bio_pipe_read(fr_bio_t *bio, void *packet_ctx, void *buffer, size_t size)
+{
+ ssize_t rcode;
+ fr_bio_pipe_t *my = talloc_get_type_abort(bio, fr_bio_pipe_t);
+
+ fr_assert(my->next != NULL);
+
+ pthread_mutex_lock(&my->mutex);
+ rcode = my->next->read(my->next, packet_ctx, buffer, size);
+ if ((rcode == 0) && my->eof) {
+ rcode = fr_bio_error(EOF);
+
+ } else if (rcode > 0) {
+ /*
+ * There is room to write more data.
+ *
+ * @todo - only signal when we transition from BLOCKED to unblocked.
+ */
+ my->signal.writeable(&my->bio);
+ }
+ pthread_mutex_unlock(&my->mutex);
+
+ return rcode;
+}
+
+
+/** Write to the pipe.
+ *
+ * Once EOF is set, no further writes are possible.
+ */
+static ssize_t fr_bio_pipe_write(fr_bio_t *bio, void *packet_ctx, void const *buffer, size_t size)
+{
+ ssize_t rcode;
+ fr_bio_pipe_t *my = talloc_get_type_abort(bio, fr_bio_pipe_t);
+
+ fr_assert(my->next != NULL);
+
+ pthread_mutex_lock(&my->mutex);
+ if (!my->eof) {
+ rcode = my->next->write(my->next, packet_ctx, buffer, size);
+
+ /*
+ * There is more data to read.
+ *
+ * @todo - only signal when we transition from no data to data.
+ */
+ if (rcode > 0) {
+ my->signal.readable(&my->bio);
+ }
+
+ } else {
+ rcode = fr_bio_error(EOF);
+ }
+ pthread_mutex_unlock(&my->mutex);
+
+ return rcode;
+}
+
+/** Shutdown callback.
+ *
+ */
+static int fr_bio_pipe_shutdown(fr_bio_t *bio)
+{
+ ssize_t rcode;
+ fr_bio_pipe_t *my = talloc_get_type_abort(bio, fr_bio_pipe_t);
+
+ fr_assert(my->next != NULL);
+
+ pthread_mutex_lock(&my->mutex);
+ rcode = fr_bio_shutdown(my->next);
+ pthread_mutex_unlock(&my->mutex);
+
+ return rcode;
+}
+
+/** Allocate a thread-safe pipe which can be used for both reads and writes.
+ *
+ * Due to talloc issues with multiple threads, if the caller wants a bi-directional pipe, this function will
+ * need to be called twice. That way a free in each context won't result in a race condition on two mutex
+ * locks.
+ *
+ * For now, iqt's too difficult to emulate the pipe[2] behavior, where two identical "connected" things are
+ * returned, and either can be used for reading or for writing.
+ *
+ * i.e. a pipe is really a mutex-protected memory buffer. One side should call write (and never read). The
+ * other side should call read (and never write).
+ *
+ * The pipe should be freed only after both ends have set EOF.
+ */
+fr_bio_t *fr_bio_pipe_alloc(TALLOC_CTX *ctx, fr_bio_pipe_cb_funcs_t *cb, size_t buffer_size)
+{
+ fr_bio_pipe_t *my;
+
+ if (!cb->readable || !cb->writeable) return -1;
+
+ if (buffer_size < 1024) buffer_size = 1024;
+ if (buffer_size > (1 << 20)) buffer_size = (1 << 20);
+
+ my = talloc_zero(ctx, fr_bio_pipe_t);
+ if (!my) return NULL;
+
+ my->next = fr_bio_mem_sink_alloc(my, buffer_size);
+ if (!my->next) {
+ talloc_free(my);
+ return NULL;
+ }
+
+ my->signal = *cb;
+
+ pthread_mutex_init(&my->mutex, NULL);
+
+ my->bio.read = fr_bio_pipe_read;
+ my->bio.write = fr_bio_pipe_write;
+ my->cb.shutdown = fr_bio_pipe_shutdown;
+
+ talloc_set_destructor(my, fr_bio_pipe_destructor);
+ return (fr_bio_t *) my;
+}
+
+/** Set EOF.
+ *
+ * Either side can set EOF, in which case pending reads are still processed. Writes return EOF immediately.
+ * Readers return pending data, and then EOF.
+ */
+void fr_bio_pipe_set_eof(fr_bio_t *bio)
+{
+ fr_bio_pipe_t *my = talloc_get_type_abort(bio, fr_bio_pipe_t);
+
+ pthread_mutex_lock(&my->mutex);
+ my->eof = true;
+ pthread_mutex_unlock(&my->mutex);
+}
--- /dev/null
+#pragma once
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
+ */
+
+/**
+ * $Id$
+ * @file lib/bio/pipe.h
+ * @brief BIO pipe handlers.
+ *
+ * @copyright 2024 Network RADIUS SAS (legal@networkradius.com)
+ */
+RCSIDH(lib_bio_pipe_h, "$Id$")
+
+typedef struct {
+ fr_bio_callback_t readable;
+ fr_bio_callback_t writeable;
+} fr_bio_pipe_cb_funcs_t;
+
+fr_bio_t *fr_bio_pipe_alloc(TALLOC_CTX *ctx, fr_bio_pipe_cb_funcs_t *cb, size_t buffer_size) CC_HINT(nonnull);
+
+void fr_bio_pipe_set_eof(fr_bio_t *bio);