From 3686d215fe6da344e1190d7027f0c007662c7698 Mon Sep 17 00:00:00 2001 From: Hugo Landau Date: Wed, 24 Apr 2024 13:38:27 +0100 Subject: [PATCH] QUIC FUTURE: Add concurrency architecture design document MIME-Version: 1.0 Content-Type: text/plain; charset=utf8 Content-Transfer-Encoding: 8bit Reviewed-by: Neil Horman Reviewed-by: Saša Nedvědický Reviewed-by: Tomas Mraz (Merged from https://github.com/openssl/openssl/pull/26025) --- .../images/quic-concurrency-models.svg | 1 + doc/designs/quic-design/quic-concurrency.md | 412 ++++++++++++++++++ 2 files changed, 413 insertions(+) create mode 100644 doc/designs/quic-design/images/quic-concurrency-models.svg create mode 100644 doc/designs/quic-design/quic-concurrency.md diff --git a/doc/designs/quic-design/images/quic-concurrency-models.svg b/doc/designs/quic-design/images/quic-concurrency-models.svg new file mode 100644 index 00000000000..7b5a623051d --- /dev/null +++ b/doc/designs/quic-design/images/quic-concurrency-models.svg @@ -0,0 +1 @@ + \ No newline at end of file diff --git a/doc/designs/quic-design/quic-concurrency.md b/doc/designs/quic-design/quic-concurrency.md new file mode 100644 index 00000000000..55af2a94db9 --- /dev/null +++ b/doc/designs/quic-design/quic-concurrency.md @@ -0,0 +1,412 @@ +QUIC Concurrency Architecture +============================= + +Introduction +------------ + +Most QUIC implementations in C are offered as a simple state machine without any +included I/O solution. Applications must do significant integration work to +provide the necessary infrastructure for a QUIC implementation to integrate +with. Moreover, blocking I/O at an application level may not be supported. + +OpenSSL QUIC seeks to offer a QUIC solution which can serve multiple use cases: + +- Firstly, it seeks to offer the simple state machine model and a fully + customisable network path (via a BIO) for those who want it; + +- Secondly, it seeks to offer a turnkey solution with an in-the-box I/O + and polling solution which can support blocking API calls in a Berkeley + sockets-like way. + +These usage modes are somewhat diametrically opposed. One involves libssl +consuming no resources but those it is given, with an application responsible +for synchronisation and a potentially custom network I/O path. This usage model +is not “smart”. Network traffic is connected to the state machine and state is +input and output from the state machine as needed by an application on a purely +non-blocking basis. Determining *when* to do anything is largely the +application's responsibility. + +The other diametrically opposed usage mode involves libssl managing more things +internally to provide an easier to use solution. For example, it may involve +spinning up background threads to ensure connections are serviced regularly (as +in our existing client-side thread assisted mode). + +In order to provide for these different use cases, the concept of concurrency +models is introduced. A concurrency model defines how “cleverly” the QUIC engine +will operate and how many background resources (e.g. threads, other OS +resources) will be established to support operation. + +Concurrency Models +------------------ + +- **Unsynchronised Concurrency Model (UCM):** In the Unsynchronised Concurrency + Model, calls to SSL objects are not synchronised. There is no locking on any + APL call (the omission of which is purely an optimisation). The application is + either single-threaded or is otherwise responsible for doing synchronisation + itself. + + Blocking API calls are not supported under this model. This model is intended + primarily for single-threaded use as a simple state machine by advanced + applications, and many applications will be likely to disable autoticking. + +- **Contentive Concurrency Model (CCM):** In the + Contentive Concurrency Model, calls to SSL objects are wrapped in locks and + multi-threaded usage of a QUIC connection (for example, parallel writes to + different QUIC stream SSL objects belonging to the same QUIC connection) is + synchronised by a mutex. + + This is contentive in the sense that if a large number of threads are trying + to write to different streams on the same connection, a large amount of lock + contention will occur. As such, this concurrency model will not scale and + provide good performance, at least within the context of concurrent use + of a single connection. + + Under this model, APL calls by the application result in lock-wrapped + mutations of QUIC core objects (`QUIC_CHANNEL`, `QUIC_STREAM`, etc.) on the + same thread. + + This model may be used either in a variant which does not support blocking + (NB-CCM) or which does support blocking (B-CCM). The blocking variant must + spin up additional OS resources to correctly support blocking semantics. + +- **Thread Assisted Contentive Concurrency Model (TA-CCM):** This is currently + implemented by our thread assisted mode for client-side QUIC usage. It does + not realise the full state separation or performance of the Worker Concurrency + Model (WCM) below. Instead, it simply spawns a background thread which ensures + QUIC timer events are handled as needed. It makes use of the Contentive + Concurrency Model for performing that handling, in that it obtains a lock when + ticking a QUIC connection just as any call by an application would. + + This mode is likely to be deprecated in favour of the full Worker Concurrency + Model (WCM), which it will naturally be subsumed by. + +- **Worker Concurrency Model (WCM):** In the Worker Concurrency Model, + a background worker thread is spawned to manage connection processing. All + interaction with a SSL object goes through this thread in some way. + Interactions with SSL objects are essentially translated into commands and + handled by the worker thread. To optimise performance and minimise lock + contention, there is an emphasis on message passing over locking. + Internal dataflow for application data can be managed in a zero-copy way to + minimise the costs of this message passing. + + Under this model, QUIC core objects (`QUIC_CHANNEL`, `QUIC_STREAM`, etc.) will + live solely on the worker thread and access to these objects by an application + thread will be entirely forbidden. + + Blocking API calls are supported under this model. + +These concurrency models are summarised as follows: + +| Model | Sophistication | Concurrency | Blocking Supported | OS Resources | Timer Events | RX Steering | Core State Affinity | +|--------|----------------|-----------------------|--------------------|---------------------------|-----------------|-------------|----------------------| +| UCM | Lowest | ST only | No | None | App Responsible | None | App Thread | +| CCM | | MT (Contentive) | Optional | Mutex, (Notifier) | App Responsible | TBD | App Threads | +| TA-CCM† | | MT (Contentive) | Optional | Mutex, Thread, (Notifier) | Managed | TBD | App & Assist Threads | +| WCM | Highest | MT (High Performance) | Yes | Mutex, Thread, Notifier | Managed | Futureproof | Worker Thread | + +† To eventually be deprecated in favour of WCM. + +Legend: + +- **Blocking Supported:** Whether blocking calls to e.g. `SSL_read` can be + supported. If this is listed as “optional”, extra resources are required to + support this under the listed model and these resources could be omitted if an + application indicates it does not need this functionality at initialisation + time. + +- **OS Resources:** “Mutex” refers to mutex and condition variable resources. + “Notifier” refers to a kind of OS resource needed to allow one thread to wake + another thread which is currently blocking in an OS socket polling call such + as poll(2) (e.g. an eventfd or socketpair). Resources listed in parentheses in + the table above are required only if blocking support is desired. + +- **Timer Events:** Is an application responsible for ensuring QUIC timeout + events are handled in a timely manner? + +- **RX Steering:** The matter of RX steering will be discussed in detail in a + future document. Broadly speaking, RX steering concerns whether incoming + traffic for multiple different QUIC connections on the same local port (e.g. + for a server) can be vectored *by the OS* to different threads or whether the + demuxing of incoming traffic for different connections has to be done manually + on an in-process basis. + + The WCM model most readily supports RX steering and is futureproof in this + regard. The feasibility of having the UCM and CCM models support RX steering + is left for future analysis. + +- **Core State Affinity:** Which threads are allowed to touch the QUIC core + objects (`QUIC_CHANNEL`, `QUIC_STREAM`, etc.) + +Architecture +------------ + +To recap, the API Personality Layer (APL) refers to the code in `quic_impl.c` +which implements the libssl API personality (`SSL_write`, etc.). The APL is +cleanly separated from the QUIC core implementation (`QUIC_CHANNEL`, etc.). + +Since UCM is basically a slight optimisation of CCM in which unnecessary locking +is elided, discussion from hereon in will focus on CCM and WCM except where +there are specific differences between CCM and UCM. + +Supporting both CCM and WCM creates significant architectural challenges. Under +CCM, QUIC core objects have their state mutated under lock by arbitrary +application threads and these mutations happen during APL calls. By contrast, a +performant WCM architecture requires that APL calls be recorded and serviced in +an asynchronous fashion involving message passing to a worker thread. This +threatens to require highly divergent dispatch architectures for the two +concurrency models. + +As such, the concept of a **Concurrency Management Layer (CML)** is introduced. +The CML lives between the APL and the QUIC core code. It is responsible for +dispatching in-thread mutations of QUIC core objects when operating under CCM, +and for dispatching messages to a worker thread under WCM. + +![Concurrency Models Diagram](images/quic-concurrency-models.svg) + +There are two different CMLs: + +- **Direct CML (DCML)**, in which core objects are worked on in the same thread + which made an APL call, under lock; + +- **Worker CML (WCML)**, in which core objects are managed by a worker thread + with communication via message passing. This CML is split into a front end + (WCML-FE) and back end (WCML-BE). + +The legacy thread assisted mode uses a bespoke method which is similar to the +approach used by the DCML. + +CML Design +---------- + +The CML is designed to have as small an API surface area as possible to enable +unified handling of as many kinds of (APL) API operations as possible. The idea +is that complex APL calls are translated into simple operations on the CML. + +At its core, the CML exposes some number of *pipes*. The number of pipes which +can be accessed via the CML varies as connections and streams are created and +destroyed. A pipe is a *unidirectional* transport for byte streams. Zero-copy +optimisations are expected to be implemented in future but are deferred. + +The CML (`QUIC_CML`) allows the caller to refer to a pipe by providing an opaque +pipe handle (`QUIC_CML_PIPE`). If the pipe is a sending pipe, the caller can use +`ossl_cml_write` to try and add bytes to it. Conversely, if it is a receiving +pipe, the caller can use `ossl_cml_read` to try and read bytes from it. + +The method `ossl_cml_block_until` allows the caller to block until at least one +of the provided pipe handles is ready. Ready means that at least one byte can be +written (for a sending pipe) or at least one byte can be read (for a receiving +pipe). + +Note that there is only expected to be one `QUIC_CML` instance per QUIC event +processing domain (i.e., per `QUIC_DOMAIN` / `QUIC_ENGINE` instance). The CML +fully abstracts the QUIC core objects such as `QUIC_ENGINE` or `QUIC_CHANNEL` so +that the APL never sees them. + +The caller retrieves a pipe handle using `ossl_cml_get_pipe`. This function +retrieves a pipe based on two values: + + - a CML pipe class; + - a CML *selector*. + +The CML selector is a tagged union structure which specifies what pipe is to be +retrieved. Abstractly, examples of selectors include: + +```text + Domain () + Listener (listener_id: uint) + Conn (conn_id: uint) + Stream (conn_id: uint, stream_id: u64) +``` + +In other words, the CML selector selects the “object” to retrieve a pipe from. + +The CML pipe class is one of the following values: + +- Request +- Notification +- App Send +- App Recv + +The pipe classes available for a given selector vary. For example, the “App +Send” and “App Recv” pipes only exist on a stream, so it is invalid to request +such a pipe in conjunction with a different type of selector. + +The “Request” and “App Send” classes expose send-only streams, and the +“Notification” and “App Recv” classes expose receive-only streams. + +For any given CML selector, the Request pipe is used to send serialized commands +for asynchronous processing in relation to the entity selected by that selector. +Conversely, the Notification pipe returns asynchronous notifications. These +could be in relation to a previous Command (e.g. indicating whether a command +succeeded), or unprompted notifications about other events. + +The underlying pattern here is that there is a bidirectional channel for control +messages, and a bidirectional channel for application data, both comprised of +two unidirectional pipes in turn. + +Pipe handles are stable for as long as the pipe they reference exists, so an APL +object can cache a pipe handle if desired. + +All CML methods are thread safe. The CML implementation handles any necessary +locking (if any) internally. + +The `ossl_cml_write_available` and `ossl_cml_read_available` calls determine the +number of bytes which can currently be written to a send-only pipe, or read from +a receive-only pipe, respectively. + +**Race conditions.** Because these are separate calls to `ossl_cml_write` and +`ossl_cml_read`, the values returned by these functions may become out of date +before the caller has a chance to read `ossl_cml_write` or `ossl_cml_read`. +However, such changes are guaranteed to be monotonically in favour of the +caller; for example, the value returned by `ossl_cml_write_available` will only +ever increase asynchronously (and only decrease as a result of an +`ossl_cml_write` call). Conversely, the value returned by +`ossl_cml_read_available` will only ever increase asynchronously (and only +decrease as a result of an `ossl_cml_read` call). Assuming that only one thread +makes calls to CML functions at a given time *for a given pipe*, this therefore +poses no issue for callers. + +Concurrent use of `ossl_cml_write` or `ossl_cml_read` for a given pipe is not +intended (and would not make sense in any case). The caller is responsible for +synchronising such calls. + +**Examples of pipe usage.** The application data pipes are used to serialize the +actual application data sent or received on a QUIC stream. The usage of the +request/notification pipes is more varied and used for control activity. There +is therefore a “control/data” separation here. The request and notification +pipes transport tagged unions. Abstractly, commands and notifications might +include: + +- Request: Reset Stream (error code: u64) +- Notification: Connection Terminated by Peer + +**Example implementation of `SSL_write`.** An `SSL_write`-like API might be +implemented in the APL like this: + +```c +int do_write(QUIC_CML *cml, + QUIC_CML_PIPE notification_pipe, + QUIC_CML_PIPE app_send_pipe, + const void *buf, size_t buf_len) +{ + size_t bytes_written = 0; + + for (;;) { + /* e.g. connection termination */ + process_any_notifications(notification_pipe); + + /* state checks, etc. */ + if (...->conn_terminated) + return 0; + + if (buf_len == 0) + return 1; + + if (!ossl_cml_write(cml, app_send_pipe, buf, buf_len, &bytes_written)) + return 0; + + if (bytes_written == 0) { + if (!should_block()) + break; + + ossl_cml_block_until(cml, {notification_pipe, app_send_pipe}); + continue; /* try again */ + } + + buf += bytes_written; + buf_len -= bytes_written; + } + + return 1; +} +``` + +```c +/* + * Creates a new CML using the Direct CML (DCML) implementation. need_locking + * may be 0 to elide mutex usage if the application is guaranteed to synchronise + * access or is purely single-threaded. + */ +QUIC_CML *ossl_cml_new_direct(int need_locking); + +/* Creates a new CML using the Worker CML (WCML) implementation. */ +QUIC_CML *ossl_cml_new_worker(size_t num_worker_threads); + +/* + * Starts the CML operating. Idempotent after it returns successfully. For the + * WCML this might e.g. start background threads; for the DCML it is likely to + * be a no-op (but must still be called). + */ +int ossl_cml_start(QUIC_CML *cml); + +/* + * Begins the CML shutdown process. Returns 1 once shutdown is complete; may + * need to be called multiple times until shutdown is done. + */ +int ossl_cml_shutdown(QUIC_CML *cml); + +/* + * Immediate free of the CML. This is always safe but may cause handling + * of a connection to be aborted abruptly as it is an immediate teardown + * of all state. + */ +void ossl_cml_free(QUIC_CML *cml); + +/* + * Retrieves a pipe for a logical CML object described by selector. The pipe + * handle, which is stable over the life of the logical CML object, is written + * to *pipe_handle. class_ is a QUIC_CML_CLASS value. + */ +enum { + QUIC_CML_CLASS_REQUEST, /* control; send */ + QUIC_CML_CLASS_NOTIFICATION, /* control; recv */ + QUIC_CML_CLASS_APP_SEND, /* data; send */ + QUIC_CML_CLASS_APP_RECV /* data; recv */ +}; + +int ossl_cml_get_pipe(QUIC_CML *cml, + int class_, + const QUIC_CML_SELECTOR *selector, + QUIC_CML_PIPE *pipe_handle); + +/* + * Returns the number of bytes a sending pipe can currently accept. The returned + * value may increase over time asynchronously but will only decrease in + * response to an ossl_cml_write call. + */ +size_t ossl_cml_write_available(QUIC_CML *cml, QUIC_CML_PIPE pipe_handle); + +/* + * Appends bytes into a sending pipe by copying them. The buffer can be freed + * as soon as this call returns. + */ +int ossl_cml_write(QUIC_CML *cml, QUIC_CML_PIPE pipe_handle, + const void *buf, size_t buf_len); + +/* + * Returns the number of bytes a receiving pipe currently has waiting to be + * read. The returned value may increase over time asynchronously but will only + * decreate in response to an ossl_cml_read call. + */ +size_t ossl_cml_read_available(QUIC_CML *cml, QUIC_CML_PIPE pipe_handle); + +/* + * Reads bytes from a receiving pipe by copying them. + */ +int ossl_cml_read(QUIC_CML *cml, QUIC_CML_PIPE pipe_handle, + void *buf, size_t buf_len); + +/* + * Blocks until at least one of the pipes in the array specified by + * pipe_handles is ready, or until the deadline given is reached. + * + * A pipe is ready if: + * + * - it is a sending pipe and one or more bytes can now be written; + * - it is a receiving pipe and one or more bytes can now be read. + */ +int ossl_cml_block_until(QUIC_CML *cml, + const QUIC_CML_PIPE *pipe_handles, + size_t num_pipe_handles, + OSSL_TIME deadline); +``` -- 2.47.2