From: Chris Mason Date: Fri, 22 May 2026 13:39:06 +0000 (-0400) Subject: sunrpc: pin svc_xprt across the asynchronous TLS handshake callback X-Git-Url: http://git.ipfire.org/gitweb.cgi?a=commitdiff_plain;h=4f988f3a2808fb659f3880c282041ff067acad78;p=thirdparty%2Flinux.git sunrpc: pin svc_xprt across the asynchronous TLS handshake callback svc_tcp_handshake() stores the raw svc_xprt pointer in tls_handshake_args.ta_data and submits the request through tls_server_hello_x509(). The handshake core takes only sock_hold(req->hr_sk); nothing references the embedding struct svc_sock that svc_tcp_handshake_done() reaches via container_of(). Two close races leave the in-flight callback writing through a freed svc_sock. svc_sock_free() calls tls_handshake_cancel() and discards its return value: a false return means handshake_complete() has already set HANDSHAKE_F_REQ_COMPLETED but hp_done() may not have finished, yet svc_sock_free() proceeds to kfree(svsk). The cancel-loser fall-through inside svc_tcp_handshake() itself produces the same window: when wait_for_completion_interruptible_timeout() returns <= 0 (timeout or signal) and tls_handshake_cancel() returns false, the function does not drain, returns, and svc_handle_xprt() calls svc_xprt_received(), which clears XPT_BUSY and can drop the last reference. A concurrent close then runs svc_sock_free() while svc_tcp_handshake_done() is still updating xpt_flags and walking svsk->sk_handshake_done. The corruption surfaces as set_bit/clear_bit RMW into the freed xpt_flags slab slot and as complete_all() walking and writing the freed wait_queue_head_t list embedded in sk_handshake_done -- a slab-corruption primitive, not a benign read. The path is reachable on any TLS-enabled NFS server whenever a connection close overlaps the tlshd downcall delivery window; the interruptible wait means signal delivery suffices, not just SVC_HANDSHAKE_TO expiry. Take svc_xprt_get(xprt) immediately before tls_server_hello_x509() so the in-flight callback owns its own reference. Release it on the two edges where the callback is guaranteed not to fire -- submission failure from tls_server_hello_x509() and a successful tls_handshake_cancel() -- and at the tail of svc_tcp_handshake_done() after complete_all(). Fixes: b3cbf98e2fdf ("SUNRPC: Support TLS handshake in the server-side TCP socket code") Cc: stable@vger.kernel.org Signed-off-by: Chris Mason Assisted-by: kres (claude-opus-4-7) [cel: rewrote commit message to describe the actual change] Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever --- diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c index c434b6a6637d9..6a75ac4565db8 100644 --- a/net/sunrpc/svcsock.c +++ b/net/sunrpc/svcsock.c @@ -471,6 +471,7 @@ static void svc_tcp_handshake_done(void *data, int status, key_serial_t peerid) } clear_bit(XPT_HANDSHAKE, &xprt->xpt_flags); complete_all(&svsk->sk_handshake_done); + svc_xprt_put(xprt); } /** @@ -494,9 +495,13 @@ static void svc_tcp_handshake(struct svc_xprt *xprt) clear_bit(XPT_TLS_SESSION, &xprt->xpt_flags); init_completion(&svsk->sk_handshake_done); + /* Pin the transport across the asynchronous handshake callback. */ + svc_xprt_get(xprt); + ret = tls_server_hello_x509(&args, GFP_KERNEL); if (ret) { trace_svc_tls_not_started(xprt); + svc_xprt_put(xprt); goto out_failed; } @@ -505,6 +510,7 @@ static void svc_tcp_handshake(struct svc_xprt *xprt) if (ret <= 0) { if (tls_handshake_cancel(sk)) { trace_svc_tls_timed_out(xprt); + svc_xprt_put(xprt); goto out_close; } }