]> git.ipfire.org Git - thirdparty/kernel/stable-queue.git/commitdiff
Fixes for 5.4
authorSasha Levin <sashal@kernel.org>
Thu, 14 Dec 2023 12:39:52 +0000 (07:39 -0500)
committerSasha Levin <sashal@kernel.org>
Thu, 14 Dec 2023 12:39:52 +0000 (07:39 -0500)
Signed-off-by: Sasha Levin <sashal@kernel.org>
queue-5.4/afs-fix-refcount-underflow-from-error-handling-race.patch [new file with mode: 0644]
queue-5.4/series [new file with mode: 0644]

diff --git a/queue-5.4/afs-fix-refcount-underflow-from-error-handling-race.patch b/queue-5.4/afs-fix-refcount-underflow-from-error-handling-race.patch
new file mode 100644 (file)
index 0000000..05aeeec
--- /dev/null
@@ -0,0 +1,146 @@
+From b095fefd2057d6010b375b9e0fed514312eabf9e Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Mon, 11 Dec 2023 21:43:52 +0000
+Subject: afs: Fix refcount underflow from error handling race
+
+From: David Howells <dhowells@redhat.com>
+
+[ Upstream commit 52bf9f6c09fca8c74388cd41cc24e5d1bff812a9 ]
+
+If an AFS cell that has an unreachable (eg. ENETUNREACH) server listed (VL
+server or fileserver), an asynchronous probe to one of its addresses may
+fail immediately because sendmsg() returns an error.  When this happens, a
+refcount underflow can happen if certain events hit a very small window.
+
+The way this occurs is:
+
+ (1) There are two levels of "call" object, the afs_call and the
+     rxrpc_call.  Each of them can be transitioned to a "completed" state
+     in the event of success or failure.
+
+ (2) Asynchronous afs_calls are self-referential whilst they are active to
+     prevent them from evaporating when they're not being processed.  This
+     reference is disposed of when the afs_call is completed.
+
+     Note that an afs_call may only be completed once; once completed
+     completing it again will do nothing.
+
+ (3) When a call transmission is made, the app-side rxrpc code queues a Tx
+     buffer for the rxrpc I/O thread to transmit.  The I/O thread invokes
+     sendmsg() to transmit it - and in the case of failure, it transitions
+     the rxrpc_call to the completed state.
+
+ (4) When an rxrpc_call is completed, the app layer is notified.  In this
+     case, the app is kafs and it schedules a work item to process events
+     pertaining to an afs_call.
+
+ (5) When the afs_call event processor is run, it goes down through the
+     RPC-specific handler to afs_extract_data() to retrieve data from rxrpc
+     - and, in this case, it picks up the error from the rxrpc_call and
+     returns it.
+
+     The error is then propagated to the afs_call and that is completed
+     too.  At this point the self-reference is released.
+
+ (6) If the rxrpc I/O thread manages to complete the rxrpc_call within the
+     window between rxrpc_send_data() queuing the request packet and
+     checking for call completion on the way out, then
+     rxrpc_kernel_send_data() will return the error from sendmsg() to the
+     app.
+
+ (7) Then afs_make_call() will see an error and will jump to the error
+     handling path which will attempt to clean up the afs_call.
+
+ (8) The problem comes when the error handling path in afs_make_call()
+     tries to unconditionally drop an async afs_call's self-reference.
+     This self-reference, however, may already have been dropped by
+     afs_extract_data() completing the afs_call
+
+ (9) The refcount underflows when we return to afs_do_probe_vlserver() and
+     that tries to drop its reference on the afs_call.
+
+Fix this by making afs_make_call() attempt to complete the afs_call rather
+than unconditionally putting it.  That way, if afs_extract_data() manages
+to complete the call first, afs_make_call() won't do anything.
+
+The bug can be forced by making do_udp_sendmsg() return -ENETUNREACH and
+sticking an msleep() in rxrpc_send_data() after the 'success:' label to
+widen the race window.
+
+The error message looks something like:
+
+    refcount_t: underflow; use-after-free.
+    WARNING: CPU: 3 PID: 720 at lib/refcount.c:28 refcount_warn_saturate+0xba/0x110
+    ...
+    RIP: 0010:refcount_warn_saturate+0xba/0x110
+    ...
+    afs_put_call+0x1dc/0x1f0 [kafs]
+    afs_fs_get_capabilities+0x8b/0xe0 [kafs]
+    afs_fs_probe_fileserver+0x188/0x1e0 [kafs]
+    afs_lookup_server+0x3bf/0x3f0 [kafs]
+    afs_alloc_server_list+0x130/0x2e0 [kafs]
+    afs_create_volume+0x162/0x400 [kafs]
+    afs_get_tree+0x266/0x410 [kafs]
+    vfs_get_tree+0x25/0xc0
+    fc_mount+0xe/0x40
+    afs_d_automount+0x1b3/0x390 [kafs]
+    __traverse_mounts+0x8f/0x210
+    step_into+0x340/0x760
+    path_openat+0x13a/0x1260
+    do_filp_open+0xaf/0x160
+    do_sys_openat2+0xaf/0x170
+
+or something like:
+
+    refcount_t: underflow; use-after-free.
+    ...
+    RIP: 0010:refcount_warn_saturate+0x99/0xda
+    ...
+    afs_put_call+0x4a/0x175
+    afs_send_vl_probes+0x108/0x172
+    afs_select_vlserver+0xd6/0x311
+    afs_do_cell_detect_alias+0x5e/0x1e9
+    afs_cell_detect_alias+0x44/0x92
+    afs_validate_fc+0x9d/0x134
+    afs_get_tree+0x20/0x2e6
+    vfs_get_tree+0x1d/0xc9
+    fc_mount+0xe/0x33
+    afs_d_automount+0x48/0x9d
+    __traverse_mounts+0xe0/0x166
+    step_into+0x140/0x274
+    open_last_lookups+0x1c1/0x1df
+    path_openat+0x138/0x1c3
+    do_filp_open+0x55/0xb4
+    do_sys_openat2+0x6c/0xb6
+
+Fixes: 34fa47612bfe ("afs: Fix race in async call refcounting")
+Reported-by: Bill MacAllister <bill@ca-zephyr.org>
+Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1052304
+Suggested-by: Jeffrey E Altman <jaltman@auristor.com>
+Signed-off-by: David Howells <dhowells@redhat.com>
+Reviewed-by: Jeffrey Altman <jaltman@auristor.com>
+cc: Marc Dionne <marc.dionne@auristor.com>
+cc: linux-afs@lists.infradead.org
+Link: https://lore.kernel.org/r/2633992.1702073229@warthog.procyon.org.uk/ # v1
+Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ fs/afs/rxrpc.c | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+diff --git a/fs/afs/rxrpc.c b/fs/afs/rxrpc.c
+index 49fcce6529a60..30952f073fb41 100644
+--- a/fs/afs/rxrpc.c
++++ b/fs/afs/rxrpc.c
+@@ -490,7 +490,7 @@ void afs_make_call(struct afs_addr_cursor *ac, struct afs_call *call, gfp_t gfp)
+       if (call->async) {
+               if (cancel_work_sync(&call->async_work))
+                       afs_put_call(call);
+-              afs_put_call(call);
++              afs_set_call_complete(call, ret, 0);
+       }
+       ac->error = ret;
+-- 
+2.43.0
+
diff --git a/queue-5.4/series b/queue-5.4/series
new file mode 100644 (file)
index 0000000..77eedf2
--- /dev/null
@@ -0,0 +1 @@
+afs-fix-refcount-underflow-from-error-handling-race.patch