From d009f258d11c957df50c062fa3976d81e9f89682 Mon Sep 17 00:00:00 2001 From: Sasha Levin Date: Thu, 14 Dec 2023 07:39:52 -0500 Subject: [PATCH] Fixes for 5.4 Signed-off-by: Sasha Levin --- ...t-underflow-from-error-handling-race.patch | 146 ++++++++++++++++++ queue-5.4/series | 1 + 2 files changed, 147 insertions(+) create mode 100644 queue-5.4/afs-fix-refcount-underflow-from-error-handling-race.patch create mode 100644 queue-5.4/series diff --git a/queue-5.4/afs-fix-refcount-underflow-from-error-handling-race.patch b/queue-5.4/afs-fix-refcount-underflow-from-error-handling-race.patch new file mode 100644 index 00000000000..05aeeec40d8 --- /dev/null +++ b/queue-5.4/afs-fix-refcount-underflow-from-error-handling-race.patch @@ -0,0 +1,146 @@ +From b095fefd2057d6010b375b9e0fed514312eabf9e Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 11 Dec 2023 21:43:52 +0000 +Subject: afs: Fix refcount underflow from error handling race + +From: David Howells + +[ Upstream commit 52bf9f6c09fca8c74388cd41cc24e5d1bff812a9 ] + +If an AFS cell that has an unreachable (eg. ENETUNREACH) server listed (VL +server or fileserver), an asynchronous probe to one of its addresses may +fail immediately because sendmsg() returns an error. When this happens, a +refcount underflow can happen if certain events hit a very small window. + +The way this occurs is: + + (1) There are two levels of "call" object, the afs_call and the + rxrpc_call. Each of them can be transitioned to a "completed" state + in the event of success or failure. + + (2) Asynchronous afs_calls are self-referential whilst they are active to + prevent them from evaporating when they're not being processed. This + reference is disposed of when the afs_call is completed. + + Note that an afs_call may only be completed once; once completed + completing it again will do nothing. + + (3) When a call transmission is made, the app-side rxrpc code queues a Tx + buffer for the rxrpc I/O thread to transmit. The I/O thread invokes + sendmsg() to transmit it - and in the case of failure, it transitions + the rxrpc_call to the completed state. + + (4) When an rxrpc_call is completed, the app layer is notified. In this + case, the app is kafs and it schedules a work item to process events + pertaining to an afs_call. + + (5) When the afs_call event processor is run, it goes down through the + RPC-specific handler to afs_extract_data() to retrieve data from rxrpc + - and, in this case, it picks up the error from the rxrpc_call and + returns it. + + The error is then propagated to the afs_call and that is completed + too. At this point the self-reference is released. + + (6) If the rxrpc I/O thread manages to complete the rxrpc_call within the + window between rxrpc_send_data() queuing the request packet and + checking for call completion on the way out, then + rxrpc_kernel_send_data() will return the error from sendmsg() to the + app. + + (7) Then afs_make_call() will see an error and will jump to the error + handling path which will attempt to clean up the afs_call. + + (8) The problem comes when the error handling path in afs_make_call() + tries to unconditionally drop an async afs_call's self-reference. + This self-reference, however, may already have been dropped by + afs_extract_data() completing the afs_call + + (9) The refcount underflows when we return to afs_do_probe_vlserver() and + that tries to drop its reference on the afs_call. + +Fix this by making afs_make_call() attempt to complete the afs_call rather +than unconditionally putting it. That way, if afs_extract_data() manages +to complete the call first, afs_make_call() won't do anything. + +The bug can be forced by making do_udp_sendmsg() return -ENETUNREACH and +sticking an msleep() in rxrpc_send_data() after the 'success:' label to +widen the race window. + +The error message looks something like: + + refcount_t: underflow; use-after-free. + WARNING: CPU: 3 PID: 720 at lib/refcount.c:28 refcount_warn_saturate+0xba/0x110 + ... + RIP: 0010:refcount_warn_saturate+0xba/0x110 + ... + afs_put_call+0x1dc/0x1f0 [kafs] + afs_fs_get_capabilities+0x8b/0xe0 [kafs] + afs_fs_probe_fileserver+0x188/0x1e0 [kafs] + afs_lookup_server+0x3bf/0x3f0 [kafs] + afs_alloc_server_list+0x130/0x2e0 [kafs] + afs_create_volume+0x162/0x400 [kafs] + afs_get_tree+0x266/0x410 [kafs] + vfs_get_tree+0x25/0xc0 + fc_mount+0xe/0x40 + afs_d_automount+0x1b3/0x390 [kafs] + __traverse_mounts+0x8f/0x210 + step_into+0x340/0x760 + path_openat+0x13a/0x1260 + do_filp_open+0xaf/0x160 + do_sys_openat2+0xaf/0x170 + +or something like: + + refcount_t: underflow; use-after-free. + ... + RIP: 0010:refcount_warn_saturate+0x99/0xda + ... + afs_put_call+0x4a/0x175 + afs_send_vl_probes+0x108/0x172 + afs_select_vlserver+0xd6/0x311 + afs_do_cell_detect_alias+0x5e/0x1e9 + afs_cell_detect_alias+0x44/0x92 + afs_validate_fc+0x9d/0x134 + afs_get_tree+0x20/0x2e6 + vfs_get_tree+0x1d/0xc9 + fc_mount+0xe/0x33 + afs_d_automount+0x48/0x9d + __traverse_mounts+0xe0/0x166 + step_into+0x140/0x274 + open_last_lookups+0x1c1/0x1df + path_openat+0x138/0x1c3 + do_filp_open+0x55/0xb4 + do_sys_openat2+0x6c/0xb6 + +Fixes: 34fa47612bfe ("afs: Fix race in async call refcounting") +Reported-by: Bill MacAllister +Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1052304 +Suggested-by: Jeffrey E Altman +Signed-off-by: David Howells +Reviewed-by: Jeffrey Altman +cc: Marc Dionne +cc: linux-afs@lists.infradead.org +Link: https://lore.kernel.org/r/2633992.1702073229@warthog.procyon.org.uk/ # v1 +Signed-off-by: Linus Torvalds +Signed-off-by: Sasha Levin +--- + fs/afs/rxrpc.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/fs/afs/rxrpc.c b/fs/afs/rxrpc.c +index 49fcce6529a60..30952f073fb41 100644 +--- a/fs/afs/rxrpc.c ++++ b/fs/afs/rxrpc.c +@@ -490,7 +490,7 @@ void afs_make_call(struct afs_addr_cursor *ac, struct afs_call *call, gfp_t gfp) + if (call->async) { + if (cancel_work_sync(&call->async_work)) + afs_put_call(call); +- afs_put_call(call); ++ afs_set_call_complete(call, ret, 0); + } + + ac->error = ret; +-- +2.43.0 + diff --git a/queue-5.4/series b/queue-5.4/series new file mode 100644 index 00000000000..77eedf270e2 --- /dev/null +++ b/queue-5.4/series @@ -0,0 +1 @@ +afs-fix-refcount-underflow-from-error-handling-race.patch -- 2.47.3