From: Greg Kroah-Hartman Date: Mon, 21 Aug 2023 18:57:57 +0000 (+0200) Subject: 6.4-stable patches X-Git-Tag: v6.4.12~13 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=c3d50f3fb06f23daaea1165efd1c6e7a8cc4f1a7;p=thirdparty%2Fkernel%2Fstable-queue.git 6.4-stable patches added patches: af_unix-fix-null-ptr-deref-in-unix_stream_sendpage.patch net-fix-the-rto-timer-retransmitting-skb-every-1ms-if-linear-option-is-enabled.patch --- diff --git a/queue-6.4/af_unix-fix-null-ptr-deref-in-unix_stream_sendpage.patch b/queue-6.4/af_unix-fix-null-ptr-deref-in-unix_stream_sendpage.patch new file mode 100644 index 00000000000..03199a9542b --- /dev/null +++ b/queue-6.4/af_unix-fix-null-ptr-deref-in-unix_stream_sendpage.patch @@ -0,0 +1,137 @@ +From kuniyu@amazon.com Mon Aug 21 20:35:34 2023 +From: Kuniyuki Iwashima +Date: Mon, 21 Aug 2023 10:55:05 -0700 +Subject: af_unix: Fix null-ptr-deref in unix_stream_sendpage(). +To: Greg Kroah-Hartman +Cc: Hannes Frederic Sowa , Kuniyuki Iwashima , Kuniyuki Iwashima , , , Bing-Jhong Billy Jheng , Linus Torvalds +Message-ID: <20230821175505.23107-1-kuniyu@amazon.com> + +From: Kuniyuki Iwashima + +Bing-Jhong Billy Jheng reported null-ptr-deref in unix_stream_sendpage() +with detailed analysis and a nice repro. + +unix_stream_sendpage() tries to add data to the last skb in the peer's +recv queue without locking the queue. + +If the peer's FD is passed to another socket and the socket's FD is +passed to the peer, there is a loop between them. If we close both +sockets without receiving FD, the sockets will be cleaned up by garbage +collection. + +The garbage collection iterates such sockets and unlinks skb with +FD from the socket's receive queue under the queue's lock. + +So, there is a race where unix_stream_sendpage() could access an skb +locklessly that is being released by garbage collection, resulting in +use-after-free. + +To avoid the issue, unix_stream_sendpage() must lock the peer's recv +queue. + +Note the issue does not exist in 6.5+ thanks to the recent sendpage() +refactoring. + +This patch is originally written by Linus Torvalds. + +BUG: unable to handle page fault for address: ffff988004dd6870 +PF: supervisor read access in kernel mode +PF: error_code(0x0000) - not-present page +PGD 0 P4D 0 +PREEMPT SMP PTI +CPU: 4 PID: 297 Comm: garbage_uaf Not tainted 6.1.46 #1 +Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 +RIP: 0010:kmem_cache_alloc_node+0xa2/0x1e0 +Code: c0 0f 84 32 01 00 00 41 83 fd ff 74 10 48 8b 00 48 c1 e8 3a 41 39 c5 0f 85 1c 01 00 00 41 8b 44 24 28 49 8b 3c 24 48 8d 4a 40 <49> 8b 1c 06 4c 89 f0 65 48 0f c7 0f 0f 94 c0 84 c0 74 a1 41 8b 44 +RSP: 0018:ffffc9000079fac0 EFLAGS: 00000246 +RAX: 0000000000000070 RBX: 0000000000000005 RCX: 000000000001a284 +RDX: 000000000001a244 RSI: 0000000000400cc0 RDI: 000000000002eee0 +RBP: 0000000000400cc0 R08: 0000000000400cc0 R09: 0000000000000003 +R10: 0000000000000001 R11: 0000000000000000 R12: ffff888003970f00 +R13: 00000000ffffffff R14: ffff988004dd6800 R15: 00000000000000e8 +FS: 00007f174d6f3600(0000) GS:ffff88807db00000(0000) knlGS:0000000000000000 +CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 +CR2: ffff988004dd6870 CR3: 00000000092be000 CR4: 00000000007506e0 +PKRU: 55555554 +Call Trace: + + ? __die_body.cold+0x1a/0x1f + ? page_fault_oops+0xa9/0x1e0 + ? fixup_exception+0x1d/0x310 + ? exc_page_fault+0xa8/0x150 + ? asm_exc_page_fault+0x22/0x30 + ? kmem_cache_alloc_node+0xa2/0x1e0 + ? __alloc_skb+0x16c/0x1e0 + __alloc_skb+0x16c/0x1e0 + alloc_skb_with_frags+0x48/0x1e0 + sock_alloc_send_pskb+0x234/0x270 + unix_stream_sendmsg+0x1f5/0x690 + sock_sendmsg+0x5d/0x60 + ____sys_sendmsg+0x210/0x260 + ___sys_sendmsg+0x83/0xd0 + ? kmem_cache_alloc+0xc6/0x1c0 + ? avc_disable+0x20/0x20 + ? percpu_counter_add_batch+0x53/0xc0 + ? alloc_empty_file+0x5d/0xb0 + ? alloc_file+0x91/0x170 + ? alloc_file_pseudo+0x94/0x100 + ? __fget_light+0x9f/0x120 + __sys_sendmsg+0x54/0xa0 + do_syscall_64+0x3b/0x90 + entry_SYSCALL_64_after_hwframe+0x69/0xd3 +RIP: 0033:0x7f174d639a7d +Code: 28 89 54 24 1c 48 89 74 24 10 89 7c 24 08 e8 8a c1 f4 ff 8b 54 24 1c 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 33 44 89 c7 48 89 44 24 08 e8 de c1 f4 ff 48 +RSP: 002b:00007ffcb563ea50 EFLAGS: 00000293 ORIG_RAX: 000000000000002e +RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f174d639a7d +RDX: 0000000000000000 RSI: 00007ffcb563eab0 RDI: 0000000000000007 +RBP: 00007ffcb563eb10 R08: 0000000000000000 R09: 00000000ffffffff +R10: 00000000004040a0 R11: 0000000000000293 R12: 00007ffcb563ec28 +R13: 0000000000401398 R14: 0000000000403e00 R15: 00007f174d72c000 + + +Fixes: 869e7c62486e ("net: af_unix: implement stream sendpage support") +Reported-by: Bing-Jhong Billy Jheng +Reviewed-by: Bing-Jhong Billy Jheng +Co-developed-by: Linus Torvalds +Signed-off-by: Linus Torvalds +Signed-off-by: Kuniyuki Iwashima +Signed-off-by: Greg Kroah-Hartman +--- + net/unix/af_unix.c | 9 ++++----- + 1 file changed, 4 insertions(+), 5 deletions(-) + +--- a/net/unix/af_unix.c ++++ b/net/unix/af_unix.c +@@ -2291,6 +2291,7 @@ static ssize_t unix_stream_sendpage(stru + + if (false) { + alloc_skb: ++ spin_unlock(&other->sk_receive_queue.lock); + unix_state_unlock(other); + mutex_unlock(&unix_sk(other)->iolock); + newskb = sock_alloc_send_pskb(sk, 0, 0, flags & MSG_DONTWAIT, +@@ -2330,6 +2331,7 @@ alloc_skb: + init_scm = false; + } + ++ spin_lock(&other->sk_receive_queue.lock); + skb = skb_peek_tail(&other->sk_receive_queue); + if (tail && tail == skb) { + skb = newskb; +@@ -2360,14 +2362,11 @@ alloc_skb: + refcount_add(size, &sk->sk_wmem_alloc); + + if (newskb) { +- err = unix_scm_to_skb(&scm, skb, false); +- if (err) +- goto err_state_unlock; +- spin_lock(&other->sk_receive_queue.lock); ++ unix_scm_to_skb(&scm, skb, false); + __skb_queue_tail(&other->sk_receive_queue, newskb); +- spin_unlock(&other->sk_receive_queue.lock); + } + ++ spin_unlock(&other->sk_receive_queue.lock); + unix_state_unlock(other); + mutex_unlock(&unix_sk(other)->iolock); + diff --git a/queue-6.4/net-fix-the-rto-timer-retransmitting-skb-every-1ms-if-linear-option-is-enabled.patch b/queue-6.4/net-fix-the-rto-timer-retransmitting-skb-every-1ms-if-linear-option-is-enabled.patch new file mode 100644 index 00000000000..0d8f67688d5 --- /dev/null +++ b/queue-6.4/net-fix-the-rto-timer-retransmitting-skb-every-1ms-if-linear-option-is-enabled.patch @@ -0,0 +1,51 @@ +From e4dd0d3a2f64b8bd8029ec70f52bdbebd0644408 Mon Sep 17 00:00:00 2001 +From: Jason Xing +Date: Fri, 11 Aug 2023 10:37:47 +0800 +Subject: net: fix the RTO timer retransmitting skb every 1ms if linear option is enabled + +From: Jason Xing + +commit e4dd0d3a2f64b8bd8029ec70f52bdbebd0644408 upstream. + +In the real workload, I encountered an issue which could cause the RTO +timer to retransmit the skb per 1ms with linear option enabled. The amount +of lost-retransmitted skbs can go up to 1000+ instantly. + +The root cause is that if the icsk_rto happens to be zero in the 6th round +(which is the TCP_THIN_LINEAR_RETRIES value), then it will always be zero +due to the changed calculation method in tcp_retransmit_timer() as follows: + +icsk->icsk_rto = min(icsk->icsk_rto << 1, TCP_RTO_MAX); + +Above line could be converted to +icsk->icsk_rto = min(0 << 1, TCP_RTO_MAX) = 0 + +Therefore, the timer expires so quickly without any doubt. + +I read through the RFC 6298 and found that the RTO value can be rounded +up to a certain value, in Linux, say TCP_RTO_MIN as default, which is +regarded as the lower bound in this patch as suggested by Eric. + +Fixes: 36e31b0af587 ("net: TCP thin linear timeouts") +Suggested-by: Eric Dumazet +Signed-off-by: Jason Xing +Reviewed-by: Eric Dumazet +Signed-off-by: David S. Miller +Signed-off-by: Greg Kroah-Hartman +--- + net/ipv4/tcp_timer.c | 4 +++- + 1 file changed, 3 insertions(+), 1 deletion(-) + +--- a/net/ipv4/tcp_timer.c ++++ b/net/ipv4/tcp_timer.c +@@ -586,7 +586,9 @@ out_reset_timer: + tcp_stream_is_thin(tp) && + icsk->icsk_retransmits <= TCP_THIN_LINEAR_RETRIES) { + icsk->icsk_backoff = 0; +- icsk->icsk_rto = min(__tcp_set_rto(tp), TCP_RTO_MAX); ++ icsk->icsk_rto = clamp(__tcp_set_rto(tp), ++ tcp_rto_min(sk), ++ TCP_RTO_MAX); + } else { + /* Use normal (exponential) backoff */ + icsk->icsk_rto = min(icsk->icsk_rto << 1, TCP_RTO_MAX); diff --git a/queue-6.4/series b/queue-6.4/series index 0bcaedb2a40..fd23228d644 100644 --- a/queue-6.4/series +++ b/queue-6.4/series @@ -230,3 +230,5 @@ drm-amd-pm-skip-the-rlc-stop-when-s0i3-suspend-for-smu-v13.0.4-11.patch drm-amdgpu-keep-irq-count-in-amdgpu_irq_disable_all.patch revert-perf-report-append-inlines-to-non-dwarf-callchains.patch asoc-sof-intel-hda-clean-up-link-dma-for-ipc3-during-stop.patch +af_unix-fix-null-ptr-deref-in-unix_stream_sendpage.patch +net-fix-the-rto-timer-retransmitting-skb-every-1ms-if-linear-option-is-enabled.patch