]> git.ipfire.org Git - thirdparty/linux.git/commit
inet: Avoid ehash lookup race in inet_twsk_hashdance_schedule()
authorXuanqiang Luo <luoxuanqiang@kylinos.cn>
Wed, 15 Oct 2025 02:02:36 +0000 (10:02 +0800)
committerJakub Kicinski <kuba@kernel.org>
Fri, 17 Oct 2025 23:08:43 +0000 (16:08 -0700)
commitb8ec80b130211e7bf076ef72365952979d5f7a72
tree285157df36a8157122a4eab2ddba9a7a76a2b8c6
parent1532ed0d0753c83e72595f785f82b48c28bbe5dc
inet: Avoid ehash lookup race in inet_twsk_hashdance_schedule()

Since ehash lookups are lockless, if another CPU is converting sk to tw
concurrently, fetching the newly inserted tw with tw->tw_refcnt == 0 cause
lookup failure.

The call trace map is drawn as follows:
   CPU 0                                CPU 1
   -----                                -----
     inet_twsk_hashdance_schedule()
     spin_lock()
     inet_twsk_add_node_rcu(tw, ...)
__inet_lookup_established()
(find tw, failure due to tw_refcnt = 0)
     __sk_nulls_del_node_init_rcu(sk)
     refcount_set(&tw->tw_refcnt, 3)
     spin_unlock()

By replacing sk with tw atomically via hlist_nulls_replace_init_rcu() after
setting tw_refcnt, we ensure that tw is either fully initialized or not
visible to other CPUs, eliminating the race.

It's worth noting that we held lock_sock() before the replacement, so
there's no need to check if sk is hashed. Thanks to Kuniyuki Iwashima!

Fixes: 3ab5aee7fe84 ("net: Convert TCP & DCCP hash tables to use RCU / hlist_nulls")
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Signed-off-by: Xuanqiang Luo <luoxuanqiang@kylinos.cn>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20251015020236.431822-4-xuanqiang.luo@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
net/ipv4/inet_timewait_sock.c