From: Eric Dumazet Date: Fri, 24 Oct 2025 12:07:07 +0000 (+0000) Subject: tcp: remove one ktime_get() from recvmsg() fast path X-Git-Tag: v6.19-rc1~170^2~302 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=0ae1ac7335ca2328c42fbbe2b71bc9c3fc8e65d9;p=thirdparty%2Flinux.git tcp: remove one ktime_get() from recvmsg() fast path Each time some payload is consumed by user space (recvmsg() and friends), TCP calls tcp_rcv_space_adjust() to run DRS algorithm to check if an increase of sk->sk_rcvbuf is needed. This function is based on time sampling, and currently calls tcp_mstamp_refresh(tp), which is a wrapper around ktime_get_ns(). ktime_get_ns() has a high cost on some platforms. 100+ cycles for rdtscp on AMD EPYC Turin for instance. We do not have to refresh tp->tcp_mpstamp, using the last cached value is enough. We only need to refresh it from __tcp_cleanup_rbuf() if an ACK must be sent (this is a rare event). Signed-off-by: Eric Dumazet Reviewed-by: Kuniyuki Iwashima Link: https://patch.msgid.link/20251024120707.3516550-1-edumazet@google.com Signed-off-by: Jakub Kicinski --- diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index b79da6d393927..a9345aa5a2e5f 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -1556,8 +1556,10 @@ void __tcp_cleanup_rbuf(struct sock *sk, int copied) time_to_ack = true; } } - if (time_to_ack) + if (time_to_ack) { + tcp_mstamp_refresh(tp); tcp_send_ack(sk); + } } void tcp_cleanup_rbuf(struct sock *sk, int copied) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 8fc97f4d8a6b2..ff19f6e54d55c 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -928,9 +928,15 @@ void tcp_rcv_space_adjust(struct sock *sk) trace_tcp_rcv_space_adjust(sk); - tcp_mstamp_refresh(tp); + if (unlikely(!tp->rcv_rtt_est.rtt_us)) + return; + + /* We do not refresh tp->tcp_mstamp here. + * Some platforms have expensive ktime_get() implementations. + * Using the last cached value is enough for DRS. + */ time = tcp_stamp_us_delta(tp->tcp_mstamp, tp->rcvq_space.time); - if (time < (tp->rcv_rtt_est.rtt_us >> 3) || tp->rcv_rtt_est.rtt_us == 0) + if (time < (tp->rcv_rtt_est.rtt_us >> 3)) return; /* Number of bytes copied to user in last RTT */