]> git.ipfire.org Git - thirdparty/linux.git/log
thirdparty/linux.git
4 months agor8169: don't scan PHY addresses > 0
Heiner Kallweit [Tue, 4 Feb 2025 06:58:17 +0000 (07:58 +0100)] 
r8169: don't scan PHY addresses > 0

The PHY address is a dummy, because r8169 PHY access registers
don't support a PHY address. Therefore scan address 0 only.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/830637dd-4016-4a68-92b3-618fcac6589d@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agonet: flush_backlog() small changes
Eric Dumazet [Tue, 4 Feb 2025 14:48:25 +0000 (14:48 +0000)] 
net: flush_backlog() small changes

Add READ_ONCE() around reads of skb->dev->reg_state, because
this field can be changed from other threads/cpus.

Instead of calling dev_kfree_skb_irq() and kfree_skb()
while interrupts are masked and locks held,
use a temporary list and use __skb_queue_purge_reason()

Use SKB_DROP_REASON_DEV_READY drop reason to better
describe why these skbs are dropped.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Link: https://patch.msgid.link/20250204144825.316785-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agos390/net: Remove LCS driver
Aswin Karuvally [Tue, 4 Feb 2025 10:31:35 +0000 (11:31 +0100)] 
s390/net: Remove LCS driver

The original Open Systems Adapter (OSA) was introduced by IBM in the
mid-90s. These were then superseded by OSA-Express in 1999 which used
Queued Direct IO to greatly improve throughput. The newer cards
retained the older, slower non-QDIO (OSE) modes for compatibility with
older systems. In Linux, the lcs driver was responsible for cards
operating in the older OSE mode and the qeth driver was introduced to
allow the OSA-Express cards to operate in the newer QDIO (OSD) mode.

For an S390 machine from 1998 or later, there is no reason to use the
OSE mode and lcs driver as all OSA cards since 1999 provide the faster
OSD mode. As a result, it's been years since we have heard of a
customer configuration involving the lcs driver.

This patch removes the lcs driver. The technology it supports has been
obsolete for past 25+ years and is irrelevant for current use cases.

Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Acked-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Signed-off-by: Aswin Karuvally <aswin@linux.ibm.com>
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250204103135.1619097-1-wintera@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agocxgb4: Avoid a -Wflex-array-member-not-at-end warning
Gustavo A. R. Silva [Tue, 4 Feb 2025 02:54:31 +0000 (13:24 +1030)] 
cxgb4: Avoid a -Wflex-array-member-not-at-end warning

-Wflex-array-member-not-at-end was introduced in GCC-14, and we are
getting ready to enable it, globally.

Move the conflicting declaration to the end of the structure. Notice
that `struct ethtool_dump` is a flexible structure --a structure that
contains a flexible-array member.

Fix the following warning:
./drivers/net/ethernet/chelsio/cxgb4/cxgb4.h:1215:29: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]

Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://patch.msgid.link/Z6GBZ4brXYffLkt_@kspp
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agobridge: mdb: Allow replace of a host-joined group
Petr Machata [Tue, 4 Feb 2025 17:37:15 +0000 (18:37 +0100)] 
bridge: mdb: Allow replace of a host-joined group

Attempts to replace an MDB group membership of the host itself are
currently bounced:

 # ip link add name br up type bridge vlan_filtering 1
 # bridge mdb replace dev br port br grp 239.0.0.1 vid 2
 # bridge mdb replace dev br port br grp 239.0.0.1 vid 2
 Error: bridge: Group is already joined by host.

A similar operation done on a member port would succeed. Ignore the check
for replacement of host group memberships as well.

The bit of code that this enables is br_multicast_host_join(), which, for
already-joined groups only refreshes the MC group expiration timer, which
is desirable; and a userspace notification, also desirable.

Change a selftest that exercises this code path from expecting a rejection
to expecting a pass. The rest of MDB selftests pass without modification.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/e5c5188b9787ae806609e7ca3aa2a0a501b9b5c4.1738685648.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agoselftests: net: suppress ReST file generation when building selftests
Jakub Kicinski [Mon, 3 Feb 2025 21:48:50 +0000 (13:48 -0800)] 
selftests: net: suppress ReST file generation when building selftests

Some selftests need libynl.a. When building it try to skip
generating the ReST documentation, libynl.a does not depend
on them.

Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250203214850.1282291-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agoMerge branch 'net-sysfs-remove-the-rtnl_trylock-restart_syscall-construction'
Jakub Kicinski [Thu, 6 Feb 2025 01:49:10 +0000 (17:49 -0800)] 
Merge branch 'net-sysfs-remove-the-rtnl_trylock-restart_syscall-construction'

Antoine Tenart says:

====================
net-sysfs: remove the rtnl_trylock/restart_syscall construction

The series initially aimed at improving spins (and thus delays) while
accessing net sysfs under rtnl lock contention[1]. The culprit was the
trylock/restart_syscall constructions. There wasn't much interest at the
time but it got traction recently for other reasons (lowering the rtnl
lock pressure).

Since v1[2]:

- Do not export rtnl_lock_interruptible [Stephen].
- Add netdev_warn_once messages in rx_queue_add_kobject [Jakub].

Since the RFC[1]:

- Limit the breaking of the sysfs protection to sysfs_rtnl_lock() only
  as this is not needed in the whole rtnl locking section thanks to the
  additional check on dev_isalive(). This simplifies error handling as
  well as the unlocking path.
- Used an interruptible version of rtnl_lock, as done by Jakub in
  his experiments.
- Removed a WARN_ONCE_ONCE [Greg].
- Removed explicit inline markers [Stephen].

Most of the reasoning is explained in comments added in patch 1. This
was tested by stress-testing net sysfs attributes (read/write ops) while
adding/removing queues and adding/removing veths, all in parallel. I
also used an OCP single node cluster, spawning lots of pods.

[1] https://lore.kernel.org/all/20231018154804.420823-1-atenart@kernel.org/T/
[2] https://lore.kernel.org/all/20250117102612.132644-1-atenart@kernel.org/T/
====================

Link: https://patch.msgid.link/20250204170314.146022-1-atenart@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agonet-sysfs: remove rtnl_trylock from queue attributes
Antoine Tenart [Tue, 4 Feb 2025 17:03:13 +0000 (18:03 +0100)] 
net-sysfs: remove rtnl_trylock from queue attributes

Similar to the commit removing remove rtnl_trylock from device
attributes we here apply the same technique to networking queues.

Signed-off-by: Antoine Tenart <atenart@kernel.org>
Link: https://patch.msgid.link/20250204170314.146022-5-atenart@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agonet-sysfs: prevent uncleared queues from being re-added
Antoine Tenart [Tue, 4 Feb 2025 17:03:12 +0000 (18:03 +0100)] 
net-sysfs: prevent uncleared queues from being re-added

With the (upcoming) removal of the rtnl_trylock/restart_syscall logic
and because of how Tx/Rx queues are implemented (and their
requirements), it might happen that a queue is re-added before having
the chance to be cleared. In such rare case, do not complete the queue
addition operation.

Signed-off-by: Antoine Tenart <atenart@kernel.org>
Link: https://patch.msgid.link/20250204170314.146022-4-atenart@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agonet-sysfs: move queue attribute groups outside the default groups
Antoine Tenart [Tue, 4 Feb 2025 17:03:11 +0000 (18:03 +0100)] 
net-sysfs: move queue attribute groups outside the default groups

Rx/tx queues embed their own kobject for registering their per-queue
sysfs files. The issue is they're using the kobject default groups for
this and entirely rely on the kobject refcounting for releasing their
sysfs paths.

In order to remove rtnl_trylock calls we need sysfs files not to rely on
their associated kobject refcounting for their release. Thus we here
move queues sysfs files from the kobject default groups to their own
groups which can be removed separately.

Signed-off-by: Antoine Tenart <atenart@kernel.org>
Link: https://patch.msgid.link/20250204170314.146022-3-atenart@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agonet-sysfs: remove rtnl_trylock from device attributes
Antoine Tenart [Tue, 4 Feb 2025 17:03:10 +0000 (18:03 +0100)] 
net-sysfs: remove rtnl_trylock from device attributes

There is an ABBA deadlock between net device unregistration and sysfs
files being accessed[1][2]. To prevent this from happening all paths
taking the rtnl lock after the sysfs one (actually kn->active refcount)
use rtnl_trylock and return early (using restart_syscall)[3], which can
make syscalls to spin for a long time when there is contention on the
rtnl lock[4].

There are not many possibilities to improve the above:
- Rework the entire net/ locking logic.
- Invert two locks in one of the paths — not possible.

But here it's actually possible to drop one of the locks safely: the
kernfs_node refcount. More details in the code itself, which comes with
lots of comments.

Note that we check the device is alive in the added sysfs_rtnl_lock
helper to disallow sysfs operations to run after device dismantle has
started. This also help keeping the same behavior as before. Because of
this calls to dev_isalive in sysfs ops were removed.

[1] https://lore.kernel.org/netdev/49A4D5D5.5090602@trash.net/
[2] https://lore.kernel.org/netdev/m14oyhis31.fsf@fess.ebiederm.org/
[3] https://lore.kernel.org/netdev/20090226084924.16cb3e08@nehalam/
[4] https://lore.kernel.org/all/20210928125500.167943-1-atenart@kernel.org/T/

Signed-off-by: Antoine Tenart <atenart@kernel.org>
Link: https://patch.msgid.link/20250204170314.146022-2-atenart@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agonet: phy: realtek: use string choices helpers
Heiner Kallweit [Mon, 3 Feb 2025 20:41:36 +0000 (21:41 +0100)] 
net: phy: realtek: use string choices helpers

Use string choices helpers to simplify the code.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202501190707.qQS8PGHW-lkp@intel.com/
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agor8169: make Kconfig option for LED support user-visible
Heiner Kallweit [Mon, 3 Feb 2025 20:35:24 +0000 (21:35 +0100)] 
r8169: make Kconfig option for LED support user-visible

Make config option R8169_LEDS user-visible, so that users can remove
support if not needed.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/d29f0cdb-32bf-435f-b59d-dc96bca1e3ab@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agonet: phy: realtek: make HWMON support a user-visible Kconfig symbol
Heiner Kallweit [Mon, 3 Feb 2025 20:33:39 +0000 (21:33 +0100)] 
net: phy: realtek: make HWMON support a user-visible Kconfig symbol

Make config symbol REALTEK_PHY_HWMON user-visible, so that users can
remove support if not needed.

Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/3466ee92-166a-4b0f-9ae7-42b9e046f333@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agonetconsole: selftest: Add test for fragmented messages
Breno Leitao [Mon, 3 Feb 2025 19:04:15 +0000 (11:04 -0800)] 
netconsole: selftest: Add test for fragmented messages

Add a new selftest to verify netconsole's handling of messages that
exceed the packet size limit and require fragmentation. The test sends
messages with varying sizes and userdata, validating that:

1. Large messages are correctly fragmented and reassembled
2. Userdata fields are properly preserved across fragments
3. Messages work correctly with and without kernel release version
   appending

The test creates a networking environment using netdevsim, sends
messages through /dev/kmsg, and verifies the received fragments maintain
message integrity.

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250203-netcons_frag_msgs-v1-1-5bc6bedf2ac0@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agonet: atlantic: Avoid -Wflex-array-member-not-at-end warnings
Gustavo A. R. Silva [Tue, 4 Feb 2025 02:10:49 +0000 (12:40 +1030)] 
net: atlantic: Avoid -Wflex-array-member-not-at-end warnings

-Wflex-array-member-not-at-end was introduced in GCC-14, and we are
getting ready to enable it, globally.

Remove unused flexible-array member `buf` and, with this, fix the following
warnings:
drivers/net/ethernet/aquantia/atlantic/aq_hw.h:197:36: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
drivers/net/ethernet/aquantia/atlantic/hw_atl/../aq_hw.h:197:36: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]

Suggested-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Igor Russkikh <irusskikh@marvell.com>
Link: https://patch.msgid.link/Z6F3KZVfnAZ2FoJm@kspp
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agonet: warn if NAPI instance wasn't shut down
Jakub Kicinski [Mon, 3 Feb 2025 21:58:16 +0000 (13:58 -0800)] 
net: warn if NAPI instance wasn't shut down

Drivers should always disable a NAPI instance before removing it.
If they don't the instance may be queued for polling.
Since commit 86e25f40aa1e ("net: napi: Add napi_config")
we also remove the NAPI from the busy polling hash table
in napi_disable(), so not disabling would leave a stale
entry there.

Use of busy polling is relatively uncommon so bugs may be lurking
in the drivers. Add an explicit warning.

Reviewed-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250203215816.1294081-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agocavium/liquidio: Remove unused lio_get_device_id
Dr. David Alan Gilbert [Mon, 3 Feb 2025 18:33:43 +0000 (18:33 +0000)] 
cavium/liquidio: Remove unused lio_get_device_id

lio_get_device_id() has been unused since 2018's
commit 64fecd3ec512 ("liquidio: remove obsolete functions and data
structures")

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250203183343.193691-1-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agomlxsw: spectrum_router: Remove unused functions
Dr. David Alan Gilbert [Mon, 3 Feb 2025 19:01:41 +0000 (19:01 +0000)] 
mlxsw: spectrum_router: Remove unused functions

mlxsw_sp_ipip_lb_ul_vr_id() has been unused since 2020's
commit acde33bf7319 ("mlxsw: spectrum_router: Reduce
mlxsw_sp_ipip_fib_entry_op_gre4()")

mlxsw_sp_rif_exists() has been unused since 2023's
commit 49c3a615d382 ("mlxsw: spectrum_router: Replay MACVLANs when RIF is
made")

mlxsw_sp_rif_vid() has been unused since 2023's
commit a5b52692e693 ("mlxsw: spectrum_switchdev: Manage RIFs on PVID
change")

Remove them.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20250203190141.204951-1-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agonet/mlx5: Remove unused mlx5dr_domain_sync
Dr. David Alan Gilbert [Mon, 3 Feb 2025 18:59:58 +0000 (18:59 +0000)] 
net/mlx5: Remove unused mlx5dr_domain_sync

mlx5dr_domain_sync() was added in 2019 by
commit 70605ea545e8 ("net/mlx5: DR, Expose APIs for direct rule managing")
but hasn't been used.

Remove it.

mlx5dr_domain_sync() was the only user of
mlx5dr_send_ring_force_drain().
Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Link: https://patch.msgid.link/20250203185958.204794-1-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agomlx4: Remove unused functions
Dr. David Alan Gilbert [Mon, 3 Feb 2025 18:52:29 +0000 (18:52 +0000)] 
mlx4: Remove unused functions

The last use of mlx4_find_cached_mac() was removed in 2014 by
commit 2f5bb473681b ("mlx4: Add ref counting to port MAC table for RoCE")

mlx4_zone_free_entries() was added in 2014 by
commit 7a89399ffad7 ("net/mlx4: Add mlx4_bitmap zone allocator")
but hasn't been used. (The _unique version is used)

Remove them.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Link: https://patch.msgid.link/20250203185229.204279-1-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agonet: qed: fix typos
Andrew Kreimer [Mon, 3 Feb 2025 17:53:24 +0000 (19:53 +0200)] 
net: qed: fix typos

There are some typos in comments/messages:
 - Valiate -> Validate
 - acceptible -> acceptable
 - acces -> access
 - relased -> released

Fix them via codespell.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250203175419.4146-1-algonell@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agodt-bindings: net: faraday,ftgmac100: Add phys mode
Ninad Palsule [Mon, 3 Feb 2025 15:12:55 +0000 (09:12 -0600)] 
dt-bindings: net: faraday,ftgmac100: Add phys mode

Aspeed device supports rgmii, rgmii-id, rgmii-rxid, rgmii-txid so
document them.

Acked-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Ninad Palsule <ninad@linux.ibm.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250203151306.276358-2-ninad@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agoneighbour: remove neigh_parms_destroy()
Eric Dumazet [Mon, 3 Feb 2025 15:11:52 +0000 (15:11 +0000)] 
neighbour: remove neigh_parms_destroy()

neigh_parms_destroy() is a simple kfree(), no need for
a forward declaration.

neigh_parms_put() can instead call kfree() directly.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250203151152.3163876-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agobonding: delete always true device check
Leon Romanovsky [Mon, 3 Feb 2025 12:59:23 +0000 (14:59 +0200)] 
bonding: delete always true device check

XFRM API makes sure that xs->xso.dev is valid in all XFRM offload
callbacks. There is no need to check it again.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/0b2f8f5f09701bb43bbd83b94bfe5cb506b57adc.1738587150.git.leon@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agoMerge tag 'net-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 30 Jan 2025 20:24:20 +0000 (12:24 -0800)] 
Merge tag 'net-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from IPSec, netfilter and Bluetooth.

  Nothing really stands out, but as usual there's a slight concentration
  of fixes for issues added in the last two weeks before the merge
  window, and driver bugs from 6.13 which tend to get discovered upon
  wider distribution.

  Current release - regressions:

   - net: revert RTNL changes in unregister_netdevice_many_notify()

   - Bluetooth: fix possible infinite recursion of btusb_reset

   - eth: adjust locking in some old drivers which protect their state
     with spinlocks to avoid sleeping in atomic; core protects netdev
     state with a mutex now

  Previous releases - regressions:

   - eth:
      - mlx5e: make sure we pass node ID, not CPU ID to kvzalloc_node()
      - bgmac: reduce max frame size to support just 1500 bytes; the
        jumbo frame support would previously cause OOB writes, but now
        fails outright

   - mptcp: blackhole only if 1st SYN retrans w/o MPC is accepted, avoid
     false detection of MPTCP blackholing

  Previous releases - always broken:

   - mptcp: handle fastopen disconnect correctly

   - xfrm:
      - make sure skb->sk is a full sock before accessing its fields
      - fix taking a lock with preempt disabled for RT kernels

   - usb: ipheth: improve safety of packet metadata parsing; prevent
     potential OOB accesses

   - eth: renesas: fix missing rtnl lock in suspend/resume path"

* tag 'net-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (88 commits)
  MAINTAINERS: add Neal to TCP maintainers
  net: revert RTNL changes in unregister_netdevice_many_notify()
  net: hsr: fix fill_frame_info() regression vs VLAN packets
  doc: mptcp: sysctl: blackhole_timeout is per-netns
  mptcp: blackhole only if 1st SYN retrans w/o MPC is accepted
  netfilter: nf_tables: reject mismatching sum of field_len with set key length
  net: sh_eth: Fix missing rtnl lock in suspend/resume path
  net: ravb: Fix missing rtnl lock in suspend/resume path
  selftests/net: Add test for loading devbound XDP program in generic mode
  net: xdp: Disallow attaching device-bound programs in generic mode
  tcp: correct handling of extreme memory squeeze
  bgmac: reduce max frame size to support just MTU 1500
  vsock/test: Add test for connect() retries
  vsock/test: Add test for UAF due to socket unbinding
  vsock/test: Introduce vsock_connect_fd()
  vsock/test: Introduce vsock_bind()
  vsock: Allow retrying on connect() failure
  vsock: Keep the binding until socket destruction
  Bluetooth: L2CAP: accept zero as a special value for MTU auto-selection
  Bluetooth: btnxpuart: Fix glitches seen in dual A2DP streaming
  ...

4 months agoMerge tag 'docs-6.14-2' of git://git.lwn.net/linux
Linus Torvalds [Thu, 30 Jan 2025 18:57:19 +0000 (10:57 -0800)] 
Merge tag 'docs-6.14-2' of git://git.lwn.net/linux

Pull documentation fixes from Jonathan Corbet:
 "Two fixes for footnote-related warnings that appeared with Sphinx 8.x.

  We want to encourage use of newer Sphinx - they fixed a performance
  problem and the docs build takes less than half the time it used to"

* tag 'docs-6.14-2' of git://git.lwn.net/linux:
  docs: power: Fix footnote reference for Toshiba Satellite P10-554
  Documentation: ublk: Drop Stefan Hajnoczi's message footnote

4 months agoMerge tag 's390-6.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Linus Torvalds [Thu, 30 Jan 2025 18:53:49 +0000 (10:53 -0800)] 
Merge tag 's390-6.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 fixes from Alexander Gordeev:

 - Architecutre-specific ftrace recursion trylock tests were removed in
   favour of the generic function_graph_enter(), but s390 got missed.

   Remove this test for s390 as well.

 - Add ftrace_get_symaddr() for s390, which returns the symbol address
   from ftrace 'ip' parameter

* tag 's390-6.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/tracing: Define ftrace_get_symaddr() for s390
  s390/fgraph: Fix to remove ftrace_test_recursion_trylock()

4 months agoMerge tag 's390-6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Linus Torvalds [Thu, 30 Jan 2025 18:48:17 +0000 (10:48 -0800)] 
Merge tag 's390-6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull more s390 updates from Alexander Gordeev:

 - The rework that uncoupled physical and virtual address spaces
   inadvertently prevented KASAN shadow mappings from using large pages.
   Restore large page mappings for KASAN shadows

 - Add decompressor routine physmem_alloc() that may fail, unlike
   physmem_alloc_or_die(). This allows callers to implement fallback
   paths

 - Allow falling back from large pages to smaller pages (1MB or 4KB) if
   the allocation of 2GB pages in the decompressor can not be fulfilled

 - Add to the decompressor boot print support of "%%" format string,
   width and padding hadnling, length modifiers and decimal conversion
   specifiers

 - Add to the decompressor message severity levels similar to kernel
   ones. Support command-line options that control console output
   verbosity

 - Replaces boot_printk() calls with appropriate loglevel- specific
   helpers such as boot_emerg(), boot_warn(), and boot_debug().

 - Collect all boot messages into a ring buffer independent of the
   current log level. This is particularly useful for early crash
   analysis

 - If 'earlyprintk' command line parameter is not specified, store
   decompressor boot messages in a ring buffer to be printed later by
   the kernel, once the console driver is registered

 - Add 'bootdebug' command line parameter to enable printing of
   decompressor debug messages when needed. That parameters allows
   message suppressing and filtering

 - Dump boot messages on a decompressor crash, but only if 'bootdebug'
   command line parameter is enabled

 - When CONFIG_PRINTK_TIME is enabled, add timestamps to boot messages
   in the same format as regular printk()

 - Dump physical memory tracking information on boot: online ranges,
   reserved areas and vmem allocations

 - Dump virtual memory layout and randomization details

 - Improve decompression error reporting and dump the message ring
   buffer in case the boot failed and system halted

 - Add an exception handler which handles exceptions when FPU control
   register is attempted to be set to an invalid value. Remove '.fixup'
   section as result of this change

 - Use 'A', 'O', and 'R' inline assembly format flags, which allows
   recent Clang compilers to generate better FPU code

 - Rework uaccess code so it reads better and generates more efficient
   code

 - Cleanup futex inline assembly code

 - Disable KMSAN instrumention for futex inline assemblies, which
   contain dereferenced user pointers. Otherwise, shadows for the user
   pointers would be accessed

 - PFs which are not initially configured but in standby create only a
   single-function PCI domain. If they are configured later on, sibling
   PFs and their child VFs will not be added to their PCI domain
   breaking SR-IOV expectations.

   Fix that by allowing initially configured but in standby PFs create
   multi-function PCI domains

 - Add '-std=gnu11' to decompressor and purgatory CFLAGS to avoid
   compile errors caused by kernel's own definitions of 'bool', 'false',
   and 'true' conflicting with the C23 reserved keywords

 - Fix sclp subsystem failure when a sclp console is not present

 - Fix misuse of non-NULL terminated strings in vmlogrdr driver

 - Various other small improvements, cleanups and fixes

* tag 's390-6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (53 commits)
  s390/vmlogrdr: Use array instead of string initializer
  s390/vmlogrdr: Use internal_name for error messages
  s390/sclp: Initialize sclp subsystem via arch_cpu_finalize_init()
  s390/tools: Use array instead of string initializer
  s390/vmem: Fix null-pointer-arithmetic warning in vmem_map_init()
  s390: Add '-std=gnu11' to decompressor and purgatory CFLAGS
  s390/bitops: Use correct constraint for arch_test_bit() inline assembly
  s390/pci: Fix SR-IOV for PFs initially in standby
  s390/futex: Avoid KMSAN instrumention for user pointers
  s390/uaccess: Rename get_put_user_noinstr_attributes to uaccess_kmsan_or_inline
  s390/futex: Cleanup futex_atomic_cmpxchg_inatomic()
  s390/futex: Generate futex atomic op functions
  s390/uaccess: Remove INLINE_COPY_FROM_USER and INLINE_COPY_TO_USER
  s390/uaccess: Use asm goto for put_user()/get_user()
  s390/uaccess: Remove usage of the oac specifier
  s390/uaccess: Replace EX_TABLE_UA_LOAD_MEM exception handling
  s390/uaccess: Cleanup noinstr __put_user()/__get_user() inline assembly constraints
  s390/uaccess: Remove __put_user_fn()/__get_user_fn() wrappers
  s390/uaccess: Move put_user() / __put_user() close to put_user() asm code
  s390/uaccess: Use asm goto for __mvc_kernel_nofault()
  ...

4 months agoMerge tag 'gpio-fixes-for-v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 30 Jan 2025 18:19:30 +0000 (10:19 -0800)] 
Merge tag 'gpio-fixes-for-v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux

Pull gpio fixes from Bartosz Golaszewski:

 - update gpio-sim selftests to not fail now that we no longer allow
   rmdir() on configfs entries of active devices

 - remove leftover code from gpio-mxc

* tag 'gpio-fixes-for-v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  selftests: gpio: gpio-sim: Fix missing chip disablements
  gpio: mxc: remove dead code after switch to DT-only

4 months agoMerge tag 'pull-revalidate' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Linus Torvalds [Thu, 30 Jan 2025 17:13:35 +0000 (09:13 -0800)] 
Merge tag 'pull-revalidate' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull vfs d_revalidate updates from Al Viro:
 "Provide stable parent and name to ->d_revalidate() instances

  Most of the filesystem methods where we care about dentry name and
  parent have their stability guaranteed by the callers;
  ->d_revalidate() is the major exception.

  It's easy enough for callers to supply stable values for expected name
  and expected parent of the dentry being validated. That kills quite a
  bit of boilerplate in ->d_revalidate() instances, along with a bunch
  of races where they used to access ->d_name without sufficient
  precautions"

* tag 'pull-revalidate' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  9p: fix ->rename_sem exclusion
  orangefs_d_revalidate(): use stable parent inode and name passed by caller
  ocfs2_dentry_revalidate(): use stable parent inode and name passed by caller
  nfs: fix ->d_revalidate() UAF on ->d_name accesses
  nfs{,4}_lookup_validate(): use stable parent inode passed by caller
  gfs2_drevalidate(): use stable parent inode and name passed by caller
  fuse_dentry_revalidate(): use stable parent inode and name passed by caller
  vfat_revalidate{,_ci}(): use stable parent inode passed by caller
  exfat_d_revalidate(): use stable parent inode passed by caller
  fscrypt_d_revalidate(): use stable parent inode passed by caller
  ceph_d_revalidate(): propagate stable name down into request encoding
  ceph_d_revalidate(): use stable parent inode passed by caller
  afs_d_revalidate(): use stable name and parent inode passed by caller
  Pass parent directory inode and expected name to ->d_revalidate()
  generic_ci_d_compare(): use shortname_storage
  ext4 fast_commit: make use of name_snapshot primitives
  dissolve external_name.u into separate members
  make take_dentry_name_snapshot() lockless
  dcache: back inline names with a struct-wrapped array of unsigned long
  make sure that DNAME_INLINE_LEN is a multiple of word size

4 months agoMerge tag 'nf-25-01-30' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Jakub Kicinski [Thu, 30 Jan 2025 17:01:00 +0000 (09:01 -0800)] 
Merge tag 'nf-25-01-30' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following batch contains one Netfilter fix:

1) Reject mismatching sum of field_len with set key length which allows
   to create a set without inconsistent pipapo rule width and set key
   length.

* tag 'nf-25-01-30' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: nf_tables: reject mismatching sum of field_len with set key length
====================

Link: https://patch.msgid.link/20250130113307.2327470-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agoMAINTAINERS: add Neal to TCP maintainers
Jakub Kicinski [Wed, 29 Jan 2025 19:13:32 +0000 (11:13 -0800)] 
MAINTAINERS: add Neal to TCP maintainers

Neal Cardwell has been indispensable in TCP reviews
and investigations, especially protocol-related.
Neal is also the author of packetdrill.

Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250129191332.2526140-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agonet: revert RTNL changes in unregister_netdevice_many_notify()
Eric Dumazet [Wed, 29 Jan 2025 14:27:26 +0000 (14:27 +0000)] 
net: revert RTNL changes in unregister_netdevice_many_notify()

This patch reverts following changes:

83419b61d187 net: reduce RTNL hold duration in unregister_netdevice_many_notify() (part 2)
ae646f1a0bb9 net: reduce RTNL hold duration in unregister_netdevice_many_notify() (part 1)
cfa579f66656 net: no longer hold RTNL while calling flush_all_backlogs()

This caused issues in layers holding a private mutex:

cleanup_net()
  rtnl_lock();
mutex_lock(subsystem_mutex);

unregister_netdevice();

   rtnl_unlock(); // LOCKDEP violation
   rtnl_lock();

I will revisit this in next cycle, opt-in for the new behavior
from safe contexts only.

Fixes: cfa579f66656 ("net: no longer hold RTNL while calling flush_all_backlogs()")
Fixes: ae646f1a0bb9 ("net: reduce RTNL hold duration in unregister_netdevice_many_notify() (part 1)")
Fixes: 83419b61d187 ("net: reduce RTNL hold duration in unregister_netdevice_many_notify() (part 2)")
Reported-by: syzbot+5b9196ecf74447172a9a@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/6789d55f.050a0220.20d369.004e.GAE@google.com/
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250129142726.747726-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agonet: hsr: fix fill_frame_info() regression vs VLAN packets
Eric Dumazet [Wed, 29 Jan 2025 13:00:07 +0000 (13:00 +0000)] 
net: hsr: fix fill_frame_info() regression vs VLAN packets

Stephan Wurm reported that my recent patch broke VLAN support.

Apparently skb->mac_len is not correct for VLAN traffic as
shown by debug traces [1].

Use instead pskb_may_pull() to make sure the expected header
is present in skb->head.

Many thanks to Stephan for his help.

[1]
kernel: skb len=170 headroom=2 headlen=170 tailroom=20
        mac=(2,14) mac_len=14 net=(16,-1) trans=-1
        shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
        csum(0x0 start=0 offset=0 ip_summed=0 complete_sw=0 valid=0 level=0)
        hash(0x0 sw=0 l4=0) proto=0x0000 pkttype=0 iif=0
        priority=0x0 mark=0x0 alloc_cpu=0 vlan_all=0x0
        encapsulation=0 inner(proto=0x0000, mac=0, net=0, trans=0)
kernel: dev name=prp0 feat=0x0000000000007000
kernel: sk family=17 type=3 proto=0
kernel: skb headroom: 00000000: 74 00
kernel: skb linear:   00000000: 01 0c cd 01 00 01 00 d0 93 53 9c cb 81 00 80 00
kernel: skb linear:   00000010: 88 b8 00 01 00 98 00 00 00 00 61 81 8d 80 16 52
kernel: skb linear:   00000020: 45 47 44 4e 43 54 52 4c 2f 4c 4c 4e 30 24 47 4f
kernel: skb linear:   00000030: 24 47 6f 43 62 81 01 14 82 16 52 45 47 44 4e 43
kernel: skb linear:   00000040: 54 52 4c 2f 4c 4c 4e 30 24 44 73 47 6f 6f 73 65
kernel: skb linear:   00000050: 83 07 47 6f 49 64 65 6e 74 84 08 67 8d f5 93 7e
kernel: skb linear:   00000060: 76 c8 00 85 01 01 86 01 00 87 01 00 88 01 01 89
kernel: skb linear:   00000070: 01 00 8a 01 02 ab 33 a2 15 83 01 00 84 03 03 00
kernel: skb linear:   00000080: 00 91 08 67 8d f5 92 77 4b c6 1f 83 01 00 a2 1a
kernel: skb linear:   00000090: a2 06 85 01 00 83 01 00 84 03 03 00 00 91 08 67
kernel: skb linear:   000000a0: 8d f5 92 77 4b c6 1f 83 01 00
kernel: skb tailroom: 00000000: 80 18 02 00 fe 4e 00 00 01 01 08 0a 4f fd 5e d1
kernel: skb tailroom: 00000010: 4f fd 5e cd

Fixes: b9653d19e556 ("net: hsr: avoid potential out-of-bound access in fill_frame_info()")
Reported-by: Stephan Wurm <stephan.wurm@a-eberle.de>
Tested-by: Stephan Wurm <stephan.wurm@a-eberle.de>
Closes: https://lore.kernel.org/netdev/Z4o_UC0HweBHJ_cw@PC-LX-SteWu/
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250129130007.644084-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agoMerge tag 'ntfs3_for_6.14' of https://github.com/Paragon-Software-Group/linux-ntfs3
Linus Torvalds [Thu, 30 Jan 2025 16:47:17 +0000 (08:47 -0800)] 
Merge tag 'ntfs3_for_6.14' of https://github.com/Paragon-Software-Group/linux-ntfs3

Pull ntfs3 fixes from Konstantin Komarov:

 - unify inode corruption marking and mark them as bad immediately upon
   detection of an error in attribute enumeration

 - folio cleanup

* tag 'ntfs3_for_6.14' of https://github.com/Paragon-Software-Group/linux-ntfs3:
  fs/ntfs3: Unify inode corruption marking with _ntfs_bad_inode()
  fs/ntfs3: Mark inode as bad as soon as error detected in mi_enum_attr()
  ntfs3: Remove an access to page->index

4 months agoMerge tag 'bcachefs-2025-01-29' of git://evilpiepirate.org/bcachefs
Linus Torvalds [Thu, 30 Jan 2025 16:42:50 +0000 (08:42 -0800)] 
Merge tag 'bcachefs-2025-01-29' of git://evilpiepirate.org/bcachefs

Pull bcachefs fixes from Kent Overstreet:

 - second half of a fix for a bug that'd been causing oopses on
   filesystems using snapshots with memory pressure (key cache fills for
   snaphots btrees are tricky)

 - build fix for strange compiler configurations that double stack frame
   size

 - "journal stuck timeout" now takes into account device latency: this
   fixes some spurious warnings, and the main remaining source of SRCU
   lock hold time warnings (I'm no longer seeing this in my CI, so any
   users still seeing this should definitely ping me)

 - fix for slow/hanging unmounts (" Improve journal pin flushing")

 - some more tracepoint fixes/improvements, to chase down the "rebalance
   isn't making progress" issues

* tag 'bcachefs-2025-01-29' of git://evilpiepirate.org/bcachefs:
  bcachefs: Improve trace_move_extent_finish
  bcachefs: Fix trace_copygc
  bcachefs: Journal writes are now IOPRIO_CLASS_RT
  bcachefs: Improve journal pin flushing
  bcachefs: fix bch2_btree_node_flags
  bcachefs: rebalance, copygc enabled are runtime opts
  bcachefs: Improve decompression error messages
  bcachefs: bset_blacklisted_journal_seq is now AUTOFIX
  bcachefs: "Journal stuck" timeout now takes into account device latency
  bcachefs: Reduce stack frame size of __bch2_str_hash_check_key()
  bcachefs: Fix btree_trans_peek_key_cache()

4 months agoMerge branch 'mptcp-blackhole-only-if-1st-syn-retrans-w-o-mpc-is-accepted'
Paolo Abeni [Thu, 30 Jan 2025 13:02:21 +0000 (14:02 +0100)] 
Merge branch 'mptcp-blackhole-only-if-1st-syn-retrans-w-o-mpc-is-accepted'

Matthieu Baerts says:

====================
mptcp: blackhole only if 1st SYN retrans w/o MPC is accepted

Here are two small fixes for issues introduced in v6.12.

- Patch 1: reset the mpc_drop mark for other SYN retransmits, to only
  consider an MPTCP blackhole when the first SYN retransmitted without
  the MPTCP options is accepted, as initially intended.

- Patch 2: also mention in the doc that the blackhole_timeout sysctl
  knob is per-netns, like all the others.

Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
====================

Link: https://patch.msgid.link/20250129-net-mptcp-blackhole-fix-v1-0-afe88e5a6d2c@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
4 months agodoc: mptcp: sysctl: blackhole_timeout is per-netns
Matthieu Baerts (NGI0) [Wed, 29 Jan 2025 12:24:33 +0000 (13:24 +0100)] 
doc: mptcp: sysctl: blackhole_timeout is per-netns

All other sysctl entries mention it, and it is a per-namespace sysctl.

So mention it as well.

Fixes: 27069e7cb3d1 ("mptcp: disable active MPTCP in case of blackhole")
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
4 months agomptcp: blackhole only if 1st SYN retrans w/o MPC is accepted
Matthieu Baerts (NGI0) [Wed, 29 Jan 2025 12:24:32 +0000 (13:24 +0100)] 
mptcp: blackhole only if 1st SYN retrans w/o MPC is accepted

The Fixes commit mentioned this:

> An MPTCP firewall blackhole can be detected if the following SYN
> retransmission after a fallback to "plain" TCP is accepted.

But in fact, this blackhole was detected if any following SYN
retransmissions after a fallback to TCP was accepted.

That's because 'mptcp_subflow_early_fallback()' will set 'request_mptcp'
to 0, and 'mpc_drop' will never be reset to 0 after.

This is an issue, because some not so unusual situations might cause the
kernel to detect a false-positive blackhole, e.g. a client trying to
connect to a server while the network is not ready yet, causing a few
SYN retransmissions, before reaching the end server.

Fixes: 27069e7cb3d1 ("mptcp: disable active MPTCP in case of blackhole")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
4 months agonetfilter: nf_tables: reject mismatching sum of field_len with set key length
Pablo Neira Ayuso [Tue, 28 Jan 2025 11:26:33 +0000 (12:26 +0100)] 
netfilter: nf_tables: reject mismatching sum of field_len with set key length

The field length description provides the length of each separated key
field in the concatenation, each field gets rounded up to 32-bits to
calculate the pipapo rule width from pipapo_init(). The set key length
provides the total size of the key aligned to 32-bits.

Register-based arithmetics still allows for combining mismatching set
key length and field length description, eg. set key length 10 and field
description [ 5, 4 ] leading to pipapo width of 12.

Cc: stable@vger.kernel.org
Fixes: 3ce67e3793f4 ("netfilter: nf_tables: do not allow mismatch field size and set key length")
Reported-by: Noam Rathaus <noamr@ssd-disclosure.com>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
4 months agoMerge branch 'fix-missing-rtnl-lock-in-suspend-path'
Paolo Abeni [Thu, 30 Jan 2025 10:23:32 +0000 (11:23 +0100)] 
Merge branch 'fix-missing-rtnl-lock-in-suspend-path'

Kory Maincent says:

====================
Fix missing rtnl lock in suspend path

Fix the suspend path by ensuring the rtnl lock is held where required.
Calls to open, close and WOL operations must be performed under the
rtnl lock to prevent conflicts with ongoing ndo operations.

Discussion about this issue can be found here:
https://lore.kernel.org/netdev/20250120141926.1290763-1-kory.maincent@bootlin.com/

While working on the ravb fix, it was discovered that the sh_eth driver
has the same issue. This patch series addresses both drivers.

I do not have access to hardware for either of these MACs, so it would
be great if maintainers or others with the relevant boards could test
these fixes.

v2: https://lore.kernel.org/r/20250123-fix_missing_rtnl_lock_phy_disconnect-v2-0-e6206f5508ba@bootlin.com

v1: https://lore.kernel.org/r/20250122-fix_missing_rtnl_lock_phy_disconnect-v1-0-8cb9f6f88fd1@bootlin.com

Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
====================

Link: https://patch.msgid.link/20250129-fix_missing_rtnl_lock_phy_disconnect-v3-0-24c4ba185a92@bootlin.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
4 months agonet: sh_eth: Fix missing rtnl lock in suspend/resume path
Kory Maincent [Wed, 29 Jan 2025 09:50:47 +0000 (10:50 +0100)] 
net: sh_eth: Fix missing rtnl lock in suspend/resume path

Fix the suspend/resume path by ensuring the rtnl lock is held where
required. Calls to sh_eth_close, sh_eth_open and wol operations must be
performed under the rtnl lock to prevent conflicts with ongoing ndo
operations.

Fixes: b71af04676e9 ("sh_eth: add more PM methods")
Tested-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
4 months agonet: ravb: Fix missing rtnl lock in suspend/resume path
Kory Maincent [Wed, 29 Jan 2025 09:50:46 +0000 (10:50 +0100)] 
net: ravb: Fix missing rtnl lock in suspend/resume path

Fix the suspend/resume path by ensuring the rtnl lock is held where
required. Calls to ravb_open, ravb_close and wol operations must be
performed under the rtnl lock to prevent conflicts with ongoing ndo
operations.

Without this fix, the following warning is triggered:
[   39.032969] =============================
[   39.032983] WARNING: suspicious RCU usage
[   39.033019] -----------------------------
[   39.033033] drivers/net/phy/phy_device.c:2004 suspicious
rcu_dereference_protected() usage!
...
[   39.033597] stack backtrace:
[   39.033613] CPU: 0 UID: 0 PID: 174 Comm: python3 Not tainted
6.13.0-rc7-next-20250116-arm64-renesas-00002-g35245dfdc62c #7
[   39.033623] Hardware name: Renesas SMARC EVK version 2 based on
r9a08g045s33 (DT)
[   39.033628] Call trace:
[   39.033633]  show_stack+0x14/0x1c (C)
[   39.033652]  dump_stack_lvl+0xb4/0xc4
[   39.033664]  dump_stack+0x14/0x1c
[   39.033671]  lockdep_rcu_suspicious+0x16c/0x22c
[   39.033682]  phy_detach+0x160/0x190
[   39.033694]  phy_disconnect+0x40/0x54
[   39.033703]  ravb_close+0x6c/0x1cc
[   39.033714]  ravb_suspend+0x48/0x120
[   39.033721]  dpm_run_callback+0x4c/0x14c
[   39.033731]  device_suspend+0x11c/0x4dc
[   39.033740]  dpm_suspend+0xdc/0x214
[   39.033748]  dpm_suspend_start+0x48/0x60
[   39.033758]  suspend_devices_and_enter+0x124/0x574
[   39.033769]  pm_suspend+0x1ac/0x274
[   39.033778]  state_store+0x88/0x124
[   39.033788]  kobj_attr_store+0x14/0x24
[   39.033798]  sysfs_kf_write+0x48/0x6c
[   39.033808]  kernfs_fop_write_iter+0x118/0x1a8
[   39.033817]  vfs_write+0x27c/0x378
[   39.033825]  ksys_write+0x64/0xf4
[   39.033833]  __arm64_sys_write+0x18/0x20
[   39.033841]  invoke_syscall+0x44/0x104
[   39.033852]  el0_svc_common.constprop.0+0xb4/0xd4
[   39.033862]  do_el0_svc+0x18/0x20
[   39.033870]  el0_svc+0x3c/0xf0
[   39.033880]  el0t_64_sync_handler+0xc0/0xc4
[   39.033888]  el0t_64_sync+0x154/0x158
[   39.041274] ravb 11c30000.ethernet eth0: Link is Down

Reported-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Closes: https://lore.kernel.org/netdev/4c6419d8-c06b-495c-b987-d66c2e1ff848@tuxon.dev/
Fixes: 0184165b2f42 ("ravb: add sleep PM suspend/resume support")
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Tested-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
4 months agoMerge tag 'for-net-2025-01-29' of git://git.kernel.org/pub/scm/linux/kernel/git/bluet...
Paolo Abeni [Thu, 30 Jan 2025 10:00:31 +0000 (11:00 +0100)] 
Merge tag 'for-net-2025-01-29' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth

Luiz Augusto von Dentz says:

====================
bluetooth pull request for net:

 - btusb: mediatek: Add locks for usb_driver_claim_interface()
 - L2CAP: accept zero as a special value for MTU auto-selection
 - btusb: Fix possible infinite recursion of btusb_reset
 - Add ABI doc for sysfs reset
 - btnxpuart: Fix glitches seen in dual A2DP streaming

* tag 'for-net-2025-01-29' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
  Bluetooth: L2CAP: accept zero as a special value for MTU auto-selection
  Bluetooth: btnxpuart: Fix glitches seen in dual A2DP streaming
  Bluetooth: Add ABI doc for sysfs reset
  Bluetooth: Fix possible infinite recursion of btusb_reset
  Bluetooth: btusb: mediatek: Add locks for usb_driver_claim_interface()
====================

Link: https://patch.msgid.link/20250129210057.1318963-1-luiz.dentz@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
4 months agoselftests/net: Add test for loading devbound XDP program in generic mode
Toke Høiland-Jørgensen [Mon, 27 Jan 2025 13:13:43 +0000 (14:13 +0100)] 
selftests/net: Add test for loading devbound XDP program in generic mode

Add a test to bpf_offload.py for loading a devbound XDP program in
generic mode, checking that it fails correctly.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250127131344.238147-2-toke@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agonet: xdp: Disallow attaching device-bound programs in generic mode
Toke Høiland-Jørgensen [Mon, 27 Jan 2025 13:13:42 +0000 (14:13 +0100)] 
net: xdp: Disallow attaching device-bound programs in generic mode

Device-bound programs are used to support RX metadata kfuncs. These
kfuncs are driver-specific and rely on the driver context to read the
metadata. This means they can't work in generic XDP mode. However, there
is no check to disallow such programs from being attached in generic
mode, in which case the metadata kfuncs will be called in an invalid
context, leading to crashes.

Fix this by adding a check to disallow attaching device-bound programs
in generic mode.

Fixes: 2b3486bc2d23 ("bpf: Introduce device-bound XDP programs")
Reported-by: Marcus Wichelmann <marcus.wichelmann@hetzner-cloud.de>
Closes: https://lore.kernel.org/r/dae862ec-43b5-41a0-8edf-46c59071cdda@hetzner-cloud.de
Tested-by: Marcus Wichelmann <marcus.wichelmann@hetzner-cloud.de>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20250127131344.238147-1-toke@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agotcp: correct handling of extreme memory squeeze
Jon Maloy [Mon, 27 Jan 2025 23:13:04 +0000 (18:13 -0500)] 
tcp: correct handling of extreme memory squeeze

Testing with iperf3 using the "pasta" protocol splicer has revealed
a problem in the way tcp handles window advertising in extreme memory
squeeze situations.

Under memory pressure, a socket endpoint may temporarily advertise
a zero-sized window, but this is not stored as part of the socket data.
The reasoning behind this is that it is considered a temporary setting
which shouldn't influence any further calculations.

However, if we happen to stall at an unfortunate value of the current
window size, the algorithm selecting a new value will consistently fail
to advertise a non-zero window once we have freed up enough memory.
This means that this side's notion of the current window size is
different from the one last advertised to the peer, causing the latter
to not send any data to resolve the sitution.

The problem occurs on the iperf3 server side, and the socket in question
is a completely regular socket with the default settings for the
fedora40 kernel. We do not use SO_PEEK or SO_RCVBUF on the socket.

The following excerpt of a logging session, with own comments added,
shows more in detail what is happening:

//              tcp_v4_rcv(->)
//                tcp_rcv_established(->)
[5201<->39222]:     ==== Activating log @ net/ipv4/tcp_input.c/tcp_data_queue()/5257 ====
[5201<->39222]:     tcp_data_queue(->)
[5201<->39222]:        DROPPING skb [265600160..265665640], reason: SKB_DROP_REASON_PROTO_MEM
                       [rcv_nxt 265600160, rcv_wnd 262144, snt_ack 265469200, win_now 131184]
                       [copied_seq 259909392->260034360 (124968), unread 5565800, qlen 85, ofoq 0]
                       [OFO queue: gap: 65480, len: 0]
[5201<->39222]:     tcp_data_queue(<-)
[5201<->39222]:     __tcp_transmit_skb(->)
                        [tp->rcv_wup: 265469200, tp->rcv_wnd: 262144, tp->rcv_nxt 265600160]
[5201<->39222]:       tcp_select_window(->)
[5201<->39222]:         (inet_csk(sk)->icsk_ack.pending & ICSK_ACK_NOMEM) ? --> TRUE
                        [tp->rcv_wup: 265469200, tp->rcv_wnd: 262144, tp->rcv_nxt 265600160]
                        returning 0
[5201<->39222]:       tcp_select_window(<-)
[5201<->39222]:       ADVERTISING WIN 0, ACK_SEQ: 265600160
[5201<->39222]:     [__tcp_transmit_skb(<-)
[5201<->39222]:   tcp_rcv_established(<-)
[5201<->39222]: tcp_v4_rcv(<-)

// Receive queue is at 85 buffers and we are out of memory.
// We drop the incoming buffer, although it is in sequence, and decide
// to send an advertisement with a window of zero.
// We don't update tp->rcv_wnd and tp->rcv_wup accordingly, which means
// we unconditionally shrink the window.

[5201<->39222]: tcp_recvmsg_locked(->)
[5201<->39222]:   __tcp_cleanup_rbuf(->) tp->rcv_wup: 265469200, tp->rcv_wnd: 262144, tp->rcv_nxt 265600160
[5201<->39222]:     [new_win = 0, win_now = 131184, 2 * win_now = 262368]
[5201<->39222]:     [new_win >= (2 * win_now) ? --> time_to_ack = 0]
[5201<->39222]:     NOT calling tcp_send_ack()
                    [tp->rcv_wup: 265469200, tp->rcv_wnd: 262144, tp->rcv_nxt 265600160]
[5201<->39222]:   __tcp_cleanup_rbuf(<-)
                  [rcv_nxt 265600160, rcv_wnd 262144, snt_ack 265469200, win_now 131184]
                  [copied_seq 260040464->260040464 (0), unread 5559696, qlen 85, ofoq 0]
                  returning 6104 bytes
[5201<->39222]: tcp_recvmsg_locked(<-)

// After each read, the algorithm for calculating the new receive
// window in __tcp_cleanup_rbuf() finds it is too small to advertise
// or to update tp->rcv_wnd.
// Meanwhile, the peer thinks the window is zero, and will not send
// any more data to trigger an update from the interrupt mode side.

[5201<->39222]: tcp_recvmsg_locked(->)
[5201<->39222]:   __tcp_cleanup_rbuf(->) tp->rcv_wup: 265469200, tp->rcv_wnd: 262144, tp->rcv_nxt 265600160
[5201<->39222]:     [new_win = 262144, win_now = 131184, 2 * win_now = 262368]
[5201<->39222]:     [new_win >= (2 * win_now) ? --> time_to_ack = 0]
[5201<->39222]:     NOT calling tcp_send_ack()
                    [tp->rcv_wup: 265469200, tp->rcv_wnd: 262144, tp->rcv_nxt 265600160]
[5201<->39222]:   __tcp_cleanup_rbuf(<-)
                  [rcv_nxt 265600160, rcv_wnd 262144, snt_ack 265469200, win_now 131184]
                  [copied_seq 260099840->260171536 (71696), unread 5428624, qlen 83, ofoq 0]
                  returning 131072 bytes
[5201<->39222]: tcp_recvmsg_locked(<-)

// The above pattern repeats again and again, since nothing changes
// between the reads.

[...]

[5201<->39222]: tcp_recvmsg_locked(->)
[5201<->39222]:   __tcp_cleanup_rbuf(->) tp->rcv_wup: 265469200, tp->rcv_wnd: 262144, tp->rcv_nxt 265600160
[5201<->39222]:     [new_win = 262144, win_now = 131184, 2 * win_now = 262368]
[5201<->39222]:     [new_win >= (2 * win_now) ? --> time_to_ack = 0]
[5201<->39222]:     NOT calling tcp_send_ack()
                    [tp->rcv_wup: 265469200, tp->rcv_wnd: 262144, tp->rcv_nxt 265600160]
[5201<->39222]:   __tcp_cleanup_rbuf(<-)
                  [rcv_nxt 265600160, rcv_wnd 262144, snt_ack 265469200, win_now 131184]
                  [copied_seq 265600160->265600160 (0), unread 0, qlen 0, ofoq 0]
                  returning 54672 bytes
[5201<->39222]: tcp_recvmsg_locked(<-)

// The receive queue is empty, but no new advertisement has been sent.
// The peer still thinks the receive window is zero, and sends nothing.
// We have ended up in a deadlock situation.

Note that well behaved endpoints will send win0 probes, so the problem
will not occur.

Furthermore, we have observed that in these situations this side may
send out an updated 'th->ack_seq´ which is not stored in tp->rcv_wup
as it should be. Backing ack_seq seems to be harmless, but is of
course still wrong from a protocol viewpoint.

We fix this by updating the socket state correctly when a packet has
been dropped because of memory exhaustion and we have to advertize
a zero window.

Further testing shows that the connection recovers neatly from the
squeeze situation, and traffic can continue indefinitely.

Fixes: e2142825c120 ("net: tcp: send zero-window ACK when no memory")
Cc: Menglong Dong <menglong8.dong@gmail.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Link: https://patch.msgid.link/20250127231304.1465565-1-jmaloy@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agobgmac: reduce max frame size to support just MTU 1500
Rafał Miłecki [Mon, 27 Jan 2025 17:51:59 +0000 (09:51 -0800)] 
bgmac: reduce max frame size to support just MTU 1500

bgmac allocates new replacement buffer before handling each received
frame. Allocating & DMA-preparing 9724 B each time consumes a lot of CPU
time. Ideally bgmac should just respect currently set MTU but it isn't
the case right now. For now just revert back to the old limited frame
size.

This change bumps NAT masquerade speed by ~95%.

Since commit 8218f62c9c9b ("mm: page_frag: use initial zero offset for
page_frag_alloc_align()"), the bgmac driver fails to open its network
interface successfully and runs out of memory in the following call
stack:

bgmac_open
  -> bgmac_dma_init
    -> bgmac_dma_rx_skb_for_slot
      -> netdev_alloc_frag

BGMAC_RX_ALLOC_SIZE = 10048 and PAGE_FRAG_CACHE_MAX_SIZE = 32768.

Eventually we land into __page_frag_alloc_align() with the following
parameters across multiple successive calls:

__page_frag_alloc_align: fragsz=10048, align_mask=-1, size=32768, offset=0
__page_frag_alloc_align: fragsz=10048, align_mask=-1, size=32768, offset=10048
__page_frag_alloc_align: fragsz=10048, align_mask=-1, size=32768, offset=20096
__page_frag_alloc_align: fragsz=10048, align_mask=-1, size=32768, offset=30144

So in that case we do indeed have offset + fragsz (40192) > size (32768)
and so we would eventually return NULL. Reverting to the older 1500
bytes MTU allows the network driver to be usable again.

Fixes: 8c7da63978f1 ("bgmac: configure MTU and add support for frames beyond 8192 byte size")
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
[florian: expand commit message about recent commits]
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250127175159.1788246-1-florian.fainelli@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agoMerge branch 'vsock-transport-reassignment-and-error-handling-issues'
Jakub Kicinski [Thu, 30 Jan 2025 02:50:40 +0000 (18:50 -0800)] 
Merge branch 'vsock-transport-reassignment-and-error-handling-issues'

Michal Luczaj says:

====================
vsock: Transport reassignment and error handling issues

Series deals with two issues:
- socket reference count imbalance due to an unforgiving transport release
  (triggered by transport reassignment);
- unintentional API feature, a failing connect() making the socket
  impossible to use for any subsequent connect() attempts.

v2: https://lore.kernel.org/20250121-vsock-transport-vs-autobind-v2-0-aad6069a4e8c@rbox.co
v1: https://lore.kernel.org/20250117-vsock-transport-vs-autobind-v1-0-c802c803762d@rbox.co
====================

Link: https://patch.msgid.link/20250128-vsock-transport-vs-autobind-v3-0-1cf57065b770@rbox.co
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agovsock/test: Add test for connect() retries
Michal Luczaj [Tue, 28 Jan 2025 13:15:32 +0000 (14:15 +0100)] 
vsock/test: Add test for connect() retries

Deliberately fail a connect() attempt; expect error. Then verify that
subsequent attempt (using the same socket) can still succeed, rather than
fail outright.

Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Luigi Leonardi <leonardi@redhat.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Link: https://patch.msgid.link/20250128-vsock-transport-vs-autobind-v3-6-1cf57065b770@rbox.co
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agovsock/test: Add test for UAF due to socket unbinding
Michal Luczaj [Tue, 28 Jan 2025 13:15:31 +0000 (14:15 +0100)] 
vsock/test: Add test for UAF due to socket unbinding

Fail the autobind, then trigger a transport reassign. Socket might get
unbound from unbound_sockets, which then leads to a reference count
underflow.

Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Link: https://patch.msgid.link/20250128-vsock-transport-vs-autobind-v3-5-1cf57065b770@rbox.co
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agovsock/test: Introduce vsock_connect_fd()
Michal Luczaj [Tue, 28 Jan 2025 13:15:30 +0000 (14:15 +0100)] 
vsock/test: Introduce vsock_connect_fd()

Distill timeout-guarded vsock_connect_fd(). Adapt callers.

Suggested-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Link: https://patch.msgid.link/20250128-vsock-transport-vs-autobind-v3-4-1cf57065b770@rbox.co
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agovsock/test: Introduce vsock_bind()
Michal Luczaj [Tue, 28 Jan 2025 13:15:29 +0000 (14:15 +0100)] 
vsock/test: Introduce vsock_bind()

Add a helper for socket()+bind(). Adapt callers.

Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Luigi Leonardi <leonardi@redhat.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Link: https://patch.msgid.link/20250128-vsock-transport-vs-autobind-v3-3-1cf57065b770@rbox.co
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agovsock: Allow retrying on connect() failure
Michal Luczaj [Tue, 28 Jan 2025 13:15:28 +0000 (14:15 +0100)] 
vsock: Allow retrying on connect() failure

sk_err is set when a (connectible) connect() fails. Effectively, this makes
an otherwise still healthy SS_UNCONNECTED socket impossible to use for any
subsequent connection attempts.

Clear sk_err upon trying to establish a connection.

Fixes: d021c344051a ("VSOCK: Introduce VM Sockets")
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Luigi Leonardi <leonardi@redhat.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Link: https://patch.msgid.link/20250128-vsock-transport-vs-autobind-v3-2-1cf57065b770@rbox.co
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agovsock: Keep the binding until socket destruction
Michal Luczaj [Tue, 28 Jan 2025 13:15:27 +0000 (14:15 +0100)] 
vsock: Keep the binding until socket destruction

Preserve sockets bindings; this includes both resulting from an explicit
bind() and those implicitly bound through autobind during connect().

Prevents socket unbinding during a transport reassignment, which fixes a
use-after-free:

    1. vsock_create() (refcnt=1) calls vsock_insert_unbound() (refcnt=2)
    2. transport->release() calls vsock_remove_bound() without checking if
       sk was bound and moved to bound list (refcnt=1)
    3. vsock_bind() assumes sk is in unbound list and before
       __vsock_insert_bound(vsock_bound_sockets()) calls
       __vsock_remove_bound() which does:
           list_del_init(&vsk->bound_table); // nop
           sock_put(&vsk->sk);               // refcnt=0

BUG: KASAN: slab-use-after-free in __vsock_bind+0x62e/0x730
Read of size 4 at addr ffff88816b46a74c by task a.out/2057
 dump_stack_lvl+0x68/0x90
 print_report+0x174/0x4f6
 kasan_report+0xb9/0x190
 __vsock_bind+0x62e/0x730
 vsock_bind+0x97/0xe0
 __sys_bind+0x154/0x1f0
 __x64_sys_bind+0x6e/0xb0
 do_syscall_64+0x93/0x1b0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Allocated by task 2057:
 kasan_save_stack+0x1e/0x40
 kasan_save_track+0x10/0x30
 __kasan_slab_alloc+0x85/0x90
 kmem_cache_alloc_noprof+0x131/0x450
 sk_prot_alloc+0x5b/0x220
 sk_alloc+0x2c/0x870
 __vsock_create.constprop.0+0x2e/0xb60
 vsock_create+0xe4/0x420
 __sock_create+0x241/0x650
 __sys_socket+0xf2/0x1a0
 __x64_sys_socket+0x6e/0xb0
 do_syscall_64+0x93/0x1b0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Freed by task 2057:
 kasan_save_stack+0x1e/0x40
 kasan_save_track+0x10/0x30
 kasan_save_free_info+0x37/0x60
 __kasan_slab_free+0x4b/0x70
 kmem_cache_free+0x1a1/0x590
 __sk_destruct+0x388/0x5a0
 __vsock_bind+0x5e1/0x730
 vsock_bind+0x97/0xe0
 __sys_bind+0x154/0x1f0
 __x64_sys_bind+0x6e/0xb0
 do_syscall_64+0x93/0x1b0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

refcount_t: addition on 0; use-after-free.
WARNING: CPU: 7 PID: 2057 at lib/refcount.c:25 refcount_warn_saturate+0xce/0x150
RIP: 0010:refcount_warn_saturate+0xce/0x150
 __vsock_bind+0x66d/0x730
 vsock_bind+0x97/0xe0
 __sys_bind+0x154/0x1f0
 __x64_sys_bind+0x6e/0xb0
 do_syscall_64+0x93/0x1b0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

refcount_t: underflow; use-after-free.
WARNING: CPU: 7 PID: 2057 at lib/refcount.c:28 refcount_warn_saturate+0xee/0x150
RIP: 0010:refcount_warn_saturate+0xee/0x150
 vsock_remove_bound+0x187/0x1e0
 __vsock_release+0x383/0x4a0
 vsock_release+0x90/0x120
 __sock_release+0xa3/0x250
 sock_close+0x14/0x20
 __fput+0x359/0xa80
 task_work_run+0x107/0x1d0
 do_exit+0x847/0x2560
 do_group_exit+0xb8/0x250
 __x64_sys_exit_group+0x3a/0x50
 x64_sys_call+0xfec/0x14f0
 do_syscall_64+0x93/0x1b0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Fixes: c0cfa2d8a788 ("vsock: add multi-transports support")
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Link: https://patch.msgid.link/20250128-vsock-transport-vs-autobind-v3-1-1cf57065b770@rbox.co
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agoMerge tag 'soundwire-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul...
Linus Torvalds [Wed, 29 Jan 2025 22:38:19 +0000 (14:38 -0800)] 
Merge tag 'soundwire-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire

Pull soundwire updates from Vinod Koul:

 - SoundWire multi lane support to use multiple lanes if supported

 - Stream handling of DEPREPARED state

 - AMD wake register programming for power off mode

* tag 'soundwire-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
  soundwire: amd: clear wake enable register for power off mode
  soundwire: generic_bandwidth_allocation: count the bandwidth of active streams only
  SoundWire: pass stream to compute_params()
  soundwire: generic_bandwidth_allocation: add lane in sdw_group_params
  soundwire: generic_bandwidth_allocation: select data lane
  soundwire: generic_bandwidth_allocation: check required freq accurately
  soundwire: generic_bandwidth_allocation: correct clk_freq check in sdw_select_row_col
  Soundwire: generic_bandwidth_allocation: set frame shape on fly
  Soundwire: stream: program BUSCLOCK_SCALE
  Soundwire: add sdw_slave_get_scale_index helper
  soundwire: generic_bandwidth_allocation: skip DEPREPARED streams
  soundwire: stream: set DEPREPARED state earlier
  soundwire: add lane_used_bandwidth in struct sdw_bus
  soundwire: mipi_disco: read lane mapping properties from ACPI
  soundwire: add lane field in sdw_port_runtime
  soundwire: bus: Move irq mapping cleanup into devres

4 months agoMerge tag 'phy-for-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy
Linus Torvalds [Wed, 29 Jan 2025 22:32:38 +0000 (14:32 -0800)] 
Merge tag 'phy-for-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy

Pull phy updates from Vinod Koul:
 "Lots of Qualcomm and Rockchip device support.

  New Support:
   - Qualcomm SAR2130P qmp usb, SAR2130P qmp pcie, QCS615 qusb2 and
     PCIe, IPQ5424 qmp pcie, IPQ5424 QUSB2 and USB3 PHY
   - Rockchip rk3576 combo phy support

  Updates:
   - Drop Shengyang for JH7110 maintainer
   - Freescale hdmi register calculation optimization
   - Rockchip pcie phy mutex and regmap updates"

* tag 'phy-for-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: (37 commits)
  dt-bindings: phy: qcom,qmp-pcie: document the SM8350 two lanes PCIe PHY
  phy: rockchip: phy-rockchip-typec: Fix Copyright description
  dt-bindings: phy: qcom,ipq8074-qmp-pcie: Document the IPQ5424 QMP PCIe PHYs
  phy: qcom-qusb2: Add support for QCS615
  dt-bindings: usb: qcom,dwc3: Add QCS615 to USB DWC3 bindings
  phy: core: Simplify API of_phy_simple_xlate() implementation
  phy: sun4i-usb: Remove unused of_gpio.h
  phy: HiSilicon: Don't use "proxy" headers
  phy: samsung-ufs: switch back to syscon_regmap_lookup_by_phandle()
  phy: qualcomm: qmp-pcie: add support for SAR2130P
  phy: qualcomm: qmp-pcie: define several new registers
  phy: qualcomm: qmp-pcie: split PCS_LANE1 region
  phy: qualcomm: qmp-combo: add support for SAR2130P
  dt-bindings: phy: qcom,sc8280xp-qmp-pcie-phy: Add SAR2130P compatible
  dt-bindings: phy: qcom,sc8280xp-qmp-usb43dp: Add SAR2130P compatible
  phy: freescale: fsl-samsung-hdmi: Clean up fld_tg_code calculation
  phy: freescale: fsl-samsung-hdmi: Stop searching when exact match is found
  phy: freescale: fsl-samsung-hdmi: Expand Integer divider range
  phy: rockchip-naneng-combo: add rk3576 support
  dt-bindings: phy: rockchip: add rk3576 compatible
  ...

4 months agoMerge tag 'dmaengine-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul...
Linus Torvalds [Wed, 29 Jan 2025 22:29:57 +0000 (14:29 -0800)] 
Merge tag 'dmaengine-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine

Pull dmaengine updates from Vinod Koul:
 "A bunch of new device support and updates to few drivers, biggest of
  them amd ones.

  New support:
   - TI J722S CSI BCDMA controller support
   - Intel idxd Panther Lake family platforms
   - Allwinner F1C100s suniv DMA
   - Qualcomm QCS615, QCS8300, SM8750, SA8775P GPI dma controller support
   - AMD ae4dma controller support and reorganisation of amd driver

  Updates:
   - Channel page support for Nvidia Tegra210 adma driver
   - Freescale support for S32G based platforms
   - Yamilfy atmel dma bindings"

* tag 'dmaengine-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (45 commits)
  dmaengine: idxd: Enable Function Level Reset (FLR) for halt
  dmaengine: idxd: Refactor halt handler
  dmaengine: idxd: Add idxd_device_config_save() and idxd_device_config_restore() helpers
  dmaengine: idxd: Binding and unbinding IDXD device and driver
  dmaengine: idxd: Add idxd_pci_probe_alloc() helper
  dt-bindings: dma: atmel: Convert to json schema
  dt-bindings: dma: st-stm32-dmamux: Add description for dma-cell values
  dmaengine: qcom: gpi: Add GPI immediate DMA support for SPI protocol
  dt-bindings: dma: adi,axi-dmac: deprecate adi,channels node
  dt-bindings: dma: adi,axi-dmac: convert to yaml schema
  dmaengine: mv_xor: switch to for_each_child_of_node_scoped()
  dmaengine: bcm2835-dma: Prevent suspend if DMA channel is busy
  dmaengine: tegra210-adma: Support channel page
  dt-bindings: dma: Support channel page to nvidia,tegra210-adma
  dmaengine: ti: k3-udma: Add support for J722S CSI BCDMA
  dt-bindings: dma: ti: k3-bcdma: Add J722S CSI BCDMA
  dmaengine: ti: edma: fix OF node reference leaks in edma_driver
  dmaengine: ti: edma: make the loop condition simpler in edma_probe()
  dmaengine: fsl-edma: read/write multiple registers in cyclic transactions
  dmaengine: fsl-edma: add support for S32G based platforms
  ...

4 months agoBluetooth: L2CAP: accept zero as a special value for MTU auto-selection
Fedor Pchelkin [Tue, 28 Jan 2025 21:08:14 +0000 (00:08 +0300)] 
Bluetooth: L2CAP: accept zero as a special value for MTU auto-selection

One of the possible ways to enable the input MTU auto-selection for L2CAP
connections is supposed to be through passing a special "0" value for it
as a socket option. Commit [1] added one of those into avdtp. However, it
simply wouldn't work because the kernel still treats the specified value
as invalid and denies the setting attempt. Recorded BlueZ logs include the
following:

  bluetoothd[496]: profiles/audio/avdtp.c:l2cap_connect() setsockopt(L2CAP_OPTIONS): Invalid argument (22)

[1]: https://github.com/bluez/bluez/commit/ae5be371a9f53fed33d2b34748a95a5498fd4b77

Found by Linux Verification Center (linuxtesting.org).

Fixes: 4b6e228e297b ("Bluetooth: Auto tune if input MTU is set to 0")
Cc: stable@vger.kernel.org
Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 months agoBluetooth: btnxpuart: Fix glitches seen in dual A2DP streaming
Neeraj Sanjay Kale [Mon, 20 Jan 2025 14:19:46 +0000 (19:49 +0530)] 
Bluetooth: btnxpuart: Fix glitches seen in dual A2DP streaming

This fixes a regression caused by previous commit for fixing truncated
ACL data, which is causing some intermittent glitches when running two
A2DP streams.

serdev_device_write_buf() is the root cause of the glitch, which is
reverted, and the TX work will continue to write until the queue is empty.

This change fixes both issues. No A2DP streaming glitches or truncated
ACL data issue observed.

Fixes: 8023dd220425 ("Bluetooth: btnxpuart: Fix driver sending truncated data")
Fixes: 689ca16e5232 ("Bluetooth: NXP: Add protocol support for NXP Bluetooth chipsets")
Signed-off-by: Neeraj Sanjay Kale <neeraj.sanjaykale@nxp.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 months agoBluetooth: Add ABI doc for sysfs reset
Hsin-chen Chuang [Mon, 20 Jan 2025 08:47:36 +0000 (16:47 +0800)] 
Bluetooth: Add ABI doc for sysfs reset

The functionality was implemented in commit 0f8a00137411 ("Bluetooth:
Allow reset via sysfs")

Fixes: 0f8a00137411 ("Bluetooth: Allow reset via sysfs")
Signed-off-by: Hsin-chen Chuang <chharry@chromium.org>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 months agoBluetooth: Fix possible infinite recursion of btusb_reset
Hsin-chen Chuang [Mon, 20 Jan 2025 10:39:39 +0000 (18:39 +0800)] 
Bluetooth: Fix possible infinite recursion of btusb_reset

The function enters infinite recursion if the HCI device doesn't support
GPIO reset: btusb_reset -> hdev->reset -> vendor_reset -> btusb_reset...

btusb_reset shouldn't call hdev->reset after commit f07d478090b0
("Bluetooth: Get rid of cmd_timeout and use the reset callback")

Fixes: f07d478090b0 ("Bluetooth: Get rid of cmd_timeout and use the reset callback")
Signed-off-by: Hsin-chen Chuang <chharry@chromium.org>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 months agoBluetooth: btusb: mediatek: Add locks for usb_driver_claim_interface()
Douglas Anderson [Thu, 16 Jan 2025 03:36:36 +0000 (19:36 -0800)] 
Bluetooth: btusb: mediatek: Add locks for usb_driver_claim_interface()

The documentation for usb_driver_claim_interface() says that "the
device lock" is needed when the function is called from places other
than probe(). This appears to be the lock for the USB interface
device. The Mediatek btusb code gets called via this path:

  Workqueue: hci0 hci_power_on [bluetooth]
  Call trace:
   usb_driver_claim_interface
   btusb_mtk_claim_iso_intf
   btusb_mtk_setup
   hci_dev_open_sync
   hci_power_on
   process_scheduled_works
   worker_thread
   kthread

With the above call trace the device lock hasn't been claimed. Claim
it.

Without this fix, we'd sometimes see the error "Failed to claim iso
interface". Sometimes we'd even see worse errors, like a NULL pointer
dereference (where `intf->dev.driver` was NULL) with a trace like:

  Call trace:
   usb_suspend_both
   usb_runtime_suspend
   __rpm_callback
   rpm_suspend
   pm_runtime_work
   process_scheduled_works

Both errors appear to be fixed with the proper locking.

Fixes: ceac1cb0259d ("Bluetooth: btusb: mediatek: add ISO data transmission functions")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 months agoMerge tag 'regulator-fix-v6.14-merge-window' of git://git.kernel.org/pub/scm/linux...
Linus Torvalds [Wed, 29 Jan 2025 19:56:55 +0000 (11:56 -0800)] 
Merge tag 'regulator-fix-v6.14-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator fixes from Mark Brown:
 "A couple of fixes that have come in during the merge window: one that
  operates the TPS6287x devices more within the design spec and can
  prevent current surges when changing voltages and another more trivial
  one for error message formatting"

* tag 'regulator-fix-v6.14-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: core: Add missing newline character
  regulator: TPS6287X: Use min/max uV to get VRANGE

4 months agoMerge tag 'for-linus-6.14-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Wed, 29 Jan 2025 19:39:20 +0000 (11:39 -0800)] 
Merge tag 'for-linus-6.14-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen updates from Juergen Gross:
 "Three minor fixes"

* tag 'for-linus-6.14-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen: update pvcalls_front_accept prototype
  Grab mm lock before grabbing pt lock
  xen: pcpu: remove unnecessary __ref annotation

4 months agoMerge tag 'cxl-for-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Linus Torvalds [Wed, 29 Jan 2025 19:23:22 +0000 (11:23 -0800)] 
Merge tag 'cxl-for-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull Compute Express Link (CXL) updates from Dave Jiang:
 "A tweak to the HMAT output that was acked by Rafael, a prep patch for
  CXL type2 devices support that's coming soon, refactoring of the CXL
  regblock enumeration code, and a series of patches to update the event
  records to CXL spec r3.1:

   - Move HMAT printouts to pr_debug()

   - Add CXL type2 support to cxl_dvsec_rr_decode() in preparation for
     type2 support

   - A series that updates CXL event records to spec r3.1 and related
     changes

   - Refactoring of cxl_find_regblock_instance() to count regblocks"

* tag 'cxl-for-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
  cxl/core/regs: Refactor out functions to count regblocks of given type
  cxl/test: Update test code for event records to CXL spec rev 3.1
  cxl/events: Update Memory Module Event Record to CXL spec rev 3.1
  cxl/events: Update DRAM Event Record to CXL spec rev 3.1
  cxl/events: Update General Media Event Record to CXL spec rev 3.1
  cxl/events: Add Component Identifier formatting for CXL spec rev 3.1
  cxl/events: Update Common Event Record to CXL spec rev 3.1
  cxl/pci: Add CXL Type 1/2 support to cxl_dvsec_rr_decode()
  ACPI/HMAT: Move HMAT messages to pr_debug()

4 months agoMerge tag 'powerpc-6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc...
Linus Torvalds [Wed, 29 Jan 2025 18:55:04 +0000 (10:55 -0800)] 
Merge tag 'powerpc-6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Madhavan Srinivasan:

 - Fix to handle PE state in pseries_eeh_get_state()

 - Handle unset of tce window if it was never set

Thanks to Narayana Murty N, Ritesh Harjani (IBM), Shivaprasad G Bhat,
and Vaishnavi Bhat.

* tag 'powerpc-6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/pseries/iommu: Don't unset window if it was never set
  powerpc/pseries/eeh: Fix get PE state translation

4 months agoMerge tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers...
Linus Torvalds [Wed, 29 Jan 2025 18:50:28 +0000 (10:50 -0800)] 
Merge tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux

Pull CRC cleanups from Eric Biggers:
 "Simplify the kconfig options for controlling which CRC implementations
  are built into the kernel, as was requested by Linus.

  This means making the option to disable the arch code visible only
  when CONFIG_EXPERT=y, and standardizing on a single generic
  implementation of CRC32"

* tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
  lib/crc32: remove other generic implementations
  lib/crc: simplify the kconfig options for CRC implementations

4 months agoMerge tag 'constfy-sysctl-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 29 Jan 2025 18:35:40 +0000 (10:35 -0800)] 
Merge tag 'constfy-sysctl-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl

Pull sysctl table constification from Joel Granados:
 "All ctl_table declared outside of functions and that remain unmodified
  after initialization are const qualified.

  This prevents unintended modifications to proc_handler function
  pointers by placing them in the .rodata section.

  This is a continuation of the tree-wide effort started a few releases
  ago with the constification of the ctl_table struct arguments in the
  sysctl API done in 78eb4ea25cd5 ("sysctl: treewide: constify the
  ctl_table argument of proc_handlers")"

* tag 'constfy-sysctl-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl:
  treewide: const qualify ctl_tables where applicable

4 months agoMerge tag 'fuse-update-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/mszered...
Linus Torvalds [Wed, 29 Jan 2025 17:40:23 +0000 (09:40 -0800)] 
Merge tag 'fuse-update-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse

Pull fuse updates from Miklos Szeredi:
 "Add support for io-uring communication between kernel and userspace
  using IORING_OP_URING_CMD (Bernd Schubert). Following features enable
  gains in performance compared to the regular interface:

   - Allow processing multiple requests with less syscall overhead

   - Combine commit of old and fetch of new fuse request

   - CPU/NUMA affinity of queues

  Patches were reviewed by several people, including Pavel Begunkov,
  io-uring co-maintainer"

* tag 'fuse-update-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: prevent disabling io-uring on active connections
  fuse: enable fuse-over-io-uring
  fuse: block request allocation until io-uring init is complete
  fuse: {io-uring} Prevent mount point hang on fuse-server termination
  fuse: Allow to queue bg requests through io-uring
  fuse: Allow to queue fg requests through io-uring
  fuse: {io-uring} Make fuse_dev_queue_{interrupt,forget} non-static
  fuse: {io-uring} Handle teardown of ring entries
  fuse: Add io-uring sqe commit and fetch support
  fuse: {io-uring} Make hash-list req unique finding functions non-static
  fuse: Add fuse-io-uring handling into fuse_copy
  fuse: Make fuse_copy non static
  fuse: {io-uring} Handle SQEs - register commands
  fuse: make args->in_args[0] to be always the header
  fuse: Add fuse-io-uring design documentation
  fuse: Move request bits
  fuse: Move fuse_get_dev to header file
  fuse: rename to fuse_dev_end_requests and make non-static

4 months agolib/crc32: remove other generic implementations
Eric Biggers [Thu, 23 Jan 2025 21:29:04 +0000 (13:29 -0800)] 
lib/crc32: remove other generic implementations

Now that we've standardized on the byte-by-byte implementation of CRC32
as the only generic implementation (see previous commit for the
rationale), remove the code for the other implementations.

Tested with crc_kunit.

Link: https://lore.kernel.org/r/20250123212904.118683-3-ebiggers@kernel.org
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
4 months agolib/crc: simplify the kconfig options for CRC implementations
Eric Biggers [Thu, 23 Jan 2025 21:29:03 +0000 (13:29 -0800)] 
lib/crc: simplify the kconfig options for CRC implementations

Make the following simplifications to the kconfig options for choosing
CRC implementations for CRC32 and CRC_T10DIF:

1. Make the option to disable the arch-optimized code be visible only
   when CONFIG_EXPERT=y.
2. Make a single option control the inclusion of the arch-optimized code
   for all enabled CRC variants.
3. Make CRC32_SARWATE (a.k.a. slice-by-1 or byte-by-byte) be the only
   generic CRC32 implementation.

The result is there is now just one option, CRC_OPTIMIZATIONS, which is
default y and can be disabled only when CONFIG_EXPERT=y.

Rationale:

1. Enabling the arch-optimized code is nearly always the right choice.
   However, people trying to build the tiniest kernel possible would
   find some use in disabling it.  Anything we add to CRC32 is de facto
   unconditional, given that CRC32 gets selected by something in nearly
   all kernels.  And unfortunately enabling the arch CRC code does not
   eliminate the need to build the generic CRC code into the kernel too,
   due to CPU feature dependencies.  The size of the arch CRC code will
   also increase slightly over time as more CRC variants get added and
   more implementations targeting different instruction set extensions
   get added.  Thus, it seems worthwhile to still provide an option to
   disable it, but it should be considered an expert-level tweak.

2. Considering the use case described in (1), there doesn't seem to be
   sufficient value in making the arch-optimized CRC code be
   independently configurable for different CRC variants.  Note also
   that multiple variants were already grouped together, e.g.
   CONFIG_CRC32 actually enables three different variants of CRC32.

3. The bit-by-bit implementation is uselessly slow, whereas slice-by-n
   for n=4 and n=8 use tables that are inconveniently large: 4096 bytes
   and 8192 bytes respectively, compared to 1024 bytes for n=1.  Higher
   n gives higher instruction-level parallelism, so higher n easily wins
   on traditional microbenchmarks on most CPUs.  However, the larger
   tables, which are accessed randomly, can be harmful in real-world
   situations where the dcache may be cold or useful data may need be
   evicted from the dcache.  Meanwhile, today most architectures have
   much faster CRC32 implementations using dedicated CRC32 instructions
   or carryless multiplication instructions anyway, which make the
   generic code obsolete in most cases especially on long messages.

   Another reason for going with n=1 is that this is already what is
   used by all the other CRC variants in the kernel.  CRC32 was unique
   in having support for larger tables.  But as per the above this can
   be considered an outdated optimization.

   The standardization on slice-by-1 a.k.a. CRC32_SARWATE makes much of
   the code in lib/crc32.c unused.  A later patch will clean that up.

Link: https://lore.kernel.org/r/20250123212904.118683-2-ebiggers@kernel.org
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
4 months agofs: pack struct kstat better
Christoph Hellwig [Wed, 29 Jan 2025 06:37:57 +0000 (07:37 +0100)] 
fs: pack struct kstat better

Move the change_cookie and subvol up to avoid two 4 byte holes.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 months agos390/tracing: Define ftrace_get_symaddr() for s390
Masami Hiramatsu (Google) [Tue, 28 Jan 2025 15:29:48 +0000 (00:29 +0900)] 
s390/tracing: Define ftrace_get_symaddr() for s390

Add ftrace_get_symaddr() for s390, which returns the symbol address
from ftrace's 'ip' parameter.

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Link: https://lore.kernel.org/r/173807818869.1854334.15474589105952793986.stgit@devnote2
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
4 months agos390/fgraph: Fix to remove ftrace_test_recursion_trylock()
Masami Hiramatsu (Google) [Tue, 28 Jan 2025 15:29:37 +0000 (00:29 +0900)] 
s390/fgraph: Fix to remove ftrace_test_recursion_trylock()

Fix to remove ftrace_test_recursion_trylock() from ftrace_graph_func()
because commit d576aec24df9 ("fgraph: Get ftrace recursion lock in
function_graph_enter") has been moved it to function_graph_enter_regs()
already.

Reported-by: Jiri Olsa <olsajiri@gmail.com>
Closes: https://lore.kernel.org/all/Z5O0shrdgeExZ2kF@krava/
Fixes: d576aec24df9 ("fgraph: Get ftrace recursion lock in function_graph_enter")
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Link: https://lore.kernel.org/r/173807817692.1854334.2985776940754607459.stgit@devnote2
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
4 months agos390/vmlogrdr: Use array instead of string initializer
Heiko Carstens [Fri, 24 Jan 2025 13:51:54 +0000 (14:51 +0100)] 
s390/vmlogrdr: Use array instead of string initializer

Compiling vmlogrdr with GCC 15 generates this warning:

  CC [M]  drivers/s390/char/vmlogrdr.o
drivers/s390/char/vmlogrdr.c:126:29: error: initializer-string for array
 of ‘char’ is too long [-Werror=unterminated-string-initialization]
  126 |         { .system_service = "*LOGREC ",

Given that the system_service array intentionally contains a non-null
terminated string use an array initializer, instead of string
initializer to get rid of this warning.

Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
4 months agos390/vmlogrdr: Use internal_name for error messages
Heiko Carstens [Fri, 24 Jan 2025 13:51:53 +0000 (14:51 +0100)] 
s390/vmlogrdr: Use internal_name for error messages

Use the internal_name member of vmlogrdr_priv_t to print error messages
instead of the system_service member. The system_service member is not a
string, but a non-null terminated eight byte character array, which
contains the ASCII representation of a z/VM system service.

Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
4 months agoMerge tag 'x86-urgent-2025-01-28' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Tue, 28 Jan 2025 22:32:03 +0000 (14:32 -0800)] 
Merge tag 'x86-urgent-2025-01-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fix from Ingo Molnar:
 "Fix a potential early boot crash in SEV-SNP guests, where certain
  config and build environment combinations can generate absolute
  references to symbols in the early boot code"

* tag 'x86-urgent-2025-01-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/sev: Disable jump tables in SEV startup code

4 months agoMerge tag 'nfs-for-6.14-1' of git://git.linux-nfs.org/projects/anna/linux-nfs
Linus Torvalds [Tue, 28 Jan 2025 22:23:46 +0000 (14:23 -0800)] 
Merge tag 'nfs-for-6.14-1' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull NFS client updates from Anna Schumaker:
 "New Features:
   - Enable using direct IO with localio
   - Added localio related tracepoints

  Bugfixes:
   - Sunrpc fixes for working with a very large cl_tasks list
   - Fix a possible buffer overflow in nfs_sysfs_link_rpc_client()
   - Fixes for handling reconnections with localio
   - Fix how the NFS_FSCACHE kconfig option interacts with NETFS_SUPPORT
   - Fix COPY_NOTIFY xdr_buf size calculations
   - pNFS/Flexfiles fix for retrying requesting a layout segment for
     reads
   - Sunrpc fix for retrying on EKEYEXPIRED error when the TGT is
     expired

  Cleanups:
   - Various other nfs & nfsd localio cleanups
   - Prepratory patches for async copy improvements that are under
     development
   - Make OFFLOAD_CANCEL, LAYOUTSTATS, and LAYOUTERR moveable to other
     xprts
   - Add netns inum and srcaddr to debugfs rpc_xprt info"

* tag 'nfs-for-6.14-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (28 commits)
  SUNRPC: do not retry on EKEYEXPIRED when user TGT ticket expired
  sunrpc: add netns inum and srcaddr to debugfs rpc_xprt info
  pnfs/flexfiles: retry getting layout segment for reads
  NFSv4.2: make LAYOUTSTATS and LAYOUTERROR MOVEABLE
  NFSv4.2: mark OFFLOAD_CANCEL MOVEABLE
  NFSv4.2: fix COPY_NOTIFY xdr buf size calculation
  NFS: Rename struct nfs4_offloadcancel_data
  NFS: Fix typo in OFFLOAD_CANCEL comment
  NFS: CB_OFFLOAD can return NFS4ERR_DELAY
  nfs: Make NFS_FSCACHE select NETFS_SUPPORT instead of depending on it
  nfs: fix incorrect error handling in LOCALIO
  nfs: probe for LOCALIO when v3 client reconnects to server
  nfs: probe for LOCALIO when v4 client reconnects to server
  nfs/localio: remove redundant code and simplify LOCALIO enablement
  nfs_common: add nfs_localio trace events
  nfs_common: track all open nfsd_files per LOCALIO nfs_client
  nfs_common: rename nfslocalio nfs_uuid_lock to nfs_uuids_lock
  nfsd: nfsd_file_acquire_local no longer returns GC'd nfsd_file
  nfsd: rename nfsd_serv_ prefixed methods and variables with nfsd_net_
  nfsd: update percpu_ref to manage references on nfsd_net
  ...

4 months agoMerge tag 'vfio-v6.14-rc1' of https://github.com/awilliam/linux-vfio
Linus Torvalds [Tue, 28 Jan 2025 22:16:46 +0000 (14:16 -0800)] 
Merge tag 'vfio-v6.14-rc1' of https://github.com/awilliam/linux-vfio

Pull vfio updates from Alex Williamson:

 - Extend vfio-pci 8-byte read/write support to include archs defining
   CONFIG_GENERIC_IOMAP, such as x86, and remove now extraneous #ifdefs
   around 64-bit accessors (Ramesh Thomas)

 - Update vfio-pci shadow ROM handling and allow cached ROM from setup
   data to be exposed as a functional ROM BAR region when available
   (Yunxiang Li)

 - Update nvgrace-gpu vfio-pci variant driver for new Grace Blackwell
   hardware, conditionalizing the uncached BAR workaround for previous
   generation hardware based on the presence of a flag in a new DVSEC
   capability, and include a delay during probe for link training to
   complete, a new requirement for GB devices (Ankit Agrawal)

* tag 'vfio-v6.14-rc1' of https://github.com/awilliam/linux-vfio:
  vfio/nvgrace-gpu: Add GB200 SKU to the devid table
  vfio/nvgrace-gpu: Check the HBM training and C2C link status
  vfio/nvgrace-gpu: Expose the blackwell device PF BAR1 to the VM
  vfio/nvgrace-gpu: Read dvsec register to determine need for uncached resmem
  vfio/platform: check the bounds of read/write syscalls
  vfio/pci: Expose setup ROM at ROM bar when needed
  vfio/pci: Remove shadow ROM specific code paths
  vfio/pci: Remove #ifdef iowrite64 and #ifdef ioread64
  vfio/pci: Enable iowrite64 and ioread64 for vfio pci

4 months agox86/sev: Disable jump tables in SEV startup code
Ard Biesheuvel [Mon, 27 Jan 2025 11:43:37 +0000 (12:43 +0100)] 
x86/sev: Disable jump tables in SEV startup code

When retpolines and IBT are both disabled, the compiler is free to use
jump tables to optimize switch instructions. However, these are emitted
by Clang as absolute references into .rodata:

        jmp    *-0x7dfffe90(,%r9,8)
                        R_X86_64_32S    .rodata+0x170

Given that this code will execute before that address in .rodata has even
been mapped, it is guaranteed to crash a SEV-SNP guest in a way that is
difficult to diagnose.

So disable jump tables when building this code. It would be better if we
could attach this annotation to the __head macro but this appears to be
impossible.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Tested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250127114334.1045857-6-ardb+git@google.com
4 months agodocs: power: Fix footnote reference for Toshiba Satellite P10-554
Bagas Sanjaya [Wed, 22 Jan 2025 14:34:56 +0000 (21:34 +0700)] 
docs: power: Fix footnote reference for Toshiba Satellite P10-554

Sphinx reports unreferenced footnote warning on "Video issues with S3
resume" doc:

Documentation/power/video.rst:213: WARNING: Footnote [#] is not referenced. [ref.footnote]

Fix the warning by separating footnote reference for Toshiba Satellite
P10-554 by a space.

Fixes: 151f4e2bdc7a ("docs: power: convert docs to ReST and rename to *.rst")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Closes: https://lore.kernel.org/linux-next/20250122170335.148a23b0@canb.auug.org.au/
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20250122143456.68867-4-bagasdotme@gmail.com
4 months agoDocumentation: ublk: Drop Stefan Hajnoczi's message footnote
Bagas Sanjaya [Wed, 22 Jan 2025 14:34:55 +0000 (21:34 +0700)] 
Documentation: ublk: Drop Stefan Hajnoczi's message footnote

Sphinx reports unreferenced footnote warning pointing to ubd-control
message by Stefan Hajnoczi:

Documentation/block/ublk.rst:336: WARNING: Footnote [#] is not referenced. [ref.footnote]

Drop the footnote to squash above warning.

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Fixes: 4093cb5a0634 ("ublk_drv: add mechanism for supporting unprivileged ublk device")
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20250122143456.68867-3-bagasdotme@gmail.com
4 months agoMerge tag 'driver-core-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Tue, 28 Jan 2025 20:25:12 +0000 (12:25 -0800)] 
Merge tag 'driver-core-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core and debugfs updates from Greg KH:
 "Here is the big set of driver core and debugfs updates for 6.14-rc1.

  Included in here is a bunch of driver core, PCI, OF, and platform rust
  bindings (all acked by the different subsystem maintainers), hence the
  merge conflict with the rust tree, and some driver core api updates to
  mark things as const, which will also require some fixups due to new
  stuff coming in through other trees in this merge window.

  There are also a bunch of debugfs updates from Al, and there is at
  least one user that does have a regression with these, but Al is
  working on tracking down the fix for it. In my use (and everyone
  else's linux-next use), it does not seem like a big issue at the
  moment.

  Here's a short list of the things in here:

   - driver core rust bindings for PCI, platform, OF, and some i/o
     functions.

     We are almost at the "write a real driver in rust" stage now,
     depending on what you want to do.

   - misc device rust bindings and a sample driver to show how to use
     them

   - debugfs cleanups in the fs as well as the users of the fs api for
     places where drivers got it wrong or were unnecessarily doing
     things in complex ways.

   - driver core const work, making more of the api take const * for
     different parameters to make the rust bindings easier overall.

   - other small fixes and updates

  All of these have been in linux-next with all of the aforementioned
  merge conflicts, and the one debugfs issue, which looks to be resolved
  "soon""

* tag 'driver-core-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (95 commits)
  rust: device: Use as_char_ptr() to avoid explicit cast
  rust: device: Replace CString with CStr in property_present()
  devcoredump: Constify 'struct bin_attribute'
  devcoredump: Define 'struct bin_attribute' through macro
  rust: device: Add property_present()
  saner replacement for debugfs_rename()
  orangefs-debugfs: don't mess with ->d_name
  octeontx2: don't mess with ->d_parent or ->d_parent->d_name
  arm_scmi: don't mess with ->d_parent->d_name
  slub: don't mess with ->d_name
  sof-client-ipc-flood-test: don't mess with ->d_name
  qat: don't mess with ->d_name
  xhci: don't mess with ->d_iname
  mtu3: don't mess wiht ->d_iname
  greybus/camera - stop messing with ->d_iname
  mediatek: stop messing with ->d_iname
  netdevsim: don't embed file_operations into your structs
  b43legacy: make use of debugfs_get_aux()
  b43: stop embedding struct file_operations into their objects
  carl9170: stop embedding file_operations into their objects
  ...

4 months agoMerge tag 'stop-machine.2025.01.28a' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Tue, 28 Jan 2025 19:35:58 +0000 (11:35 -0800)] 
Merge tag 'stop-machine.2025.01.28a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu

Pull stop_machine update from Paul McKenney:
 "Move a misplaced call to rcu_momentary_eqs() from multi_cpu_stop() to
  ensure that interrupts are disabled as required"

* tag 'stop-machine.2025.01.28a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
  stop_machine: Fix rcu_momentary_eqs() call in multi_cpu_stop()

4 months agoMerge tag 'csd-lock.2025.01.28a' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Tue, 28 Jan 2025 19:34:03 +0000 (11:34 -0800)] 
Merge tag 'csd-lock.2025.01.28a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu

Pull CSD-lock update from Paul McKenney:
 "Allow runtime modification of the csd_lock_timeout and
  panic_on_ipistall module parameters"

* tag 'csd-lock.2025.01.28a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
  locking/csd-lock: make CSD lock debug tunables writable in /sys

4 months agoMerge tag 'bootconfig-fixes-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Tue, 28 Jan 2025 18:11:33 +0000 (10:11 -0800)] 
Merge tag 'bootconfig-fixes-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull bootconfig fix from Masami Hiramatsu:

 - Fix wrong format specifier: use '%u' for unsigned int

* tag 'bootconfig-fixes-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tools/bootconfig: Fix the wrong format specifier

4 months agoMerge tag 'tty-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Linus Torvalds [Tue, 28 Jan 2025 17:55:04 +0000 (09:55 -0800)] 
Merge tag 'tty-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty / serial driver updates from Greg KH:
 "Here is the tty/serial driver set of changes for 6.14-rc1. Nothing
  major in here, it was delayed a bit due to a regression found in
  linux-next which has now been reverted and verified that it is fixed.

  Other than the reverts, highlights include:

   - 8250 work to get the nbcon mode working (partially reverted)

   - altera_jtaguart minor fixes

   - fsl_lpuart minor updates

   - sh-sci driver minor updatesa

   - other tiny driver updates and cleanups

  All of these have been in linux-next for a while, and now with no
  reports of problems (thanks to the reverts)"

* tag 'tty-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (44 commits)
  Revert "serial: 8250: Switch to nbcon console"
  Revert "serial: 8250: Revert "drop lockdep annotation from serial8250_clear_IER()""
  serial: sh-sci: Increment the runtime usage counter for the earlycon device
  serial: sh-sci: Clean sci_ports[0] after at earlycon exit
  serial: sh-sci: Do not probe the serial port if its slot in sci_ports[] is in use
  serial: sh-sci: Move runtime PM enable to sci_probe_single()
  serial: sh-sci: Drop __initdata macro for port_cfg
  serial: kgdb_nmi: Remove unused knock code
  tty: Permit some TIOCL_SETSEL modes without CAP_SYS_ADMIN
  tty: xilinx_uartps: split sysrq handling
  serial: 8250: Revert "drop lockdep annotation from serial8250_clear_IER()"
  serial: 8250: Switch to nbcon console
  serial: 8250: Provide flag for IER toggling for RS485
  serial: 8250: Use high-level writing function for FIFO
  serial: 8250: Use frame time to determine timeout
  serial: 8250: Adjust the timeout for FIFO mode
  tty: atmel_serial: Use of_property_present() for non-boolean properties
  serial: sc16is7xx: Add polling mode if no IRQ pin is available
  dt-bindings: serial: sc16is7xx: Add description for polling mode
  tty: serial: atmel: make it selectable for ARCH_LAN969X
  ...

4 months agoMerge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64...
Linus Torvalds [Tue, 28 Jan 2025 17:01:36 +0000 (09:01 -0800)] 
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull KVM/arm64 updates from Will Deacon:
 "New features:

   - Support for non-protected guest in protected mode, achieving near
     feature parity with the non-protected mode

   - Support for the EL2 timers as part of the ongoing NV support

   - Allow control of hardware tracing for nVHE/hVHE

  Improvements, fixes and cleanups:

   - Massive cleanup of the debug infrastructure, making it a bit less
     awkward and definitely easier to maintain. This should pave the way
     for further optimisations

   - Complete rewrite of pKVM's fixed-feature infrastructure, aligning
     it with the rest of KVM and making the code easier to follow

   - Large simplification of pKVM's memory protection infrastructure

   - Better handling of RES0/RES1 fields for memory-backed system
     registers

   - Add a workaround for Qualcomm's Snapdragon X CPUs, which suffer
     from a pretty nasty timer bug

   - Small collection of cleanups and low-impact fixes"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (87 commits)
  arm64/sysreg: Get rid of TRFCR_ELx SysregFields
  KVM: arm64: nv: Fix doc header layout for timers
  KVM: arm64: nv: Apply RESx settings to sysreg reset values
  KVM: arm64: nv: Always evaluate HCR_EL2 using sanitising accessors
  KVM: arm64: Fix selftests after sysreg field name update
  coresight: Pass guest TRFCR value to KVM
  KVM: arm64: Support trace filtering for guests
  KVM: arm64: coresight: Give TRBE enabled state to KVM
  coresight: trbe: Remove redundant disable call
  arm64/sysreg/tools: Move TRFCR definitions to sysreg
  tools: arm64: Update sysreg.h header files
  KVM: arm64: Drop pkvm_mem_transition for host/hyp donations
  KVM: arm64: Drop pkvm_mem_transition for host/hyp sharing
  KVM: arm64: Drop pkvm_mem_transition for FF-A
  KVM: arm64: Explicitly handle BRBE traps as UNDEFINED
  KVM: arm64: vgic: Use str_enabled_disabled() in vgic_v3_probe()
  arm64: kvm: Introduce nvhe stack size constants
  KVM: arm64: Fix nVHE stacktrace VA bits mask
  KVM: arm64: Fix FEAT_MTE in pKVM
  Documentation: Update the behaviour of "kvm-arm.mode"
  ...

4 months agoMerge tag 'loongarch-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuaca...
Linus Torvalds [Tue, 28 Jan 2025 16:52:01 +0000 (08:52 -0800)] 
Merge tag 'loongarch-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson

Pull LoongArch updates from Huacai Chen:

 - Migrate to the generic rule for built-in DTB

 - Disable FIX_EARLYCON_MEM when ARCH_IOREMAP is enabled

 - Derive timer max_delta from PRCFG1's timer_bits

 - Correct the cacheinfo sharing information

 - Add pgprot_nx() implementation

 - Add debugfs entries to switch SFB/TSO state

 - Change the maximum number of watchpoints

 - Some bug fixes and other small changes

* tag 'loongarch-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
  LoongArch: Extend the maximum number of watchpoints
  LoongArch: Change 8 to 14 for LOONGARCH_MAX_{BRP,WRP}
  LoongArch: Add debugfs entries to switch SFB/TSO state
  LoongArch: Fix warnings during S3 suspend
  LoongArch: Adjust SETUP_SLEEP and SETUP_WAKEUP
  LoongArch: Refactor bug_handler() implementation
  LoongArch: Add pgprot_nx() implementation
  LoongArch: Correct the __switch_to() prototype in comments
  LoongArch: Correct the cacheinfo sharing information
  LoongArch: Derive timer max_delta from PRCFG1's timer_bits
  LoongArch: Disable FIX_EARLYCON_MEM when ARCH_IOREMAP is enabled
  LoongArch: Migrate to the generic rule for built-in DTB

4 months agos390/sclp: Initialize sclp subsystem via arch_cpu_finalize_init()
Heiko Carstens [Mon, 20 Jan 2025 10:53:42 +0000 (11:53 +0100)] 
s390/sclp: Initialize sclp subsystem via arch_cpu_finalize_init()

With the switch to GENERIC_CPU_DEVICES an early call to the sclp subsystem
was added to smp_prepare_cpus(). This will usually succeed since the sclp
subsystem is implicitly initialized early enough if an sclp based console
is present.

If no such console is present the initialization happens with an
arch_initcall(); in such cases calls to the sclp subsystem will fail.
For CPU detection this means that the fallback sigp loop will be used
permanently to detect CPUs instead of the preferred READ_CPU_INFO sclp
request.

Fix this by adding an explicit early sclp_init() call via
arch_cpu_finalize_init().

Reported-by: Sheshu Ramanandan <sheshu.ramanandan@ibm.com>
Fixes: 4a39f12e753d ("s390/smp: Switch to GENERIC_CPU_DEVICES")
Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
4 months agoMerge tag 'sparc-for-6.14-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Tue, 28 Jan 2025 16:38:30 +0000 (08:38 -0800)] 
Merge tag 'sparc-for-6.14-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/alarsson/linux-sparc

Pull sparc updates from Andreas Larsson:

 - Improve performance for reading /proc/interrupts

 - Simplify irq code for sun4v

 - Replace zero-length array with flexible array in struct for pci for
   sparc64

* tag 'sparc-for-6.14-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/alarsson/linux-sparc:
  sparc/irq: Remove unneeded if check in sun4v_cookie_only_virqs()
  sparc/irq: Use str_enabled_disabled() helper function
  sparc: replace zero-length array with flexible-array member
  sparc/irq: use seq_put_decimal_ull_width() for decimal values

4 months agotools/bootconfig: Fix the wrong format specifier
Luo Yifan [Tue, 28 Jan 2025 14:27:01 +0000 (23:27 +0900)] 
tools/bootconfig: Fix the wrong format specifier

Use '%u' instead of '%d' for unsigned int.

Link: https://lore.kernel.org/all/20241105011048.201629-1-luoyifan@cmss.chinamobile.com/
Fixes: 973780011106 ("tools/bootconfig: Suppress non-error messages")
Signed-off-by: Luo Yifan <luoyifan@cmss.chinamobile.com>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
4 months agos390/tools: Use array instead of string initializer
Heiko Carstens [Fri, 24 Jan 2025 13:51:52 +0000 (14:51 +0100)] 
s390/tools: Use array instead of string initializer

The in-kernel disassembler intentionally uses nun-null terminated
strings in order to keep the arrays which contain mnemonics as small
as possible. GCC 15 however warns about this:

./arch/s390/include/generated/asm/dis-defs.h:1662:71: error: initializer-string
 for array of ‘char’ is too long [-Werror=unterminated-string-initialization]
 1662 |         [1261] = { .opfrag = 0xea, .format = INSTR_SS_L0RDRD, .name = "unpka" }, \

Get rid of this warning by using array initializers.

Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
4 months agotreewide: const qualify ctl_tables where applicable
Joel Granados [Tue, 28 Jan 2025 12:48:37 +0000 (13:48 +0100)] 
treewide: const qualify ctl_tables where applicable

Add the const qualifier to all the ctl_tables in the tree except for
watchdog_hardlockup_sysctl, memory_allocation_profiling_sysctls,
loadpin_sysctl_table and the ones calling register_net_sysctl (./net,
drivers/inifiniband dirs). These are special cases as they use a
registration function with a non-const qualified ctl_table argument or
modify the arrays before passing them on to the registration function.

Constifying ctl_table structs will prevent the modification of
proc_handler function pointers as the arrays would reside in .rodata.
This is made possible after commit 78eb4ea25cd5 ("sysctl: treewide:
constify the ctl_table argument of proc_handlers") constified all the
proc_handlers.

Created this by running an spatch followed by a sed command:
Spatch:
    virtual patch

    @
    depends on !(file in "net")
    disable optional_qualifier
    @

    identifier table_name != {
      watchdog_hardlockup_sysctl,
      iwcm_ctl_table,
      ucma_ctl_table,
      memory_allocation_profiling_sysctls,
      loadpin_sysctl_table
    };
    @@

    + const
    struct ctl_table table_name [] = { ... };

sed:
    sed --in-place \
      -e "s/struct ctl_table .table = &uts_kern/const struct ctl_table *table = \&uts_kern/" \
      kernel/utsname_sysctl.c

Reviewed-by: Song Liu <song@kernel.org>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> # for kernel/trace/
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> # SCSI
Reviewed-by: Darrick J. Wong <djwong@kernel.org> # xfs
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Corey Minyard <cminyard@mvista.com>
Acked-by: Wei Liu <wei.liu@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Bill O'Donnell <bodonnel@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
Acked-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Acked-by: Anna Schumaker <anna.schumaker@oracle.com>
Signed-off-by: Joel Granados <joel.granados@kernel.org>
4 months agobonding: Correctly support GSO ESP offload
Cosmin Ratiu [Mon, 27 Jan 2025 10:41:47 +0000 (12:41 +0200)] 
bonding: Correctly support GSO ESP offload

The referenced fix is incomplete. It correctly computes
bond_dev->gso_partial_features across slaves, but unfortunately
netdev_fix_features discards gso_partial_features from the feature set
if NETIF_F_GSO_PARTIAL isn't set in bond_dev->features.

This is visible with ethtool -k bond0 | grep esp:
tx-esp-segmentation: off [requested on]
esp-hw-offload: on
esp-tx-csum-hw-offload: on

This patch reworks the bonding GSO offload support by:
- making aggregating gso_partial_features across slaves similar to the
  other feature sets (this part is a no-op).
- advertising the default partial gso features on empty bond devs, same
  as with other feature sets (also a no-op).
- adding NETIF_F_GSO_PARTIAL to hw_enc_features filtered across slaves.
- adding NETIF_F_GSO_PARTIAL to features in bond_setup()

With all of these, 'ethtool -k bond0 | grep esp' now reports:
tx-esp-segmentation: on
esp-hw-offload: on
esp-tx-csum-hw-offload: on

Fixes: 4861333b4217 ("bonding: add ESP offload features when slaves support")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Acked-by: Jay Vosburgh <jv@jvosburgh.net>
Link: https://patch.msgid.link/20250127104147.759658-1-cratiu@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
4 months agoMerge branch 'limit-devicetree-parameters-to-hardware-capability'
Paolo Abeni [Tue, 28 Jan 2025 11:44:47 +0000 (12:44 +0100)] 
Merge branch 'limit-devicetree-parameters-to-hardware-capability'

Kunihiko Hayashi says:

====================
Limit devicetree parameters to hardware capability

This series includes patches that checks the devicetree properties,
the number of MTL queues and FIFO size values, and if these specified
values exceed the value contained in hardware capabilities, limit to
the values from the capabilities. Do nothing if the capabilities don't
have any specified values.

And this sets hardware capability values if FIFO sizes are not specified
and removes redundant lines.
====================

Link: https://patch.msgid.link/20250127013820.2941044-1-hayashi.kunihiko@socionext.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
4 months agonet: stmmac: Specify hardware capability value when FIFO size isn't specified
Kunihiko Hayashi [Mon, 27 Jan 2025 01:38:20 +0000 (10:38 +0900)] 
net: stmmac: Specify hardware capability value when FIFO size isn't specified

When Tx/Rx FIFO size is not specified in advance, the driver checks if
the value is zero and sets the hardware capability value in functions
where that value is used.

Consolidate the check and settings into function stmmac_hw_init() and
remove redundant other statements.

If FIFO size is zero and the hardware capability also doesn't have upper
limit values, return with an error message.

Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Reviewed-by: Yanteng Si <si.yanteng@linux.dev>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
4 months agonet: stmmac: Limit FIFO size by hardware capability
Kunihiko Hayashi [Mon, 27 Jan 2025 01:38:19 +0000 (10:38 +0900)] 
net: stmmac: Limit FIFO size by hardware capability

Tx/Rx FIFO size is specified by the parameter "{tx,rx}-fifo-depth" from
stmmac_platform layer.

However, these values are constrained by upper limits determined by the
capabilities of each hardware feature. There is a risk that the upper
bits will be truncated due to the calculation, so it's appropriate to
limit them to the upper limit values and display a warning message.

This only works if the hardware capability has the upper limit values.

Fixes: e7877f52fd4a ("stmmac: Read tx-fifo-depth and rx-fifo-depth from the devicetree")
Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Reviewed-by: Yanteng Si <si.yanteng@linux.dev>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>