git.ipfire.org Git - thirdparty/kernel/linux.git/log

bonding: limit BOND_MODE_8023AD to Ethernet devices

BOND_MODE_8023AD makes sense for ARPHRD_ETHER only.

syzbot reported:

BUG: KASAN: global-out-of-bounds in __hw_addr_create net/core/dev_addr_lists.c:63 [inline]
BUG: KASAN: global-out-of-bounds in __hw_addr_add_ex+0x25d/0x760 net/core/dev_addr_lists.c:118
Read of size 16 at addr ffffffff8bf94040 by task syz.1.3580/19497

CPU: 1 UID: 0 PID: 19497 Comm: syz.1.3580 Tainted: G             L      syzkaller #0 PREEMPT(full)
Tainted: [L]=SOFTLOCKUP
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025
Call Trace:
<TASK>
  dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
  print_address_description mm/kasan/report.c:378 [inline]
  print_report+0xca/0x240 mm/kasan/report.c:482
  kasan_report+0x118/0x150 mm/kasan/report.c:595
check_region_inline mm/kasan/generic.c:-1 [inline]
  kasan_check_range+0x2b0/0x2c0 mm/kasan/generic.c:200
  __asan_memcpy+0x29/0x70 mm/kasan/shadow.c:105
  __hw_addr_create net/core/dev_addr_lists.c:63 [inline]
  __hw_addr_add_ex+0x25d/0x760 net/core/dev_addr_lists.c:118
  __dev_mc_add net/core/dev_addr_lists.c:868 [inline]
  dev_mc_add+0xa1/0x120 net/core/dev_addr_lists.c:886
  bond_enslave+0x2b8b/0x3ac0 drivers/net/bonding/bond_main.c:2180
  do_set_master+0x533/0x6d0 net/core/rtnetlink.c:2963
  do_setlink+0xcf0/0x41c0 net/core/rtnetlink.c:3165
  rtnl_changelink net/core/rtnetlink.c:3776 [inline]
  __rtnl_newlink net/core/rtnetlink.c:3935 [inline]
  rtnl_newlink+0x161c/0x1c90 net/core/rtnetlink.c:4072
  rtnetlink_rcv_msg+0x7cf/0xb70 net/core/rtnetlink.c:6958
  netlink_rcv_skb+0x208/0x470 net/netlink/af_netlink.c:2550
  netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline]
  netlink_unicast+0x82f/0x9e0 net/netlink/af_netlink.c:1344
  netlink_sendmsg+0x805/0xb30 net/netlink/af_netlink.c:1894
  sock_sendmsg_nosec net/socket.c:727 [inline]
  __sock_sendmsg+0x21c/0x270 net/socket.c:742
  ____sys_sendmsg+0x505/0x820 net/socket.c:2592
  ___sys_sendmsg+0x21f/0x2a0 net/socket.c:2646
  __sys_sendmsg+0x164/0x220 net/socket.c:2678
  do_syscall_32_irqs_on arch/x86/entry/syscall_32.c:83 [inline]
  __do_fast_syscall_32+0x1dc/0x560 arch/x86/entry/syscall_32.c:307
  do_fast_syscall_32+0x34/0x80 arch/x86/entry/syscall_32.c:332
entry_SYSENTER_compat_after_hwframe+0x84/0x8e
</TASK>

The buggy address belongs to the variable:
lacpdu_mcast_addr+0x0/0x40

Fixes: 872254dd6b1f ("net/bonding: Enable bonding to enslave non ARPHRD_ETHER")
Reported-by: syzbot+9c081b17773615f24672@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/6966946b.a70a0220.245e30.0002.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Andrew Lunn <andrew+netdev@lunn.ch>
Acked-by: Jay Vosburgh <jv@jvosburgh.net>
Link: https://patch.msgid.link/20260113191201.3970737-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: add skb->data_len and (skb>end - skb->tail) to skb_dump()

While working on a syzbot report, I found that skb_dump()
is lacking two important parts :

- skb->data_len.

- (skb>end - skb->tail) tailroom is zero if skb is not linear.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260112172621.4188700-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: usb: dm9601: remove broken SR9700 support

The SR9700 chip sends more than one packet in a USB transaction,
like the DM962x chips can optionally do, but the dm9601 driver does not
support this mode, and the hardware does not have the DM962x
MODE_CTL register to disable it, so this driver drops packets on SR9700
devices. The sr9700 driver correctly handles receiving more than one
packet per transaction.

While the dm9601 driver could be improved to handle this, the easiest
way to fix this issue in the short term is to remove the SR9700 device
ID from the dm9601 driver so the sr9700 driver is always used. This
device ID should not have been in more than one driver to begin with.

The "Fixes" commit was chosen so that the patch is automatically
included in all kernels that have the sr9700 driver, even though the
issue affects dm9601.

Fixes: c9b37458e956 ("USB2NET : SR9700 : One chip USB 1.1 USB2NET SR9700Device Driver Support")
Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com>
Acked-by: Peter Korsgaard <peter@korsgaard.com>
Link: https://patch.msgid.link/20260113063924.74464-1-enelsonmoore@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'vsock-virtio-fix-data-loss-disclosure-due-to-joining-of-non-linear-skb'

Michal Luczaj says:

====================
vsock/virtio: Fix data loss/disclosure due to joining of non-linear skb

Loopback transport coalesces some skbs too eagerly. Handling a zerocopy
(non-linear) skb as a linear one leads to skb data loss and kernel memory
disclosure.

Plug the loss/leak by allowing only linear skb join. Provide a test.
====================

Link: https://patch.msgid.link/20260113-vsock-recv-coalescence-v2-0-552b17837cf4@rbox.co
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

vsock/test: Add test for a linear and non-linear skb getting coalesced

Loopback transport can mangle data in rx queue when a linear skb is
followed by a small MSG_ZEROCOPY packet.

To exercise the logic, send out two packets: a weirdly sized one (to ensure
some spare tail room in the skb) and a zerocopy one that's small enough to
fit in the spare room of its predecessor. Then, wait for both to land in
the rx queue, and check the data received. Faulty packets merger manifests
itself by corrupting payload of the later packet.

Signed-off-by: Michal Luczaj <mhal@rbox.co>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://patch.msgid.link/20260113-vsock-recv-coalescence-v2-2-552b17837cf4@rbox.co
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

vsock/virtio: Coalesce only linear skb

vsock/virtio common tries to coalesce buffers in rx queue: if a linear skb
(with a spare tail room) is followed by a small skb (length limited by
GOOD_COPY_LEN = 128), an attempt is made to join them.

Since the introduction of MSG_ZEROCOPY support, assumption that a small skb
will always be linear is incorrect. In the zerocopy case, data is lost and
the linear skb is appended with uninitialized kernel memory.

Of all 3 supported virtio-based transports, only loopback-transport is
affected. G2H virtio-transport rx queue operates on explicitly linear skbs;
see virtio_vsock_alloc_linear_skb() in virtio_vsock_rx_fill(). H2G
vhost-transport may allocate non-linear skbs, but only for sizes that are
not considered for coalescence; see PAGE_ALLOC_COSTLY_ORDER in
virtio_vsock_alloc_skb().

Ensure only linear skbs are coalesced. Note that skb_tailroom(last_skb) > 0
guarantees last_skb is linear.

Fixes: 581512a6dc93 ("vsock/virtio: MSG_ZEROCOPY flag support")
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://patch.msgid.link/20260113-vsock-recv-coalescence-v2-1-552b17837cf4@rbox.co
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

usbnet: fix crash due to missing BQL accounting after resume

In commit 7ff14c52049e ("usbnet: Add support for Byte Queue Limits
(BQL)"), it was missed that usbnet_resume() may enqueue SKBs using
__skb_queue_tail() without reporting them to BQL. As a result, the next
call to netdev_completed_queue() triggers a BUG_ON() in dql_completed(),
since the SKBs queued during resume were never accounted for.

This patch fixes the issue by adding a corresponding netdev_sent_queue()
call in usbnet_resume() when SKBs are queued after suspend. Because
dev->txq.lock is held at this point, no concurrent calls to
netdev_sent_queue() from usbnet_start_xmit() can occur.

The crash can be reproduced by generating network traffic
(e.g. iperf3 -c ... -t 0), suspending the system, and then waking it up
(e.g. rtcwake -m mem -s 5).

When testing USB2 Android tethering (cdc_ncm), the system crashed within
three suspend/resume cycles without this patch. With the patch applied,
no crashes were observed after 90 cycles. Testing with an AX88179 USB
Ethernet adapter also showed no crashes.

Fixes: 7ff14c52049e ("usbnet: Add support for Byte Queue Limits (BQL)")
Reported-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Tested-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Tested-by: Simon Schippers <simon.schippers@tu-dortmund.de>
Signed-off-by: Simon Schippers <simon.schippers@tu-dortmund.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260113075139.6735-1-simon.schippers@tu-dortmund.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'net-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
"Including fixes from bluetooth, can and IPsec.

  Current release - regressions:

   - net: add net.core.qdisc_max_burst

   - can: propagate CAN device capabilities via ml_priv

  Previous releases - regressions:

   - dst: fix races in rt6_uncached_list_del() and
     rt_del_uncached_list()

   - ipv6: fix use-after-free in inet6_addr_del().

   - xfrm: fix inner mode lookup in tunnel mode GSO segmentation

   - ip_tunnel: spread netdev_lockdep_set_classes()

   - ip6_tunnel: use skb_vlan_inet_prepare() in __ip6_tnl_rcv()

   - bluetooth: hci_sync: enable PA sync lost event

   - eth: virtio-net:
      - fix the deadlock when disabling rx NAPI
      - fix misalignment bug in struct virtnet_info

  Previous releases - always broken:

   - ipv4: ip_gre: make ipgre_header() robust

   - can: fix SSP_SRC in cases when bit-rate is higher than 1 MBit.

   - eth:
      - mlx5e: profile change fix
      - octeon_ep_vf: fix free_irq dev_id mismatch in IRQ rollback
      - macvlan: fix possible UAF in macvlan_forward_source()"

* tag 'net-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (37 commits)
  virtio_net: Fix misalignment bug in struct virtnet_info
  net: can: j1939: j1939_xtp_rx_rts_session_active(): deactivate session upon receiving the second rts
  can: raw: instantly reject disabled CAN frames
  can: propagate CAN device capabilities via ml_priv
  Revert "can: raw: instantly reject unsupported CAN frames"
  net/sched: sch_qfq: do not free existing class in qfq_change_class()
  selftests: drv-net: fix RPS mask handling for high CPU numbers
  selftests: drv-net: fix RPS mask handling in toeplitz test
  ipv6: Fix use-after-free in inet6_addr_del().
  dst: fix races in rt6_uncached_list_del() and rt_del_uncached_list()
  net: hv_netvsc: reject RSS hash key programming without RX indirection table
  tools: ynl: render event op docs correctly
  net: add net.core.qdisc_max_burst
  net: airoha: Fix typo in airoha_ppe_setup_tc_block_cb definition
  net: phy: motorcomm: fix duplex setting error for phy leds
  net: octeon_ep_vf: fix free_irq dev_id mismatch in IRQ rollback
  net/mlx5e: Restore destroying state bit after profile cleanup
  net/mlx5e: Pass netdev to mlx5e_destroy_netdev instead of priv
  net/mlx5e: Don't store mlx5e_priv in mlx5e_dev devlink priv
  net/mlx5e: Fix crash on profile change rollback failure
  ...

Merge tag 'linux-can-fixes-for-6.19-20260115' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2026-01-15

this is a pull request of 4 patches for net/main, it super-seeds the
"can 2026-01-14" pull request. The dev refcount leak in patch #3 is
fixed.

The first 3 patches are by Oliver Hartkopp and revert the approach to
instantly reject unsupported CAN frames introduced in
net-next-for-v6.19 and replace it by placing the needed data into the
CAN specific ml_priv.

The last patch is by Tetsuo Handa and fixes a J1939 refcount leak for
j1939_session in session deactivation upon receiving the second RTS.

linux-can-fixes-for-6.19-20260115

* tag 'linux-can-fixes-for-6.19-20260115' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
  net: can: j1939: j1939_xtp_rx_rts_session_active(): deactivate session upon receiving the second rts
  can: raw: instantly reject disabled CAN frames
  can: propagate CAN device capabilities via ml_priv
  Revert "can: raw: instantly reject unsupported CAN frames"
====================

Link: https://patch.msgid.link/20260115090603.1124860-1-mkl@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Merge tag 'ipsec-2026-01-14' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec

Steffen Klassert says:

====================
pull request (net): ipsec 2026-01-14

1) Fix inner mode lookup in tunnel mode GSO segmentation.
   The protocol was taken from the wrong field.

2) Set ipv4 no_pmtu_disc flag only on output SAs. The
   insertation of input SAs can fail if no_pmtu_disc
   is set.

Please pull or let me know if there are problems.

ipsec-2026-01-14

* tag 'ipsec-2026-01-14' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
  xfrm: set ipv4 no_pmtu_disc flag only on output sa when direction is set
  xfrm: Fix inner mode lookup in tunnel mode GSO segmentation
====================

Link: https://patch.msgid.link/20260114121817.1106134-1-steffen.klassert@secunet.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

virtio_net: Fix misalignment bug in struct virtnet_info

Use the new TRAILING_OVERLAP() helper to fix a misalignment bug
along with the following warning:

drivers/net/virtio_net.c:429:46: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]

This helper creates a union between a flexible-array member (FAM)
and a set of members that would otherwise follow it (in this case
`u8 rss_hash_key_data[VIRTIO_NET_RSS_MAX_KEY_SIZE];`). This
overlays the trailing members (rss_hash_key_data) onto the FAM
(hash_key_data) while keeping the FAM and the start of MEMBERS aligned.
The static_assert() ensures this alignment remains.

Notice that due to tail padding in flexible `struct
virtio_net_rss_config_trailer`, `rss_trailer.hash_key_data`
(at offset 83 in struct virtnet_info) and `rss_hash_key_data` (at
offset 84 in struct virtnet_info) are misaligned by one byte. See
below:

struct virtio_net_rss_config_trailer {
        __le16                     max_tx_vq;            /*     0     2 */
        __u8                       hash_key_length;      /*     2     1 */
        __u8                       hash_key_data[];      /*     3     0 */

        /* size: 4, cachelines: 1, members: 3 */
        /* padding: 1 */
        /* last cacheline: 4 bytes */
};

struct virtnet_info {
...
        struct virtio_net_rss_config_trailer rss_trailer; /*    80     4 */

        /* XXX last struct has 1 byte of padding */

        u8                         rss_hash_key_data[40]; /*    84    40 */
...
        /* size: 832, cachelines: 13, members: 48 */
        /* sum members: 801, holes: 8, sum holes: 31 */
        /* paddings: 2, sum paddings: 5 */
};

After changes, those members are correctly aligned at offset 795:

struct virtnet_info {
...
        union {
                struct virtio_net_rss_config_trailer rss_trailer; /*   792     4 */
                struct {
                        unsigned char __offset_to_hash_key_data[3]; /*   792     3 */
                        u8         rss_hash_key_data[40]; /*   795    40 */
                };                                       /*   792    43 */
        };                                               /*   792    44 */
...
        /* size: 840, cachelines: 14, members: 47 */
        /* sum members: 801, holes: 8, sum holes: 35 */
        /* padding: 4 */
        /* paddings: 1, sum paddings: 4 */
        /* last cacheline: 8 bytes */
};

As a result, the RSS key passed to the device is shifted by 1
byte: the last byte is cut off, and instead a (possibly
uninitialized) byte is added at the beginning.

As a last note `struct virtio_net_rss_config_hdr *rss_hdr;` is also
moved to the end, since it seems those three members should stick
around together. :)

Cc: stable@vger.kernel.org
Fixes: ed3100e90d0d ("virtio_net: Use new RSS config structs")
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://patch.msgid.link/aWIItWq5dV9XTTCJ@kspp
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: can: j1939: j1939_xtp_rx_rts_session_active(): deactivate session upon receiving the second rts

Since j1939_session_deactivate_activate_next() in j1939_tp_rxtimer() is
called only when the timer is enabled, we need to call
j1939_session_deactivate_activate_next() if we cancelled the timer.
Otherwise, refcount for j1939_session leaks, which will later appear as

| unregister_netdevice: waiting for vcan0 to become free. Usage count = 2.

problem.

Reported-by: syzbot <syzbot+881d65229ca4f9ae8c84@syzkaller.appspotmail.com>
Closes: https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Tested-by: Oleksij Rempel <o.rempel@pengutronix.de>
Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol")
Link: https://patch.msgid.link/b1212653-8fa1-44e1-be9d-12f950fb3a07@I-love.SAKURA.ne.jp
Cc: stable@vger.kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

Merge patch series "can: raw: better approach to instantly reject unsupported CAN frames"

Oliver Hartkopp <socketcan@hartkopp.net> says:

This series reverts commit 1a620a723853 ("can: raw: instantly reject
unsupported CAN frames").

and its follow-up fixes for the introduced dependency issues.

commit 1a620a723853 ("can: raw: instantly reject unsupported CAN frames")
commit cb2dc6d2869a ("can: Kconfig: select CAN driver infrastructure by default")
commit 6abd4577bccc ("can: fix build dependency")
commit 5a5aff6338c0 ("can: fix build dependency")

The reverted patch was accessing CAN device internal data structures
from the network layer because it needs to know about the CAN protocol
capabilities of the CAN devices.

This data access caused build problems between the CAN network and the
CAN driver layer which introduced unwanted Kconfig dependencies and fixes.

The patches 2 & 3 implement a better approach which makes use of the
CAN specific ml_priv data which is accessible from both sides.

With this change the CAN network layer can check the required features
and the decoupling of the driver layer and network layer is restored.

Link: https://patch.msgid.link/20260109144135.8495-1-socketcan@hartkopp.net
[mkl: give series a more descriptive name]
[mkl: properly format reverted patch commitish]
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

can: raw: instantly reject disabled CAN frames

For real CAN interfaces the CAN_CTRLMODE_FD and CAN_CTRLMODE_XL control
modes indicate whether an interface can handle those CAN FD/XL frames.

In the case a CAN XL interface is configured in CANXL-only mode with
disabled error-signalling neither CAN CC nor CAN FD frames can be sent.

The checks are now performed on CAN_RAW sockets to give an instant feedback
to the user when writing unsupported CAN frames to the interface or when
the CAN interface is in read-only mode.

Fixes: 1a620a723853 ("can: raw: instantly reject unsupported CAN frames")
Cc: Marc Kleine-Budde <mkl@pengutronix.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Vincent Mailhol <mailhol@kernel.org>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Link: https://patch.msgid.link/20260109144135.8495-4-socketcan@hartkopp.net
[mkl: fix dev reference leak]
Link: https://lore.kernel.org/all/0636c732-2e71-4633-8005-dfa85e1da445@hartkopp.net
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

can: propagate CAN device capabilities via ml_priv

Commit 1a620a723853 ("can: raw: instantly reject unsupported CAN frames")
caused a sequence of dependency and linker fixes.

Instead of accessing CAN device internal data structures which caused the
dependency problems this patch introduces capability information into the
CAN specific ml_priv data which is accessible from both sides.

With this change the CAN network layer can check the required features and
the decoupling of the driver layer and network layer is restored.

Fixes: 1a620a723853 ("can: raw: instantly reject unsupported CAN frames")
Cc: Marc Kleine-Budde <mkl@pengutronix.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Vincent Mailhol <mailhol@kernel.org>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Link: https://patch.msgid.link/20260109144135.8495-3-socketcan@hartkopp.net
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

Revert "can: raw: instantly reject unsupported CAN frames"

This reverts commit 1a620a723853a0f49703c317d52dc6b9602cbaa8

and its follow-up fixes for the introduced dependency issues.

commit 1a620a723853 ("can: raw: instantly reject unsupported CAN frames")
commit cb2dc6d2869a ("can: Kconfig: select CAN driver infrastructure by default")
commit 6abd4577bccc ("can: fix build dependency")
commit 5a5aff6338c0 ("can: fix build dependency")

The entire problem was caused by the requirement that a new network layer
feature needed to know about the protocol capabilities of the CAN devices.
Instead of accessing CAN device internal data structures which caused the
dependency problems a better approach has been developed which makes use of
CAN specific ml_priv data which is accessible from both sides.

Cc: Marc Kleine-Budde <mkl@pengutronix.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Vincent Mailhol <mailhol@kernel.org>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Link: https://patch.msgid.link/20260109144135.8495-2-socketcan@hartkopp.net
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
"Only one core change (and one in doc only) the rest are drivers.

  The one core fix is for some inline encrypting drives that can't
  handle encryption requests on non-data commands (like error handling
  ones); it saves the request level encryption parameters in the eh_save
  structure so they can be cleared for error handling and restored after
  it is completed"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: ufs: host: mediatek: Make read-only array scale_us static const
  scsi: bfa: Update outdated comment
  scsi: mpt3sas: Update maintainer list
  scsi: ufs: core: Configure MCQ after link startup
  scsi: core: Fix error handler encryption support
  scsi: core: Correct documentation for scsi_test_unit_ready()
  scsi: ufs: dt-bindings: Fix several grammar errors

Merge tag 'bitmap-for-6.19-rc5' of https://github.com/norov/linux

Pull bitmap fix from Yury Norov:
"Fix Rust build for architectures implementing their own find_bit() ops
(arm and m68k)"

* tag 'bitmap-for-6.19-rc5' of https://github.com/norov/linux:
rust: bitops: fix missing _find_* functions on 32-bit ARM

Merge tag 'media/v6.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media fixes from Mauro Carvalho Chehab:

- ov02c10: some fixes related to preserving bayer pattern and
   horizontal control

- ipu-bridge: Add quirks for some Dell XPS laptops with inverted
   sensors

- mali-c55: Fix version identifier logic

- rzg2l-cru: csi-2: fix RZ/V2H input sizes on some variants

* tag 'media/v6.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  media: ov02c10: Remove unnecessary hflip and vflip pointers
  media: ipu-bridge: Add DMI quirk for Dell XPS laptops with upside down sensors
  media: ov02c10: Fix the horizontal flip control
  media: ov02c10: Adjust x-win/y-win when changing flipping to preserve bayer-pattern
  media: ov02c10: Fix bayer-pattern change after default vflip change
  media: rzg2l-cru: csi-2: Support RZ/V2H input sizes
  media: uapi: mali-c55-config: Remove version identifier
  media: mali-c55: Remove duplicated version check
  media: Documentation: mali-c55: Use v4l2-isp version identifier

Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Pull bpf fixes from Alexei Starovoitov:

- Fix incorrect usage of BPF_TRAMP_F_ORIG_STACK in riscv JIT (Menglong
   Dong)

- Fix reference count leak in bpf_prog_test_run_xdp() (Tetsuo Handa)

- Fix metadata size check in bpf_test_run() (Toke Høiland-Jørgensen)

- Check that BPF insn array is not allowed as a map for const strings
   (Deepanshu Kartikey)

* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf: Fix reference count leak in bpf_prog_test_run_xdp()
  bpf: Reject BPF_MAP_TYPE_INSN_ARRAY in check_reg_const_str()
  selftests/bpf: Update xdp_context_test_run test to check maximum metadata size
  bpf, test_run: Subtract size of xdp_frame from allowed metadata size
  riscv, bpf: Fix incorrect usage of BPF_TRAMP_F_ORIG_STACK

net/sched: sch_qfq: do not free existing class in qfq_change_class()

Fixes qfq_change_class() error case.

cl->qdisc and cl should only be freed if a new class and qdisc
were allocated, or we risk various UAF.

Fixes: 462dbc9101ac ("pkt_sched: QFQ Plus: fair-queueing service at DRR cost")
Reported-by: syzbot+07f3f38f723c335f106d@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/6965351d.050a0220.eaf7.00c5.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20260112175656.17605-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

rust: bitops: fix missing _find_* functions on 32-bit ARM

On 32-bit ARM, you may encounter linker errors such as this one:

ld.lld: error: undefined symbol: _find_next_zero_bit
>>> referenced by rust_binder_main.43196037ba7bcee1-cgu.0
>>> drivers/android/binder/rust_binder_main.o:(<rust_binder_main::process::Process>::insert_or_update_handle) in archive vmlinux.a
>>> referenced by rust_binder_main.43196037ba7bcee1-cgu.0
>>> drivers/android/binder/rust_binder_main.o:(<rust_binder_main::process::Process>::insert_or_update_handle) in archive vmlinux.a

This error occurs because even though the functions are declared by
include/linux/find.h, the definition is #ifdef'd out on 32-bit ARM. This
is because arch/arm/include/asm/bitops.h contains:

#define find_first_zero_bit(p,sz) _find_first_zero_bit_le(p,sz)
#define find_next_zero_bit(p,sz,off) _find_next_zero_bit_le(p,sz,off)
#define find_first_bit(p,sz) _find_first_bit_le(p,sz)
#define find_next_bit(p,sz,off) _find_next_bit_le(p,sz,off)

And the underscore-prefixed function is conditional on #ifndef of the
non-underscore-prefixed name, but the declaration in find.h is *not*
conditional on that #ifndef.

To fix the linker error, we ensure that the symbols in question exist
when compiling Rust code. We do this by defining them in rust/helpers/
whenever the normal definition is #ifndef'd out.

Note that these helpers are somewhat unusual in that they do not have
the rust_helper_ prefix that most helpers have. Adding the rust_helper_
prefix does not compile, as 'bindings::_find_next_zero_bit()' will
result in a call to a symbol called _find_next_zero_bit as defined by
include/linux/find.h rather than a symbol with the rust_helper_ prefix.
This is because when a symbol is present in both include/ and
rust/helpers/, the one from include/ wins under the assumption that the
current configuration is one where that helper is unnecessary. This
heuristic fails for _find_next_zero_bit() because the header file always
declares it even if the symbol does not exist.

The functions still use the __rust_helper annotation. This lets the
wrapper function be inlined into Rust code even if full kernel LTO is
not used once the patch series for that feature lands.

Yury: arches are free to implement they own find_bit() functions. Most
rely on generic implementation, but arm32 and m86k - not; so they require
custom handling. Alice confirmed it fixes the build for both.

Cc: stable@vger.kernel.org
Fixes: 6cf93a9ed39e ("rust: add bindings for bitops.h")
Reported-by: Andreas Hindborg <a.hindborg@kernel.org>
Closes: https://rust-for-linux.zulipchat.com/#narrow/channel/x/topic/x/near/561677301
Tested-by: Andreas Hindborg <a.hindborg@kernel.org>
Reviewed-by: Dirk Behme <dirk.behme@de.bosch.com>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>

Merge branch 'selftests-couple-of-fixes-in-toeplitz-rps-cases'

Gal Pressman says:

====================
selftests: Couple of fixes in Toeplitz RPS cases

Fix a couple of bugs in the RPS cases of the Toeplitz selftest.
====================

Link: https://patch.msgid.link/20260112173715.384843-1-gal@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

selftests: drv-net: fix RPS mask handling for high CPU numbers

The RPS bitmask bounds check uses ~(RPS_MAX_CPUS - 1) which equals ~15 =
0xfff0, only allowing CPUs 0-3.

Change the mask to ~((1UL << RPS_MAX_CPUS) - 1) = ~0xffff to allow CPUs
0-15.

Fixes: 5ebfb4cc3048 ("selftests/net: toeplitz test")
Reviewed-by: Nimrod Oren <noren@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260112173715.384843-3-gal@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

selftests: drv-net: fix RPS mask handling in toeplitz test

The toeplitz.py test passed the hex mask without "0x" prefix (e.g.,
"300" for CPUs 8,9). The toeplitz.c strtoul() call wrongly parsed this
as decimal 300 (0x12c) instead of hex 0x300.

Pass the prefixed mask to toeplitz.c, and the unprefixed one to sysfs.

Fixes: 9cf9aa77a1f6 ("selftests: drv-net: hw: convert the Toeplitz test to Python")
Reviewed-by: Nimrod Oren <noren@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260112173715.384843-2-gal@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ipv6: Fix use-after-free in inet6_addr_del().

syzbot reported use-after-free of inet6_ifaddr in
inet6_addr_del(). [0]

The cited commit accidentally moved ipv6_del_addr() for
mngtmpaddr before reading its ifp->flags for temporary
addresses in inet6_addr_del().

Let's move ipv6_del_addr() down to fix the UAF.

[0]:
BUG: KASAN: slab-use-after-free in inet6_addr_del.constprop.0+0x67a/0x6b0 net/ipv6/addrconf.c:3117
Read of size 4 at addr ffff88807b89c86c by task syz.3.1618/9593

CPU: 0 UID: 0 PID: 9593 Comm: syz.3.1618 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:378 [inline]
print_report+0xcd/0x630 mm/kasan/report.c:482
kasan_report+0xe0/0x110 mm/kasan/report.c:595
inet6_addr_del.constprop.0+0x67a/0x6b0 net/ipv6/addrconf.c:3117
addrconf_del_ifaddr+0x11e/0x190 net/ipv6/addrconf.c:3181
inet6_ioctl+0x1e5/0x2b0 net/ipv6/af_inet6.c:582
sock_do_ioctl+0x118/0x280 net/socket.c:1254
sock_ioctl+0x227/0x6b0 net/socket.c:1375
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:597 [inline]
__se_sys_ioctl fs/ioctl.c:583 [inline]
__x64_sys_ioctl+0x18e/0x210 fs/ioctl.c:583
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xcd/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f164cf8f749
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f164de64038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007f164d1e5fa0 RCX: 00007f164cf8f749
RDX: 0000200000000000 RSI: 0000000000008936 RDI: 0000000000000003
RBP: 00007f164d013f91 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f164d1e6038 R14: 00007f164d1e5fa0 R15: 00007ffde15c8288
</TASK>

Allocated by task 9593:
kasan_save_stack+0x33/0x60 mm/kasan/common.c:56
kasan_save_track+0x14/0x30 mm/kasan/common.c:77
poison_kmalloc_redzone mm/kasan/common.c:397 [inline]
__kasan_kmalloc+0xaa/0xb0 mm/kasan/common.c:414
kmalloc_noprof include/linux/slab.h:957 [inline]
kzalloc_noprof include/linux/slab.h:1094 [inline]
ipv6_add_addr+0x4e3/0x2010 net/ipv6/addrconf.c:1120
inet6_addr_add+0x256/0x9b0 net/ipv6/addrconf.c:3050
addrconf_add_ifaddr+0x1fc/0x450 net/ipv6/addrconf.c:3160
inet6_ioctl+0x103/0x2b0 net/ipv6/af_inet6.c:580
sock_do_ioctl+0x118/0x280 net/socket.c:1254
sock_ioctl+0x227/0x6b0 net/socket.c:1375
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:597 [inline]
__se_sys_ioctl fs/ioctl.c:583 [inline]
__x64_sys_ioctl+0x18e/0x210 fs/ioctl.c:583
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xcd/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f

Freed by task 6099:
kasan_save_stack+0x33/0x60 mm/kasan/common.c:56
kasan_save_track+0x14/0x30 mm/kasan/common.c:77
kasan_save_free_info+0x3b/0x60 mm/kasan/generic.c:584
poison_slab_object mm/kasan/common.c:252 [inline]
__kasan_slab_free+0x5f/0x80 mm/kasan/common.c:284
kasan_slab_free include/linux/kasan.h:234 [inline]
slab_free_hook mm/slub.c:2540 [inline]
slab_free_freelist_hook mm/slub.c:2569 [inline]
slab_free_bulk mm/slub.c:6696 [inline]
kmem_cache_free_bulk mm/slub.c:7383 [inline]
kmem_cache_free_bulk+0x2bf/0x680 mm/slub.c:7362
kfree_bulk include/linux/slab.h:830 [inline]
kvfree_rcu_bulk+0x1b7/0x1e0 mm/slab_common.c:1523
kvfree_rcu_drain_ready mm/slab_common.c:1728 [inline]
kfree_rcu_monitor+0x1d0/0x2f0 mm/slab_common.c:1801
process_one_work+0x9ba/0x1b20 kernel/workqueue.c:3257
process_scheduled_works kernel/workqueue.c:3340 [inline]
worker_thread+0x6c8/0xf10 kernel/workqueue.c:3421
kthread+0x3c5/0x780 kernel/kthread.c:463
ret_from_fork+0x983/0xb10 arch/x86/kernel/process.c:158
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246

Fixes: 00b5b7aab9e42 ("net/ipv6: delete temporary address if mngtmpaddr is removed or unmanaged")
Reported-by: syzbot+72e610f4f1a930ca9d8a@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/696598e9.050a0220.3be5c5.0009.GAE@google.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260113010538.2019411-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

dst: fix races in rt6_uncached_list_del() and rt_del_uncached_list()

syzbot was able to crash the kernel in rt6_uncached_list_flush_dev()
in an interesting way [1]

Crash happens in list_del_init()/INIT_LIST_HEAD() while writing
list->prev, while the prior write on list->next went well.

static inline void INIT_LIST_HEAD(struct list_head *list)
{
WRITE_ONCE(list->next, list); // This went well
WRITE_ONCE(list->prev, list); // Crash, @list has been freed.
}

Issue here is that rt6_uncached_list_del() did not attempt to lock
ul->lock, as list_empty(&rt->dst.rt_uncached) returned
true because the WRITE_ONCE(list->next, list) happened on the other CPU.

We might use list_del_init_careful() and list_empty_careful(),
or make sure rt6_uncached_list_del() always grabs the spinlock
whenever rt->dst.rt_uncached_list has been set.

A similar fix is neeed for IPv4.

[1]

BUG: KASAN: slab-use-after-free in INIT_LIST_HEAD include/linux/list.h:46 [inline]
BUG: KASAN: slab-use-after-free in list_del_init include/linux/list.h:296 [inline]
BUG: KASAN: slab-use-after-free in rt6_uncached_list_flush_dev net/ipv6/route.c:191 [inline]
BUG: KASAN: slab-use-after-free in rt6_disable_ip+0x633/0x730 net/ipv6/route.c:5020
Write of size 8 at addr ffff8880294cfa78 by task kworker/u8:14/3450

CPU: 0 UID: 0 PID: 3450 Comm: kworker/u8:14 Tainted: G             L      syzkaller #0 PREEMPT_{RT,(full)}
Tainted: [L]=SOFTLOCKUP
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025
Workqueue: netns cleanup_net
Call Trace:
<TASK>
  dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
  print_address_description mm/kasan/report.c:378 [inline]
  print_report+0xca/0x240 mm/kasan/report.c:482
  kasan_report+0x118/0x150 mm/kasan/report.c:595
  INIT_LIST_HEAD include/linux/list.h:46 [inline]
  list_del_init include/linux/list.h:296 [inline]
  rt6_uncached_list_flush_dev net/ipv6/route.c:191 [inline]
  rt6_disable_ip+0x633/0x730 net/ipv6/route.c:5020
  addrconf_ifdown+0x143/0x18a0 net/ipv6/addrconf.c:3853
addrconf_notify+0x1bc/0x1050 net/ipv6/addrconf.c:-1
  notifier_call_chain+0x19d/0x3a0 kernel/notifier.c:85
  call_netdevice_notifiers_extack net/core/dev.c:2268 [inline]
  call_netdevice_notifiers net/core/dev.c:2282 [inline]
  netif_close_many+0x29c/0x410 net/core/dev.c:1785
  unregister_netdevice_many_notify+0xb50/0x2330 net/core/dev.c:12353
  ops_exit_rtnl_list net/core/net_namespace.c:187 [inline]
  ops_undo_list+0x3dc/0x990 net/core/net_namespace.c:248
  cleanup_net+0x4de/0x7b0 net/core/net_namespace.c:696
  process_one_work kernel/workqueue.c:3257 [inline]
  process_scheduled_works+0xad1/0x1770 kernel/workqueue.c:3340
  worker_thread+0x8a0/0xda0 kernel/workqueue.c:3421
  kthread+0x711/0x8a0 kernel/kthread.c:463
  ret_from_fork+0x510/0xa50 arch/x86/kernel/process.c:158
  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246
</TASK>

Allocated by task 803:
  kasan_save_stack mm/kasan/common.c:57 [inline]
  kasan_save_track+0x3e/0x80 mm/kasan/common.c:78
  unpoison_slab_object mm/kasan/common.c:340 [inline]
  __kasan_slab_alloc+0x6c/0x80 mm/kasan/common.c:366
  kasan_slab_alloc include/linux/kasan.h:253 [inline]
  slab_post_alloc_hook mm/slub.c:4953 [inline]
  slab_alloc_node mm/slub.c:5263 [inline]
  kmem_cache_alloc_noprof+0x18d/0x6c0 mm/slub.c:5270
  dst_alloc+0x105/0x170 net/core/dst.c:89
  ip6_dst_alloc net/ipv6/route.c:342 [inline]
  icmp6_dst_alloc+0x75/0x460 net/ipv6/route.c:3333
  mld_sendpack+0x683/0xe60 net/ipv6/mcast.c:1844
  mld_send_cr net/ipv6/mcast.c:2154 [inline]
  mld_ifc_work+0x83e/0xd60 net/ipv6/mcast.c:2693
  process_one_work kernel/workqueue.c:3257 [inline]
  process_scheduled_works+0xad1/0x1770 kernel/workqueue.c:3340
  worker_thread+0x8a0/0xda0 kernel/workqueue.c:3421
  kthread+0x711/0x8a0 kernel/kthread.c:463
  ret_from_fork+0x510/0xa50 arch/x86/kernel/process.c:158
  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246

Freed by task 20:
  kasan_save_stack mm/kasan/common.c:57 [inline]
  kasan_save_track+0x3e/0x80 mm/kasan/common.c:78
  kasan_save_free_info+0x46/0x50 mm/kasan/generic.c:584
  poison_slab_object mm/kasan/common.c:253 [inline]
  __kasan_slab_free+0x5c/0x80 mm/kasan/common.c:285
  kasan_slab_free include/linux/kasan.h:235 [inline]
  slab_free_hook mm/slub.c:2540 [inline]
  slab_free mm/slub.c:6670 [inline]
  kmem_cache_free+0x18f/0x8d0 mm/slub.c:6781
  dst_destroy+0x235/0x350 net/core/dst.c:121
  rcu_do_batch kernel/rcu/tree.c:2605 [inline]
  rcu_core kernel/rcu/tree.c:2857 [inline]
  rcu_cpu_kthread+0xba5/0x1af0 kernel/rcu/tree.c:2945
  smpboot_thread_fn+0x542/0xa60 kernel/smpboot.c:160
  kthread+0x711/0x8a0 kernel/kthread.c:463
  ret_from_fork+0x510/0xa50 arch/x86/kernel/process.c:158
  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246

Last potentially related work creation:
  kasan_save_stack+0x3e/0x60 mm/kasan/common.c:57
  kasan_record_aux_stack+0xbd/0xd0 mm/kasan/generic.c:556
  __call_rcu_common kernel/rcu/tree.c:3119 [inline]
  call_rcu+0xee/0x890 kernel/rcu/tree.c:3239
  refdst_drop include/net/dst.h:266 [inline]
  skb_dst_drop include/net/dst.h:278 [inline]
  skb_release_head_state+0x71/0x360 net/core/skbuff.c:1156
  skb_release_all net/core/skbuff.c:1180 [inline]
  __kfree_skb net/core/skbuff.c:1196 [inline]
  sk_skb_reason_drop+0xe9/0x170 net/core/skbuff.c:1234
  kfree_skb_reason include/linux/skbuff.h:1322 [inline]
  tcf_kfree_skb_list include/net/sch_generic.h:1127 [inline]
  __dev_xmit_skb net/core/dev.c:4260 [inline]
  __dev_queue_xmit+0x26aa/0x3210 net/core/dev.c:4785
  NF_HOOK_COND include/linux/netfilter.h:307 [inline]
  ip6_output+0x340/0x550 net/ipv6/ip6_output.c:247
  NF_HOOK+0x9e/0x380 include/linux/netfilter.h:318
  mld_sendpack+0x8d4/0xe60 net/ipv6/mcast.c:1855
  mld_send_cr net/ipv6/mcast.c:2154 [inline]
  mld_ifc_work+0x83e/0xd60 net/ipv6/mcast.c:2693
  process_one_work kernel/workqueue.c:3257 [inline]
  process_scheduled_works+0xad1/0x1770 kernel/workqueue.c:3340
  worker_thread+0x8a0/0xda0 kernel/workqueue.c:3421
  kthread+0x711/0x8a0 kernel/kthread.c:463
  ret_from_fork+0x510/0xa50 arch/x86/kernel/process.c:158
  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246

The buggy address belongs to the object at ffff8880294cfa00
which belongs to the cache ip6_dst_cache of size 232
The buggy address is located 120 bytes inside of
freed 232-byte region [ffff8880294cfa00, ffff8880294cfae8)

The buggy address belongs to the physical page:
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x294cf
memcg:ffff88803536b781
flags: 0x80000000000000(node=0|zone=1)
page_type: f5(slab)
raw: 0080000000000000 ffff88802ff1c8c0 ffffea0000bf2bc0 dead000000000006
raw: 0000000000000000 00000000800c000c 00000000f5000000 ffff88803536b781
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 0, migratetype Unmovable, gfp_mask 0x52820(GFP_ATOMIC|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP), pid 9, tgid 9 (kworker/0:0), ts 91119585830, free_ts 91088628818
  set_page_owner include/linux/page_owner.h:32 [inline]
  post_alloc_hook+0x234/0x290 mm/page_alloc.c:1857
  prep_new_page mm/page_alloc.c:1865 [inline]
  get_page_from_freelist+0x28c0/0x2960 mm/page_alloc.c:3915
  __alloc_frozen_pages_noprof+0x181/0x370 mm/page_alloc.c:5210
  alloc_pages_mpol+0xd1/0x380 mm/mempolicy.c:2486
  alloc_slab_page mm/slub.c:3075 [inline]
  allocate_slab+0x86/0x3b0 mm/slub.c:3248
  new_slab mm/slub.c:3302 [inline]
  ___slab_alloc+0xb10/0x13e0 mm/slub.c:4656
  __slab_alloc+0xc6/0x1f0 mm/slub.c:4779
  __slab_alloc_node mm/slub.c:4855 [inline]
  slab_alloc_node mm/slub.c:5251 [inline]
  kmem_cache_alloc_noprof+0x101/0x6c0 mm/slub.c:5270
  dst_alloc+0x105/0x170 net/core/dst.c:89
  ip6_dst_alloc net/ipv6/route.c:342 [inline]
  icmp6_dst_alloc+0x75/0x460 net/ipv6/route.c:3333
  mld_sendpack+0x683/0xe60 net/ipv6/mcast.c:1844
  mld_send_cr net/ipv6/mcast.c:2154 [inline]
  mld_ifc_work+0x83e/0xd60 net/ipv6/mcast.c:2693
  process_one_work kernel/workqueue.c:3257 [inline]
  process_scheduled_works+0xad1/0x1770 kernel/workqueue.c:3340
  worker_thread+0x8a0/0xda0 kernel/workqueue.c:3421
  kthread+0x711/0x8a0 kernel/kthread.c:463
  ret_from_fork+0x510/0xa50 arch/x86/kernel/process.c:158
page last free pid 5859 tgid 5859 stack trace:
  reset_page_owner include/linux/page_owner.h:25 [inline]
  free_pages_prepare mm/page_alloc.c:1406 [inline]
  __free_frozen_pages+0xfe1/0x1170 mm/page_alloc.c:2943
  discard_slab mm/slub.c:3346 [inline]
  __put_partials+0x149/0x170 mm/slub.c:3886
  __slab_free+0x2af/0x330 mm/slub.c:5952
  qlink_free mm/kasan/quarantine.c:163 [inline]
  qlist_free_all+0x97/0x100 mm/kasan/quarantine.c:179
  kasan_quarantine_reduce+0x148/0x160 mm/kasan/quarantine.c:286
  __kasan_slab_alloc+0x22/0x80 mm/kasan/common.c:350
  kasan_slab_alloc include/linux/kasan.h:253 [inline]
  slab_post_alloc_hook mm/slub.c:4953 [inline]
  slab_alloc_node mm/slub.c:5263 [inline]
  kmem_cache_alloc_noprof+0x18d/0x6c0 mm/slub.c:5270
  getname_flags+0xb8/0x540 fs/namei.c:146
  getname include/linux/fs.h:2498 [inline]
  do_sys_openat2+0xbc/0x200 fs/open.c:1426
  do_sys_open fs/open.c:1436 [inline]
  __do_sys_openat fs/open.c:1452 [inline]
  __se_sys_openat fs/open.c:1447 [inline]
  __x64_sys_openat+0x138/0x170 fs/open.c:1447
  do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
  do_syscall_64+0xec/0xf80 arch/x86/entry/syscall_64.c:94

Fixes: 8d0b94afdca8 ("ipv6: Keep track of DST_NOCACHE routes in case of iface down/unregister")
Fixes: 78df76a065ae ("ipv4: take rt_uncached_lock only if needed")
Reported-by: syzbot+179fc225724092b8b2b2@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/6964cdf2.050a0220.eaf7.009d.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20260112103825.3810713-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: hv_netvsc: reject RSS hash key programming without RX indirection table

RSS configuration requires a valid RX indirection table. When the device
reports a single receive queue, rndis_filter_device_add() does not
allocate an indirection table, accepting RSS hash key updates in this
state leads to a hang.

Fix this by gating netvsc_set_rxfh() on ndc->rx_table_sz and return
-EOPNOTSUPP when the table is absent. This aligns set_rxfh with the device
capabilities and prevents incorrect behavior.

Fixes: 962f3fee83a4 ("netvsc: add ethtool ops to get/set RSS key")
Signed-off-by: Aditya Garg <gargaditya@linux.microsoft.com>
Reviewed-by: Dipayaan Roy <dipayanroy@linux.microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://patch.msgid.link/1768212093-1594-1-git-send-email-gargaditya@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tools: ynl: render event op docs correctly

The docs for YNL event ops currently render raw python structs. For
example in:

https://docs.kernel.org/netlink/specs/ethtool.html#cable-test-ntf

event: {‘attributes’: [‘header’, ‘status’, ‘nest’], ‘__lineno__’: 2385}

Handle event ops correctly and render their op attributes:

event: attributes: [header, status]

Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20260112153436.75495-1-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'gfs2-for-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2

Pull gfs2 revert from Andreas Gruenbacher:
"Revert bad commit "gfs2: Fix use of bio_chain"

  I was originally assuming that there must be a bug in gfs2
  because gfs2 chains bios in the opposite direction of what
  bio_chain_and_submit() expects.

  It turns out that the bio chains are set up in "reverse direction"
  intentionally so that the first bio's bi_end_io callback is invoked
  rather than the last bio's callback.

  We want the first bio's callback invoked for the following reason: The
  initial bio starts page aligned and covers one or more pages. When it
  terminates at a non-page-aligned offset, subsequent bios are added to
  handle the remaining portion of the final page.

  Upon completion of the bio chain, all affected pages need to be be
  marked as read, and only the first bio references all of these pages"

* tag 'gfs2-for-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
  Revert "gfs2: Fix use of bio_chain"

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull x86 kvm fixes from Paolo Bonzini:

- Avoid freeing stack-allocated node in kvm_async_pf_queue_task

- Clear XSTATE_BV[i] in guest XSAVE state whenever XFD[i]=1

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  selftests: kvm: Verify TILELOADD actually #NM faults when XFD[18]=1
  selftests: kvm: try getting XFD and XSAVE state out of sync
  selftests: kvm: replace numbered sync points with actions
  x86/fpu: Clear XSTATE_BV[i] in guest XSAVE state whenever XFD[i]=1
  x86/kvm: Avoid freeing stack-allocated node in kvm_async_pf_queue_task

Merge tag 'hyperv-fixes-signed-20260112' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux

Pull hyperv fixes from Wei Liu:

- Minor fixes and cleanups for the MSHV driver

* tag 'hyperv-fixes-signed-20260112' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  mshv: release mutex on region invalidation failure
  hyperv: Avoid -Wflex-array-member-not-at-end warning
  mshv: hide x86-specific functions on arm64
  mshv: Initialize local variables early upon region invalidation
  mshv: Use PMD_ORDER instead of HPAGE_PMD_ORDER when processing regions

net: add net.core.qdisc_max_burst

In blamed commit, I added a check against the temporary queue
built in __dev_xmit_skb(). Idea was to drop packets early,
before any spinlock was acquired.

if (unlikely(defer_count > READ_ONCE(q->limit))) {
kfree_skb_reason(skb, SKB_DROP_REASON_QDISC_DROP);
return NET_XMIT_DROP;
}

It turned out that HTB Qdisc has a zero q->limit.
HTB limits packets on a per-class basis.
Some of our tests became flaky.

Add a new sysctl : net.core.qdisc_max_burst to control
how many packets can be stored in the temporary lockless queue.

Also add a new QDISC_BURST_DROP drop reason to better diagnose
future issues.

Thanks Neal !

Fixes: 100dfa74cad9 ("net: dev_queue_xmit() llist adoption")
Reported-and-bisected-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Link: https://patch.msgid.link/20260107104159.3669285-1-edumazet@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: airoha: Fix typo in airoha_ppe_setup_tc_block_cb definition

Fix Typo in airoha_ppe_dev_setup_tc_block_cb routine definition when
CONFIG_NET_AIROHA is not enabled.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202601090517.Fj6v501r-lkp@intel.com/
Fixes: f45fc18b6de04 ("net: airoha: Add airoha_ppe_dev struct definition")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20260109-airoha_ppe_dev_setup_tc_block_cb-typo-v1-1-282e8834a9f9@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: phy: motorcomm: fix duplex setting error for phy leds

fix duplex setting error for phy leds

Fixes: 355b82c54c12 ("net: phy: motorcomm: Add support for PHY LEDs on YT8521")
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20260108071409.2750607-1-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

bpf: Fix reference count leak in bpf_prog_test_run_xdp()

syzbot is reporting

unregister_netdevice: waiting for sit0 to become free. Usage count = 2

problem. A debug printk() patch found that a refcount is obtained at
xdp_convert_md_to_buff() from bpf_prog_test_run_xdp().

According to commit ec94670fcb3b ("bpf: Support specifying ingress via
xdp_md context in BPF_PROG_TEST_RUN"), the refcount obtained by
xdp_convert_md_to_buff() will be released by xdp_convert_buff_to_md().

Therefore, we can consider that the error handling path introduced by
commit 1c1949982524 ("bpf: introduce frags support to
bpf_prog_test_run_xdp()") forgot to call xdp_convert_buff_to_md().

Reported-by: syzbot+881d65229ca4f9ae8c84@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84
Fixes: 1c1949982524 ("bpf: introduce frags support to bpf_prog_test_run_xdp()")
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/af090e53-9d9b-4412-8acb-957733b3975c@I-love.SAKURA.ne.jp
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Merge tag 'cgroup-for-6.19-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

Pull cgroup fix from Tejun Heo:

- Fix -Wflex-array-member-not-at-end warnings in cgroup_root

* tag 'cgroup-for-6.19-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: Eliminate cgrp_ancestor_storage in cgroup_root

net: octeon_ep_vf: fix free_irq dev_id mismatch in IRQ rollback

octep_vf_request_irqs() requests MSI-X queue IRQs with dev_id set to
ioq_vector. If request_irq() fails part-way, the rollback loop calls
free_irq() with dev_id set to 'oct', which does not match the original
dev_id and may leave the irqaction registered.

This can keep IRQ handlers alive while ioq_vector is later freed during
unwind/teardown, leading to a use-after-free or crash when an interrupt
fires.

Fix the error path to free IRQs with the same ioq_vector dev_id used
during request_irq().

Fixes: 1cd3b407977c ("octeon_ep_vf: add Tx/Rx processing and interrupt support")
Signed-off-by: Kery Qi <qikeyu2017@gmail.com>
Link: https://patch.msgid.link/20260108164256.1749-2-qikeyu2017@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Revert "gfs2: Fix use of bio_chain"

This reverts commit 8a157e0a0aa5143b5d94201508c0ca1bb8cfb941.

That commit incorrectly assumed that the bio_chain() arguments were
swapped in gfs2. However, gfs2 intentionally constructs bio chains so
that the first bio's bi_end_io callback is invoked when all bios in the
chain have completed, unlike bio chains where the last bio's callback is
invoked.

Fixes: 8a157e0a0aa5 ("gfs2: Fix use of bio_chain")
Cc: stable@vger.kernel.org
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>

Linux 6.19-rc5

Merge tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux

Pull crypto library fixes from Eric Biggers:

- A couple more fixes for the lib/crypto KUnit tests

- Fix missing MMU protection for the AES S-box

* tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
  lib/crypto: aes: Fix missing MMU protection for AES S-box
  MAINTAINERS: add test vector generation scripts to "CRYPTO LIBRARY"
  lib/crypto: tests: Fix syntax error for old python versions
  lib/crypto: tests: polyval_kunit: Increase iterations for preparekey in IRQs

Merge tag 'char-misc-6.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver fixes from Greg KH:
"Here are some small char/misc driver fixes for some reported issues.
  Included in here is:

   - much reported rust_binder fix

   - counter driver fixes

   - new device ids for the mei driver

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'char-misc-6.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  rust_binder: remove spin_lock() in rust_shrink_free_page()
  mei: me: add nova lake point S DID
  counter: 104-quad-8: Fix incorrect return value in IRQ handler
  counter: interrupt-cnt: Drop IRQF_NO_THREAD flag

Merge tag 'x86-urgent-2026-01-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fix from Ingo Molnar:
"Disable GCOV instrumentation in the SEV noinstr.c collection of SEV
noinstr methods, to further robustify the code"

* tag 'x86-urgent-2026-01-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/sev: Disable GCOV on noinstr object

Merge tag 'sched-urgent-2026-01-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fix from Ingo Molnar:
"Fix a crash in sched_mm_cid_after_execve()"

* tag 'sched-urgent-2026-01-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/mm_cid: Prevent NULL mm dereference in sched_mm_cid_after_execve()

Merge tag 'perf-urgent-2026-01-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf event fix from Ingo Molnar:
"Fix perf swevent hrtimer deinit regression"

* tag 'perf-urgent-2026-01-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf: Ensure swevent hrtimer is properly destroyed

Merge tag 'irq-urgent-2026-01-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull misc irqchip fixes from Ingo Molnar:

- Fix an endianness bug in the gic-v5 irqchip driver

- Revert a broken commit from the riscv-imsic irqchip driver

* tag 'irq-urgent-2026-01-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Revert "irqchip/riscv-imsic: Embed the vector array in lpriv"
irqchip/gic-v5: Fix gicv5_its_map_event() ITTE read endianness

treewide: Update email address

In a vain attempt to consolidate the email zoo switch everything to the
kernel.org account.

Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'riscv-for-linus-6.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Paul Walmsley:
"Notable changes include a fix to close one common microarchitectural
  attack vector for out-of-order cores. Another patch exposed an
  omission in my boot test coverage, which is currently missing
  relocatable kernels. Otherwise, the fixes seem to be settling down for
  us.

   - Fix CONFIG_RELOCATABLE=y boots by building Image files from
     vmlinux, rather than vmlinux.unstripped, now that the .modinfo
     section is included in vmlinux.unstripped

   - Prevent branch predictor poisoning microarchitectural attacks that
     use the syscall index as a vector by using array_index_nospec() to
     clamp the index after the bounds check (as x86 and ARM64 already
     do)

   - Fix a crash in test_kprobes when building with Clang

   - Fix a deadlock possible when tracing is enabled for SBI ecalls

   - Fix the definition of the Zk standard RISC-V ISA extension bundle,
     which was missing the Zknh extension

   - A few other miscellaneous non-functional cleanups, removing unused
     macros, fixing an out-of-date path in code comments, resolving a
     compile-time warning for a type mismatch in a pr_crit(), and
     removing an unnecessary header file inclusion"

* tag 'riscv-for-linus-6.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: trace: fix snapshot deadlock with sbi ecall
  riscv: remove irqflags.h inclusion in asm/bitops.h
  riscv: cpu_ops_sbi: smp_processor_id() returns int, not unsigned int
  riscv: configs: Clean up references to non-existing configs
  riscv: kexec_image: Fix dead link to boot-image-header.rst
  riscv: pgtable: Cleanup useless VA_USER_XXX definitions
  riscv: cpufeature: Fix Zk bundled extension missing Zknh
  riscv: fix KUnit test_kprobes crash when building with Clang
  riscv: Sanitize syscall table indexing under speculation
  riscv: boot: Always make Image from vmlinux, not vmlinux.unstripped

Merge tag 'driver-core-6.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core

Pull driver core fixes from Danilo Krummrich:

- Fix swapped example values for the `family` and `machine` attributes
   in the sysfs SoC bus ABI documentation

- Fix Rust build and intra-doc issues when optional subsystems
   (CONFIG_PCI, CONFIG_AUXILIARY_BUS, CONFIG_PRINTK) are disabled

- Fix typos and incorrect safety comments in Rust PCI, DMA, and
   device ID documentation

* tag 'driver-core-6.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core:
  rust: device: Remove explicit import of CStrExt
  rust: pci: fix typos in Bar struct's comments
  rust: device: fix broken intra-doc links
  rust: dma: fix broken intra-doc links
  rust: driver: fix broken intra-doc links to example driver types
  rust: device_id: replace incorrect word in safety documentation
  rust: dma: remove incorrect safety documentation
  docs: ABI: sysfs-devices-soc: Fix swapped sample values

Merge tag 'linux_kselftest-fixes-6.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kselftest fix from Shuah Khan:
"Fix tracing test_multiple_writes stalls when buffer_size_kb is less
than 12KB"

* tag 'linux_kselftest-fixes-6.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests/tracing: Fix test_multiple_writes stall

Merge branch 'mlx5e-profile-change-fix'

Saeed Mahameed says:

====================
mlx5e profile change fix

This series fixes a crash in mlx5e due to profile change error flow.
====================

Link: https://patch.msgid.link/20260108212657.25090-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5e: Restore destroying state bit after profile cleanup

Profile rollback can fail in mlx5e_netdev_change_profile() and we will
end up with invalid mlx5e_priv memset to 0, we must maintain the
'destroying' bit in order to gracefully shutdown even if the
profile/priv are not valid.

This patch maintains the previous state of the 'destroying' state of
mlx5e_priv after priv cleanup, to allow the remove flow to cleanup
common resources from mlx5_core to avoid FW fatal errors as seen below:

$ devlink dev eswitch set pci/0000:00:03.0 mode switchdev
Error: mlx5_core: Failed setting eswitch to offloads.
dmesg: mlx5_core 0000:00:03.0 enp0s3np0: failed to rollback to orig profile, ...

$ devlink dev reload pci/0000:00:03.0

mlx5_core 0000:00:03.0: E-Switch: Disable: mode(LEGACY), nvfs(0), necvfs(0), active vports(0)
mlx5_core 0000:00:03.0: poll_health:803:(pid 519): Fatal error 3 detected
mlx5_core 0000:00:03.0: firmware version: 28.41.1000
mlx5_core 0000:00:03.0: 0.000 Gb/s available PCIe bandwidth (Unknown x255 link)
mlx5_core 0000:00:03.0: mlx5_function_enable:1200:(pid 519): enable hca failed
mlx5_core 0000:00:03.0: mlx5_function_enable:1200:(pid 519): enable hca failed
mlx5_core 0000:00:03.0: mlx5_health_try_recover:340:(pid 141): handling bad device here
mlx5_core 0000:00:03.0: mlx5_handle_bad_state:285:(pid 141): Expected to see disabled NIC but it is full driver
mlx5_core 0000:00:03.0: mlx5_error_sw_reset:236:(pid 141): start
mlx5_core 0000:00:03.0: NIC IFC still 0 after 4000ms.

Fixes: c4d7eb57687f ("net/mxl5e: Add change profile method")
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260108212657.25090-5-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5e: Pass netdev to mlx5e_destroy_netdev instead of priv

mlx5e_priv is an unstable structure that can be memset(0) if profile
attaching fails.

Pass netdev to mlx5e_destroy_netdev() to guarantee it will work on a
valid netdev.

On mlx5e_remove: Check validity of priv->profile, before attempting
to cleanup any resources that might be not there.

This fixes a kernel oops in mlx5e_remove when switchdev mode fails due
to change profile failure.

$ devlink dev eswitch set pci/0000:00:03.0 mode switchdev
Error: mlx5_core: Failed setting eswitch to offloads.
dmesg:
workqueue: Failed to create a rescuer kthread for wq "mlx5e": -EINTR
mlx5_core 0012:03:00.1: mlx5e_netdev_init_profile:6214:(pid 37199): mlx5e_priv_init failed, err=-12
mlx5_core 0012:03:00.1 gpu3rdma1: mlx5e_netdev_change_profile: new profile init failed, -12
workqueue: Failed to create a rescuer kthread for wq "mlx5e": -EINTR
mlx5_core 0012:03:00.1: mlx5e_netdev_init_profile:6214:(pid 37199): mlx5e_priv_init failed, err=-12
mlx5_core 0012:03:00.1 gpu3rdma1: mlx5e_netdev_change_profile: failed to rollback to orig profile, -12

$ devlink dev reload pci/0000:00:03.0 ==> oops

BUG: kernel NULL pointer dereference, address: 0000000000000370
PGD 0 P4D 0
Oops: Oops: 0000 [#1] SMP NOPTI
CPU: 15 UID: 0 PID: 520 Comm: devlink Not tainted 6.18.0-rc5+ #115 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-2.fc40 04/01/2014
RIP: 0010:mlx5e_dcbnl_dscp_app+0x23/0x100
RSP: 0018:ffffc9000083f8b8 EFLAGS: 00010286
RAX: ffff8881126fc380 RBX: ffff8881015ac400 RCX: ffffffff826ffc45
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8881035109c0
RBP: ffff8881035109c0 R08: ffff888101e3e838 R09: ffff888100264e10
R10: ffffc9000083f898 R11: ffffc9000083f8a0 R12: ffff888101b921a0
R13: ffff888101b921a0 R14: ffff8881015ac9a0 R15: ffff8881015ac400
FS: 00007f789a3c8740(0000) GS:ffff88856aa59000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000370 CR3: 000000010b6c0001 CR4: 0000000000370ef0
Call Trace:
<TASK>
mlx5e_remove+0x57/0x110
device_release_driver_internal+0x19c/0x200
bus_remove_device+0xc6/0x130
device_del+0x160/0x3d0
? devl_param_driverinit_value_get+0x2d/0x90
mlx5_detach_device+0x89/0xe0
mlx5_unload_one_devl_locked+0x3a/0x70
mlx5_devlink_reload_down+0xc8/0x220
devlink_reload+0x7d/0x260
devlink_nl_reload_doit+0x45b/0x5a0
genl_family_rcv_msg_doit+0xe8/0x140

Fixes: c4d7eb57687f ("net/mxl5e: Add change profile method")
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Shay Drori <shayd@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260108212657.25090-4-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5e: Don't store mlx5e_priv in mlx5e_dev devlink priv

mlx5e_priv is an unstable structure that can be memset(0) if profile
attaching fails, mlx5e_priv in mlx5e_dev devlink private is used to
reference the netdev and mdev associated with that struct. Instead,
store netdev directly into mlx5e_dev and get mdev from the containing
mlx5_adev aux device structure.

This fixes a kernel oops in mlx5e_remove when switchdev mode fails due
to change profile failure.

$ devlink dev eswitch set pci/0000:00:03.0 mode switchdev
Error: mlx5_core: Failed setting eswitch to offloads.
dmesg:
workqueue: Failed to create a rescuer kthread for wq "mlx5e": -EINTR
mlx5_core 0012:03:00.1: mlx5e_netdev_init_profile:6214:(pid 37199): mlx5e_priv_init failed, err=-12
mlx5_core 0012:03:00.1 gpu3rdma1: mlx5e_netdev_change_profile: new profile init failed, -12
workqueue: Failed to create a rescuer kthread for wq "mlx5e": -EINTR
mlx5_core 0012:03:00.1: mlx5e_netdev_init_profile:6214:(pid 37199): mlx5e_priv_init failed, err=-12
mlx5_core 0012:03:00.1 gpu3rdma1: mlx5e_netdev_change_profile: failed to rollback to orig profile, -12

$ devlink dev reload pci/0000:00:03.0 ==> oops

BUG: kernel NULL pointer dereference, address: 0000000000000520
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: Oops: 0000 [#1] SMP NOPTI
CPU: 3 UID: 0 PID: 521 Comm: devlink Not tainted 6.18.0-rc5+ #117 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-2.fc40 04/01/2014
RIP: 0010:mlx5e_remove+0x68/0x130
RSP: 0018:ffffc900034838f0 EFLAGS: 00010246
RAX: ffff88810283c380 RBX: ffff888101874400 RCX: ffffffff826ffc45
RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000
RBP: ffff888102d789c0 R08: ffff8881007137f0 R09: ffff888100264e10
R10: ffffc90003483898 R11: ffffc900034838a0 R12: ffff888100d261a0
R13: ffff888100d261a0 R14: ffff8881018749a0 R15: ffff888101874400
FS: 00007f8565fea740(0000) GS:ffff88856a759000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000520 CR3: 000000010b11a004 CR4: 0000000000370ef0
Call Trace:
<TASK>
device_release_driver_internal+0x19c/0x200
bus_remove_device+0xc6/0x130
device_del+0x160/0x3d0
? devl_param_driverinit_value_get+0x2d/0x90
mlx5_detach_device+0x89/0xe0
mlx5_unload_one_devl_locked+0x3a/0x70
mlx5_devlink_reload_down+0xc8/0x220
devlink_reload+0x7d/0x260
devlink_nl_reload_doit+0x45b/0x5a0
genl_family_rcv_msg_doit+0xe8/0x140

Fixes: ee75f1fc44dd ("net/mlx5e: Create separate devlink instance for ethernet auxiliary device")
Fixes: c4d7eb57687f ("net/mxl5e: Add change profile method")
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://patch.msgid.link/20260108212657.25090-3-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5e: Fix crash on profile change rollback failure

mlx5e_netdev_change_profile can fail to attach a new profile and can
fail to rollback to old profile, in such case, we could end up with a
dangling netdev with a fully reset netdev_priv. A retry to change
profile, e.g. another attempt to call mlx5e_netdev_change_profile via
switchdev mode change, will crash trying to access the now NULL
priv->mdev.

This fix allows mlx5e_netdev_change_profile() to handle previous
failures and an empty priv, by not assuming priv is valid.

Pass netdev and mdev to all flows requiring
mlx5e_netdev_change_profile() and avoid passing priv.
In mlx5e_netdev_change_profile() check if current priv is valid, and if
not, just attach the new profile without trying to access the old one.

This fixes the following oops, when enabling switchdev mode for the 2nd
time after first time failure:

## Enabling switchdev mode first time:

mlx5_core 0012:03:00.1: E-Switch: Supported tc chains and prios offload
workqueue: Failed to create a rescuer kthread for wq "mlx5e": -EINTR
mlx5_core 0012:03:00.1: mlx5e_netdev_init_profile:6214:(pid 37199): mlx5e_priv_init failed, err=-12
mlx5_core 0012:03:00.1 gpu3rdma1: mlx5e_netdev_change_profile: new profile init failed, -12
workqueue: Failed to create a rescuer kthread for wq "mlx5e": -EINTR
mlx5_core 0012:03:00.1: mlx5e_netdev_init_profile:6214:(pid 37199): mlx5e_priv_init failed, err=-12
mlx5_core 0012:03:00.1 gpu3rdma1: mlx5e_netdev_change_profile: failed to rollback to orig profile, -12
                                                                         ^^^^^^^^
mlx5_core 0000:00:03.0: E-Switch: Disable: mode(LEGACY), nvfs(0), necvfs(0), active vports(0)

## retry: Enabling switchdev mode 2nd time:

mlx5_core 0000:00:03.0: E-Switch: Supported tc chains and prios offload
BUG: kernel NULL pointer dereference, address: 0000000000000038
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: Oops: 0000 [#1] SMP NOPTI
CPU: 13 UID: 0 PID: 520 Comm: devlink Not tainted 6.18.0-rc4+ #91 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-2.fc40 04/01/2014
RIP: 0010:mlx5e_detach_netdev+0x3c/0x90
Code: 50 00 00 f0 80 4f 78 02 48 8b bf e8 07 00 00 48 85 ff 74 16 48 8b 73 78 48 d1 ee 83 e6 01 83 f6 01 40 0f b6 f6 e8 c4 42 00 00 <48> 8b 45 38 48 85 c0 74 08 48 89 df e8 cc 47 40 1e 48 8b bb f0 07
RSP: 0018:ffffc90000673890 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8881036a89c0 RCX: 0000000000000000
RDX: ffff888113f63800 RSI: ffffffff822fe720 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000002dcd R09: 0000000000000000
R10: ffffc900006738e8 R11: 00000000ffffffff R12: 0000000000000000
R13: 0000000000000000 R14: ffff8881036a89c0 R15: 0000000000000000
FS:  00007fdfb8384740(0000) GS:ffff88856a9d6000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000038 CR3: 0000000112ae0005 CR4: 0000000000370ef0
Call Trace:
<TASK>
mlx5e_netdev_change_profile+0x45/0xb0
mlx5e_vport_rep_load+0x27b/0x2d0
mlx5_esw_offloads_rep_load+0x72/0xf0
esw_offloads_enable+0x5d0/0x970
mlx5_eswitch_enable_locked+0x349/0x430
? is_mp_supported+0x57/0xb0
mlx5_devlink_eswitch_mode_set+0x26b/0x430
devlink_nl_eswitch_set_doit+0x6f/0xf0
genl_family_rcv_msg_doit+0xe8/0x140
genl_rcv_msg+0x18b/0x290
? __pfx_devlink_nl_pre_doit+0x10/0x10
? __pfx_devlink_nl_eswitch_set_doit+0x10/0x10
? __pfx_devlink_nl_post_doit+0x10/0x10
? __pfx_genl_rcv_msg+0x10/0x10
netlink_rcv_skb+0x52/0x100
genl_rcv+0x28/0x40
netlink_unicast+0x282/0x3e0
? __alloc_skb+0xd6/0x190
netlink_sendmsg+0x1f7/0x430
__sys_sendto+0x213/0x220
? __sys_recvmsg+0x6a/0xd0
__x64_sys_sendto+0x24/0x30
do_syscall_64+0x50/0x1f0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7fdfb8495047

Fixes: c4d7eb57687f ("net/mxl5e: Add change profile method")
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260108212657.25090-2-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'linux-can-fixes-for-6.19-20260109' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2026-01-09

The first patch is by Szymon Wilczek and fixes a potential memory leak
in the etas_es58x driver.

The 2nd patch is by me, targets the gs_usb driver and fixes an URB
memory leak.

Ondrej Ille's patch fixes the transceiver delay compensation in the
ctucanfd driver, which is needed for bit rates higher than 1 Mbit/s.

* tag 'linux-can-fixes-for-6.19-20260109' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
  can: ctucanfd: fix SSP_SRC in cases when bit-rate is higher than 1 MBit.
  can: gs_usb: gs_usb_receive_bulk_callback(): fix URB memory leak
  can: etas_es58x: allow partial RX URB allocation to succeed
====================

Link: https://patch.msgid.link/20260109135311.576033-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'for-net-2026-01-09' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth

Luiz Augusto von Dentz says:

====================
bluetooth pull request for net:

- hci_sync: enable PA Sync Lost event

* tag 'for-net-2026-01-09' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
Bluetooth: hci_sync: enable PA Sync Lost event
====================

Link: https://patch.msgid.link/20260109211949.236218-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

vsock/test: add a final full barrier after run all tests

If the last test fails, the other side still completes correctly,
which could lead to false positives.

Let's add a final barrier that ensures that the last test has finished
correctly on both sides, but also that the two sides agree on the
number of tests to be performed.

Fixes: 2f65b44e199c ("VSOCK: add full barrier between test cases")
Reviewed-by: Luigi Leonardi <leonardi@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://patch.msgid.link/20260108114419.52747-1-sgarzare@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ipv4: ip_gre: make ipgre_header() robust

Analog to commit db5b4e39c4e6 ("ip6_gre: make ip6gre_header() robust")

Over the years, syzbot found many ways to crash the kernel
in ipgre_header() [1].

This involves team or bonding drivers ability to dynamically
change their dev->needed_headroom and/or dev->hard_header_len

In this particular crash mld_newpack() allocated an skb
with a too small reserve/headroom, and by the time mld_sendpack()
was called, syzbot managed to attach an ipgre device.

[1]
skbuff: skb_under_panic: text:ffffffff89ea3cb7 len:2030915468 put:2030915372 head:ffff888058b43000 data:ffff887fdfa6e194 tail:0x120 end:0x6c0 dev:team0
kernel BUG at net/core/skbuff.c:213 !
Oops: invalid opcode: 0000 [#1] SMP KASAN PTI
CPU: 1 UID: 0 PID: 1322 Comm: kworker/1:9 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025
Workqueue: mld mld_ifc_work
RIP: 0010:skb_panic+0x157/0x160 net/core/skbuff.c:213
Call Trace:
<TASK>
  skb_under_panic net/core/skbuff.c:223 [inline]
  skb_push+0xc3/0xe0 net/core/skbuff.c:2641
  ipgre_header+0x67/0x290 net/ipv4/ip_gre.c:897
  dev_hard_header include/linux/netdevice.h:3436 [inline]
  neigh_connected_output+0x286/0x460 net/core/neighbour.c:1618
  NF_HOOK_COND include/linux/netfilter.h:307 [inline]
  ip6_output+0x340/0x550 net/ipv6/ip6_output.c:247
  NF_HOOK+0x9e/0x380 include/linux/netfilter.h:318
  mld_sendpack+0x8d4/0xe60 net/ipv6/mcast.c:1855
  mld_send_cr net/ipv6/mcast.c:2154 [inline]
  mld_ifc_work+0x83e/0xd60 net/ipv6/mcast.c:2693
  process_one_work kernel/workqueue.c:3257 [inline]
  process_scheduled_works+0xad1/0x1770 kernel/workqueue.c:3340
  worker_thread+0x8a0/0xda0 kernel/workqueue.c:3421
  kthread+0x711/0x8a0 kernel/kthread.c:463
  ret_from_fork+0x510/0xa50 arch/x86/kernel/process.c:158
  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246

Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.")
Reported-by: syzbot+7c134e1c3aa3283790b9@syzkaller.appspotmail.com
Closes: https://www.spinics.net/lists/netdev/msg1147302.html
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260108190214.1667040-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'virtio-net-fix-the-deadlock-when-disabling-rx-napi'

Bui Quang Minh says:

====================
virtio-net: fix the deadlock when disabling rx NAPI

Calling napi_disable() on an already disabled napi can cause the
deadlock. In commit 4bc12818b363 ("virtio-net: disable delayed refill
when pausing rx"), to avoid the deadlock, when pausing the RX in
virtnet_rx_pause[_all](), we disable and cancel the delayed refill work.
However, in the virtnet_rx_resume_all(), we enable the delayed refill
work too early before enabling all the receive queue napis.

The deadlock can be reproduced by running
selftests/drivers/net/hw/xsk_reconfig.py with multiqueue virtio-net
device and inserting a cond_resched() inside the for loop in
virtnet_rx_resume_all() to increase the success rate. Because the worker
processing the delayed refilled work runs on the same CPU as
virtnet_rx_resume_all(), a reschedule is needed to cause the deadlock.
In real scenario, the contention on netdev_lock can cause the
reschedule.

Due to the complexity of delayed refill worker, in this series, we remove
it. When we fail to refill the receive buffer, we will retry in the next
NAPI poll instead.

- Patch 1: removes delayed refill worker schedule and retry refill
  in next NAPI
- Patch 2, 3: removes and clean up unused delayed refill worker code

For testing, I've run the following tests with no issue so far
- selftests/drivers/net/hw/xsk_reconfig.py which sets up the XDP zerocopy
   without providing any descriptors to the fill ring. As a result,
   try_fill_recv will always fail.
- Send TCP packets from host to guest while guest is nearly OOM and some
  try_fill_recv calls fail.

v2: https://lore.kernel.org/20260102152023.10773-1-minhquangbui99@gmail.com
v1: https://lore.kernel.org/20251223152533.24364-1-minhquangbui99@gmail.com

Link to the previous approach and discussion:
https://lore.kernel.org/20251212152741.11656-1-minhquangbui99@gmail.com
====================

Link: https://patch.msgid.link/20260106150438.7425-1-minhquangbui99@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

virtio-net: clean up __virtnet_rx_pause/resume

The delayed refill worker is removed which makes virtnet_rx_pause/resume
quite the same as __virtnet_rx_pause/resume. So remove
__virtnet_rx_pause/resume and move the code to virtnet_rx_pause/resume.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
Link: https://patch.msgid.link/20260106150438.7425-4-minhquangbui99@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

virtio-net: remove unused delayed refill worker

Since we switched to retry refilling receive buffer in NAPI poll instead
of delayed worker, remove all now unused delayed refill worker code.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
Link: https://patch.msgid.link/20260106150438.7425-3-minhquangbui99@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

virtio-net: don't schedule delayed refill worker

When we fail to refill the receive buffers, we schedule a delayed worker
to retry later. However, this worker creates some concurrency issues.
For example, when the worker runs concurrently with virtnet_xdp_set,
both need to temporarily disable queue's NAPI before enabling again.
Without proper synchronization, a deadlock can happen when
napi_disable() is called on an already disabled NAPI. That
napi_disable() call will be stuck and so will the subsequent
napi_enable() call.

To simplify the logic and avoid further problems, we will instead retry
refilling in the next NAPI poll.

Fixes: 4bc12818b363 ("virtio-net: disable delayed refill when pausing rx")
Reported-by: Paolo Abeni <pabeni@redhat.com>
Closes: https://lore.kernel.org/526b5396-459d-4d02-8635-a222d07b46d7@redhat.com
Cc: stable@vger.kernel.org
Suggested-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://patch.msgid.link/20260106150438.7425-2-minhquangbui99@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'iommu-fixes-v6.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux

Pull iomu fixes from Joerg Roedel:

- several Kconfig-related build fixes

- fix for when gcc 8.5 on PPC refuses to inline a function from a
   header file

* tag 'iommu-fixes-v6.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux:
  iommupt: Make pt_feature() always_inline
  iommufd/selftest: Prevent module/builtin conflicts in kconfig
  iommufd/selftest: Add missing kconfig for DMA_SHARED_BUFFER
  iommupt: Fix the kunit building

erofs: fix file-backed mounts no longer working on EROFS partitions

Sheng Yong reported [1] that Android APEX images didn't work with commit
072a7c7cdbea ("erofs: don't bother with s_stack_depth increasing for
now") because "EROFS-formatted APEX file images can be stored within an
EROFS-formatted Android system partition."

In response, I sent a quick fat-fingered [PATCH v3] to address the
report.  Unfortunately, the updated condition was incorrect:

         if (erofs_is_fileio_mode(sbi)) {
-            sb->s_stack_depth =
-                file_inode(sbi->dif0.file)->i_sb->s_stack_depth + 1;
-            if (sb->s_stack_depth > FILESYSTEM_MAX_STACK_DEPTH) {
-                erofs_err(sb, "maximum fs stacking depth exceeded");
+            inode = file_inode(sbi->dif0.file);
+            if ((inode->i_sb->s_op == &erofs_sops && !sb->s_bdev) ||
+                inode->i_sb->s_stack_depth) {

The condition `!sb->s_bdev` is always true for all file-backed EROFS
mounts, making the check effectively a no-op.

The real fix tested and confirmed by Sheng Yong [2] at that time was
[PATCH v3 RESEND], which correctly ensures the following EROFS^2 setup
works:
    EROFS (on a block device) + EROFS (file-backed mount)

But sadly I screwed it up again by upstreaming the outdated [PATCH v3].

This patch applies the same logic as the delta between the upstream
[PATCH v3] and the real fix [PATCH v3 RESEND].

Reported-by: Sheng Yong <shengyong1@xiaomi.com>
Closes: https://lore.kernel.org/r/3acec686-4020-4609-aee4-5dae7b9b0093@gmail.com [1]
Fixes: 072a7c7cdbea ("erofs: don't bother with s_stack_depth increasing for now")
Link: https://lore.kernel.org/r/243f57b8-246f-47e7-9fb1-27a771e8e9e8@gmail.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

iommupt: Make pt_feature() always_inline

gcc 8.5 on powerpc does not automatically inline these functions even
though they evaluate to constants in key cases. Since the constant
propagation is essential for some code elimination and built-time checks
this causes a build failure:

ERROR: modpost: "__pt_no_sw_bit" [drivers/iommu/generic_pt/fmt/iommu_amdv1.ko] undefined!

Caused by this:

if (pts_feature(&pts, PT_FEAT_DMA_INCOHERENT) &&
!pt_test_sw_bit_acquire(&pts,
SW_BIT_CACHE_FLUSH_DONE))
flush_writes_item(&pts);

Where pts_feature() evaluates to a constant false. Mark them as
__always_inline to force it to evaluate to a constant and trigger the code
elimination.

Fixes: 7c5b184db714 ("genpt: Generic Page Table base API")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202512230720.9y9DtWIo-lkp@intel.com/
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

iommufd/selftest: Prevent module/builtin conflicts in kconfig

The selftest now depends on the AMDv1 page table, however the selftest
kconfig itself is just an sub-option of the main IOMMUFD module kconfig.

This means it cannot be modular and so kconfig allowed a modular
IOMMU_PT_AMDV1 with a built in IOMMUFD. This causes link failures:

   ld: vmlinux.o: in function `mock_domain_alloc_pgtable.isra.0':
   selftest.c:(.text+0x12e8ad3): undefined reference to `pt_iommu_amdv1_init'
   ld: vmlinux.o: in function `BSWAP_SHUFB_CTL':
   sha1-avx2-asm.o:(.rodata+0xaa36a8): undefined reference to `pt_iommu_amdv1_read_and_clear_dirty'
   ld: sha1-avx2-asm.o:(.rodata+0xaa36f0): undefined reference to `pt_iommu_amdv1_map_pages'
   ld: sha1-avx2-asm.o:(.rodata+0xaa36f8): undefined reference to `pt_iommu_amdv1_unmap_pages'
   ld: sha1-avx2-asm.o:(.rodata+0xaa3720): undefined reference to `pt_iommu_amdv1_iova_to_phys'

Adjust the kconfig to disable IOMMUFD_TEST if IOMMU_PT_AMDV1 is incompatible.

Fixes: e93d5945ed5b ("iommufd: Change the selftest to use iommupt instead of xarray")
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202512210135.freQWpxa-lkp@intel.com/
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>

iommufd/selftest: Add missing kconfig for DMA_SHARED_BUFFER

The test doesn't build without it, dma-buf.h does not provide stub
functions if it is not enabled. Compilation can fail with:

ERROR:root:ld: vmlinux.o: in function `iommufd_test':
(.text+0x3b1cdd): undefined reference to `dma_buf_get'
ld: (.text+0x3b1d08): undefined reference to `dma_buf_put'
ld: (.text+0x3b2105): undefined reference to `dma_buf_export'
ld: (.text+0x3b211f): undefined reference to `dma_buf_fd'
ld: (.text+0x3b2e47): undefined reference to `dma_buf_move_notify'

Add the missing select.

Fixes: d2041f1f11dd ("iommufd/selftest: Add some tests for the dmabuf flow")
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>

iommupt: Fix the kunit building

The kunit doesn't work since the below commit made GENERIC_PT
unselectable:

$ make ARCH=x86_64 O=build_kunit_x86_64 olddefconfig
ERROR:root:Not all Kconfig options selected in kunitconfig were in the generated .config.
This is probably due to unsatisfied dependencies.
Missing: CONFIG_DEBUG_GENERIC_PT=y, CONFIG_IOMMUFD_TEST=y,
CONFIG_IOMMU_PT_X86_64=y, CONFIG_GENERIC_PT=y, CONFIG_IOMMU_PT_AMDV1=y,
CONFIG_IOMMU_PT_VTDSS=y, CONFIG_IOMMU_PT=y, CONFIG_IOMMU_PT_KUNIT_TEST=y

Also remove the unneeded CONFIG_IOMMUFD_TEST reference as the iommupt kunit
doesn't interact with iommufd, and it doesn't currently build for the
kunit due problems with DMA_SHARED buffer either.

Fixes: 01569c216dde ("genpt: Make GENERIC_PT invisible")
Fixes: 1dd4187f53c3 ("iommupt: Add a kunit test for Generic Page Table")
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>

selftests: kvm: Verify TILELOADD actually #NM faults when XFD[18]=1

Rework the AMX test's #NM handling to use kvm_asm_safe() to verify an #NM
actually occurs. As is, a completely missing #NM could go unnoticed.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

selftests: kvm: try getting XFD and XSAVE state out of sync

The host is allowed to set FPU state that includes a disabled
xstate component. Check that this does not cause bad effects.

Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

selftests: kvm: replace numbered sync points with actions

Rework the guest=>host syncs in the AMX test to use named actions instead
of arbitrary, incrementing numbers. The "stage" of the test has no real
meaning, what matters is what action the test wants the host to perform.
The incrementing numbers are somewhat helpful for triaging failures, but
fully debugging failures almost always requires a much deeper dive into
the test (and KVM).

Using named actions not only makes it easier to extend the test without
having to shift all sync point numbers, it makes the code easier to read.

[Commit message by Sean Christopherson]

Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

x86/fpu: Clear XSTATE_BV[i] in guest XSAVE state whenever XFD[i]=1

When loading guest XSAVE state via KVM_SET_XSAVE, and when updating XFD in
response to a guest WRMSR, clear XFD-disabled features in the saved (or to
be restored) XSTATE_BV to ensure KVM doesn't attempt to load state for
features that are disabled via the guest's XFD.  Because the kernel
executes XRSTOR with the guest's XFD, saving XSTATE_BV[i]=1 with XFD[i]=1
will cause XRSTOR to #NM and panic the kernel.

E.g. if fpu_update_guest_xfd() sets XFD without clearing XSTATE_BV:

  ------------[ cut here ]------------
  WARNING: arch/x86/kernel/traps.c:1524 at exc_device_not_available+0x101/0x110, CPU#29: amx_test/848
  Modules linked in: kvm_intel kvm irqbypass
  CPU: 29 UID: 1000 PID: 848 Comm: amx_test Not tainted 6.19.0-rc2-ffa07f7fd437-x86_amx_nm_xfd_non_init-vm #171 NONE
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0010:exc_device_not_available+0x101/0x110
  Call Trace:
   <TASK>
   asm_exc_device_not_available+0x1a/0x20
  RIP: 0010:restore_fpregs_from_fpstate+0x36/0x90
   switch_fpu_return+0x4a/0xb0
   kvm_arch_vcpu_ioctl_run+0x1245/0x1e40 [kvm]
   kvm_vcpu_ioctl+0x2c3/0x8f0 [kvm]
   __x64_sys_ioctl+0x8f/0xd0
   do_syscall_64+0x62/0x940
   entry_SYSCALL_64_after_hwframe+0x4b/0x53
   </TASK>
  ---[ end trace 0000000000000000 ]---

This can happen if the guest executes WRMSR(MSR_IA32_XFD) to set XFD[18] = 1,
and a host IRQ triggers kernel_fpu_begin() prior to the vmexit handler's
call to fpu_update_guest_xfd().

and if userspace stuffs XSTATE_BV[i]=1 via KVM_SET_XSAVE:

  ------------[ cut here ]------------
  WARNING: arch/x86/kernel/traps.c:1524 at exc_device_not_available+0x101/0x110, CPU#14: amx_test/867
  Modules linked in: kvm_intel kvm irqbypass
  CPU: 14 UID: 1000 PID: 867 Comm: amx_test Not tainted 6.19.0-rc2-2dace9faccd6-x86_amx_nm_xfd_non_init-vm #168 NONE
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0010:exc_device_not_available+0x101/0x110
  Call Trace:
   <TASK>
   asm_exc_device_not_available+0x1a/0x20
  RIP: 0010:restore_fpregs_from_fpstate+0x36/0x90
   fpu_swap_kvm_fpstate+0x6b/0x120
   kvm_load_guest_fpu+0x30/0x80 [kvm]
   kvm_arch_vcpu_ioctl_run+0x85/0x1e40 [kvm]
   kvm_vcpu_ioctl+0x2c3/0x8f0 [kvm]
   __x64_sys_ioctl+0x8f/0xd0
   do_syscall_64+0x62/0x940
   entry_SYSCALL_64_after_hwframe+0x4b/0x53
   </TASK>
  ---[ end trace 0000000000000000 ]---

The new behavior is consistent with the AMX architecture.  Per Intel's SDM,
XSAVE saves XSTATE_BV as '0' for components that are disabled via XFD
(and non-compacted XSAVE saves the initial configuration of the state
component):

  If XSAVE, XSAVEC, XSAVEOPT, or XSAVES is saving the state component i,
  the instruction does not generate #NM when XCR0[i] = IA32_XFD[i] = 1;
  instead, it operates as if XINUSE[i] = 0 (and the state component was
  in its initial state): it saves bit i of XSTATE_BV field of the XSAVE
  header as 0; in addition, XSAVE saves the initial configuration of the
  state component (the other instructions do not save state component i).

Alternatively, KVM could always do XRSTOR with XFD=0, e.g. by using
a constant XFD based on the set of enabled features when XSAVEing for
a struct fpu_guest.  However, having XSTATE_BV[i]=1 for XFD-disabled
features can only happen in the above interrupt case, or in similar
scenarios involving preemption on preemptible kernels, because
fpu_swap_kvm_fpstate()'s call to save_fpregs_to_fpstate() saves the
outgoing FPU state with the current XFD; and that is (on all but the
first WRMSR to XFD) the guest XFD.

Therefore, XFD can only go out of sync with XSTATE_BV in the above
interrupt case, or in similar scenarios involving preemption on
preemptible kernels, and it we can consider it (de facto) part of KVM
ABI that KVM_GET_XSAVE returns XSTATE_BV[i]=0 for XFD-disabled features.

Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 820a6ee944e7 ("kvm: x86: Add emulation for IA32_XFD", 2022-01-14)
Signed-off-by: Sean Christopherson <seanjc@google.com>
[Move clearing of XSTATE_BV from fpu_copy_uabi_to_guest_fpstate
to kvm_vcpu_ioctl_x86_set_xsave. - Paolo]
Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Merge tag 'erofs-for-6.19-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs

Pull erofs fix from Gao Xiang:

- Don't increase s_stack_depth which caused regressions in some
   composefs mount setups (EROFS + ovl^2)

   Instead just allow one extra unaccounted fs stacking level for
   straightforward cases.

* tag 'erofs-for-6.19-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
  erofs: don't bother with s_stack_depth increasing for now

erofs: don't bother with s_stack_depth increasing for now

Previously, commit d53cd891f0e4 ("erofs: limit the level of fs stacking
for file-backed mounts") bumped `s_stack_depth` by one to avoid kernel
stack overflow when stacking an unlimited number of EROFS on top of
each other.

This fix breaks composefs mounts, which need EROFS+ovl^2 sometimes
(and such setups are already used in production for quite a long time).

One way to fix this regression is to bump FILESYSTEM_MAX_STACK_DEPTH
from 2 to 3, but proving that this is safe in general is a high bar.

After a long discussion on GitHub issues [1] about possible solutions,
one conclusion is that there is no need to support nesting file-backed
EROFS mounts on stacked filesystems, because there is always the option
to use loopback devices as a fallback.

As a quick fix for the composefs regression for this cycle, instead of
bumping `s_stack_depth` for file backed EROFS mounts, we disallow
nesting file-backed EROFS over EROFS and over filesystems with
`s_stack_depth` > 0.

This works for all known file-backed mount use cases (composefs,
containerd, and Android APEX for some Android vendors), and the fix is
self-contained.

Essentially, we are allowing one extra unaccounted fs stacking level of
EROFS below stacking filesystems, but EROFS can only be used in the read
path (i.e. overlayfs lower layers), which typically has much lower stack
usage than the write path.

We can consider increasing FILESYSTEM_MAX_STACK_DEPTH later, after more
stack usage analysis or using alternative approaches, such as splitting
the `s_stack_depth` limitation according to different combinations of
stacking.

Fixes: d53cd891f0e4 ("erofs: limit the level of fs stacking for file-backed mounts")
Reported-and-tested-by: Dusty Mabe <dusty@dustymabe.com>
Reported-by: Timothée Ravier <tim@siosm.fr>
Closes: https://github.com/coreos/fedora-coreos-tracker/issues/2087 [1]
Reported-by: "Alekséi Naidénov" <an@digitaltide.io>
Closes: https://lore.kernel.org/r/CAFHtUiYv4+=+JP_-JjARWjo6OwcvBj1wtYN=z0QXwCpec9sXtg@mail.gmail.com
Acked-by: Amir Goldstein <amir73il@gmail.com>
Acked-by: Alexander Larsson <alexl@redhat.com>
Reviewed-and-tested-by: Sheng Yong <shengyong1@xiaomi.com>
Reviewed-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

macvlan: fix possible UAF in macvlan_forward_source()

Add RCU protection on (struct macvlan_source_entry)->vlan.

Whenever macvlan_hash_del_source() is called, we must clear
entry->vlan pointer before RCU grace period starts.

This allows macvlan_forward_source() to skip over
entries queued for freeing.

Note that macvlan_dev are already RCU protected, as they
are embedded in a standard netdev (netdev_priv(ndev)).

Fixes: 79cf79abce71 ("macvlan: add source mode")
Reported-by: syzbot+7182fbe91e58602ec1fe@syzkaller.appspotmail.com
https: //lore.kernel.org/netdev/695fb1e8.050a0220.1c677c.039f.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260108133651.1130486-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: update netdev_lock_{type,name}

Add missing entries in netdev_lock_type[] and netdev_lock_name[] :

CAN, MCTP, RAWIP, CAIF, IP6GRE, 6LOWPAN, NETLINK, VSOCKMON,
IEEE802154_MONITOR.

Also add a WARN_ONCE() in netdev_lock_pos() to help future bug hunting
next time a protocol is added without updating these arrays.

Fixes: 1a33e10e4a95 ("net: partially revert dynamic lockdep key changes")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260108093244.830280-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ip6_tunnel: use skb_vlan_inet_prepare() in __ip6_tnl_rcv()

Blamed commit did not take care of VLAN encapsulations
as spotted by syzbot [1].

Use skb_vlan_inet_prepare() instead of pskb_inet_may_pull().

[1]
BUG: KMSAN: uninit-value in __INET_ECN_decapsulate include/net/inet_ecn.h:253 [inline]
BUG: KMSAN: uninit-value in INET_ECN_decapsulate include/net/inet_ecn.h:275 [inline]
BUG: KMSAN: uninit-value in IP6_ECN_decapsulate+0x7a8/0x1fa0 include/net/inet_ecn.h:321
  __INET_ECN_decapsulate include/net/inet_ecn.h:253 [inline]
  INET_ECN_decapsulate include/net/inet_ecn.h:275 [inline]
  IP6_ECN_decapsulate+0x7a8/0x1fa0 include/net/inet_ecn.h:321
  ip6ip6_dscp_ecn_decapsulate+0x16f/0x1b0 net/ipv6/ip6_tunnel.c:729
  __ip6_tnl_rcv+0xed9/0x1b50 net/ipv6/ip6_tunnel.c:860
  ip6_tnl_rcv+0xc3/0x100 net/ipv6/ip6_tunnel.c:903
gre_rcv+0x1529/0x1b90 net/ipv6/ip6_gre.c:-1
  ip6_protocol_deliver_rcu+0x1c89/0x2c60 net/ipv6/ip6_input.c:438
  ip6_input_finish+0x1f4/0x4a0 net/ipv6/ip6_input.c:489
  NF_HOOK include/linux/netfilter.h:318 [inline]
  ip6_input+0x9c/0x330 net/ipv6/ip6_input.c:500
  ip6_mc_input+0x7ca/0xc10 net/ipv6/ip6_input.c:590
  dst_input include/net/dst.h:474 [inline]
  ip6_rcv_finish+0x958/0x990 net/ipv6/ip6_input.c:79
  NF_HOOK include/linux/netfilter.h:318 [inline]
  ipv6_rcv+0xf1/0x3c0 net/ipv6/ip6_input.c:311
  __netif_receive_skb_one_core net/core/dev.c:6139 [inline]
  __netif_receive_skb+0x1df/0xac0 net/core/dev.c:6252
  netif_receive_skb_internal net/core/dev.c:6338 [inline]
  netif_receive_skb+0x57/0x630 net/core/dev.c:6397
  tun_rx_batched+0x1df/0x980 drivers/net/tun.c:1485
  tun_get_user+0x5c0e/0x6c60 drivers/net/tun.c:1953
  tun_chr_write_iter+0x3e9/0x5c0 drivers/net/tun.c:1999
  new_sync_write fs/read_write.c:593 [inline]
  vfs_write+0xbe2/0x15d0 fs/read_write.c:686
  ksys_write fs/read_write.c:738 [inline]
  __do_sys_write fs/read_write.c:749 [inline]
  __se_sys_write fs/read_write.c:746 [inline]
  __x64_sys_write+0x1fb/0x4d0 fs/read_write.c:746
  x64_sys_call+0x30ab/0x3e70 arch/x86/include/generated/asm/syscalls_64.h:2
  do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
  do_syscall_64+0xd3/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f

Uninit was created at:
  slab_post_alloc_hook mm/slub.c:4960 [inline]
  slab_alloc_node mm/slub.c:5263 [inline]
  kmem_cache_alloc_node_noprof+0x9e7/0x17a0 mm/slub.c:5315
  kmalloc_reserve+0x13c/0x4b0 net/core/skbuff.c:586
  __alloc_skb+0x805/0x1040 net/core/skbuff.c:690
  alloc_skb include/linux/skbuff.h:1383 [inline]
  alloc_skb_with_frags+0xc5/0xa60 net/core/skbuff.c:6712
  sock_alloc_send_pskb+0xacc/0xc60 net/core/sock.c:2995
  tun_alloc_skb drivers/net/tun.c:1461 [inline]
  tun_get_user+0x1142/0x6c60 drivers/net/tun.c:1794
  tun_chr_write_iter+0x3e9/0x5c0 drivers/net/tun.c:1999
  new_sync_write fs/read_write.c:593 [inline]
  vfs_write+0xbe2/0x15d0 fs/read_write.c:686
  ksys_write fs/read_write.c:738 [inline]
  __do_sys_write fs/read_write.c:749 [inline]
  __se_sys_write fs/read_write.c:746 [inline]
  __x64_sys_write+0x1fb/0x4d0 fs/read_write.c:746
  x64_sys_call+0x30ab/0x3e70 arch/x86/include/generated/asm/syscalls_64.h:2
  do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
  do_syscall_64+0xd3/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f

CPU: 0 UID: 0 PID: 6465 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(none)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025

Fixes: 8d975c15c0cd ("ip6_tunnel: make sure to pull inner header in __ip6_tnl_rcv()")
Reported-by: syzbot+d4dda070f833dc5dc89a@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/695e88b2.050a0220.1c677c.036d.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260107163109.4188620-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'block-6.19-20260109' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

Pull block fixes from Jens Axboe:

- Kill unlikely checks for blk-rq-qos. These checks are really
   all-or-nothing, either the branch is taken all the time, or it's not.
   Depending on the configuration, either one of those cases may be
   true. Just remove the annotation

- Fix for merging bios with different app tags set

- Fix for a recently introduced slowdown due to RCU synchronization

- Fix for a status change on loop while it's in use, and then a later
   fix for that fix

- Fix for the async partition scanning in ublk

* tag 'block-6.19-20260109' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  ublk: fix use-after-free in ublk_partition_scan_work
  blk-mq: avoid stall during boot due to synchronize_rcu_expedited
  loop: add missing bd_abort_claiming in loop_set_status
  block: don't merge bios with different app_tags
  blk-rq-qos: Remove unlikely() hints from QoS checks
  loop: don't change loop device under exclusive opener in loop_set_status

net: bridge: annotate data-races around fdb->{updated,used}

fdb->updated and fdb->used are read and written locklessly.

Add READ_ONCE()/WRITE_ONCE() annotations.

Fixes: 31cbc39b6344 ("net: bridge: add option to allow activity notifications for any fdb entries")
Reported-by: syzbot+bfab43087ad57222ce96@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/695e3d74.050a0220.1c677c.035f.GAE@google.com/
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20260108093806.834459-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'io_uring-6.19-20260109' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

Pull io_uring fixes from Jens Axboe:
"A single fix for a regression introduced in 6.15, where a failure to
  wake up idle io-wq workers at ring exit will wait for the timeout to
  expire.

  This isn't normally noticeable, as the exit is async.

  But if a parent task created a thread that sets up a ring and uses
  requests that cause io-wq threads to be created, and the parent task
  then waits for the thread to exit, then it can take 5 seconds for that
  pthread_join() to succeed as the child thread is waiting for its
  children to exit.

  On top of that, just a basic cleanup as well"

* tag 'io_uring-6.19-20260109' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  io_uring/io-wq: remove io_wq_for_each_worker() return value
  io_uring/io-wq: fix incorrect io_wq_for_each_worker() termination logic

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:

- Do not return false if !preemptible() in current_in_efi(). EFI
   runtime services can now run with preemption enabled

- Fix uninitialised variable in the arm MPAM driver, reported by sparse

- Fix partial kasan_reset_tag() use in change_memory_common() when
   calculating page indices or comparing ranges

- Save/restore TCR2_EL1 during suspend/resume, otherwise the E0POE bit
   is lost

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: Fix cleared E0POE bit after cpu_suspend()/resume()
  arm64: mm: Fix incomplete tag reset in change_memory_common()
  arm_mpam: Stop using uninitialized variables in __ris_msmon_read()
  arm64/efi: Don't fail check current_in_efi() if preemptible

Merge tag 'soc-fixes-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull SoC fixes from Arnd Bergmann:
"The main code change is a revert of the Raspberry Pi RP1 overlay
  support that was decided to not be ready.

  The other fixes are all for devicetree sources:

   - ethernet configuration on ixp42x-actiontec-mi424wr is board
     revision specific

   - validation warning fixes for imx27/imx51/imx6, hikey960 and k3

   - Minor corrections across imx8 boards, addressing all types of
     issues with interrups, dma, ethernet and clock settings, all simple
     one-line changes"

* tag 'soc-fixes-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (25 commits)
  arm64: dts: hisilicon: hikey960: Drop "snps,gctl-reset-quirk" and "snps,tx_de_emphasis*" properties
  Documentation/process: maintainer-soc: Mark 'make' as commands
  Documentation/process: maintainer-soc: Be more explicit about defconfig
  arm64: dts: mba8mx: Fix Ethernet PHY IRQ support
  arm64: dts: imx8qm-ss-dma: correct the dma channels of lpuart
  arm64: dts: imx8mp: Fix LAN8740Ai PHY reference clock on DH electronics i.MX8M Plus DHCOM
  arm64: dts: freescale: tx8p-ml81: fix eqos nvmem-cells
  arm64: dts: freescale: moduline-display: fix compatible
  dt-bindings: arm: fsl: moduline-display: fix compatible
  ARM: dts: imx6q-ba16: fix RTC interrupt level
  arm64: dts: freescale: imx95-toradex-smarc: fix SMARC_SDIO_WP label position
  arm64: dts: freescale: imx95-toradex-smarc: use edge trigger for ethphy1 interrupt
  arm64: dts: add off-on-delay-us for usdhc2 regulator
  arm64: dts: imx8qm-mek: correct the light sensor interrupt type to low level
  ARM: dts: nxp: imx: Fix mc13xxx LED node names
  arm64: dts: imx95: correct I3C2 pclk to IMX95_CLK_BUSWAKEUP
  MAINTAINERS: Fix a linusw mail address
  arm64: dts: broadcom: rp1: drop RP1 overlay
  arm64: dts: broadcom: bcm2712: fix RP1 endpoint PCI topology
  misc: rp1: drop overlay support
  ...

Merge tag 'ceph-for-6.19-rc5' of https://github.com/ceph/ceph-client

Pull ceph fixes from Ilya Dryomov:
"A bunch of libceph fixes split evenly between memory safety and
  implementation correctness issues (all marked for stable) and a change
  in maintainers for CephFS: Slava and Alex have formally taken over
  Xiubo's role"

* tag 'ceph-for-6.19-rc5' of https://github.com/ceph/ceph-client:
  libceph: make calc_target() set t->paused, not just clear it
  libceph: reset sparse-read state in osd_fault()
  libceph: return the handler error from mon_handle_auth_done()
  libceph: make free_choose_arg_map() resilient to partial allocation
  ceph: update co-maintainers list in MAINTAINERS
  libceph: replace overzealous BUG_ON in osdmap_apply_incremental()
  libceph: prevent potential out-of-bounds reads in handle_auth_done()

selftests/tracing: Fix test_multiple_writes stall

When /sys/kernel/tracing/buffer_size_kb is less than 12KB,
the test_multiple_writes test will stall and wait for more
input due to insufficient buffer space.

Check current buffer_size_kb value before the test. If it is
less than 12KB, it temporarily increase the buffer to 12KB,
and restore the original value after the tests are completed.

Link: https://lore.kernel.org/r/20260109033620.25727-1-fushuai.wang@linux.dev
Fixes: 37f46601383a ("selftests/tracing: Add basic test for trace_marker_raw file")
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Fushuai Wang <wangfushuai@baidu.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>

Bluetooth: hci_sync: enable PA Sync Lost event

Enable the PA Sync Lost event mask to ensure PA sync loss is properly
reported and handled.

Fixes: 485e0626e587 ("Bluetooth: hci_event: Fix not handling PA Sync Lost event")
Signed-off-by: Yang Li <yang.li@amlogic.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>

Merge tag 'for-6.19-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

- fix potential NULL pointer dereference when replaying tree log after
   an error

- release path before initializing extent tree to avoid potential
   deadlock when allocating new inode

- on filesystems with block size > page size
    - fix potential read out of bounds during encoded read of an inline
      extent
    - only enforce free space tree if v1 cache is required

- print correct tree id in error message

* tag 'for-6.19-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: show correct warning if can't read data reloc tree
  btrfs: fix NULL pointer dereference in do_abort_log_replay()
  btrfs: force free space tree for bs > ps cases
  btrfs: only enforce free space tree if v1 cache is required for bs < ps cases
  btrfs: release path before initializing extent tree in btrfs_read_locked_inode()
  btrfs: avoid access-beyond-folio for bs > ps encoded writes

Merge tag 'pci-v6.19-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull PCI fixes from Bjorn Helgaas:

- Remove ASPM L0s support for MSM8996 SoC since we now enable L0s when
   advertised, and it caused random hangs on this device (Manivannan
   Sadhasivam)

- Fix meson-pcie to report that the link is up while in ASPM L0s or L1,
   since those are active states from the software point of view, and
   treating the link as down caused config access failures (Bjorn
   Helgaas)

- Fix up sparc DTS BAR descriptions that are above 4GB but not marked
   as prefetchable, which caused resource assignment and driver probe
   failures after we converted from the SPARC pcibios_enable_device() to
   the generic version (Ilpo Järvinen)

* tag 'pci-v6.19-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
  sparc/PCI: Correct 64-bit non-pref -> pref BAR resources
  PCI: meson: Report that link is up while in ASPM L0s and L1 states
  PCI: qcom: Remove ASPM L0s support for MSM8996 SoC

Merge tag 'acpi-6.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI support fix from Rafael Wysocki:
"This fixes the ACPI/PCI legacy interrupts (INTx) parsing in the case
  when the ACPI Global System Interrupt (GSI) value is a 32-bit one with
  the MSB set.

  That was interpreted as a negative integer and caused
  acpi_pci_link_allocate_irq() to fail and acpi_irq_get_penalty() to
  trigger an out-of-bounds array dereference (Lorenzo Pieralisi)"

* tag 'acpi-6.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: PCI: IRQ: Fix INTx GSIs signedness

Merge tag 'pm-6.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fix from Rafael Wysocki:
"This fixes a crash in the hibernation image saving code that can be
  triggered when the given compression algorithm is unavailable (Malaya
  Kumar Rout)"

* tag 'pm-6.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM: hibernate: Fix crash when freeing invalid crypto compressor

Merge tag 'gpio-fixes-for-v6.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux

Pull gpio fixes from Bartosz Golaszewski:
"There are several ordinary driver fixes and a fix to a race between
  the registration of two chips that causes a crash in GPIO core.

  The bulk of the changed lines however, concerns the management of
  shared GPIOs that landed in v6.19-rc1. Enabling it for ARCH_QCOM
  enabled it in defconfig which effectively enabled it for all arm64
  platforms and exposed the code to quite a lot of testing (which is
  good, right? :)).

  As a resukt, I received a number of bug reports, which I progressively
  fixed over the course of last weeks. This explains the number of lines
  higher than what I normally aim for at this stage.

   - balance superio enter/exit calls in error path in gpio-it87

   - fix a race where we try to take the SRCU read lock of the GPIO
     device before it's been initialized causing a NULL-pointer
     dereference

   - fix handling of short-pulse interrupts in gpio-pca053x

   - fix a reference leak in error path in gpio-mpsse

   - mark the GPIO controller as sleeping (it calls sleeping functions)
     in gpio-rockchip

   - fix several issues in management of shared GPIOs"

* tag 'gpio-fixes-for-v6.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: shared: fix a false-positive sharing detection with reset-gpios
  gpiolib: fix lookup table matching
  gpio: shared: don't allocate the lookup table until we really need it
  gpio: shared: fix a race condition
  gpio: shared: assign the correct firmware node for reset-gpio use-case
  gpio: rockchip: mark the GPIO controller as sleeping
  gpio: mpsse: fix reference leak in gpio_mpsse_probe() error paths
  gpio: pca953x: handle short interrupt pulses on PCAL devices
  gpiolib: fix race condition for gdev->srcu
  gpio: shared: allow sharing a reset-gpios pin between reset-gpio and gpiolib
  gpio: shared: verify con_id when adding proxy lookup
  gpiolib: allow multiple lookup tables per consumer
  gpio: it87: balance superio enter/exit calls in error path

Merge tag 'drm-fixes-2026-01-09' of https://gitlab.freedesktop.org/drm/kernel

Pull drm fixes from Dave Airlie:
"I missed the drm-rust fixes tree for last week, so this catches up on
  that, along with amdgpu, and then some misc fixes across a few
  drivers. I hadn't got an xe pull by the time I sent this, I suspect
  one will arrive 10 mins after, but I don't think there is anything
  that can't wait for next week.

  Things seem to have picked up a little with people coming back from
  holidays,

  MAINTAINERS:
   - Fix Nova GPU driver git links
   - Fix typo in TYR driver entry preventing correct behavior of
     scripts/get_maintainer.pl
   - Exclude TYR driver from DRM MISC

  nova-core:
   - Correctly select RUST_FW_LOADER_ABSTRACTIONS to prevent build
     errors
   - Regenerate nova-core bindgen bindings with '--explicit-padding' to
     avoid uninitialized bytes
   - Fix length of received GSP messages, due to miscalculated message
     payload size
   - Regenerate bindings to derive MaybeZeroable
   - Use a bindings alias to derive the firmware version

  exynos:
   - hdmi: replace system_wq with system_percpu_wq

  pl111:
   - Fix error handling in probe

  mediatek/atomic/tidss:
   - Fix tidss in another way and revert reordering of pre-enable and
     post-disable operations, as it breaks other bridge drivers

  nouveau:
   - Fix regression from fwsec s/r fix

  pci/vga:
   - Fix multiple gpu's being reported a 'boot_display'

  fb-helper:
   - Fix vblank timeout during suspend/reset

  amdgpu:
   - Clang fixes
   - Navi1x PCIe DPM fixes
   - Ring reset fixes
   - ISP suspend fix
   - Analog DC fixes
   - VPE fixes
   - Mode1 reset fix

  radeon:
   - Variable sized array fix"

* tag 'drm-fixes-2026-01-09' of https://gitlab.freedesktop.org/drm/kernel: (32 commits)
  Reapply "Revert "drm/amd: Skip power ungate during suspend for VPE""
  drm/amd/display: Check NULL before calling dac_load_detection
  drm/amd/pm: Disable MMIO access during SMU Mode 1 reset
  drm/exynos: hdmi: replace use of system_wq with system_percpu_wq
  drm/fb-helper: Fix vblank timeout during suspend/reset
  PCI/VGA: Don't assume the only VGA device on a system is `boot_vga`
  drm/amdgpu: Fix query for VPE block_type and ip_count
  drm/amd/display: Add missing encoder setup to DACnEncoderControl
  drm/amd/display: Correct color depth for SelectCRTC_Source
  drm/amd/amdgpu: Fix SMU warning during isp suspend-resume
  drm/amdgpu: always backup and reemit fences
  drm/amdgpu: don't reemit ring contents more than once
  drm/amd/pm: force send pcie parmater on navi1x
  drm/amd/pm: fix wrong pcie parameter on navi1x
  drm/radeon: Remove __counted_by from ClockInfoArray.clockInfo[]
  drm/amd/display: Reduce number of arguments of dcn30's CalculateWatermarksAndDRAMSpeedChangeSupport()
  drm/amd/display: Reduce number of arguments of dcn30's CalculatePrefetchSchedule()
  drm/amd/display: Apply e4479aecf658 to dml
  nouveau: don't attempt fwsec on sb on newer platforms
  drm/tidss: Fix enable/disable order
  ...

Merge tag 'vfs-6.19-rc5.fixes' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs

Pull vfs fixes from Christian Brauner:

- Remove incorrect __user annotation from struct xattr_args::value

- Documentation fix: Add missing kernel-doc description for the @isnew
   parameter in ilookup5_nowait() to silence Sphinx warnings

- Documentation fix: Fix kernel-doc comment for __start_dirop() - the
   function name in the comment was wrong and the @state parameter was
   undocumented

- Replace dynamic folio_batch allocation with stack allocation in
   iomap_zero_range(). The dynamic allocation was problematic for
   ext4-on-iomap work (didn't handle allocation failure properly) and
   triggered lockdep complaints. Uses a flag instead to control batch
   usage

- Re-add #ifdef guards around PIDFD_GET_<ns-type>_NAMESPACE ioctls.
   When a namespace type is disabled, ns->ops is NULL, causes crashes
   during inode eviction when closing the fd. The ifdefs were removed in
   a recent simplification but are still needed

- Fixe a race where a folio could be unlocked before the trailing zeros
   (for EOF within the page) were written

- Split out a dedicated lease_dispose_list() helper since lease code
   paths always know they're disposing of leases. Removes unnecessary
   runtime flag checks and prepares for upcoming lease_manager
   enhancements

- Fix userland delegation requests succeeding despite conflicting
   opens. Previously, FL_LAYOUT and FL_DELEG leases bypassed conflict
   checks (a hack for nfsd). Adds new ->lm_open_conflict() lease_manager
   operation so userland delegations get proper conflict checking while
   nfsd can continue its own conflict handling

- Fix LOOKUP_CACHED path lookups incorrectly falling through to the
   slow path. After legitimize_links() calls were conditionally elided,
   the routine would always fail with LOOKUP_CACHED regardless of
   whether there were any links. Now the flag is checked at the two
   callsites before calling legitimize_links()

- Fix bug in media fd allocation in media_request_alloc()

- Fix mismatched API calls in ecryptfs_mknod(): was calling
   end_removing() instead of end_creating() after
   ecryptfs_start_creating_dentry()

- Fix dentry reference count leak in ecryptfs_mkdir(): a dget() of the
   lower parent dir was added but never dput()'d, causing BUG during
   lower filesystem unmount due to the still-in-use dentry

* tag 'vfs-6.19-rc5.fixes' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs:
  pidfs: protect PIDFD_GET_* ioctls() via ifdef
  ecryptfs: Release lower parent dentry after creating dir
  ecryptfs: Fix improper mknod pairing of start_creating()/end_removing()
  get rid of bogus __user in struct xattr_args::value
  VFS: fix __start_dirop() kernel-doc warnings
  fs: Describe @isnew parameter in ilookup5_nowait()
  fs: make sure to fail try_to_unlazy() and try_to_unlazy() for LOOKUP_CACHED
  netfs: Fix early read unlock of page with EOF in middle
  filelock: allow lease_managers to dictate what qualifies as a conflict
  filelock: add lease_dispose_list() helper
  iomap: replace folio_batch allocation with stack allocation
  media: mc: fix potential use-after-free in media_request_alloc()

Merge tag 'v6.19-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto fix from Herbert Xu:

- Fix duplicate restart messages in qat

* tag 'v6.19-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: qat - fix duplicate restarting msg during AER error

Revert "irqchip/riscv-imsic: Embed the vector array in lpriv"

The __alloc_percpu() fails when the number of IDs are greater than 959
because size parameter of __alloc_percpu() must be less than 32768 (aka
PCPU_MIN_UNIT_SIZE). This failure is observed with KVMTOOL when AIA is
trap-n-emulated by in-kernel KVM because in this case KVM guest has 2047
interrupt IDs.

To address this issue, don't embed vector array in struct imsic_local_priv
until __alloc_percpu() support size parameter greater than 32768.

This reverts commit 79eaabc61dfb ("irqchip/riscv-imsic: Embed the vector
array in lpriv").

Signed-off-by: Anup Patel <anup.patel@oss.qualcomm.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20251223143544.1504217-1-anup.patel@oss.qualcomm.com

irqchip/gic-v5: Fix gicv5_its_map_event() ITTE read endianness

Kbuild bot (through sparse) reported that the ITTE read to carry out
a valid check in gicv5_its_map_event() lacks proper endianness handling.

Add the missing endianess conversion.

Fixes: 57d72196dfc8 ("irqchip/gic-v5: Add GICv5 ITS support")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://patch.msgid.link/20251222102250.435460-1-lpieralisi@kernel.org
Closes: https://lore.kernel.org/oe-kbuild-all/202512131849.30ZRTBeR-lkp@intel.com/

ublk: fix use-after-free in ublk_partition_scan_work

A race condition exists between the async partition scan work and device
teardown that can lead to a use-after-free of ub->ub_disk:

1. ublk_ctrl_start_dev() schedules partition_scan_work after add_disk()
2. ublk_stop_dev() calls ublk_stop_dev_unlocked() which does:
   - del_gendisk(ub->ub_disk)
   - ublk_detach_disk() sets ub->ub_disk = NULL
   - put_disk() which may free the disk
3. The worker ublk_partition_scan_work() then dereferences ub->ub_disk
   leading to UAF

Fix this by using ublk_get_disk()/ublk_put_disk() in the worker to hold
a reference to the disk during the partition scan. The spinlock in
ublk_get_disk() synchronizes with ublk_detach_disk() ensuring the worker
either gets a valid reference or sees NULL and exits early.

Also change flush_work() to cancel_work_sync() to avoid running the
partition scan work unnecessarily when the disk is already detached.

Fixes: 7fc4da6a304b ("ublk: scan partition in async way")
Reported-by: Ruikai Peng <ruikai@pwno.io>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

can: ctucanfd: fix SSP_SRC in cases when bit-rate is higher than 1 MBit.

The Secondary Sample Point Source field has been
set to an incorrect value by some mistake in the
past

  0b01 - SSP_SRC_NO_SSP - SSP is not used.

for data bitrates above 1 MBit/s. The correct/default
value already used for lower bitrates is

  0b00 - SSP_SRC_MEAS_N_OFFSET - SSP position = TRV_DELAY
         (Measured Transmitter delay) + SSP_OFFSET.

The related configuration register structure is described
in section 3.1.46 SSP_CFG of the CTU CAN FD
IP CORE Datasheet.

The analysis leading to the proper configuration
is described in section 2.8.3 Secondary sampling point
of the datasheet.

The change has been tested on AMD/Xilinx Zynq
with the next CTU CN FD IP core versions:

- 2.6 aka master in the "integration with Zynq-7000 system" test
   6.12.43-rt12+ #1 SMP PREEMPT_RT kernel with CTU CAN FD git
   driver (change already included in the driver repo)
- older 2.5 snapshot with mainline kernels with this patch
   applied locally in the multiple CAN latency tester nightly runs
   6.18.0-rc4-rt3-dut #1 SMP PREEMPT_RT
   6.19.0-rc3-dut

The logs, the datasheet and sources are available at

https://canbus.pages.fel.cvut.cz/

Signed-off-by: Ondrej Ille <ondrej.ille@gmail.com>
Signed-off-by: Pavel Pisa <pisa@fel.cvut.cz>
Link: https://patch.msgid.link/20260105111620.16580-1-pisa@fel.cvut.cz
Fixes: 2dcb8e8782d8 ("can: ctucanfd: add support for CTU CAN FD open-source IP core - bus independent part.")
Cc: stable@vger.kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

sched/mm_cid: Prevent NULL mm dereference in sched_mm_cid_after_execve()

sched_mm_cid_after_execve() is called in bprm_execve()'s cleanup path even
when exec_binprm() fails. For the init task's first execve(), this causes a
problem:

  1. current->mm is NULL (kernel threads don't have an mm)
  2. sched_mm_cid_before_execve() exits early because mm is NULL
  3. exec_binprm() fails (e.g., ENOENT for missing script interpreter)
  4. sched_mm_cid_after_execve() is called with mm still NULL
  5. sched_mm_cid_fork() is called unconditionally, triggering WARN_ON

This is easily reproduced by booting with an init that is a shell script
(#!/bin/sh) where the interpreter doesn't exist in the initramfs.

Fix this by checking if t->mm is NULL before calling sched_mm_cid_fork(),
matching the behavior of sched_mm_cid_before_execve() which already
handles this case via sched_mm_cid_exit()'s early return.

Fixes: b0c3d51b54f8 ("sched/mmcid: Provide precomputed maximal value")
Signed-off-by: Cong Wang <cwang@multikernel.io>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://patch.msgid.link/20251223215113.639686-1-xiyou.wangcong@gmail.com

arm64: Fix cleared E0POE bit after cpu_suspend()/resume()

TCR2_ELx.E0POE is set during smp_init().
However, this bit is not reprogrammed when the CPU enters suspension and
later resumes via cpu_resume(), as __cpu_setup() does not re-enable E0POE
and there is no save/restore logic for the TCR2_ELx system register.

As a result, the E0POE feature no longer works after cpu_resume().

To address this, save and restore TCR2_EL1 in the cpu_suspend()/cpu_resume()
path, rather than adding related logic to __cpu_setup(), taking into account
possible future extensions of the TCR2_ELx feature.

Fixes: bf83dae90fbc ("arm64: enable the Permission Overlay Extension for EL0")
Cc: <stable@vger.kernel.org> # 6.12.x
Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Kevin Brodsky <kevin.brodsky@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>