git.ipfire.org Git - thirdparty/kernel/linux.git/log

Merge tag 'for-linus-7.0-rc1a-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fix from Juergen Gross:
"A single patch fixing a boot regression when running as a Xen PV
guest. This issue was introduced in this merge window"

* tag 'for-linus-7.0-rc1a-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
x86/xen: Fix Xen PV guest boot

Merge tag 'hyperv-next-signed-20260218' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux

Pull Hyper-V updates from Wei Liu:

- Debugfs support for MSHV statistics (Nuno Das Neves)

- Support for the integrated scheduler (Stanislav Kinsburskii)

- Various fixes for MSHV memory management and hypervisor status
   handling (Stanislav Kinsburskii)

- Expose more capabilities and flags for MSHV partition management
   (Anatol Belski, Muminul Islam, Magnus Kulke)

- Miscellaneous fixes to improve code quality and stability (Carlos
   López, Ethan Nelson-Moore, Li RongQing, Michael Kelley, Mukesh
   Rathor, Purna Pavan Chandra Aekkaladevi, Stanislav Kinsburskii, Uros
   Bizjak)

- PREEMPT_RT fixes for vmbus interrupts (Jan Kiszka)

* tag 'hyperv-next-signed-20260218' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: (34 commits)
  mshv: Handle insufficient root memory hypervisor statuses
  mshv: Handle insufficient contiguous memory hypervisor status
  mshv: Introduce hv_deposit_memory helper functions
  mshv: Introduce hv_result_needs_memory() helper function
  mshv: Add SMT_ENABLED_GUEST partition creation flag
  mshv: Add nested virtualization creation flag
  Drivers: hv: vmbus: Simplify allocation of vmbus_evt
  mshv: expose the scrub partition hypercall
  mshv: Add support for integrated scheduler
  mshv: Use try_cmpxchg() instead of cmpxchg()
  x86/hyperv: Fix error pointer dereference
  x86/hyperv: Reserve 3 interrupt vectors used exclusively by MSHV
  Drivers: hv: vmbus: Use kthread for vmbus interrupts on PREEMPT_RT
  x86/hyperv: Remove ASM_CALL_CONSTRAINT with VMMCALL insn
  x86/hyperv: Use savesegment() instead of inline asm() to save segment registers
  mshv: fix SRCU protection in irqfd resampler ack handler
  mshv: make field names descriptive in a header struct
  x86/hyperv: Update comment in hyperv_cleanup()
  mshv: clear eventfd counter on irqfd shutdown
  x86/hyperv: Use memremap()/memunmap() instead of ioremap_cache()/iounmap()
  ...

Merge tag 'net-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
"Including fixes from Netfilter.

  Current release - new code bugs:

   - net: fix backlog_unlock_irq_restore() vs CONFIG_PREEMPT_RT

   - eth: mlx5e: XSK, Fix unintended ICOSQ change

   - phy_port: correctly recompute the port's linkmodes

   - vsock: prevent child netns mode switch from local to global

   - couple of kconfig fixes for new symbols

  Previous releases - regressions:

   - nfc: nci: fix false-positive parameter validation for packet data

   - net: do not delay zero-copy skbs in skb_attempt_defer_free()

  Previous releases - always broken:

   - mctp: ensure our nlmsg responses to user space are zero-initialised

   - ipv6: ioam: fix heap buffer overflow in __ioam6_fill_trace_data()

   - fixes for ICMP rate limiting

  Misc:

   - intel: fix PCI device ID conflict between i40e and ipw2200"

* tag 'net-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (85 commits)
  net: nfc: nci: Fix parameter validation for packet data
  net/mlx5e: Use unsigned for mlx5e_get_max_num_channels
  net/mlx5e: Fix deadlocks between devlink and netdev instance locks
  net/mlx5e: MACsec, add ASO poll loop in macsec_aso_set_arm_event
  net/mlx5: Fix misidentification of write combining CQE during poll loop
  net/mlx5e: Fix misidentification of ASO CQE during poll loop
  net/mlx5: Fix multiport device check over light SFs
  bonding: alb: fix UAF in rlb_arp_recv during bond up/down
  bnge: fix reserving resources from FW
  eth: fbnic: Advertise supported XDP features.
  rds: tcp: fix uninit-value in __inet_bind
  net/rds: Fix NULL pointer dereference in rds_tcp_accept_one
  octeontx2-af: Fix default entries mcam entry action
  net/mlx5e: XSK, Fix unintended ICOSQ change
  ipv6: icmp: icmpv6_xrlim_allow() optimization if net.ipv6.icmp.ratelimit is zero
  ipv4: icmp: icmpv4_xrlim_allow() optimization if net.ipv4.icmp_ratelimit is zero
  ipv6: icmp: remove obsolete code in icmpv6_xrlim_allow()
  inet: move icmp_global_{credit,stamp} to a separate cache line
  icmp: prevent possible overflow in icmp_global_allow()
  selftests/net: packetdrill: add ipv4-mapped-ipv6 tests
  ...

Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Pull bpf fixes from Alexei Starovoitov:

- Fix invalid write loop logic in libbpf's bpf_linker__add_buf() (Amery
   Hung)

- Fix a potential use-after-free of BTF object (Anton Protopopov)

- Add feature detection to libbpf and avoid moving arena global
   variables on older kernels (Emil Tsalapatis)

- Remove extern declaration of bpf_stream_vprintk() from libbpf headers
   (Ihor Solodrai)

- Fix truncated netlink dumps in bpftool (Jakub Kicinski)

- Fix map_kptr grace period wait in bpf selftests (Kumar Kartikeya
   Dwivedi)

- Remove hexdump dependency while building bpf selftests (Matthieu
   Baerts)

- Complete fsession support in BPF trampolines on riscv (Menglong Dong)

* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  selftests/bpf: Remove hexdump dependency
  libbpf: Remove extern declaration of bpf_stream_vprintk()
  selftests/bpf: Use vmlinux.h in test_xdp_meta
  bpftool: Fix truncated netlink dumps
  libbpf: Delay feature gate check until object prepare time
  libbpf: Do not use PROG_TYPE_TRACEPOINT program for feature gating
  bpf: Add a map/btf from a fd array more consistently
  selftests/bpf: Fix map_kptr grace period wait
  selftests/bpf: enable fsession_test on riscv64
  selftests/bpf: Adjust selftest due to function rename
  bpf, riscv: add fsession support for trampolines
  bpf: Fix a potential use-after-free of BTF object
  bpf, riscv: introduce emit_store_stack_imm64() for trampoline
  libbpf: Fix invalid write loop logic in bpf_linker__add_buf()
  libbpf: Add gating for arena globals relocation feature

net: nfc: nci: Fix parameter validation for packet data

Since commit 9c328f54741b ("net: nfc: nci: Add parameter validation for
packet data") communication with nci nfc chips is not working any more.

The mentioned commit tries to fix access of uninitialized data, but
failed to understand that in some cases the data packet is of variable
length and can therefore not be compared to the maximum packet length
given by the sizeof(struct).

Fixes: 9c328f54741b ("net: nfc: nci: Add parameter validation for packet data")
Cc: stable@vger.kernel.org
Signed-off-by: Michael Thalmeier <michael.thalmeier@hale.at>
Reported-by: syzbot+740e04c2a93467a0f8c8@syzkaller.appspotmail.com
Link: https://patch.msgid.link/20260218083000.301354-1-michael.thalmeier@hale.at
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'mlx5-misc-fixes-2026-02-18'

Tariq Toukan says:

====================
mlx5 misc fixes 2026-02-18

This patchset provides misc bug fixes from the team to the mlx5
core and Eth drivers.
====================

Link: https://patch.msgid.link/20260218072904.1764634-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5e: Use unsigned for mlx5e_get_max_num_channels

The max number of channels is always an unsigned int, use the correct
type to fix compilation errors done with strict type checking, e.g.:

error: call to ‘__compiletime_assert_1110’ declared with attribute
error: min(mlx5e_get_devlink_param_num_doorbells(mdev),
mlx5e_get_max_num_channels(mdev)) signedness error

Fixes: 74a8dadac17e ("net/mlx5e: Preparations for supporting larger number of channels")
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Jacob Keller <Jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260218072904.1764634-7-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5e: Fix deadlocks between devlink and netdev instance locks

In the mentioned "Fixes" commit, various work tasks triggering devlink
health reporter recovery were switched to use netdev_trylock to protect
against concurrent tear down of the channels being recovered. But this
had the side effect of introducing potential deadlocks because of
incorrect lock ordering.

The correct lock order is described by the init flow:
probe_one -> mlx5_init_one (acquires devlink lock)
-> mlx5_init_one_devl_locked -> mlx5_register_device
-> mlx5_rescan_drivers_locked -...-> mlx5e_probe -> _mlx5e_probe
-> register_netdev (acquires rtnl lock)
-> register_netdevice (acquires netdev lock)
=> devlink lock -> rtnl lock -> netdev lock.

But in the current recovery flow, the order is wrong:
mlx5e_tx_err_cqe_work (acquires netdev lock)
-> mlx5e_reporter_tx_err_cqe -> mlx5e_health_report
-> devlink_health_report (acquires devlink lock => boom!)
-> devlink_health_reporter_recover
-> mlx5e_tx_reporter_recover -> mlx5e_tx_reporter_recover_from_ctx
-> mlx5e_tx_reporter_err_cqe_recover

The same pattern exists in:
mlx5e_reporter_rx_timeout
mlx5e_reporter_tx_ptpsq_unhealthy
mlx5e_reporter_tx_timeout

Fix these by moving the netdev_trylock calls from the work handlers
lower in the call stack, in the respective recovery functions, where
they are actually necessary.

Fixes: 8f7b00307bf1 ("net/mlx5e: Convert mlx5 netdevs to instance locking")
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Jacob Keller <Jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260218072904.1764634-6-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5e: MACsec, add ASO poll loop in macsec_aso_set_arm_event

The macsec_aso_set_arm_event function calls mlx5_aso_poll_cq once
without a retry loop. If the CQE is not immediately available after
posting the WQE, the function fails unnecessarily.

Use read_poll_timeout() to poll 3-10 usecs for CQE, consistent with
other ASO polling code paths in the driver.

Fixes: 739cfa34518e ("net/mlx5: Make ASO poll CQ usable in atomic context")
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Jacob Keller <Jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260218072904.1764634-5-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5: Fix misidentification of write combining CQE during poll loop

The write combining completion poll loop uses usleep_range() which can
sleep much longer than requested due to scheduler latency. Under load,
we witnessed a 20ms+ delay until the process was rescheduled, causing
the jiffies based timeout to expire while the thread is sleeping.

The original do-while loop structure (poll, sleep, check timeout) would
exit without a final poll when waking after timeout, missing a CQE that
arrived during sleep.

Instead of the open-coded while loop, use the kernel's poll_timeout_us()
which always performs an additional check after the sleep expiration,
and is less error-prone.

Note: poll_timeout_us() doesn't accept a sleep range, by passing 10
sleep_us the sleep range effectively changes from 2-10 to 3-10 usecs.

Fixes: d98995b4bf98 ("net/mlx5: Reimplement write combining test")
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Jacob Keller <Jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260218072904.1764634-4-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5e: Fix misidentification of ASO CQE during poll loop

The ASO completion poll loop uses usleep_range() which can sleep much
longer than requested due to scheduler latency. Under load, we witnessed
a 20ms+ delay until the process was rescheduled, causing the jiffies
based timeout to expire while the thread is sleeping.

The original do-while loop structure (poll, sleep, check timeout) would
exit without a final poll when waking after timeout, missing a CQE that
arrived during sleep.

Instead of the open-coded while loop, use the kernel's
read_poll_timeout() which always performs an additional check after the
sleep expiration, and is less error-prone.

Note: read_poll_timeout() doesn't accept a sleep range, by passing 10
sleep_us the sleep range effectively changes from 2-10 to 3-10 usecs.

Fixes: 739cfa34518e ("net/mlx5: Make ASO poll CQ usable in atomic context")
Fixes: 7e3fce82d945 ("net/mlx5e: Overcome slow response for first macsec ASO WQE")
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Jacob Keller <Jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260218072904.1764634-3-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5: Fix multiport device check over light SFs

Driver is using num_vhca_ports capability to distinguish between
multiport master device and multiport slave device. num_vhca_ports is a
capability the driver sets according to the MAX num_vhca_ports
capability reported by FW. On the other hand, light SFs doesn't set the
above capbility.

This leads to wrong results whenever light SFs is checking whether he is
a multiport master or slave.

Therefore, use the MAX capability to distinguish between master and
slave devices.

Fixes: e71383fb9cd1 ("net/mlx5: Light probe local SFs")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Jacob Keller <Jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260218072904.1764634-2-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

bonding: alb: fix UAF in rlb_arp_recv during bond up/down

The ALB RX path may access rx_hashtbl concurrently with bond
teardown. During rapid bond up/down cycles, rlb_deinitialize()
frees rx_hashtbl while RX handlers are still running, leading
to a null pointer dereference detected by KASAN.

However, the root cause is that rlb_arp_recv() can still be accessed
after setting recv_probe to NULL, which is actually a use-after-free
(UAF) issue. That is the reason for using the referenced commit in the
Fixes tag.

[  214.174138] Oops: general protection fault, probably for non-canonical address 0xdffffc000000001d: 0000 [#1] SMP KASAN PTI
[  214.186478] KASAN: null-ptr-deref in range [0x00000000000000e8-0x00000000000000ef]
[  214.194933] CPU: 30 UID: 0 PID: 2375 Comm: ping Kdump: loaded Not tainted 6.19.0-rc8+ #2 PREEMPT(voluntary)
[  214.205907] Hardware name: Dell Inc. PowerEdge R730/0WCJNT, BIOS 2.14.0 01/14/2022
[  214.214357] RIP: 0010:rlb_arp_recv+0x505/0xab0 [bonding]
[  214.220320] Code: 0f 85 2b 05 00 00 48 b8 00 00 00 00 00 fc ff df 40 0f b6 ed 48 c1 e5 06 49 03 ad 78 01 00 00 48 8d 7d 28 48 89 fa 48 c1 ea 03 <0f> b6
04 02 84 c0 74 06 0f 8e 12 05 00 00 80 7d 28 00 0f 84 8c 00
[  214.241280] RSP: 0018:ffffc900073d8870 EFLAGS: 00010206
[  214.247116] RAX: dffffc0000000000 RBX: ffff888168556822 RCX: ffff88816855681e
[  214.255082] RDX: 000000000000001d RSI: dffffc0000000000 RDI: 00000000000000e8
[  214.263048] RBP: 00000000000000c0 R08: 0000000000000002 R09: ffffed11192021c8
[  214.271013] R10: ffff8888c9010e43 R11: 0000000000000001 R12: 1ffff92000e7b119
[  214.278978] R13: ffff8888c9010e00 R14: ffff888168556822 R15: ffff888168556810
[  214.286943] FS:  00007f85d2d9cb80(0000) GS:ffff88886ccb3000(0000) knlGS:0000000000000000
[  214.295966] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  214.302380] CR2: 00007f0d047b5e34 CR3: 00000008a1c2e002 CR4: 00000000001726f0
[  214.310347] Call Trace:
[  214.313070]  <IRQ>
[  214.315318]  ? __pfx_rlb_arp_recv+0x10/0x10 [bonding]
[  214.320975]  bond_handle_frame+0x166/0xb60 [bonding]
[  214.326537]  ? __pfx_bond_handle_frame+0x10/0x10 [bonding]
[  214.332680]  __netif_receive_skb_core.constprop.0+0x576/0x2710
[  214.339199]  ? __pfx_arp_process+0x10/0x10
[  214.343775]  ? sched_balance_find_src_group+0x98/0x630
[  214.349513]  ? __pfx___netif_receive_skb_core.constprop.0+0x10/0x10
[  214.356513]  ? arp_rcv+0x307/0x690
[  214.360311]  ? __pfx_arp_rcv+0x10/0x10
[  214.364499]  ? __lock_acquire+0x58c/0xbd0
[  214.368975]  __netif_receive_skb_one_core+0xae/0x1b0
[  214.374518]  ? __pfx___netif_receive_skb_one_core+0x10/0x10
[  214.380743]  ? lock_acquire+0x10b/0x140
[  214.385026]  process_backlog+0x3f1/0x13a0
[  214.389502]  ? process_backlog+0x3aa/0x13a0
[  214.394174]  __napi_poll.constprop.0+0x9f/0x370
[  214.399233]  net_rx_action+0x8c1/0xe60
[  214.403423]  ? __pfx_net_rx_action+0x10/0x10
[  214.408193]  ? lock_acquire.part.0+0xbd/0x260
[  214.413058]  ? sched_clock_cpu+0x6c/0x540
[  214.417540]  ? mark_held_locks+0x40/0x70
[  214.421920]  handle_softirqs+0x1fd/0x860
[  214.426302]  ? __pfx_handle_softirqs+0x10/0x10
[  214.431264]  ? __neigh_event_send+0x2d6/0xf50
[  214.436131]  do_softirq+0xb1/0xf0
[  214.439830]  </IRQ>

The issue is reproducible by repeatedly running
ip link set bond0 up/down while receiving ARP messages, where
rlb_arp_recv() can race with rlb_deinitialize() and dereference
a freed rx_hashtbl entry.

Fix this by setting recv_probe to NULL and then calling
synchronize_net() to wait for any concurrent RX processing to finish.
This ensures that no RX handler can access rx_hashtbl after it is freed
in bond_alb_deinitialize().

Reported-by: Liang Li <liali@redhat.com>
Fixes: 3aba891dde38 ("bonding: move processing of recv handlers into handle_frame()")
Reviewed-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Acked-by: Jay Vosburgh <jv@jvosburgh.net>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://patch.msgid.link/20260218060919.101574-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

bnge: fix reserving resources from FW

HWRM_FUNC_CFG is used to reserve resources, whereas HWRM_FUNC_QCFG is
intended for querying resource information from the firmware.
Since __bnge_hwrm_reserve_pf_rings() reserves resources for a specific
PF, the command type should be HWRM_FUNC_CFG.

Fixes: 627c67f038d2 ("bng_en: Add resource management support")
Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com>
Reviewed-by: Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260218052755.4097468-1-vikas.gupta@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

eth: fbnic: Advertise supported XDP features.

Drivers are supposed to advertise the XDP features they support. This was
missed while adding XDP support.

Before:
$ ynl --family netdev --dump dev-get
...
{'ifindex': 3,
  'xdp-features': set(),
  'xdp-rx-metadata-features': set(),
  'xsk-features': set()},
...

After:
$ ynl --family netdev --dump dev-get
...
{'ifindex': 3,
  'xdp-features': {'basic', 'rx-sg'},
  'xdp-rx-metadata-features': set(),
  'xsk-features': set()},
...

Fixes: 168deb7b31b2 ("eth: fbnic: Add support for XDP_TX action")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Dimitri Daskalakis <dimitri.daskalakis1@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260218030620.3329608-1-dimitri.daskalakis1@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

rds: tcp: fix uninit-value in __inet_bind

KMSAN reported an uninit-value access in __inet_bind() when binding
an RDS TCP socket.

The uninitialized memory originates from rds_tcp_conn_alloc(),
which uses kmem_cache_alloc() to allocate the rds_tcp_connection structure.

Specifically, the field 't_client_port_group' is incremented in
rds_tcp_conn_path_connect() without being initialized first:

if (++tc->t_client_port_group >= port_groups)

Since kmem_cache_alloc() does not zero the memory, this field contains
garbage, leading to the KMSAN report.

Fix this by using kmem_cache_zalloc() to ensure the structure is
zero-initialized upon allocation.

Reported-by: syzbot+aae646f09192f72a68dc@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=aae646f09192f72a68dc
Tested-by: syzbot+aae646f09192f72a68dc@syzkaller.appspotmail.com
Fixes: a20a6992558f ("net/rds: Encode cp_index in TCP source port")
Signed-off-by: Tabrez Ahmed <tabreztalks@gmail.com>
Reviewed-by: Charalampos Mitrodimas <charmitro@posteo.net>
Reviewed-by: Allison Henderson <achender@kernel.org>
Link: https://patch.msgid.link/20260217135350.33641-1-tabreztalks@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net/rds: Fix NULL pointer dereference in rds_tcp_accept_one

Save a local pointer to new_sock->sk and hold a reference before
installing callbacks in rds_tcp_accept_one. After
rds_tcp_set_callbacks() or rds_tcp_reset_callbacks(), tc->t_sock is
set to new_sock which may race with the shutdown path. A concurrent
rds_tcp_conn_path_shutdown() may call sock_release(), which sets
new_sock->sk = NULL and may eventually free sk when the refcount
reaches zero.

Subsequent accesses to new_sock->sk->sk_state would dereference NULL,
causing the crash. The fix saves a local sk pointer before callbacks
are installed so that sk_state can be accessed safely even after
new_sock->sk is nulled, and uses sock_hold()/sock_put() to ensure
sk itself remains valid for the duration.

Fixes: 826c1004d4ae ("net/rds: rds_tcp_conn_path_shutdown must not discard messages")
Reported-by: syzbot+96046021045ffe6d7709@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=96046021045ffe6d7709
Signed-off-by: Allison Henderson <achender@kernel.org>
Link: https://patch.msgid.link/20260216222643.2391390-1-achender@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

mshv: Handle insufficient root memory hypervisor statuses

When creating guest partition objects, the hypervisor may fail to
allocate root partition pages and return an insufficient memory status.
In this case, deposit memory using the root partition ID instead.

Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
Reviewed-by: Anirudh Rayabharam (Microsoft) <anirudh@anirudhrb.com>
Reviewed-by: Mukesh R <mrathor@linux.microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>

mshv: Handle insufficient contiguous memory hypervisor status

The HV_STATUS_INSUFFICIENT_CONTIGUOUS_MEMORY status indicates that the
hypervisor lacks sufficient contiguous memory for its internal allocations.

When this status is encountered, allocate and deposit
HV_MAX_CONTIGUOUS_ALLOCATION_PAGES contiguous pages to the hypervisor.
HV_MAX_CONTIGUOUS_ALLOCATION_PAGES is defined in the hypervisor headers, a
deposit of this size will always satisfy the hypervisor's requirements.

Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
Reviewed-by: Anirudh Rayabharam (Microsoft) <anirudh@anirudhrb.com>
Reviewed-by: Mukesh R <mrathor@linux.microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>

mshv: Introduce hv_deposit_memory helper functions

Introduce hv_deposit_memory_node() and hv_deposit_memory() helper
functions to handle memory deposit with proper error handling.

The new hv_deposit_memory_node() function takes the hypervisor status
as a parameter and validates it before depositing pages. It checks for
HV_STATUS_INSUFFICIENT_MEMORY specifically and returns an error for
unexpected status codes.

This is a precursor patch to new out-of-memory error codes support.
No functional changes intended.

Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
Reviewed-by: Anirudh Rayabharam (Microsoft) <anirudh@anirudhrb.com>
Reviewed-by: Mukesh R <mrathor@linux.microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>

mshv: Introduce hv_result_needs_memory() helper function

Replace direct comparisons of hv_result(status) against
HV_STATUS_INSUFFICIENT_MEMORY with a new hv_result_needs_memory() helper
function.

This improves code readability and provides a consistent and extendable
interface for checking out-of-memory conditions in hypercall results.

No functional changes intended.

Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
Reviewed-by: Anirudh Rayabharam (Microsoft) <anirudh@anirudhrb.com>
Reviewed-by: Mukesh R <mrathor@linux.microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>

Merge tag 'mm-nonmm-stable-2026-02-18-19-56' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull more non-MM updates from Andrew Morton:

- "two fixes in kho_populate()" fixes a couple of not-major issues in
   the kexec handover code (Ran Xiaokai)

- misc singletons

* tag 'mm-nonmm-stable-2026-02-18-19-56' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  lib/group_cpus: handle const qualifier from clusters allocation type
  kho: remove unnecessary WARN_ON(err) in kho_populate()
  kho: fix missing early_memunmap() call in kho_populate()
  scripts/gdb: implement x86_page_ops in mm.py
  objpool: fix the overestimation of object pooling metadata size
  selftests/memfd: use IPC semaphore instead of SIGSTOP/SIGCONT
  delayacct: fix build regression on accounting tool

Merge tag 'mm-stable-2026-02-18-19-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull more MM  updates from Andrew Morton:

- "mm/vmscan: fix demotion targets checks in reclaim/demotion" fixes a
   couple of issues in the demotion code - pages were failed demotion
   and were finding themselves demoted into disallowed nodes (Bing Jiao)

- "Remove XA_ZERO from error recovery of dup_mmap()" fixes a rare
   mapledtree race and performs a number of cleanups (Liam Howlett)

- "mm: add bitmap VMA flag helpers and convert all mmap_prepare to use
   them" implements a lot of cleanups following on from the conversion
   of the VMA flags into a bitmap (Lorenzo Stoakes)

- "support batch checking of references and unmapping for large folios"
   implements batching to greatly improve the performance of reclaiming
   clean file-backed large folios (Baolin Wang)

- "selftests/mm: add memory failure selftests" does as claimed (Miaohe
   Lin)

* tag 'mm-stable-2026-02-18-19-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (36 commits)
  mm/page_alloc: clear page->private in free_pages_prepare()
  selftests/mm: add memory failure dirty pagecache test
  selftests/mm: add memory failure clean pagecache test
  selftests/mm: add memory failure anonymous page test
  mm: rmap: support batched unmapping for file large folios
  arm64: mm: implement the architecture-specific clear_flush_young_ptes()
  arm64: mm: support batch clearing of the young flag for large folios
  arm64: mm: factor out the address and ptep alignment into a new helper
  mm: rmap: support batched checks of the references for large folios
  tools/testing/vma: add VMA userland tests for VMA flag functions
  tools/testing/vma: separate out vma_internal.h into logical headers
  tools/testing/vma: separate VMA userland tests into separate files
  mm: make vm_area_desc utilise vma_flags_t only
  mm: update all remaining mmap_prepare users to use vma_flags_t
  mm: update shmem_[kernel]_file_*() functions to use vma_flags_t
  mm: update secretmem to use VMA flags on mmap_prepare
  mm: update hugetlbfs to use VMA flags on mmap_prepare
  mm: add basic VMA flag operation helper functions
  tools: bitmap: add missing bitmap_[subset(), andnot()]
  mm: add mk_vma_flags() bitmap flag macro helper
  ...

octeontx2-af: Fix default entries mcam entry action

As per design, AF should update the default MCAM action only when
mcam_index is -1. A bug in the previous patch caused default entries
to be changed even when the request was not for them.

Fixes: 570ba37898ec ("octeontx2-af: Update RSS algorithm index")
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260216090338.1318976-1-hkelam@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'nf-26-02-17' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

Florian Westphal says:

====================
netfilter: updates for net

The following patchset contains Netfilter fixes for *net*:

1) Add missing __rcu annotations to NAT helper hook pointers in Amanda,
   FTP, IRC, SNMP and TFTP helpers.  From Sun Jian.

2-4):
- Add global spinlock to serialize nft_counter fetch+reset operations.
- Use atomic64_xchg() for nft_quota reset instead of read+subtract pattern.
   Note AI review detects a race in this change but it isn't new. The
   'racing' bit only exists to prevent constant stream of 'quota expired'
   notifications.
- Revert commit_mutex usage in nf_tables reset path, it caused
   circular lock dependency.  All from Brian Witte.

5) Fix uninitialized l3num value in nf_conntrack_h323 helper.

6) Fix musl libc compatibility in netfilter_bridge.h UAPI header. This
   change isn't nice (UAPI headers should not include libc headers), but
   as-is musl builds may fail due to redefinition of struct ethhdr.

7) Fix protocol checksum validation in IPVS for IPv6 with extension headers,
   from Julian Anastasov.

8) Fix device reference leak in IPVS when netdev goes down. Also from
   Julian.

9) Remove WARN_ON_ONCE when accessing forward path array, this can
   trigger with sufficiently long forward paths.  From Pablo Neira Ayuso.

10) Fix use-after-free in nf_tables_addchain() error path, from Inseo An.

* tag 'nf-26-02-17' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: nf_tables: fix use-after-free in nf_tables_addchain()
  net: remove WARN_ON_ONCE when accessing forward path array
  ipvs: do not keep dest_dst if dev is going down
  ipvs: skip ipv6 extension headers for csum checks
  include: uapi: netfilter_bridge.h: Cover for musl libc
  netfilter: nf_conntrack_h323: don't pass uninitialised l3num value
  netfilter: nf_tables: revert commit_mutex usage in reset path
  netfilter: nft_quota: use atomic64_xchg for reset
  netfilter: nft_counter: serialize reset with spinlock
  netfilter: annotate NAT helper hook pointers with __rcu
====================

Link: https://patch.msgid.link/20260217163233.31455-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5e: XSK, Fix unintended ICOSQ change

XSK wakeup must use the async ICOSQ (with proper locking), as it is not
guaranteed to run on the same CPU as the channel.

The commit that converted the NAPI trigger path to use the sync ICOSQ
incorrectly applied the same change to XSK, causing XSK wakeups to use
the sync ICOSQ as well. Revert XSK flows to use the async ICOSQ.

XDP program attach/detach triggers channel reopen, while XSK pool
enable/disable can happen on-the-fly via NDOs without reopening
channels. As a result, xsk_pool state cannot be reliably used at
mlx5e_open_channel() time to decide whether an async ICOSQ is needed.

Update the async_icosq_needed logic to depend on the presence of an XDP
program rather than the xsk_pool, ensuring the async ICOSQ is available
when XSK wakeups are enabled.

This fixes multiple issues:

1. Illegal synchronize_rcu() in an RCU read- side critical section via
mlx5e_xsk_wakeup() -> mlx5e_trigger_napi_icosq() ->
synchronize_net(). The stack holds RCU read-lock in xsk_poll().

2. Hitting a NULL pointer dereference in mlx5e_xsk_wakeup():

[] BUG: kernel NULL pointer dereference, address: 0000000000000240
[] #PF: supervisor read access in kernel mode
[] #PF: error_code(0x0000) - not-present page
[] PGD 0 P4D 0
[] Oops: Oops: 0000 [#1] SMP
[] CPU: 0 UID: 0 PID: 2255 Comm: qemu-system-x86 Not tainted 6.19.0-rc5+ #229 PREEMPT(none)
[] Hardware name: [...]
[] RIP: 0010:mlx5e_xsk_wakeup+0x53/0x90 [mlx5_core]

Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Closes: https://lore.kernel.org/all/20260123223916.361295-1-daniel@iogearbox.net/
Fixes: 56aca3e0f730 ("net/mlx5e: Use regular ICOSQ for triggering NAPI")
Tested-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Acked-by: Alice Mikityanska <alice.kernel@fastmail.im>
Link: https://patch.msgid.link/20260217074525.1761454-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'icmp-better-deal-with-ddos'

Eric Dumazet says:

====================
icmp: better deal with DDOS

When dealing with death of big UDP servers, admins might want to
increase net.ipv4.icmp_msgs_per_sec and net.ipv4.icmp_msgs_burst
to big values (2,000,000 or more).

They also might need to tune the per-host ratelimit to 1ms or 0ms
in favor of the global rate limit.

This series fixes bugs showing up in all these needs.
====================

Link: https://patch.msgid.link/20260216142832.3834174-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ipv6: icmp: icmpv6_xrlim_allow() optimization if net.ipv6.icmp.ratelimit is zero

If net.ipv6.icmp.ratelimit is zero we do not have to call
inet_getpeer_v6() and inet_peer_xrlim_allow().

Both can be very expensive under DDOS.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260216142832.3834174-6-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ipv4: icmp: icmpv4_xrlim_allow() optimization if net.ipv4.icmp_ratelimit is zero

If net.ipv4.icmp_ratelimit is zero, we do not have to call
inet_getpeer_v4() and inet_peer_xrlim_allow().

Both can be very expensive under DDOS.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260216142832.3834174-5-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ipv6: icmp: remove obsolete code in icmpv6_xrlim_allow()

Following part was needed before the blamed commit, because
inet_getpeer_v6() second argument was the prefix.

/* Give more bandwidth to wider prefixes. */
if (rt->rt6i_dst.plen < 128)
tmo >>= ((128 - rt->rt6i_dst.plen)>>5);

Now inet_getpeer_v6() retrieves hosts, we need to remove
@tmo adjustement or wider prefixes likes /24 allow 8x
more ICMP to be sent for a given ratelimit.

As we had this issue for a while, this patch changes net.ipv6.icmp.ratelimit
default value from 1000ms to 100ms to avoid potential regressions.

Also add a READ_ONCE() when reading net->ipv6.sysctl.icmpv6_time.

Fixes: fd0273d7939f ("ipv6: Remove external dependency on rt6i_dst and rt6i_src")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Cc: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20260216142832.3834174-4-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

inet: move icmp_global_{credit,stamp} to a separate cache line

icmp_global_credit was meant to be changed ~1000 times per second,
but if an admin sets net.ipv4.icmp_msgs_per_sec to a very high value,
icmp_global_credit changes can inflict false sharing to surrounding
fields that are read mostly.

Move icmp_global_credit and icmp_global_stamp to a separate
cacheline aligned group.

Fixes: b056b4cd9178 ("icmp: move icmp_global.credit and icmp_global.stamp to per netns storage")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260216142832.3834174-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

icmp: prevent possible overflow in icmp_global_allow()

Following expression can overflow
if sysctl_icmp_msgs_per_sec is big enough.

sysctl_icmp_msgs_per_sec * delta / HZ;

Fixes: 4cdf507d5452 ("icmp: add a global rate limitation")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260216142832.3834174-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

selftests/net: packetdrill: add ipv4-mapped-ipv6 tests

Add ipv4-mapped-ipv6 case to ksft_runner.sh before
an upcoming TCP fix in this area.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260217142924.1853498-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

mshv: Add SMT_ENABLED_GUEST partition creation flag

Add support for HV_PARTITION_CREATION_FLAG_SMT_ENABLED_GUEST
to allow userspace VMMs to enable SMT for guest partitions.

Expose this via new MSHV_PT_BIT_SMT_ENABLED_GUEST flag in the UAPI.

Without this flag, the hypervisor schedules guest VPs incorrectly,
causing SMT unusable.

Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>

mshv: Add nested virtualization creation flag

Introduce HV_PARTITION_CREATION_FLAG_NESTED_VIRTUALIZATION_CAPABLE to
indicate support for nested virtualization during partition creation.

This enables clearer configuration and capability checks for nested
virtualization scenarios.

Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>

Drivers: hv: vmbus: Simplify allocation of vmbus_evt

The per-cpu variable vmbus_evt is currently dynamically allocated. It's
only 8 bytes, so just allocate it statically to simplify and save a few
lines of code.

Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Reviewed-by: Long Li <longli@microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>

mshv: expose the scrub partition hypercall

This hypercall needs to be exposed for VMMs to soft-reboot guests. It
will reset APIC and synthetic interrupt controller state, among others.

Signed-off-by: Magnus Kulke <magnuskulke@linux.microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>

mshv: Add support for integrated scheduler

Query the hypervisor for integrated scheduler support and use it if
configured.

Microsoft Hypervisor originally provided two schedulers: root and core. The
root scheduler allows the root partition to schedule guest vCPUs across
physical cores, supporting both time slicing and CPU affinity (e.g., via
cgroups). In contrast, the core scheduler delegates vCPU-to-physical-core
scheduling entirely to the hypervisor.

Direct virtualization introduces a new privileged guest partition type - L1
Virtual Host (L1VH) — which can create child partitions from its own
resources. These child partitions are effectively siblings, scheduled by
the hypervisor's core scheduler. This prevents the L1VH parent from setting
affinity or time slicing for its own processes or guest VPs. While cgroups,
CFS, and cpuset controllers can still be used, their effectiveness is
unpredictable, as the core scheduler swaps vCPUs according to its own logic
(typically round-robin across all allocated physical CPUs). As a result,
the system may appear to "steal" time from the L1VH and its children.

To address this, Microsoft Hypervisor introduces the integrated scheduler.
This allows an L1VH partition to schedule its own vCPUs and those of its
guests across its "physical" cores, effectively emulating root scheduler
behavior within the L1VH, while retaining core scheduler behavior for the
rest of the system.

The integrated scheduler is controlled by the root partition and gated by
the vmm_enable_integrated_scheduler capability bit. If set, the hypervisor
supports the integrated scheduler. The L1VH partition must then check if it
is enabled by querying the corresponding extended partition property. If
this property is true, the L1VH partition must use the root scheduler
logic; otherwise, it must use the core scheduler. This requirement makes
reading VMM capabilities in L1VH partition a requirement too.

Signed-off-by: Andreea Pintilie <anpintil@microsoft.com>
Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>

mshv: Use try_cmpxchg() instead of cmpxchg()

Use !try_cmpxchg() instead of cmpxchg (*ptr, old, new) != old.
x86 CMPXCHG instruction returns success in ZF flag, so this
change saves a compare after CMPXCHG.

The generated assembly code improves from e.g.:

     415: 48 8b 44 24 30        mov    0x30(%rsp),%rax
     41a: 48 8b 54 24 38        mov    0x38(%rsp),%rdx
     41f: f0 49 0f b1 91 a8 02 lock cmpxchg %rdx,0x2a8(%r9)
     426: 00 00
     428: 48 3b 44 24 30        cmp    0x30(%rsp),%rax
     42d: 0f 84 09 ff ff ff     je     33c <...>

to:

     415: 48 8b 44 24 30        mov    0x30(%rsp),%rax
     41a: 48 8b 54 24 38        mov    0x38(%rsp),%rdx
     41f: f0 49 0f b1 91 a8 02 lock cmpxchg %rdx,0x2a8(%r9)
     426: 00 00
     428: 0f 84 0e ff ff ff     je     33c <...>

No functional change intended.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Long Li <longli@microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>

x86/hyperv: Fix error pointer dereference

The function idle_thread_get() can return an error pointer and is not
checked for it. Add check for error pointer.

Detected by Smatch:
arch/x86/hyperv/hv_vtl.c:126 hv_vtl_bringup_vcpu() error:
'idle' dereferencing possible ERR_PTR()

Fixes: 2b4b90e053a29 ("x86/hyperv: Use per cpu initial stack for vtl context")
Signed-off-by: Ethan Tidmore <ethantidmore06@gmail.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>

x86/hyperv: Reserve 3 interrupt vectors used exclusively by MSHV

MSVC compiler, used to compile the Microsoft Hypervisor, currently
has an assert intrinsic that uses interrupt vector 0x29 to create an
exception. This will cause hypervisor to then crash and collect core. As
such, if this interrupt number is assigned to a device by Linux and the
device generates it, hypervisor will crash. There are two other such
vectors hard coded in the hypervisor, 0x2C and 0x2D for debug purposes.

Fortunately, the three vectors are part of the kernel driver space and
that makes it feasible to reserve them early so they are not assigned
later.

Signed-off-by: Mukesh Rathor <mrathor@linux.microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>

selftests/bpf: Remove hexdump dependency

The verification signature header generation requires converting a
binary certificate to a C array. Previously this only worked with xxd,
and a switch to hexdump has been done in commit b640d556a2b3
("selftests/bpf: Remove xxd util dependency").

hexdump is a more common utility program, yet it might not be installed
by default. When it is not installed, BPF selftests build without
errors, but tests_progs is unusable: it exits with the 255 code and
without any error messages. When manually reproducing the issue, it is
not too hard to find out that the generated verification_cert.h file is
incorrect, but that's time consuming. When digging the BPF selftests
build logs, this line can be seen amongst thousands others, but ignored:

/bin/sh: 2: hexdump: not found

Here, od is used instead of hexdump. od is coming from the coreutils
package, and this new od command produces the same output when using od
from GNU coreutils, uutils, and even busybox. This is more portable, and
it produces a similar results to what was done before with hexdump:
there is an extra comma at the end instead of trailing whitespaces,
but the C code is not impacted.

Fixes: b640d556a2b3 ("selftests/bpf: Remove xxd util dependency")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/r/20260218-bpf-sft-hexdump-od-v2-1-2f9b3ee5ab86@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Merge tag 'kbuild-fixes-7.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux

Pull Kbuild fixes from Nathan Chancellor:

- Ensure tools/objtool is cleaned by 'make clean' and 'make mrproper'

- Fix test program for CONFIG_CC_CAN_LINK to avoid a warning, which is
   made fatal by -Werror

- Drop explicit LZMA parallel compression in scripts/make_fit.py

- Several fixes for commit 62089b804895 ("kbuild: rpm-pkg: Generate
   debuginfo package manually")

* tag 'kbuild-fixes-7.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux:
  kbuild: rpm-pkg: Disable automatic requires for manual debuginfo package
  kbuild: rpm-pkg: Fix manual debuginfo generation when using .src.rpm
  kernel: rpm-pkg: Restore find-debuginfo.sh approach to -debuginfo package
  kbuild: rpm-pkg: Restrict manual debug package creation
  scripts/make_fit.py: Drop explicit LZMA parallel compression
  kbuild: Fix CC_CAN_LINK detection
  kbuild: Add objtool to top-level clean target

Merge branch 'libbpf-remove-extern-declaration-of-bpf_stream_vprintk'

Ihor Solodrai says:

====================
libbpf: Remove extern declaration of bpf_stream_vprintk()

The first patch adjusts a selftest that has been using
bpf_stream_printk() macro. The second patch removes the declaration.
====================

Link: https://patch.msgid.link/20260218215651.2057673-1-ihor.solodrai@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

libbpf: Remove extern declaration of bpf_stream_vprintk()

An issue was reported that building BPF program which includes both
vmlinux.h and bpf_helpers.h from libbpf fails due to conflicting
declarations of bpf_stream_vprintk().

Remove the extern declaration from bpf_helpers.h to address this.

In order to use bpf_stream_printk() macro, BPF programs are expected
to either include vmlinux.h of the kernel they are targeting, or add
their own extern declaration.

Reported-by: Luca Boccassi <luca.boccassi@gmail.com>
Closes: https://github.com/libbpf/libbpf/issues/947
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Link: https://lore.kernel.org/r/20260218215651.2057673-3-ihor.solodrai@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

selftests/bpf: Use vmlinux.h in test_xdp_meta

- Replace linux/* includes with vmlinux.h
- Include errno.h
- Include bpf_tracing_net.h for TC_ACT_* and ETH_*
- Use BPF_STDERR instead of BPF_STREAM_STDERR

Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Link: https://lore.kernel.org/r/20260218215651.2057673-2-ihor.solodrai@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Merge tag 'thermal-7.0-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull thermal control fix from Rafael Wysocki:
"This fixes a sysfs group leak on DLVR registration failure in the
Intel int340x thermal driver (Kaushlendra Kumar)"

* tag 'thermal-7.0-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal: int340x: Fix sysfs group leak on DLVR registration failure

Merge tag 'acpi-7.0-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull more ACPI support updates from Rafael J. Wysocki:
"These are mostly fixes and cleanups on top of the ACPI support updates
  merged recently, including two new quirks, an ACPI CPPC library fix,
  and fixes and cleanups of a few core ACPI device drivers:

   - Add an unused power resource handling quirk for THUNDEROBOT ZERO
     (Zhai Can)

   - Fix remaining for_each_possible_cpu() in the ACPI CPPC library to
     use online CPUs (Sean V Kelley)

   - Drop redundant checks from the ACPI notify handler and the driver
     remove callback in the ACPI battery driver (Rafael Wysocki)

   - Move the creation of the wakeup source during the ACPI button
     driver probe to an earlier point to avoid missing a wakeup event
     due to a race and clean up system wakeup handling and remove
     callback in that driver (Rafael Wysocki)

   - Drop unnecessary driver_data pointer clearing from the ACPI EC and
     SMBUS HC drivers and make the ACPI backlight (video) driver clear
     the device's driver_data pointer on remove (Rafael Wysocki)

   - Force enabling of PWM2 on the Yogabook YB1-X90 tablets (Yauhen
     Kharuzhy)"

* tag 'acpi-7.0-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: PM: Add unused power resource quirk for THUNDEROBOT ZERO
  ACPI: driver: Drop driver_data pointer clearing from two drivers
  ACPI: video: Clear driver_data pointer on remove
  ACPI: button: Tweak acpi_button_remove()
  ACPI: button: Tweak system wakeup handling
  ACPI: battery: Drop redundant checks from acpi_battery_remove()
  ACPI: CPPC: Fix remaining for_each_possible_cpu() to use online CPUs
  ACPI: x86: Force enabling of PWM2 on the Yogabook YB1-X90
  ACPI: button: Call device_init_wakeup() earlier during probe
  ACPI: battery: Drop redundant check from acpi_battery_notify()

Merge tag 'pm-7.0-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull more power management updates from Rafael Wysocki:
"These are mostly fixes on top of the power management updates merged
  recently in cpuidle governors, in the Intel RAPL power capping driver
  and in the wake IRQ management code:

   - Fix the handling of package-scope MSRs in the intel_rapl power
     capping driver when called from the PMU subsystem and make it add
     all package CPUs to the PMU cpumask to allow tools to read RAPL
     events from any CPU in the package (Kuppuswamy Satharayananyan)

   - Rework the invalid version check in the intel_rapl_tpmi power
     capping driver to account for the fact that on partitioned systems,
     multiple TPMI instances may exist per package, but RAPL registers
     are only valid on one instance (Kuppuswamy Satharayananyan)

   - Describe the new intel_idle.table command line option in the
     admin-guide intel_idle documentation (Artem Bityutskiy)

   - Fix a crash in the ladder cpuidle governor on systems with only one
     (polling) idle state available by making the cpuidle core bypass
     the governor in those cases and adjust the other existing governors
     to that change (Aboorva Devarajan, Christian Loehle)

   - Update kerneldoc comments for wake IRQ management functions that
     have not been matching the code (Wang Jiayue)"

* tag 'pm-7.0-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpuidle: menu: Remove single state handling
  cpuidle: teo: Remove single state handling
  cpuidle: haltpoll: Remove single state handling
  cpuidle: Skip governor when only one idle state is available
  powercap: intel_rapl_tpmi: Remove FW_BUG from invalid version check
  PM: sleep: wakeirq: Update outdated documentation comments
  Documentation: PM: Document intel_idle.table command line option
  powercap: intel_rapl: Expose all package CPUs in PMU cpumask
  powercap: intel_rapl: Remove incorrect CPU check in PMU context

Merge branches 'acpi-battery', 'acpi-button' and 'acpi-driver'

Merge additional updates of multiple core ACPI device drivers (battery,
button, video, EC, SMBUS HC) for 7.0-rc1:

- Drop redundant checks from the ACPI notify handler and the driver
   remove callback in the ACPI battery driver (Rafael Wysocki)

- Move the creation of the wakeup source during the ACPI button driver
   probe to an earlier point to avoid missing a wakeup event due to a
   race and clean up system wakeup handling and remove callback in that
   driver (Rafael Wysocki)

- Drop unnecessary driver_data pointer clearing from the ACPI EC and
   SMBUS HC drivers and make the ACPI backlight (video) driver clear the
   device's driver_data pointer on remove (Rafael Wysocki)

* acpi-battery:
  ACPI: battery: Drop redundant checks from acpi_battery_remove()
  ACPI: battery: Drop redundant check from acpi_battery_notify()

* acpi-button:
  ACPI: button: Tweak acpi_button_remove()
  ACPI: button: Tweak system wakeup handling
  ACPI: button: Call device_init_wakeup() earlier during probe

* acpi-driver:
  ACPI: driver: Drop driver_data pointer clearing from two drivers
  ACPI: video: Clear driver_data pointer on remove

Merge branches 'acpi-pm' and 'acpi-cppc'

Merge an ACPI power management update and an ACPI CPPC library update
for 7.0-rc1:

- Add an unused power resource handling quirk for THUNDEROBOT ZERO (Zhai
   Can)

- Fix remaining for_each_possible_cpu() in the ACPI CPPC library to use
   online CPUs (Sean V Kelley)

* acpi-pm:
  ACPI: PM: Add unused power resource quirk for THUNDEROBOT ZERO

* acpi-cppc:
  ACPI: CPPC: Fix remaining for_each_possible_cpu() to use online CPUs

Merge branches 'pm-powercap' and 'pm-cpuidle'

Merge additional power capping and cpuidle updates for 7.0-rc1:

- Fix the handling of package-scope MSRs in the intel_rapl power
   capping driver when called from the PMU subsystem and make it add all
   package CPUs to the PMU cpumask to allow tools to read RAPL events
   from any CPU in the package (Kuppuswamy Sathyanarayanan)

- Rework the invalid version check in the intel_rapl_tpmi power capping
   driver to account for the fact that on partitioned systems, multiple
   TPMI instances may exist per package, but RAPL registers are only
   valid on one instance (Kuppuswamy Satharayananyan)

- Describe the new intel_idle.table command line option in the
   admin-guide intel_idle documentation (Artem Bityutskiy)

- Fix a crash in the ladder cpuidle governor on systems with only one
   (polling) idle state available by making the cpuidle core bypass the
   governor in those cases and adjust the other existing governors to
   that change (Aboorva Devarajan, Christian Loehle)

* pm-powercap:
  powercap: intel_rapl_tpmi: Remove FW_BUG from invalid version check
  powercap: intel_rapl: Expose all package CPUs in PMU cpumask
  powercap: intel_rapl: Remove incorrect CPU check in PMU context

* pm-cpuidle:
  cpuidle: menu: Remove single state handling
  cpuidle: teo: Remove single state handling
  cpuidle: haltpoll: Remove single state handling
  cpuidle: Skip governor when only one idle state is available
  Documentation: PM: Document intel_idle.table command line option

Merge tag 'sysctl-7.00-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl

Pull sysctl updates from Joel Granados:

- Remove macros from proc handler converters

   Replace the proc converter macros with "regular" functions. Though it
   is more verbose than the macro version, it helps when debugging and
   better aligns with coding-style.rst.

- General cleanup

   Remove superfluous ctl_table forward declarations. Const qualify the
   memory_allocation_profiling_sysctl and loadpin_sysctl_table arrays.
   Add missing kernel doc to proc_dointvec_conv.

- Testing

   This series was run through sysctl selftests/kunit test suite in
   x86_64. And went into linux-next after rc4, giving it a good 3 weeks
   of testing

* tag 'sysctl-7.00-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl:
  sysctl: replace SYSCTL_INT_CONV_CUSTOM macro with functions
  sysctl: Replace unidirectional INT converter macros with functions
  sysctl: Add kernel doc to proc_douintvec_conv
  sysctl: Replace UINT converter macros with functions
  sysctl: Add CONFIG_PROC_SYSCTL guards for converter macros
  sysctl: clarify proc_douintvec_minmax doc
  sysctl: Return -ENOSYS from proc_douintvec_conv when CONFIG_PROC_SYSCTL=n
  sysctl: Remove unused ctl_table forward declarations
  loadpin: Implement custom proc_handler for enforce
  alloc_tag: move memory_allocation_profiling_sysctls into .rodata
  sysctl: Add missing kernel-doc for proc_dointvec_conv

Merge tag 'turbostat-2026.02.14-AMD-RAPL-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux

Pull turbostat fix from Len Brown:
"Fix a recent AMD regression due to errant code cleanup"

* tag 'turbostat-2026.02.14-AMD-RAPL-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
tools/power turbostat: Fix AMD RAPL regression

x86/xen: Fix Xen PV guest boot

A recent patch moving the call of sparse_init() to common mm code
broke booting as a Xen PV guest.

Reason is that the Xen PV specific boot code relied on struct page area
being accessible rather early, but this changed by the move of the call
of sparse_init().

Fortunately the fix is rather easy: there is a static branch available
indicating whether struct page contents are usable by Xen. This static
branch just needs to be tested in some places for avoiding the access
of struct page.

Fixes: 4267739cabb8 ("arch, mm: consolidate initialization of SPARSE memory model")
Signed-off-by: Juergen Gross <jgross@suse.com>
Message-ID: <20260214135035.119357-1-jgross@suse.com>

Drivers: hv: vmbus: Use kthread for vmbus interrupts on PREEMPT_RT

Resolves the following lockdep report when booting PREEMPT_RT on Hyper-V
with related guest support enabled:

[    1.127941] hv_vmbus: registering driver hyperv_drm

[    1.132518] =============================
[    1.132519] [ BUG: Invalid wait context ]
[    1.132521] 6.19.0-rc8+ #9 Not tainted
[    1.132524] -----------------------------
[    1.132525] swapper/0/0 is trying to lock:
[    1.132526] ffff8b9381bb3c90 (&channel->sched_lock){....}-{3:3}, at: vmbus_chan_sched+0xc4/0x2b0
[    1.132543] other info that might help us debug this:
[    1.132544] context-{2:2}
[    1.132545] 1 lock held by swapper/0/0:
[    1.132547]  #0: ffffffffa010c4c0 (rcu_read_lock){....}-{1:3}, at: vmbus_chan_sched+0x31/0x2b0
[    1.132557] stack backtrace:
[    1.132560] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.19.0-rc8+ #9 PREEMPT_{RT,(lazy)}
[    1.132565] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 09/25/2025
[    1.132567] Call Trace:
[    1.132570]  <IRQ>
[    1.132573]  dump_stack_lvl+0x6e/0xa0
[    1.132581]  __lock_acquire+0xee0/0x21b0
[    1.132592]  lock_acquire+0xd5/0x2d0
[    1.132598]  ? vmbus_chan_sched+0xc4/0x2b0
[    1.132606]  ? lock_acquire+0xd5/0x2d0
[    1.132613]  ? vmbus_chan_sched+0x31/0x2b0
[    1.132619]  rt_spin_lock+0x3f/0x1f0
[    1.132623]  ? vmbus_chan_sched+0xc4/0x2b0
[    1.132629]  ? vmbus_chan_sched+0x31/0x2b0
[    1.132634]  vmbus_chan_sched+0xc4/0x2b0
[    1.132641]  vmbus_isr+0x2c/0x150
[    1.132648]  __sysvec_hyperv_callback+0x5f/0xa0
[    1.132654]  sysvec_hyperv_callback+0x88/0xb0
[    1.132658]  </IRQ>
[    1.132659]  <TASK>
[    1.132660]  asm_sysvec_hyperv_callback+0x1a/0x20

As code paths that handle vmbus IRQs use sleepy locks under PREEMPT_RT,
the vmbus_isr execution needs to be moved into thread context. Open-
coding this allows to skip the IPI that irq_work would additionally
bring and which we do not need, being an IRQ, never an NMI.

This affects both x86 and arm64, therefore hook into the common driver
logic.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Florian Bezdeka <florian.bezdeka@siemens.com>
Tested-by: Florian Bezdeka <florian.bezdeka@siemens.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>

x86/hyperv: Remove ASM_CALL_CONSTRAINT with VMMCALL insn

Unlike CALL instruction, VMMCALL does not push to the stack, so it's
OK to allow the compiler to insert it before the frame pointer gets
set up by the containing function. ASM_CALL_CONSTRAINT is for CALLs
that must be inserted after the frame pointer is set up, so it is
over-constraining here and can be removed.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>

x86/hyperv: Use savesegment() instead of inline asm() to save segment registers

Use standard savesegment() utility macro to save segment registers.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Acked-by: Wei Liu <wei.liu@kernel.org>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>

eth: fbnic: Add validation for MTU changes

Increasing the MTU beyond the HDS threshold causes the hardware to
fragment packets across multiple buffers. If a single-buffer XDP program
is attached, the driver will drop all multi-frag frames. While we can't
prevent a remote sender from sending non-TCP packets larger than the MTU,
this will prevent users from inadvertently breaking new TCP streams.

Traditionally, drivers supported XDP with MTU less than 4Kb
(packet per page). Fbnic currently prevents attaching XDP when MTU is too high.
But it does not prevent increasing MTU after XDP is attached.

Fixes: 1b0a3950dbd4 ("eth: fbnic: Add XDP pass, drop, abort support")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Dimitri Daskalakis <dimitri.daskalakis1@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

tools/power turbostat: Fix AMD RAPL regression

turbostat.c:8688: rapl_perf_init: Assertion `next_domain < num_domains' failed.

Two recent cleanup patches that were not supposed to change anything
broke the core_id code needed for AMD RAPL initialization:

commit 070e92361eec ("tools/power turbostat: Enhance HT enumeration")
commit ddf60e38ca04 ("tools/power turbostat: Simplify global core_id calculation")

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>

net/sched: act_skbedit: fix divide-by-zero in tcf_skbedit_hash()

Commit 38a6f0865796 ("net: sched: support hash selecting tx queue")
added SKBEDIT_F_TXQ_SKBHASH support. The inclusive range size is
computed as:

mapping_mod = queue_mapping_max - queue_mapping + 1;

The range size can be 65536 when the requested range covers all possible
u16 queue IDs (e.g. queue_mapping=0 and queue_mapping_max=U16_MAX).
That value cannot be represented in a u16 and previously wrapped to 0,
so tcf_skbedit_hash() could trigger a divide-by-zero:

queue_mapping += skb_get_hash(skb) % params->mapping_mod;

Compute mapping_mod in a wider type and reject ranges larger than U16_MAX
to prevent params->mapping_mod from becoming 0 and avoid the crash.

Fixes: 38a6f0865796 ("net: sched: support hash selecting tx queue")
Cc: stable@vger.kernel.org # 6.12+
Signed-off-by: Ruitong Liu <cnitlrt@gmail.com>
Link: https://patch.msgid.link/20260213175948.1505257-1-cnitlrt@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

vsock: document namespace mode sysctls

Add documentation for the vsock per-namespace sysctls (`ns_mode` and
`child_ns_mode`) to Documentation/admin-guide/sysctl/net.rst.
These sysctls were introduced by commit eafb64f40ca4 ("vsock: add
netns to vsock core").

Document the two namespace modes (`global` and `local`), the
inheritance behavior of `child_ns_mode`, and the restriction preventing
local namespaces from setting `child_ns_mode` to `global`.

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://patch.msgid.link/20260216163147.236844-1-sgarzare@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: ethernet: ec_bhf: Fix dma_free_coherent() dma handle

dma_free_coherent() in error path takes priv->rx_buf.alloc_len as
the dma handle. This would lead to improper unmapping of the buffer.

Change the dma handle to priv->rx_buf.alloc_phys.

Fixes: 6af55ff52b02 ("Driver for Beckhoff CX5020 EtherCAT master module.")
Cc: <stable@vger.kernel.org>
Signed-off-by: Thomas Fourier <fourier.thomas@gmail.com>
Link: https://patch.msgid.link/20260213164340.77272-2-fourier.thomas@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

macvlan: observe an RCU grace period in macvlan_common_newlink() error path

valis reported that a race condition still happens after my prior patch.

macvlan_common_newlink() might have made @dev visible before
detecting an error, and its caller will directly call free_netdev(dev).

We must respect an RCU period, either in macvlan or the core networking
stack.

After adding a temporary mdelay(1000) in macvlan_forward_source_one()
to open the race window, valis repro was:

ip link add p1 type veth peer p2
ip link set address 00:00:00:00:00:20 dev p1
ip link set up dev p1
ip link set up dev p2
ip link add mv0 link p2 type macvlan mode source

(ip link add invalid% link p2 type macvlan mode source macaddr add
00:00:00:00:00:20 &) ; sleep 0.5 ; ping -c1 -I p1 1.2.3.4
PING 1.2.3.4 (1.2.3.4): 56 data bytes
RTNETLINK answers: Invalid argument

BUG: KASAN: slab-use-after-free in macvlan_forward_source
(drivers/net/macvlan.c:408 drivers/net/macvlan.c:444)
Read of size 8 at addr ffff888016bb89c0 by task e/175

CPU: 1 UID: 1000 PID: 175 Comm: e Not tainted 6.19.0-rc8+ #33 NONE
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
Call Trace:
<IRQ>
dump_stack_lvl (lib/dump_stack.c:123)
print_report (mm/kasan/report.c:379 mm/kasan/report.c:482)
? macvlan_forward_source (drivers/net/macvlan.c:408 drivers/net/macvlan.c:444)
kasan_report (mm/kasan/report.c:597)
? macvlan_forward_source (drivers/net/macvlan.c:408 drivers/net/macvlan.c:444)
macvlan_forward_source (drivers/net/macvlan.c:408 drivers/net/macvlan.c:444)
? tasklet_init (kernel/softirq.c:983)
macvlan_handle_frame (drivers/net/macvlan.c:501)

Allocated by task 169:
kasan_save_stack (mm/kasan/common.c:58)
kasan_save_track (./arch/x86/include/asm/current.h:25
mm/kasan/common.c:70 mm/kasan/common.c:79)
__kasan_kmalloc (mm/kasan/common.c:419)
__kvmalloc_node_noprof (./include/linux/kasan.h:263 mm/slub.c:5657
mm/slub.c:7140)
alloc_netdev_mqs (net/core/dev.c:12012)
rtnl_create_link (net/core/rtnetlink.c:3648)
rtnl_newlink (net/core/rtnetlink.c:3830 net/core/rtnetlink.c:3957
net/core/rtnetlink.c:4072)
rtnetlink_rcv_msg (net/core/rtnetlink.c:6958)
netlink_rcv_skb (net/netlink/af_netlink.c:2550)
netlink_unicast (net/netlink/af_netlink.c:1319 net/netlink/af_netlink.c:1344)
netlink_sendmsg (net/netlink/af_netlink.c:1894)
__sys_sendto (net/socket.c:727 net/socket.c:742 net/socket.c:2206)
__x64_sys_sendto (net/socket.c:2209)
do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:131)

Freed by task 169:
kasan_save_stack (mm/kasan/common.c:58)
kasan_save_track (./arch/x86/include/asm/current.h:25
mm/kasan/common.c:70 mm/kasan/common.c:79)
kasan_save_free_info (mm/kasan/generic.c:587)
__kasan_slab_free (mm/kasan/common.c:287)
kfree (mm/slub.c:6674 mm/slub.c:6882)
rtnl_newlink (net/core/rtnetlink.c:3845 net/core/rtnetlink.c:3957
net/core/rtnetlink.c:4072)
rtnetlink_rcv_msg (net/core/rtnetlink.c:6958)
netlink_rcv_skb (net/netlink/af_netlink.c:2550)
netlink_unicast (net/netlink/af_netlink.c:1319 net/netlink/af_netlink.c:1344)
netlink_sendmsg (net/netlink/af_netlink.c:1894)
__sys_sendto (net/socket.c:727 net/socket.c:742 net/socket.c:2206)
__x64_sys_sendto (net/socket.c:2209)
do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:131)

Fixes: f8db6475a836 ("macvlan: fix error recovery in macvlan_common_newlink()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: valis <sec@valis.email>
Link: https://patch.msgid.link/20260213142557.3059043-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

selftests: tc_actions: don't dump 2MB of \0 to stdout

Since we started running selftests in NIPA we have been seeing
tc_actions.sh generate a soft lockup warning on ~20% of the runs.
On the pre-netdev foundation setup it was actually a missed irq
splat from the console. Now it's either that or a lockup.

I initially suspected a socket locking issue since the test
is exercising local loopback with act_mirred.
After hours of staring at this I noticed in strace that ncat
when -o $file is specified _both_ saves the output to the file
and still prints it to stdout. Because the file being sent
is constructed with:

dd conv=sparse status=none if=/dev/zero bs=1M count=2 of=$mirred
^^^^^^^^^

the data printed is all \0. Most terminals don't display nul
characters (and neither does vng output capture save them).
But QEMU's serial console still has to poke them thru which
is very slow and causes the lockup (if the file is >600kB).

Replace the '-o $file' with '> $file'. This speeds the test up
from 2m20s to 18s on debug kernels, and prevents the warnings.

Fixes: ca22da2fbd69 ("act_mirred: use the backlog for nested calls to mirred ingress")
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260214035159.2119699-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ipv6: addrconf: reduce default temp_valid_lft to 2 days

This is a recommendation from RFC 8981 and it was intended to be changed
by commit 969c54646af0 ("ipv6: Implement draft-ietf-6man-rfc4941bis")
but it only changed the sysctl documentation.

Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260214172543.5783-1-fmancera@suse.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ping: annotate data-races in ping_lookup()

isk->inet_num, isk->inet_rcv_saddr and sk->sk_bound_dev_if
are read locklessly in ping_lookup().

Add READ_ONCE()/WRITE_ONCE() annotations.

The race on isk->inet_rcv_saddr is probably coming from IPv6 support,
but does not deserve a specific backport.

Fixes: dbca1596bbb0 ("ping: convert to RCU lookups, get rid of rwlock")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260216100149.3319315-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: dsa: MxL862xx: don't force-enable MAXLINEAR_GPHY

The newly added dsa driver attempts to enable the corresponding PHY driver,
but that one has additional dependencies that may not be available:

WARNING: unmet direct dependencies detected for MAXLINEAR_GPHY
  Depends on [m]: NETDEVICES [=y] && PHYLIB [=y] && (HWMON [=m] || HWMON [=m]=n [=n])
  Selected by [y]:
  - NET_DSA_MXL862 [=y] && NETDEVICES [=y] && NET_DSA [=y]
aarch64-linux-ld: drivers/net/phy/mxl-gpy.o: in function `gpy_probe':
mxl-gpy.c:(.text.gpy_probe+0x13c): undefined reference to `devm_hwmon_device_register_with_info'
aarch64-linux-ld: drivers/net/phy/mxl-gpy.o: in function `gpy_hwmon_read':
mxl-gpy.c:(.text.gpy_hwmon_read+0x48): undefined reference to `polynomial_calc'

There is actually no compile-time dependency, as DSA correctly uses the
PHY abstractions. Remove the 'select' statement to reduce the complexity.

Fixes: 23794bec1cb6 ("net: dsa: add basic initial driver for MxL862xx switches")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Daniel Golle <daniel@makrotopia.org>
Link: https://patch.msgid.link/20260216105522.2382373-1-arnd@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

dpll: zl3073x: Fix ref frequency setting

The frequency for an input reference is computed as:

frequency = freq_base * freq_mult * freq_ratio_m / freq_ratio_n

Before commit 5bc02b190a3fb ("dpll: zl3073x: Cache all reference
properties in zl3073x_ref"), zl3073x_dpll_input_pin_frequency_set()
explicitly wrote 1 to both the REF_RATIO_M and REF_RATIO_N hardware
registers whenever a new frequency was set. This ensured the FEC ratio
was always reset to 1:1 alongside the new base/multiplier values.

The refactoring in that commit introduced zl3073x_ref_freq_set() to
update the cached ref state, but this helper only sets freq_base and
freq_mult without resetting freq_ratio_m and freq_ratio_n to 1. Because
zl3073x_ref_state_set() uses a compare-and-write strategy, unchanged
ratio fields are never written to the hardware. If the device previously
had non-unity FEC ratio values, they remain in effect after a frequency
change, resulting in an incorrect computed frequency.

Explicitly set freq_ratio_m and freq_ratio_n to 1 in zl3073x_ref_freq_set()
to restore the original behavior.

Fixes: 5bc02b190a3fb ("dpll: zl3073x: Cache all reference properties in zl3073x_ref")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260216194007.680416-1-ivecera@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: do not delay zero-copy skbs in skb_attempt_defer_free()

After the blamed commit, TCP tx zero copy notifications could be
arbitrarily delayed and cause regressions in applications waiting
for them.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: e20dfbad8aab ("net: fix napi_consume_skb() with alien skbs")
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260216193653.627617-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: psp: select CONFIG_SKB_EXTENSIONS

psp now uses skb extensions, failing to build when that is disabled:

In file included from include/net/psp.h:7,
                 from net/psp/psp_sock.c:9:
include/net/psp/functions.h: In function '__psp_skb_coalesce_diff':
include/net/psp/functions.h:60:13: error: implicit declaration of function 'skb_ext_find'; did you mean 'skb_ext_copy'? [-Wimplicit-function-declaration]
   60 |         a = skb_ext_find(one, SKB_EXT_PSP);
      |             ^~~~~~~~~~~~
      |             skb_ext_copy
include/net/psp/functions.h:60:31: error: 'SKB_EXT_PSP' undeclared (first use in this function)
   60 |         a = skb_ext_find(one, SKB_EXT_PSP);
      |                               ^~~~~~~~~~~
include/net/psp/functions.h:60:31: note: each undeclared identifier is reported only once for each function it appears in
include/net/psp/functions.h: In function '__psp_sk_rx_policy_check':
include/net/psp/functions.h:94:53: error: 'SKB_EXT_PSP' undeclared (first use in this function)
   94 |         struct psp_skb_ext *pse = skb_ext_find(skb, SKB_EXT_PSP);
      |                                                     ^~~~~~~~~~~
net/psp/psp_sock.c: In function 'psp_sock_recv_queue_check':
net/psp/psp_sock.c:164:41: error: 'SKB_EXT_PSP' undeclared (first use in this function)
  164 |                 pse = skb_ext_find(skb, SKB_EXT_PSP);
      |                                         ^~~~~~~~~~~

Select the Kconfig symbol as we do from its other users.

Fixes: 6b46ca260e22 ("net: psp: add socket security association code")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Daniel Zahka <daniel.zahka@gmail.com>
Link: https://patch.msgid.link/20260216105500.2382181-1-arnd@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

bpftool: Fix truncated netlink dumps

Netlink requires that the recv buffer used during dumps is at least
min(PAGE_SIZE, 8k) (see the man page). Otherwise the messages will
get truncated. Make sure bpftool follows this requirement, avoid
missing information on systems with large pages.

Acked-by: Quentin Monnet <qmo@kernel.org>
Fixes: 7084566a236f ("tools/bpftool: Remove libbpf_internal.h usage in bpftool")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20260217194150.734701-1-kuba@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

ipv6: fix a race in ip6_sock_set_v6only()

It is unlikely that this function will be ever called
with isk->inet_num being not zero.

Perform the check on isk->inet_num inside the locked section
for complete safety.

Fixes: 9b115749acb24 ("ipv6: add ip6_sock_set_v6only")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Fernando Fernandez Mancera <fmancera@suse.de>
Link: https://patch.msgid.link/20260216102202.3343588-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'turbostat-2026.02.14' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux

Pull turbostat updates from Len Brown:

- Add L2 statistics columns for recent Intel processors:
L2MRPS = L2 Cache M-References Per Second
L2%hit = L2 Cache Hit %

- Sort work and output by cpu# rather than core#

- Minor features and fixes

* tag 'turbostat-2026.02.14' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: (23 commits)
  tools/power turbostat: version 2026.02.14
  tools/power turbostat: Fix and document --header_iterations
  tools/power turbostat: Use strtoul() for iteration parsing
  tools/power turbostat: Favor cpu# over core#
  tools/power turbostat: Expunge logical_cpu_id
  tools/power turbostat: Enhance HT enumeration
  tools/power turbostat: Simplify global core_id calculation
  tools/power turbostat: Unify even/odd/average counter referencing
  tools/power turbostat: Allocate average counters dynamically
  tools/power turbostat: Delete core_data.core_id
  tools/power turbostat: Rename physical_core_id to core_id
  tools/power turbostat: Cleanup package_id
  tools/power turbostat: Cleanup internal use of "base_cpu"
  tools/power turbostat: Add L2 cache statistics
  tools/power turbostat: Remove redundant newlines from err(3) strings
  tools/power turbostat: Allow more use of is_hybrid flag
  tools/power turbostat: Rename "LLCkRPS" column to "LLCMRPS"
  tools/power turbostat.8: Document the "--force" option
  tools/power turbostat: Harden against unexpected values
  tools/power turbostat: Dump hypervisor name
  ...

Merge tag 'ntfs3_for_7.0' of https://github.com/Paragon-Software-Group/linux-ntfs3

Pull ntfs3 updates from Konstantin Komarov:
"New code:
   - improve readahead for bitmap initialization and large directory scans
   - fsync files by syncing parent inodes
   - drop of preallocated clusters for sparse and compressed files
   - zero-fill folios beyond i_valid in ntfs_read_folio()
   - implement llseek SEEK_DATA/SEEK_HOLE by scanning data runs
   - implement iomap-based file operations
   - allow explicit boolean acl/prealloc mount options
   - fall-through between switch labels
   - delayed-allocation (delalloc) support

  Fixes:
   - check return value of indx_find to avoid infinite loop
   - initialize new folios before use
   - infinite loop in attr_load_runs_range on inconsistent metadata
   - infinite loop triggered by zero-sized ATTR_LIST
   - ntfs_mount_options leak in ntfs_fill_super()
   - deadlock in ni_read_folio_cmpr
   - circular locking dependency in run_unpack_ex
   - prevent infinite loops caused by the next valid being the same
   - restore NULL folio initialization in ntfs_writepages()
   - slab-out-of-bounds read in DeleteIndexEntryRoot

  Updates:
   - allow readdir() to finish after directory mutations without rewinddir()
   - handle attr_set_size() errors when truncating files
   - make ntfs_writeback_ops static
   - refactor duplicate kmemdup pattern in do_action()
   - avoid calling run_get_entry() when run == NULL in ntfs_read_run_nb_ra()

  Replaced:
   - use wait_on_buffer() directly
   - rename ni_readpage_cmpr into ni_read_folio_cmpr"

* tag 'ntfs3_for_7.0' of https://github.com/Paragon-Software-Group/linux-ntfs3: (26 commits)
  fs/ntfs3: add delayed-allocation (delalloc) support
  fs/ntfs3: avoid calling run_get_entry() when run == NULL in ntfs_read_run_nb_ra()
  fs/ntfs3: add fall-through between switch labels
  fs/ntfs3: allow explicit boolean acl/prealloc mount options
  fs/ntfs3: Fix slab-out-of-bounds read in DeleteIndexEntryRoot
  ntfs3: Restore NULL folio initialization in ntfs_writepages()
  ntfs3: Refactor duplicate kmemdup pattern in do_action()
  fs/ntfs3: prevent infinite loops caused by the next valid being the same
  fs/ntfs3: make ntfs_writeback_ops static
  ntfs3: fix circular locking dependency in run_unpack_ex
  fs/ntfs3: implement iomap-based file operations
  fs/ntfs3: fix deadlock in ni_read_folio_cmpr
  fs/ntfs3: implement llseek SEEK_DATA/SEEK_HOLE by scanning data runs
  fs/ntfs3: zero-fill folios beyond i_valid in ntfs_read_folio()
  fs/ntfs3: handle attr_set_size() errors when truncating files
  fs/ntfs3: drop preallocated clusters for sparse and compressed files
  fs/ntfs3: fsync files by syncing parent inodes
  fs/ntfs3: fix ntfs_mount_options leak in ntfs_fill_super()
  fs/ntfs3: allow readdir() to finish after directory mutations without rewinddir()
  fs/ntfs3: improve readahead for bitmap initialization and large directory scans
  ...

Merge tag 'ceph-for-7.0-rc1' of https://github.com/ceph/ceph-client

Pull ceph updates from Ilya Dryomov:
"This adds support for the upcoming aes256k key type in CephX that is
  based on Kerberos 5 and brings a bunch of assorted CephFS fixes from
  Ethan and Sam. One of Sam's patches in particular undoes a change in
  the fscrypt area that had an inadvertent side effect of making CephFS
  behave as if mounted with wsize=4096 and leading to the corresponding
  degradation in performance, especially for sequential writes"

* tag 'ceph-for-7.0-rc1' of https://github.com/ceph/ceph-client:
  ceph: assert loop invariants in ceph_writepages_start()
  ceph: remove error return from ceph_process_folio_batch()
  ceph: fix write storm on fscrypted files
  ceph: do not propagate page array emplacement errors as batch errors
  ceph: supply snapshot context in ceph_uninline_data()
  ceph: supply snapshot context in ceph_zero_partial_object()
  libceph: adapt ceph_x_challenge_blob hashing and msgr1 message signing
  libceph: add support for CEPH_CRYPTO_AES256KRB5
  libceph: introduce ceph_crypto_key_prepare()
  libceph: generalize ceph_x_encrypt_offset() and ceph_x_encrypt_buflen()
  libceph: define and enforce CEPH_MAX_KEY_LEN

Merge tag 'ovl-update-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs

Pull overlayfs update from Amir Goldstein:
"Relax the semantics of uuid=off to cater to a use case of overlayfs
  lower layers on btrfs clones, whose UUID are ephemeral and an upper
  layer on a different filesystem"

* tag 'ovl-update-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs:
  ovl: relax requirement for uuid=off,index=on

Merge tag 'v7.0-rc-part2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull smb client fixes from Steve French:

- Fix three potential double free vulnerabilities

- Fix data corruption due to racy lease checks

- Enforce SMB1 signing verification checks

- Fix invalid mount option parsing

- Remove unneeded tracepoint

- Various minor error code corrections

- Minor cleanup

* tag 'v7.0-rc-part2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  smb: client: terminate session upon failed client required signing
  cifs: some missing initializations on replay
  cifs: remove unnecessary tracing after put tcon
  cifs: update internal module version number
  smb: client: fix data corruption due to racy lease checks
  smb/client: move NT_STATUS_MORE_ENTRIES
  smb/client: rename to NT_ERROR_INVALID_DATATYPE
  smb/client: rename to NT_STATUS_SOME_NOT_MAPPED
  smb/client: map NT_STATUS_PRIVILEGE_NOT_HELD
  smb/client: map NT_STATUS_MORE_PROCESSING_REQUIRED
  smb/client: map NT_STATUS_BUFFER_OVERFLOW
  smb/client: map NT_STATUS_NOTIFY_ENUM_DIR
  cifs: SMB1 split: Remove duplicate include of cifs_debug.h
  smb: client: fix regression with mount options parsing

Merge branch 'libbpf-fix-perm-errors-for-ldimm_64_full_range_off'

Emil Tsalapatis says:

====================
libbpf: Fix perm errors for LDIMM_64_FULL_RANGE_OFF

Commit 728ff167910e ("libbpf: Add gating for arena globals relocation feature")
adds a feature flag for testing whether the running kernel supports LDIMM64
instructions with large direct offsets. Fix two edge cases that can
cause unexpected -EPERM errors in two ways:

1) The probe program used for the feature has type TRACEPOINT, but it's
possible the caller does not have permission to load it, even if it is
able to do so for generic BPF programs. Use the SOCKET_FILTER type
instead that requires fewer permissions. This does not affect the check
itself, which will always fail verification anyway.

2) The probe is triggered during bpf_object__collect_relos(), itself
called in bpf_object_open(), to compute the arena relocation offsets of
arena variables. However, the caller may not have permissions to load
BPF programs. This is the case in some systems with the bpftool calls made
by the BPF selftests during compilation, e.g., for skeleton generation.
Move all uses of the feature check to bpf_object_prepare() time instead.

Fixes: 728ff167910e ("libbpf: Add gating for arena globals relocation feature")
Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com>
v2 -> v3: https://lore.kernel.org/bpf/20260214021014.15670-1-emil@etsalapatis.com/

- Only zero out the first byte of the log buffer (Andrii)
- Minimize invocations of the feature gate (Andrii)

v1 -> v2: https://lore.kernel.org/bpf/20260213181752.505318-1-emil@etsalapatis.com/

- Adjust the hash of the original commit post-tree rebase
- Ensure close() is not called on invalid prog_fd in feature probe (Coverity)
====================

Link: https://patch.msgid.link/20260217204345.548648-1-emil@etsalapatis.com
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

libbpf: Delay feature gate check until object prepare time

Commit 728ff167910e ("libbpf: Add gating for arena globals relocation feature")
adds a feature gate check that loads a map and BPF program to
test the running kernel supports large direct offsets for LDIMM64
instructions. This check is currently used to calculate arena symbol
offsets during bpf_object__collect_relos, itself called by
bpf_object_open.

However, the program calling bpf_object_open may not have the permissions to
load maps and programs. This is the case with the BPF selftests, where
bpftool is invoked at compilation time during skeleton generation. This
causes errors as the feature gate unexpectedly fails with -EPERM.

Avoid this by moving all the use of the FEAT_LDIMM64_FULL_RANGE_OFF feature gate
to BPF object preparation time instead.

Fixes: 728ff167910e ("libbpf: Add gating for arena globals relocation feature")
Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20260217204345.548648-3-emil@etsalapatis.com

libbpf: Do not use PROG_TYPE_TRACEPOINT program for feature gating

Commit 728ff167910e uses a PROG_TYPE_TRACEPOINT BPF test program to
check whether the running kernel supports large LDIMM64 offsets. The
feature gate incorrectly assumes that the program will fail at
verification time with one of two messages, depending on whether the
feature is supported by the running kernel. However,
PROG_TYPE_TRACEPOINT programs may fail to load before verification even
starts, e.g., if the shell does not have the appropriate capabilities.
Use a BPF_PROG_TYPE_SOCKET_FILTER program for the feature gate instead.

Also fix two minor issues. First, ensure the log buffer for the test is
initialized: Failing program load before verification led to libbpf dumping
uninitialized data to stdout. Also, ensure that close() is only called
for program_fd in the probe if the program load actually succeeded. The
call was currently failing silently with -EBADF most of the time.

Fixes: 728ff167910e ("libbpf: Add gating for arena globals relocation feature")
Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20260217204345.548648-2-emil@etsalapatis.com

Merge tag 'dmaengine-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine

Pull dmaengine updates from Vinod Koul:
"Core:
   - Add Frank Li as susbstem reviewer to help with reviews

  New Support:
   - Mediatek support for Dimensity 6300 and 9200 controller
   - Qualcomm Kaanapali and Glymur GPI DMA engine
   - Synopsis DW AXI Agilex5
   - Renesas RZ/V2N SoC
   - Atmel microchip lan9691-dma
   - Tegra ADMA tegra264

  Updates:
   - sg_nents_for_dma() helper use in subsystem
   - pm_runtime_mark_last_busy() redundant call update for subsystem
   - Residue support for xilinx AXIDMA driver
   - Intel Max SGL Size Support and capabilities for DSA3.0
   - AXI dma larger than 32bits address support"

* tag 'dmaengine-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (64 commits)
  dmaengine: add Frank Li as reviewer
  dt-bindings: dma: qcom,gpi: Update max interrupts lines to 16
  dmaengine: fsl-edma: don't explicitly disable clocks in .remove()
  dmaengine: xilinx: xdma: use sg_nents_for_dma() helper
  dmaengine: sh: use sg_nents_for_dma() helper
  dmaengine: sa11x0: use sg_nents_for_dma() helper
  dmaengine: qcom: bam_dma: use sg_nents_for_dma() helper
  dmaengine: qcom: adm: use sg_nents_for_dma() helper
  dmaengine: pxa-dma: use sg_nents_for_dma() helper
  dmaengine: lgm: use sg_nents_for_dma() helper
  dmaengine: k3dma: use sg_nents_for_dma() helper
  dmaengine: dw-axi-dmac: use sg_nents_for_dma() helper
  dmaengine: bcm2835-dma: use sg_nents_for_dma() helper
  dmaengine: axi-dmac: use sg_nents_for_dma() helper
  dmaengine: altera-msgdma: use sg_nents_for_dma() helper
  scatterlist: introduce sg_nents_for_dma() helper
  dmaengine: idxd: Add Max SGL Size Support for DSA3.0
  dmaengine: idxd: Expose DSA3.0 capabilities through sysfs
  dmaengine: sh: rz-dmac: Make channel irq local
  dmaengine: pl08x: Fix comment stating the difference between PL080 and PL081
  ...

Merge tag 'phy-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy

Pull phy updates from Vinod Koul:
"Core:

   - Add suuport for "rx-polarity" and "tx-polarity" device tree
     properties and phy common properties to manage this

  New Support:

   - Qualcomm Glymur PCIe Gen4 2-lanes PCIe phy, DP and edp phy, USB UNI
     PHY and SMB2370 eUSB2 repeater. SC8280xp QMP UFS PHY, Kaanapali
     PCIe phy and QMP PHY, QCS615 QMP USB3+DP PHY and driver support for
     that.

   - SpacemiT PCIe/combo PHY and K1 USB2 PHY driver.

   - HDMI 2.1 FRL configuration support and driver enabling for rockchip
     samsung-hdptx driver

   - TI TCAN1046 phy

   - Renesas RZ/V2H(P) and RZ/V2N usb3

   - Mediatek MT8188 hdmi-phy

   - Google Tensor SoC USB PHY driver

   - Apple Type-C PHY

  Updates:

   - Subsystem conversion for clock round_rate() to determine_rate()

   - TI USB3 DT schema conversion

   - Samsung ExynosAutov920 usb3, combo hsphy and ssphy support"

* tag 'phy-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: (143 commits)
  phy: ti: phy-j721e-wiz: convert from divider_round_rate() to divider_determine_rate()
  dt-bindings: phy: ti,control-phy-otghs: convert to DT schema
  dt-bindings: phy: ti,phy-usb3: convert to DT schema
  phy: tegra: xusb: Remove unused powered_on variable
  phy: renesas: rcar-gen3-usb2: add regulator dependency
  phy: GOOGLE_USB: add TYPEC dependency
  phy: enter drivers/phy/Makefile even without CONFIG_GENERIC_PHY
  phy: renesas: rcar-gen3-usb2: Use mux-state for phyrst management
  phy: renesas: rcar-gen3-usb2: Add regulator for OTG VBUS control
  phy: renesas: rcar-gen3-usb2: Use devm_pm_runtime_enable()
  phy: renesas: rcar-gen3-usb2: Factor out VBUS control logic
  dt-bindings: phy: renesas,usb2-phy: Document RZ/G3E SoC
  dt-bindings: phy: renesas,usb2-phy: Document mux-states property
  dt-bindings: phy: renesas,usb2-phy: Document USB VBUS regulator
  phy: rockchip: samsung-hdptx: Add HDMI 2.1 FRL support
  phy: rockchip: samsung-hdptx: Extend rk_hdptx_phy_verify_hdmi_config() helper
  phy: rockchip: samsung-hdptx: Switch to driver specific HDMI config
  phy: rockchip: samsung-hdptx: Drop hw_rate driver data
  phy: rockchip: samsung-hdptx: Compute clk rate from PLL config
  phy: rockchip: samsung-hdptx: Cleanup *_cmn_init_seq lists
  ...

Merge tag 'soundwire-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire

Pull soundwire updates from Vinod Koul:

- support for Qualcomm v2.2.0 controllers

- bus method updates for .probe(), .remove() and .shutdown()
   and remove function return value updates

- Avell B.ON dmi-quirks mapping

- mark cs42l45 codec as wake capable

* tag 'soundwire-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
  soundwire: intel_ace2x: add SND_HDA_CORE dependency
  dt-bindings: soundwire: qcom: Add SoundWire v2.2.0 compatible
  soundwire: Use bus methods for .probe(), .remove() and .shutdown()
  soundwire: Make remove function return no value
  soundwire: dmi-quirks: add mapping for Avell B.ON (OEM rebranded of NUC15)
  soundwire: qcom: Use guard to avoid mixing cleanup and goto
  soundwire: intel_auxdevice: add cs42l45 codec to wake_capable_list

Merge tag 'spdx-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx

Pull SPDX updates from Greg KH:
"Here are two small changes that add some missing SPDX license lines to
  some core kernel files. These are:

   - adding SPDX license lines to kdb files

   - adding SPDX license lines to the remaining kernel/ files

  Both of these have been in linux-next for a while with no reported
  issues"

* tag 'spdx-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx:
  kernel: debug: Add SPDX license ids to kdb files
  kernel: add SPDX-License-Identifier lines

Merge tag 'usb-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB / Thunderbolt updates from Greg KH:
"Here is the "big" set of USB and Thunderbolt driver updates for
  7.0-rc1. Overall more lines were removed than added, thanks to
  dropping the obsolete isp1362 USB host controller driver, always a
  nice change.

  Other than that, nothing major happening here, highlights are:

   - lots of dwc3 driver updates and new hardware support added

   - usb gadget function driver updates

   - usb phy driver updates

   - typec driver updates and additions

   - USB rust binding updates for syntax and formatting changes

   - more usb serial device ids added

   - other smaller USB core and driver updates and additions

  All of these have been in linux-next for a long time, with no reported
  problems"

* tag 'usb-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (77 commits)
  usb: typec: ucsi: Add Thunderbolt alternate mode support
  usb: typec: hd3ss3220: Check if regulator needs to be switched
  usb: phy: tegra: parametrize PORTSC1 register offset
  usb: phy: tegra: parametrize HSIC PTS value
  usb: phy: tegra: return error value from utmi_wait_register
  usb: phy: tegra: cosmetic fixes
  dt-bindings: usb: renesas,usbhs: Add RZ/G3E SoC support
  usb: dwc2: fix resume failure if dr_mode is host
  usb: cdns3: fix role switching during resume
  usb: dwc3: gadget: Move vbus draw to workqueue context
  USB: serial: option: add Telit FN920C04 RNDIS compositions
  usb: dwc3: Log dwc3 address in traces
  usb: gadget: tegra-xudc: Add handling for BLCG_COREPLL_PWRDN
  usb: phy: tegra: add HSIC support
  usb: phy: tegra: use phy type directly
  usb: typec: ucsi: Enforce mode selection for cros_ec_ucsi
  usb: typec: ucsi: Support mode selection to activate altmodes
  usb: typec: Introduce mode_selection bit
  usb: typec: Implement mode selection
  usb: typec: Expose alternate mode priority via sysfs
  ...

Merge tag 'tty-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty / serial driver updates from Greg KH:
"Here is the small amount of tty and serial driver updates for 7.0-rc1.
  Nothing major in here at all, just some driver updates and minor
  tweaks and cleanups including:

   - sh-sci serial driver updates

   - 8250 driver updates

   - attempt to make the tty ports have their own workqueue, but was
     reverted after testing found it to have problems on some platforms.

     This will probably come back for 7.1 after it has been reworked and
     resubmitted

   - other tiny tty driver changes

  All of these have been in linux-next for a while with no reported
  problems"

* tag 'tty-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (49 commits)
  Revert "tty: tty_port: add workqueue to flip TTY buffer"
  tty: tty_port: add workqueue to flip TTY buffer
  serial: 8250_pci: Remove custom deprecated baud setting routine
  serial: 8250_omap: Remove custom deprecated baud setting routine
  dt-bindings: serial: renesas,scif: Document RZ/G3L SoC
  serial: 8250: omap: set out-of-band wakeup if wakeup pinctrl exists
  tty: hvc-iucv: Remove KMSG_COMPONENT macro
  dt-bindings: serial: google,goldfish-tty: Convert to DT schema
  dt-bindings: serial: sh-sci: Fold single-entry compatibles into enum
  serial: 8250: 8250_omap.c: Clear DMA RX running status only after DMA termination is done
  serial: 8250: 8250_omap.c: Add support for handling UART error conditions
  serial: SH_SCI: improve "DMA support" prompt
  serial: Kconfig: fix ordering of entries for menu display
  serial: 8250: fix ordering of entries for menu display
  serial: imx: change SERIAL_IMX_CONSOLE to bool
  8250_men_mcb: drop unneeded MODULE_ALIAS
  serial: men_z135_uart: drop unneeded MODULE_ALIAS
  dt-bindings: serial: renesas,rsci: Document RZ/V2H(P) and RZ/V2N SoCs
  serial: rsci: Convert to FIELD_MODIFY()
  dt-bindings: serial: 8250: add SpacemiT K3 UART compatible
  ...

Merge tag 'staging-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging

Pull staging driver updates from Greg KH:
"Here is the big set of staging driver updates for 7.0-rc1. Well, not
  that big, just lots of tiny coding style cleanups primarily in one
  driver as everyone seems to have glomed onto it for some reason that
  escapes me (is there a tutorial out there somewhere pointing people at
  this?)

  Not much overall, the changes can be summarized as:

   - cleanups for the rtl8723bs driver, so many cleanups...

   - vme_user driver cleanups

   - sm750fb driver cleanups

   - tiny greybus driver cleanups

   - other really small staging driver cleanups

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'staging-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (119 commits)
  staging: rtl8723bs: refactor ODM_SetIQCbyRFpath to reduce duplication
  staging: rtl8723bs: rename CamelCase function Set_MSR to set_msr
  staging: rtl8723bs: remove unnecessary blank lines in rtw_io.c
  staging: rtl8723bs: remove stale TODO item regarding %pM
  staging: rtl8723bs: remove unused allocation wrapper functions
  staging: rtl8723bs: use standard skb allocation APIs
  staging: rtl8723bs: replace rtw_zmalloc() with kzalloc()
  staging: rtl8723bs: replace rtw_malloc() with kmalloc()
  staging: rtl8723bs: introduce kmemdup() where applicable
  staging: sm750fb: Clean up variable names
  staging: rtl8723bs: fix null dereference in find_network
  staging: rtl8723bs: use unaligned access macros in rtw_security.c
  staging: rtl8723bs: fix potential race in expire_timeout_chk
  staging: rtl8723bs: remove dead debugging code in rtw_mlme_ext.c
  staging: rtl8723bs: modernize hex output in rtw_report_sec_ie
  staging: rtl8723bs: fix spacing around operators
  staging: rtl8723bs: rename u1bTmp to val
  staging: rtl8723bs: remove unused private debug counters
  staging: rtl8723bs: remove thread wraper functions and add IS_ERR() check
  staging: rtl8723bs: fix firmware memory leak on error
  ...

Merge tag 'char-misc-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc/IIO driver updates from Greg KH:
"Here is the big set of char/misc/iio and other smaller driver
  subsystem changes for 7.0-rc1. Lots of little things in here,
  including:

   - Loads of iio driver changes and updates and additions

   - gpib driver updates

   - interconnect driver updates

   - i3c driver updates

   - hwtracing (coresight and intel) driver updates

   - deletion of the obsolete mwave driver

   - binder driver updates (rust and c versions)

   - mhi driver updates (causing a merge conflict, see below)

   - mei driver updates

   - fsi driver updates

   - eeprom driver updates

   - lots of other small char and misc driver updates and cleanups

  All of these have been in linux-next for a while, with no reported
  issues"

* tag 'char-misc-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (297 commits)
  mux: mmio: fix regmap leak on probe failure
  rust_binder: return p from rust_binder_transaction_target_node()
  drivers: android: binder: Update ARef imports from sync::aref
  rust_binder: fix needless borrow in context.rs
  iio: magn: mmc5633: Fix Kconfig for combination of I3C as module and driver builtin
  iio: sca3000: Fix a resource leak in sca3000_probe()
  iio: proximity: rfd77402: Add interrupt handling support
  iio: proximity: rfd77402: Document device private data structure
  iio: proximity: rfd77402: Use devm-managed mutex initialization
  iio: proximity: rfd77402: Use kernel helper for result polling
  iio: proximity: rfd77402: Align polling timeout with datasheet
  iio: cros_ec: Allow enabling/disabling calibration mode
  iio: frequency: ad9523: correct kernel-doc bad line warning
  iio: buffer: buffer_impl.h: fix kernel-doc warnings
  iio: gyro: itg3200: Fix unchecked return value in read_raw
  MAINTAINERS: add entry for ADE9000 driver
  iio: accel: sca3000: remove unused last_timestamp field
  iio: accel: adxl372: remove unused int2_bitmask field
  iio: adc: ad7766: Use iio_trigger_generic_data_rdy_poll()
  iio: magnetometer: Remove IRQF_ONESHOT
  ...

Merge tag 'block-7.0-20260216' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

Pull more block updates from Jens Axboe:

- Fix partial IOVA mapping cleanup in error handling

- Minor prep series ignoring discard return value, as
   the inline value is always known

- Ensure BLK_FEAT_STABLE_WRITES is set for drbd

- Fix leak of folio in bio_iov_iter_bounce_read()

- Allow IOC_PR_READ_* for read-only open

- Another debugfs deadlock fix

- A few doc updates

* tag 'block-7.0-20260216' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  blk-mq: use NOIO context to prevent deadlock during debugfs creation
  blk-stat: convert struct blk_stat_callback to kernel-doc
  block: fix enum descriptions kernel-doc
  block: update docs for bio and bvec_iter
  block: change return type to void
  nvmet: ignore discard return value
  md: ignore discard return value
  block: fix partial IOVA mapping cleanup in blk_rq_dma_map_iova
  block: fix folio leak in bio_iov_iter_bounce_read()
  block: allow IOC_PR_READ_* ioctls with BLK_OPEN_READ
  drbd: always set BLK_FEAT_STABLE_WRITES

Merge tag 'io_uring-7.0-20260216' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

Pull more io_uring updates from Jens Axboe:
"This is a mix of cleanups and fixes. No major fixes in here, just a
  bunch of little fixes. Some of them marked for stable as it fixes
  behavioral issues

   - Fix an issue with SOCKET_URING_OP_SETSOCKOPT for netlink sockets,
     due to a too restrictive check on it having an ioctl handler

   - Remove a redundant SQPOLL check in ring creation

   - Kill dead accounting for zero-copy send, which doesn't use ->buf
     or ->len post the initial setup

   - Fix missing clamp of the allocation hint, which could cause
     allocations to fall outside of the range the application asked
     for. Still within the allowed limits.

   - Fix for IORING_OP_PIPE's handling of direct descriptors

   - Tweak to the API for the newly added BPF filters, making them
     more future proof in terms of how applications deal with them

   - A few fixes for zcrx, fixing a few error handling conditions

   - Fix for zcrx request flag checking

   - Add support for querying the zcrx page size

   - Improve the NO_SQARRAY static branch inc/dec, avoiding busy
     conditions causing too much traffic

   - Various little cleanups"

* tag 'io_uring-7.0-20260216' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  io_uring/bpf_filter: pass in expected filter payload size
  io_uring/bpf_filter: move filter size and populate helper into struct
  io_uring/cancel: de-unionize file and user_data in struct io_cancel_data
  io_uring/rsrc: improve regbuf iov validation
  io_uring: remove unneeded io_send_zc accounting
  io_uring/cmd_net: fix too strict requirement on ioctl
  io_uring: delay sqarray static branch disablement
  io_uring/query: add query.h copyright notice
  io_uring/query: return support for custom rx page size
  io_uring/zcrx: check unsupported flags on import
  io_uring/zcrx: fix post open error handling
  io_uring/zcrx: fix sgtable leak on mapping failures
  io_uring: use the right type for creds iteration
  io_uring/openclose: fix io_pipe_fixed() slot tracking for specific slots
  io_uring/filetable: clamp alloc_hint to the configured alloc range
  io_uring/rsrc: replace reg buffer bit field with flags
  io_uring/zcrx: improve types for size calculation
  io_uring/tctx: avoid modifying loop variable in io_ring_add_registered_file
  io_uring: simplify IORING_SETUP_DEFER_TASKRUN && !SQPOLL check

cpuidle: menu: Remove single state handling

cpuidle systems where the governor has no choice because there's only
a single idle state are now handled by cpuidle core and bypass the
governor, so remove the related handling.

Signed-off-by: Christian Loehle <christian.loehle@arm.com>
[ rjw: Rebase on top of the cpuidle changes merged recently ]
Link: https://patch.msgid.link/20260216185005.1131593-5-aboorvad@linux.ibm.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

cpuidle: teo: Remove single state handling

cpuidle systems where the governor has no choice because there's only
a single idle state are now handled by cpuidle core and bypass the
governor, so remove the related handling.

Signed-off-by: Christian Loehle <christian.loehle@arm.com>
Link: https://patch.msgid.link/20260216185005.1131593-4-aboorvad@linux.ibm.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

cpuidle: haltpoll: Remove single state handling

cpuidle systems where the governor has no choice because there's only
a single idle state are now handled by cpuidle core and bypass the
governor, so remove the related handling.

Signed-off-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Reviewed-by: Christian Loehle <christian.loehle@arm.com>
[ rjw: Extended the change to drop a redundant local variable ]
Link: https://patch.msgid.link/20260216185005.1131593-3-aboorvad@linux.ibm.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

cpuidle: Skip governor when only one idle state is available

On certain platforms (PowerNV systems without a power-mgt DT node),
cpuidle may register only a single idle state. In cases where that
single state is a polling state (state 0), the ladder governor may
incorrectly treat state 1 as the first usable state and pass an
out-of-bounds index. This can lead to a NULL enter callback being
invoked, ultimately resulting in a system crash.

[   13.342636] cpuidle-powernv : Only Snooze is available
[   13.351854] Faulting instruction address: 0x00000000
[   13.376489] NIP [0000000000000000] 0x0
[   13.378351] LR  [c000000001e01974] cpuidle_enter_state+0x2c4/0x668

Fix this by adding a bail-out in cpuidle_select() that returns state 0
directly when state_count <= 1, bypassing the governor and keeping the
tick running.

Fixes: dc2251bf98c6 ("cpuidle: Eliminate the CPUIDLE_DRIVER_STATE_START symbol")
Signed-off-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Reviewed-by: Christian Loehle <christian.loehle@arm.com>
Link: https://patch.msgid.link/20260216185005.1131593-2-aboorvad@linux.ibm.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

netfilter: nf_tables: fix use-after-free in nf_tables_addchain()

nf_tables_addchain() publishes the chain to table->chains via
list_add_tail_rcu() (in nft_chain_add()) before registering hooks.
If nf_tables_register_hook() then fails, the error path calls
nft_chain_del() (list_del_rcu()) followed by nf_tables_chain_destroy()
with no RCU grace period in between.

This creates two use-after-free conditions:

1) Control-plane: nf_tables_dump_chains() traverses table->chains
    under rcu_read_lock(). A concurrent dump can still be walking
    the chain when the error path frees it.

2) Packet path: for NFPROTO_INET, nf_register_net_hook() briefly
    installs the IPv4 hook before IPv6 registration fails.  Packets
    entering nft_do_chain() via the transient IPv4 hook can still be
    dereferencing chain->blob_gen_X when the error path frees the
    chain.

Add synchronize_rcu() between nft_chain_del() and the chain destroy
so that all RCU readers -- both dump threads and in-flight packet
evaluation -- have finished before the chain is freed.

Fixes: 91c7b38dc9f0 ("netfilter: nf_tables: use new transaction infrastructure to handle chain")
Signed-off-by: Inseo An <y0un9sa@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>

net: remove WARN_ON_ONCE when accessing forward path array

Although unlikely, recent support for IPIP tunnels increases chances of
reaching this WARN_ON_ONCE if userspace manages to build a sufficiently
long forward path.

Remove it.

Fixes: ddb94eafab8b ("net: resolve forwarding path from virtual netdevice and HW destination address")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>

ipvs: do not keep dest_dst if dev is going down

There is race between the netdev notifier ip_vs_dst_event()
and the code that caches dst with dev that is going down.
As the FIB can be notified for the closed device after our
handler finishes, it is possible valid route to be returned
and cached resuling in a leaked dev reference until the dest
is not removed.

To prevent new dest_dst to be attached to dest just after the
handler dropped the old one, add a netif_running() check
to make sure the notifier handler is not currently running
for device that is closing.

Fixes: 7a4f0761fce3 ("IPVS: init and cleanup restructuring")
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Florian Westphal <fw@strlen.de>

ipvs: skip ipv6 extension headers for csum checks

Protocol checksum validation fails for IPv6 if there are extension
headers before the protocol header. iph->len already contains its
offset, so use it to fix the problem.

Fixes: 2906f66a5682 ("ipvs: SCTP Trasport Loadbalancing Support")
Fixes: 0bbdd42b7efa ("IPVS: Extend protocol DNAT/SNAT and state handlers")
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Florian Westphal <fw@strlen.de>

include: uapi: netfilter_bridge.h: Cover for musl libc

Musl defines its own struct ethhdr and thus defines __UAPI_DEF_ETHHDR to
zero. To avoid struct redefinition errors, user space is therefore
supposed to include netinet/if_ether.h before (or instead of)
linux/if_ether.h. To relieve them from this burden, include the libc
header here if not building for kernel space.

Reported-by: Alyssa Ross <hi@alyssa.is>
Suggested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Florian Westphal <fw@strlen.de>