Jakub Kicinski [Mon, 8 Jun 2026 22:33:34 +0000 (15:33 -0700)]
Merge tag 'nf-next-26-06-07' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next
Pablo Neira Ayuso says:
====================
Netfilter/IPVS updates for net-next
The following patchset contains Netfilter/IPVS updates for net-next,
this contains updates to address sashiko reports in IPVS and Netfilter
on possible pre-existing issues. This also includes a series to add
refcount for ct helper and timeout to deal with a corner case scenario
with unconfirmed conntracks flying to nfqueue.
1) Add a conn_max sysctl to IPVS to limit the maximum number of
connections, from Julian Anastasov.
2) Use get_unaligned_be16() to access TCP MSS in nfnetlink_osf,
from Fernando Fernandez Mancera.
3) Use {READ,WRITE}_ONCE to access helper flags from nfnetlink_helper.
Several patches for the synproxy infrastructure, from Fernando
Fernandez Mancera:
4) Drop packet if TCP timestamp adjustment fails.
5) Continue parsing of TCP timestamp to deal with possible duplicates.
6) Use {get,put}_unaligned_be32() to acess the TCP timestamp.
7) Hold ct->lock to initialize nf_ct_seqadj_init().
Updates for the ct timeout infrastructure, to deal with a corner case
for unconfirmed conntracks flying to nfqueue:
8) Add a refcount to track ct timeout policy use by ct extension,
release the timeout until the last ct extension drops the refcnt
on it.
Similar update for the ct helper infrastructure:
9) Dynamic allocation of ct helpers, as a preparation for adding
refcount to track ct extension use.
10) Move destroy_sibling_or_exp() to nf_conntrack_proto_gre, so
pptp conntrack helper module removal does not make this code
unreachable via the helper->destroy callback. This is another
dependency for the new refcount coming in this series.
11) Add a refcount to track use of it from the ct extension, then
ct helper and timeout is reachable to the connection until
it goes away.
12) Remove the genid infrastructure in ct extensions. The primary
goal was to detect that a ct extension such as ct timeout and
ct helper went stale for unconfirmed conntrack, either because
object or module was removed. This deactivates all ct extensions
though for this unconfirmed conntrack.
13) Call nf_ct_gre_keymap_destroy() if this is a master conntrack
with a pptp helper only.
sashiko.dev reports one more relevant issue when unsetting the helper
via ctnetlink that I will address in a follow up patch.
Then, two more assorted updates:
14) Avoid a unlikely underflow in bridge VLAN untag, only possible
if buggy bridge VLAN filtering is buggy, remove WARN_ON_ONCE
while at it. From David Carlier.
15) Use get_unaligned_be32() in nf_conntrack_tcp to access sack
extension, from Rosen Penev.
* tag 'nf-next-26-06-07' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
netfilter: nf_conntrack: use get_unaligned_be32() in tcp_sack()
netfilter: flowtable: avoid num_encaps underflow on bridge VLAN untag
netfilter: conntrack: call nf_ct_gre_keymap_destroy() if master helper is pptp
netfilter: conntrack: revert ct extension genid infrastructure
netfilter: nf_conntrack_helper: add refcounting from datapath
netfilter: nf_conntrack_pptp: move GRE specific cleanup to GRE tracker
netfilter: nf_conntrack_helper: dynamically allocate struct nf_conntrack_helper
netfilter: cttimeout: detach dataplane timeout policy and repurpose refcount
netfilter: synproxy: protect nf_ct_seqadj_init() with conntrack lock
netfilter: synproxy: fix unaligned memory access in timestamp adjustment
netfilter: synproxy: adjust duplicate timestamp options
netfilter: synproxy: drop packets if timestamp adjustment fails
netfilter: nfnetlink_cthelper: use {READ,WRITE}_ONCE for accessing helper flags
netfilter: nfnetlink_osf: fix mss parsing on big-endian architectures
ipvs: add conn_max sysctl to limit connections
====================
Shashank Balaji [Mon, 18 May 2026 10:20:00 +0000 (19:20 +0900)]
driver core: platform: set mod_name in driver registration
Pass KBUILD_MODNAME through the driver registration macro so that the
driver core can create the module symlink in sysfs for built-in drivers,
and fixup all callers.
The Rust platform adapter is updated to pass the module name through to
the new parameter.
Tested on qemu with:
- x86 defconfig + CONFIG_RUST
- arm64 defconfig + CONFIG_RUST + CONFIG_CORESIGHT stuff
Shashank Balaji [Mon, 18 May 2026 10:19:59 +0000 (19:19 +0900)]
coresight: pass THIS_MODULE implicitly through a macro
Rename coresight_init_driver() to coresight_init_driver_with_owner() and
replace it with a macro wrapper that passes THIS_MODULE implicitly. This
is in line with what other buses do.
Shashank Balaji [Mon, 1 Jun 2026 10:19:41 +0000 (19:19 +0900)]
kernel: param: initialize module_kset in a pure_initcall
Commit "driver core: platform: set mod_name in driver registration" will
set struct device_driver's mod_name member for platform driver
registration. For a driver to be registered with its mod_name set,
module_kset needs to be initialized, which currently happens in a
subsys_initcall in param_sysfs_init(). The tegra cbb drivers register
themselves before module_kset init, in a core_initcall. This works
currently because lookup_or_create_module_kobject(), which dereferences
module_kset via kset_find_obj(), is not called if mod_name is not set,
which is the case now.
So in preparation for the commit "driver core: platform: set mod_name in
driver registration", move module_kset init to pure_initcall level,
ensuring it happens before tegra cbb driver registration.
Shashank Balaji [Mon, 18 May 2026 10:19:57 +0000 (19:19 +0900)]
soc/tegra: cbb: Move driver registration from pure_initcall to core_initcall
Commit "driver core: platform: set mod_name in driver registration" will
set struct device_driver's mod_name member for platform driver
registration. For a driver to be registered with its mod_name set,
module_kset needs to be initialized, which currently happens in a
subsys_initcall in param_sysfs_init(). The tegra cbb drivers register
themselves before module_kset init, in a pure_initcall. This works
currently because lookup_or_create_module_kobject(), which dereferences
module_kset via kset_find_obj(), is not called if mod_name is not set,
which is the case now.
So in preparation for the commit "driver core: platform: set mod_name in
driver registration", move tegra cbb driver registration to
core_initcall level, and commit "kernel: param: initialize module_kset
in a pure_initcall" will move module_kset init to pure_initcall level,
ensuring module_kset init happens before tegra cbb driver registration.
Sneh Mankad [Fri, 29 May 2026 12:55:45 +0000 (18:25 +0530)]
pinctrl: qcom: Fix resolving register base address from device node
Commit 56ffb63749f4 ("pinctrl: qcom: add multi TLMM region option parameter")
added reg-names property based register reading. However multiple platforms
are not using the reg-names as they have only single TLMM register region.
Commit tried to handle this using the default_region module parameter,
however this condition is unreachable as the error return precedes it by
just checking if reg-names property exists or not, making it impossible
to use tlmm-test for the SoCs (x1e80100) which don't have reg-names
property in TLMM device.
Fix this by moving the default_region check at the start of the
tlmm_reg_base().
Sneh Mankad [Fri, 29 May 2026 12:55:44 +0000 (18:25 +0530)]
pinctrl: qcom: Modify MSM_PULL_MASK to accurately represent PULL bits
MSM_PULL_MASK currently spans bits [2:0], but the GPIO_PULL field in the
GPIO_CFG register only occupies bits [1:0]. Bit 2 belongs to
FUNC_SEL.
MSM_PULL_MASK is used to isolate the GPIO_PULL bits before writing the
pull configuration (PULL_DOWN: 0x1, PULL_UP: 0x3) to the GPIO_CFG
register. Narrow it to bits [1:0] to prevent unintended modification of
the FUNC_SEL field.
This causes no functional change since the driver currently does not
modify the FUNC_SEL bit, but align the mask with hardware configuration
nonetheless.
Akhil R [Mon, 18 May 2026 11:40:11 +0000 (17:10 +0530)]
i2c: tegra: Disable fair arbitration for non-MCTP buses
Recent Tegra I2C controllers have a fairness arbitration register, which
allows configuring the fair idle time required to support MCTP protocol
over I2C. It is enabled by default, adding a per-transfer latency overhead
that impacts non-MCTP I2C buses.
Disable the fairness arbitration register during controller init for buses
that are not MCTP controllers.
Akhil R [Mon, 18 May 2026 11:40:10 +0000 (17:10 +0530)]
i2c: tegra: use dmaengine_get_dma_device() for DMA buffer allocation
Use dmaengine_get_dma_device() to obtain the correct struct device
pointer for dma_alloc_coherent() instead of directly dereferencing
chan->device->dev.
The dmaengine_get_dma_device() helper checks whether the DMA channel
has a per-channel DMA device (chan->dev->chan_dma_dev) and returns it
when available, falling back to the controller device otherwise. On
platforms where the DMA controller sits behind an IOMMU with
per-channel IOVA spaces (e.g. Tegra264 GPC DMA), the per-channel
device carries the correct DMA mapping context. Using the controller
device directly would allocate DMA buffers against the wrong IOMMU
domain, leading to SMMU faults at runtime.
On platforms without per-channel DMA devices the helper returns the
same pointer as before, so there is no change in behavior for existing
hardware.
Akhil R [Mon, 18 May 2026 11:40:13 +0000 (17:10 +0530)]
i2c: tegra: Fix NOIRQ suspend/resume
The Tegra I2C driver relies on runtime PM to wake up the controller before
each transfer. However, runtime PM is disabled between the system suspend
and NOIRQ suspend. If an I2C device initiates a transfer during this
window, the I2C controller fails to wake up and the transfer fails. To
handle this, the controller must be kept available for this period to
allow transfers.
Rework the I2C controller's system PM callbacks such that the controller
is resumed from runtime suspend during system suspend and it stays
RPM_ACTIVE throughout the suspend-resume cycle until it is runtime
suspended back in the system resume. The clocks are disabled in NOIRQ
suspend and enabled back in NOIRQ resume by calling the controller's
runtime PM functions directly.
Fixes: 8ebf15e9c869 ("i2c: tegra: Move suspend handling to NOIRQ phase") Assisted-by: Cursor:claude-4.6-opus Signed-off-by: Akhil R <akhilrajeev@nvidia.com> Cc: <stable@vger.kernel.org> # v5.4+ Reviewed-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org> Link: https://lore.kernel.org/r/20260518114013.62065-5-akhilrajeev@nvidia.com
Akhil R [Mon, 18 May 2026 11:40:12 +0000 (17:10 +0530)]
i2c: tegra: Update Tegra410 I2C timing parameters
Update Tegra410 I2C timing parameters based on hardware characterization
results. This adjusts the fast mode and HS mode settings to be compliant
with the I2C specification.
Yuho Choi [Mon, 1 Jun 2026 19:20:05 +0000 (15:20 -0400)]
watchdog: unregister PM notifier on watchdog unregister
watchdog_register_device() registers wdd->pm_nb when
WDOG_NO_PING_ON_SUSPEND is set, but watchdog_unregister_device() does not
remove it. This leaves an embedded notifier block on the PM notifier chain
after the watchdog device has been unregistered.
A later suspend/resume notification can then call watchdog_pm_notifier()
with a stale watchdog_device pointer, or at minimum after wdd->wd_data has
been cleared by watchdog_dev_unregister().
Unregister the PM notifier before tearing down the watchdog device.
Al Viro [Tue, 12 May 2026 16:18:21 +0000 (12:18 -0400)]
configfs: mark pinned dentries persistent
on the removal side we can (finally) get rid of __simple_unlink()
and __simple_rmdir() kludges now that dentries in question are
properly marked persistent - simple_unlink() and simple_rmdir()
will do the right thing for those.
Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 12 May 2026 07:19:41 +0000 (03:19 -0400)]
configfs: dentry refcount needs to be pinned only once
currently we have a weird situation where
* symlinks and roots of subtrees created by mkdir are pinned once
* subdirectories of subtrees created by mkdir are pinned twice
* roots of subtrees created by register_{group,subsystem} are pinned
twice.
It makes things harder to follow for no good reason. The goal is to
encapsulate the unbalanced dget/dput into d_{make,discard}_persisitent()
and, preferably, allow a use of simple_recursive_removal() or analogue
thereof. So let's regularize that and pin things only once.
create_default_group() and configfs_register_subsystem() don't need to
keep their reference around on success - configfs_create_dir() has pinned
the sucker already. So we can drop the reference passed to
configfs_create_dir() (via configfs_attach_group(), etc.) both on success
and on failure. On the removal side we no longer have the double references,
so we need an explicit dget() to compensate.
Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 12 May 2026 07:13:37 +0000 (03:13 -0400)]
switch configfs_detach_{group,item}() to passing dentry
... and there's no need to grab/drop it, or check for NULL - none
of the callers would even get there with NULL dentry and all of
them have the sucker pinned
Note that if sd is a directory configfs_dirent, we have sd->s_element
pointing to config_item with item->ci_dentry equal to sd->s_dentry.
Which is the only reason why detach_groups() gets away with using
the latter for locking the inode and the former for removal.
Aren't redundant data structures wonderful, for obfuscation if nothing
else?
Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 12 May 2026 06:26:41 +0000 (02:26 -0400)]
configfs_remove_dir(), detach_attrs(): switch to passing dentry
... and deal with grabbing/dropping it in the sole caller.
After that configfs_remove_dir() becomes an unconditional call of remove_dir(),
so we can fold them together.
Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 12 May 2026 06:18:38 +0000 (02:18 -0400)]
populate_attrs(): move cleanup to the sole caller
... where it folds with configfs_remove_dir() into a call of
configfs_detach_item(). Note that at the early failure exit
(before we'd added any children) we were not calling detach_attrs()
only because there it would've been a no-op - nothing added,
nothing there to be removed.
Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 12 May 2026 05:25:48 +0000 (01:25 -0400)]
configfs_do_depend_item(): pass configfs_dirent instead of dentry
Again, the only thing it uses the argument for is its ->d_fsdata
and callers already have that - as the matter of fact, they are
passing ->s_dentry of that configfs_dirent, so that the function
could get it back as ->d_fsdata of that. With nothing else in
dentry even looked at...
configfs_dirent in question is a directory one - in this case those
are subdirectories of root (aka roots of "subsystem" trees).
Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 12 May 2026 05:23:29 +0000 (01:23 -0400)]
configfs_depend_prep(): pass configfs_dirent instead of dentry
Again, the only thing it uses dentry for is dentry->d_fsdata; for the
recursive call the situation is the same as with configfs_detach_prep()
and the same observation about ->s_dentry->d_fsdata applies.
Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Breno Leitao <leitao@debian.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 12 May 2026 05:17:13 +0000 (01:17 -0400)]
configfs_detach_prep(): pass configfs_dirent instead of dentry
The only thing it uses the argument for is its ->d_fsdata and
all callers have that already available.
Note that in the recursive call we are dealing with a (sub)directory
configfs_dirent, and for those ->s_dentry->d_fsdata points back
to configfs_dirent itself.
Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Breno Leitao <leitao@debian.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sat, 30 May 2026 07:48:34 +0000 (03:48 -0400)]
configfs: fix lockless traversals of ->s_children
Having the parent directory locked protects entries from removal
by another thread, but it does *not* protect cursors from being
moved around by lseek() - or freed, for that matter.
Fixes: 6f6107640625 ("configfs: Introduce configfs_dirent_lock") Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Wentao Liang [Sun, 7 Jun 2026 09:03:03 +0000 (09:03 +0000)]
drm/virtio: fix dma_fence refcount leak on error in virtio_gpu_dma_fence_wait()
dma_fence_unwrap_for_each() internally calls dma_fence_unwrap_first()
which does cursor->chain = dma_fence_get(head), taking an extra
reference. On normal loop completion, dma_fence_unwrap_next()
releases this via dma_fence_chain_walk() -> dma_fence_put().
When virtio_gpu_do_fence_wait() fails and the function returns early
from inside the loop, the cursor->chain reference is never released.
This is the only caller in the entire kernel that does an early return
inside dma_fence_unwrap_for_each.
Add dma_fence_put(itr.chain) before the early return.
Cc: stable@vger.kernel.org Fixes: eba57fb5498f ("drm/virtio: Wait for each dma-fence of in-fence array individually") Signed-off-by: Wentao Liang <vulab@iscas.ac.cn> Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Link: https://patch.msgid.link/20260607090303.92423-1-vulab@iscas.ac.cn
i2c: qcom-cci: Fix NULL pointer dereference in cci_remove()
On all modern platforms Qualcomm CCI controller provides two I2C masters,
and on particular boards only one I2C master may be initialized, and in
such cases the device unbinding or driver removal causes a NULL pointer
dereference, because cci_halt() is called for all two I2C masters, but
a completion is initialized only for the single enabled master:
RDMA/rtrs-srv: Fix integer underflow in process_read and process_write
usr_len is read from a network-supplied message field (le16_to_cpu)
and used to compute data_len = off - usr_len without validating that
usr_len <= off. A malicious RDMA client can send usr_len > off causing
an integer underflow, resulting in data_len wrapping to a huge size_t
value which is then passed to the rdma_ev callback as a memory length,
leading to out-of-bounds memory access.
Fix by reading and validating usr_len <= off before rtrs_srv_get_ops_ids()
in both process_read() and process_write(), ensuring the early return
path acquires no reference and has no resource leak.
Link: https://patch.msgid.link/r/20260608134802.5019-1-aurelien@hackers.camp Reported-by: Aurelien DESBRIERES <aurelien@hackers.camp> Reviewed-by: Md Haris Iqbal <haris.iqbal@ionos.com> Signed-off-by: Aurelien DESBRIERES <aurelien@hackers.camp> Assisted-by: Claude <claude-sonnet-4-6> Acked-by: Md Haris Iqbal <haris.iqbal@ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Dmitry Vyukov [Fri, 29 May 2026 15:09:06 +0000 (15:09 +0000)]
firmware_loader: Fix recursive lock in device_cache_fw_images()
A recursive locking deadlock can occur in the firmware loader's power
management notification handler.
During system suspend or hibernation preparation, fw_pm_notify() calls
device_cache_fw_images(). This function acquires fw_lock to set the
firmware cache state to FW_LOADER_START_CACHE and then iterates over all
devices using dpm_for_each_dev() while still holding the lock.
For each device, dev_cache_fw_image() schedules asynchronous work to cache
the firmware. If memory allocation for the async work entry fails (e.g., in
out-of-memory conditions), async_schedule_node_domain() falls back to
executing the work function synchronously in the current thread.
The synchronous execution path (__async_dev_cache_fw_image() ->
cache_firmware() -> request_firmware() -> assign_fw()) attempts to acquire
fw_lock again. Since the current thread already holds fw_lock, this results
in a recursive locking deadlock.
Fix this by releasing fw_lock immediately after updating the cache state
and before calling dpm_for_each_dev(). The lock is only needed to protect
the state update. Concurrent firmware requests will correctly see the
FW_LOADER_START_CACHE state and use the piggyback mechanism, which is
independently protected by its own fwc->name_lock.
Aaron Ma [Thu, 28 May 2026 08:21:10 +0000 (16:21 +0800)]
ASoC: amd: acp-sdw-sof: Bound DAI link iteration
create_sdw_dailinks() walks sof_dais until it finds an entry with
initialised cleared, but sof_dais is allocated with exactly num_ends
entries. If all entries are initialised, the loop reads past the end of
the array.
Pass the allocated entry count to create_sdw_dailinks() and stop before
reading past the array.
Aaron Ma [Thu, 28 May 2026 08:21:09 +0000 (16:21 +0800)]
ASoC: amd: acp-sdw-legacy: Bound DAI link iteration
create_sdw_dailinks() walks soc_dais until it finds an entry with
initialised cleared, but soc_dais is allocated with exactly num_ends
entries. If all entries are initialised, the loop reads past the end of
the array.
This was reported by KASAN:
BUG: KASAN: slab-out-of-bounds in mc_probe+0x26b3/0x2774 [snd_acp_sdw_legacy_mach]
Read of size 1
Pass the allocated entry count to create_sdw_dailinks() and stop before
reading past the array.
The driver uses pcim_enable_device(), so IRQ vectors are automatically
freed by devres on driver detach. The explicit pci_free_irq_vectors()
calls in the probe error path and remove function are redundant.
Felix Gu [Fri, 29 May 2026 15:31:06 +0000 (23:31 +0800)]
spi: ep93xx: fix double-free of zeropage on DMA setup failure
If DMA setup fails after allocating the zeropage, the error path frees
the page but leaves espi->zeropage dangling. A subsequent call to
ep93xx_spi_release_dma() sees the non-NULL pointer and frees the page
again.
Clear the pointer after freeing in the error path of
ep93xx_spi_setup_dma().
Tuo Li [Thu, 28 May 2026 06:41:06 +0000 (14:41 +0800)]
ASoC: mediatek: mt8365-afe-pcm: fix possible NULL-pointer dereferences in mt8365_afe_suspend()
mt8365_afe_suspend() allocates the register backup buffer with
devm_kcalloc(), but does not check for allocation failure before using the
returned pointer. This may lead to a NULL pointer dereference when
accessing afe->reg_back_up[i].
Add the missing NULL check and return -ENOMEM on allocation failure after
disabling the main clock.
Also propagate the return value of mt8365_afe_suspend() in
mt8365_afe_dev_runtime_suspend() so that the suspended state is not updated
when suspend fails.
stm32f7_i2c_compute_timing() uses i2c_dev->analog_filter to pick
the analog filter delay, but i2c_dev->analog_filter is parsed from
the "i2c-analog-filter" DT property only after the compute_timing
loop in stm32f7_i2c_setup_timing(), so in practice the timing
calculations always ignore the analog filter. On an STM32MP1 board
with clock-frequency = <400000> and i2c-analog-filter set, measured
SCL frequency was ~382 kHz.
This also affects (widens) the computed SDADEL range. At high bus
clock speeds, this can select an SDADEL value that violates tVD;DAT
(data valid time).
Fix by parsing "i2c-analog-filter" before the compute_timing loop.
SuperH ecovec24/7724se are the last user of Simple Audio Card as
"platform data style". It is mainly supporting "DT style" in these days.
Now, Simple Audio Card "platform data style" is no longer correctly working
during almost this 10 years. but we have not get such report.
Let's remove Sound support from SuperH ecovec24/7724se, and remove
Simple Audio Card platform data style.
Simple-Card has created for "platform data" style first, and expanded
to "DT style". Current Simple-Card "platform data" style should not
work during almost 10 years, but no one reported it.
No one is using "platform data" style. Let's remove its support.
sh: 7724se: remove FSI/AK4642/Simple-Audio-Card support
7724se is using Simple-Audio-Card with "platform data" style
(which is mainly supporting "DT style" today), but "platform data"
style is not working correctly working during almost 10 years.
7724se sound doesn't work in these days, and there has been no
such report. Let's remove sound support.
sh: ecovec24: remove FSI/DA7210/Simple-Audio-Card support
Ecovec24 is using Simple-Audio-Card with "platform data" style
(which is mainly supporting "DT style" today), but "platform data"
style is not working correctly working during almost 10 years.
And DA7210 which is used in Ecovec24 was prototype version, and has
diff between production version. The driver doesn't care about it.
Ecovec24 sound doesn't work in these days, and there has been no
such report. Let's remove sound support.
Mark Brown [Mon, 8 Jun 2026 17:53:24 +0000 (18:53 +0100)]
ASoC: imx-rpmsg: Add headphone jack detection and driver_name support
Chancel Liu <chancel.liu@nxp.com> says:
This series adds two features to the i.MX RPMSG ASoC card:
1. Headphone jack detection via GPIO: Introduce the "hp-det-gpios"
device tree property and use simple_util_init_jack() to
register a headphone jack with GPIO-based insertion detection.
2. driver_name assignment: Set driver_name on the snd_soc_card to
"imx-audio-rpmsg", enabling userspace tools such as UCM to reliably
identify the card by driver name regardless of the board-specific
card name.
Chancel Liu [Thu, 28 May 2026 02:07:25 +0000 (11:07 +0900)]
ASoC: imx-rpmsg: Set driver_name for snd_soc_card
Set driver_name to "imx-audio-rpmsg" for the i.MX RPMSG sound card.
This allows userspace audio configuration tools (e.g., UCM) to match
the card by driver name independently of the card name, which may vary
across board configurations.
Chancel Liu [Thu, 28 May 2026 02:07:24 +0000 (11:07 +0900)]
ASoC: imx-rpmsg: Support headphone jack detection
Add headphone jack detection support for i.MX RPMSG audio cards.
When the "hp-det-gpios" property is present in the device tree node,
use simple_util_init_jack() from the ASoC simple card utilities to
register a headphone jack with GPIO-based insertion detection.
Sound cards using the i.MX RPMSG audio interface may connect a
headphone jack with GPIO-based insertion detection. Add the
"hp-det-gpios" property to the fsl,rpmsg binding to support this
configuration.
ASoC: wm_adsp: Fix NULL dereference when removing firmware controls
In wm_adsp_control_remove() check that the priv pointer is not NULL
before attempting to cleanup what it points to.
When cs_dsp creates a control it calls wm_adsp_control_add_cb() so that
wm_adsp can create its own private control data. There are two cases
where private data is not created:
1. The control is a SYSTEM control, so an ALSA control is not created.
2. The codec driver has registered a control_add() callback that
hides the control, so wm_adsp_control_add() is not called.
When cs_dsp_remove destroys its control list it calls
wm_adsp_control_remove() for each control. But wm_adsp_control_remove()
was attempting to cleanup the private data pointed to by cs_ctl->priv
without checking the pointer for NULL.
Carlos Song [Thu, 21 May 2026 06:50:38 +0000 (14:50 +0800)]
i2c: imx: fix clock and pinctrl state inconsistency in runtime PM
In i2c_imx_runtime_suspend(), the clock is disabled before switching
the pinctrl state to sleep. If pinctrl_pm_select_sleep_state() fails,
the runtime suspend is aborted but the clock remains disabled, causing
a system crash when the hardware is subsequently accessed.
Fix this by switching the pinctrl state before disabling the clock so
that a pinctrl failure leaves the clock enabled and the hardware
accessible.
In i2c_imx_runtime_resume(), restore the pinctrl state back to sleep
if clk_enable() fails to keep the consistent.
Fixes: 576eba03c994 ("i2c: imx: switch different pinctrl state in different system power status") Signed-off-by: Carlos Song <carlos.song@nxp.com> Cc: <stable@vger.kernel.org> # v6.14+ Reviewed-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org> Link: https://lore.kernel.org/r/20260521065038.2954998-1-carlos.song@oss.nxp.com
Jason Gunthorpe [Thu, 4 Jun 2026 01:27:48 +0000 (22:27 -0300)]
IB/mlx5: Push pdn above pagfault_real_mr()
Remove the mlx5_mr_pdn() in pagefault_real_mr() by pushing the pdn up, all
the callers use 0 since they don't pass MLX5_PF_FLAGS_ENABLE except the
ioctl reg_mr path which can use the ioctl pd.
Jason Gunthorpe [Thu, 4 Jun 2026 01:27:47 +0000 (22:27 -0300)]
IB/mlx5: Push pdn above mlx5r_umr_update_xlt()
Keep pushing the pdn higher to remove more places touching mr->pd:
- XLT combinations that don't use PDN can just pass 0
- Use local pd values instead of mr->pd
- Implicit MR does not have inplace rereg, so the mr->pd is safe
Jason Gunthorpe [Thu, 4 Jun 2026 01:27:45 +0000 (22:27 -0300)]
IB/mlx5: Pull the pdn out of the depths of the umr machinery
Instead of getting the pdn deep inside the umr code, pass it in from the
top. to_mpd(mr->ibmr.pd)->pdn is not safe due to the rereg races, so all
the call sites need some revision to obtain the pdn in a safe way.
Mark them with mlx5_mr_pdn(); following patches will go through and remove
these.
Cases where the XLT flags are known and do not require the PDN can pass 0,
such as for mlx5_ib_dmabuf_invalidate_cb().
Also extract the DMABUF data_direct special case from inside the UMR code
and into the only place that needs it, pagefault_dmabuf_mr(). The actual
mr was created directly without using the UMR flow. Ultimately this will
be moved into mlx5_ib_init_dmabuf_mr().
Jason Gunthorpe [Thu, 4 Jun 2026 01:27:43 +0000 (22:27 -0300)]
RDMA/nldev: Fix locking when accessing mr->pd
Sashiko points out that, due to rereg_mr, the PD is actually variable and
all the touches in nldev are racy.
Use mr->device instead of mr->pd->device.
Getting the PD restrack ID is more tricky. To avoid disturbing all the
happy paths, add an rdma_restrack_sync() operation which is sort of like
flush_workqueue() or synchronize_irq(): after it returns, all the old
nldev touches to the mr are gone and everything sees the new PD. This
makes it safe to reach into the PD pointer.
Jason Gunthorpe [Thu, 4 Jun 2026 01:27:42 +0000 (22:27 -0300)]
IB/mlx5: Properly support implicit ODP rereg_mr
Due to all the child mkeys in the implicit ODP configuration we cannot
change anything in place for the parent mkey. Instead the whole thing
needs to be rebuilt if any change is requested. If the user does not
specify a translation then force the implicit values which will then fall
through the logic into mlx5_ib_reg_user_mr() to allocate a completely new
MR.
Since implicit children were also touching the mr->pd, this removes
another case where the access was racy.
Jason Gunthorpe [Thu, 4 Jun 2026 01:27:41 +0000 (22:27 -0300)]
RDMA/mlx5: Create ODP EQ for non-pinned dmabuf MRs
DMABUF generally relies on the ODP EQ mechanism to safely implement the
move semantics. ODP requires a device-global one time startup of the ODP
machinery when the first MR is created, and this was missed on the DMABUF
path.
Call mlx5r_odp_create_eq() when creating a ODP'able DMABUF.
The core code prevents using IB_ACCESS_ON_DEMAND unless the driver
advertises IB_ODP_SUPPORT, so until now, mlx5r_odp_create_eq() cannot be
called unless the device has ODP support.
However, DMABUF has no such protection and a second bug was allowing
DMABUFs to be created on non-ODP capable HW. Add a guard at the start of
mlx5r_odp_create_eq(). This is necessary here anyhow as the
dev->odp_eq_mutex is not initialized without IB_ODP_SUPPORT.
Jason Gunthorpe [Thu, 4 Jun 2026 01:27:40 +0000 (22:27 -0300)]
IB/mlx5: Don't take the rereg_mr fallback without a new translation
Jumping to mlx5_ib_reg_user_mr() without IB_MR_REREG_TRANS set will use
garbage values for start, length, and iova. Recovering the original mr
parameters for ODP and DMABUF to properly recreate it is too hard in this
flow, so just fail it.
io_uring/kbuf: validate ring provided buffer addresses with access_ok()
Commit:
809b997a5ce9 ("x86-64/arm64/powerpc: clean up and rename __copy_from_user_flushcache")
sanitized that any provided copy helper should separately validate
destination and source addresses, but we should also ensure that
anything that is retrieved from a buffer is validated upfront. For ring
provided buffers, always include an access_ok() when grabbing a new
buffer.
Fixes: c7fb19428d67 ("io_uring: add support for ring mapped supplied buffers") Signed-off-by: Jens Axboe <axboe@kernel.dk>
Heiko Carstens [Fri, 5 Jun 2026 15:32:06 +0000 (17:32 +0200)]
s390: Remove GENERIC_LOCKBREAK Kconfig option
s390 selects GENERIC_LOCKBREAK if PREEMPT is enabled. Reason is a historic
18 years old commit [1] which fixed a compile error for PREEMPT enabled
kernels. Back than only PREEMPT_NONE and PREEMPT_VOLUNTARY kernels were
considered to be important for s390. PREEMPT should "just work".
However, since recently PREEMPT is always enabled [2], which also causes
GENERIC_LOCKBREAK to be always enabled. For some workloads this leads to
massive performance degradation; e.g. a simple kernel compile on machines
with many CPUs may take up to four times longer.
To fix this just remove the GENERIC_LOCKBREAK from s390's Kconfig, since
the compile error from 18 years ago does not exist anymore.
[1] commit b6b40c532a36 ("[S390] Define GENERIC_LOCKBREAK.")
[2] commit 7dadeaa6e851 ("sched: Further restrict the preemption modes")
Cc: stable@vger.kernel.org Reported-by: Massimiliano Pellizzer <massimiliano.pellizzer@canonical.com> Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
RDMA/srp: bound SRP_RSP sense copy by the received length
srp_process_rsp() copies sense data from rsp->data + resp_data_len,
where resp_data_len is the full 32-bit value supplied by the SRP target
and is never checked against the number of bytes actually received
(wc->byte_len). The copy length is bounded to SCSI_SENSE_BUFFERSIZE, so
at most 96 bytes are copied, but the source offset is not bounded.
A malicious or compromised SRP target on the InfiniBand/RoCE fabric that
the initiator has logged into can return an SRP_RSP with
SRP_RSP_FLAG_SNSVALID set and a large resp_data_len. The receive buffer
is allocated at the target-chosen max_ti_iu_len, so the source of the
sense copy lands past the bytes actually received; with resp_data_len
near 0xFFFFFFFF it is gigabytes past the buffer and the read faults.
Copy the sense data only if it has not been truncated, that is, only if
the response header, the response data, and the sense region fit within
the bytes actually received; otherwise drop the sense and log. The
in-tree iSER and NVMe-RDMA receive paths already bound their parse by
wc->byte_len; this brings ib_srp into line with them.
Fixes: aef9ec39c47f ("IB: Add SCSI RDMA Protocol (SRP) initiator") Link: https://patch.msgid.link/r/20260602220457.2542840-1-michael.bommarito@gmail.com Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-8 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
IB/isert: Reject login PDUs shorter than ISER_HEADERS_LEN
In drivers/infiniband/ulp/isert/ib_isert.c, isert_login_recv_done()
computes the login request payload length as wc->byte_len minus
ISER_HEADERS_LEN with no lower bound, and login_req_len is a signed int.
A remote iSER initiator can post a login Send work request carrying
fewer than ISER_HEADERS_LEN (76) bytes, so the subtraction underflows
and login_req_len becomes negative.
isert_rx_login_req() then reads that negative length back into a signed
int, takes size = min(rx_buflen, MAX_KEY_VALUE_PAIRS), and because the
min() is signed it keeps the negative value; the value is then passed as
the memcpy() length and sign-extended to a multi-gigabyte size_t. The
copy into the 8192-byte login->req_buf runs far out of bounds and
faults, crashing the target node. The login phase precedes iSCSI
authentication, so no credentials are required to reach this path.
Reject any login PDU shorter than ISER_HEADERS_LEN before the
subtraction, mirroring the existing early return on a failed work
completion, so login_req_len can never go negative. The upper bound was
already safe: a posted login buffer cannot deliver more than
ISER_RX_PAYLOAD_SIZE, so the difference stays at or below
MAX_KEY_VALUE_PAIRS and the existing min() clamps it; only the missing
lower bound needs to be added.
Fixes: b8d26b3be8b3 ("iser-target: Add iSCSI Extensions for RDMA (iSER) target driver") Link: https://patch.msgid.link/r/20260602194642.2273217-1-michael.bommarito@gmail.com Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-8 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Pierre Gondois [Thu, 28 May 2026 09:09:06 +0000 (11:09 +0200)]
cpufreq: Use policy->min/max init as QoS request
Modify cpufreq_policy_init_qos() introduced previously to use
policy->min/max set in the driver .init() callback as the initial
values for the policy min/max frequency QoS requests, respectively,
so long as they are different from 0 (which means that they have
been updated by the driver). Update the documentation in accordance
with that code change.
Prior to commit 521223d8b3ec ("cpufreq: Fix initialization of min and
max frequency QoS requests"), drivers were setting policy->min/max and
these values were used as initial policy QoS constraints.
After the above commit, these values are only used temporarily, as
cpufreq_set_policy() ultimately overrides them through:
cpufreq_policy_online()
\-cpufreq_init_policy()
\-cpufreq_set_policy()
\-/* Set policy->min/max */
A subsequent change will restore the previous behavior allowing
drivers to request special min/max QoS frequencies instead of
FREQ_QOS_MIN_DEFAULT_VALUE and FREQ_QOS_MAX_DEFAULT_VALUE, respectively,
if desired. For instance, the CPPC driver wants to advertise the lowest
non-linear frequency that should be used as the initial minimum
frequency QoS request.
However, for this purpose, all drivers setting policy->min/max to
policy->cpuinfo.min/max_freq, respectively, need to be updated so
their initial policy->min/max settings don't limit the frequency
scaling unnecessarily going forward (which would defeat the purpose
of commit 521223d8b3ec), so do that.
This does not actually alter the observed behavior of all of
the drivers in question because setting policy->min/max to
policy->cpuinfo.min/max_freq, respectively, is not necessary or
even useful any more after a previous change ("cpufreq: Set default
policy->min/max values for all drivers").
Pierre Gondois [Thu, 28 May 2026 09:09:04 +0000 (11:09 +0200)]
cpufreq: Set default policy->min/max values for all drivers
Some drivers set policy->min/max in their .init() callback, but
cpufreq_set_policy() will ultimately override them through:
cpufreq_policy_online()
\-cpufreq_init_policy()
\-cpufreq_set_policy()
\-/* Set policy->min/max */
Thus the policy min/max values set by the drivers are only temporary.
There is an exception if CPUFREQ_NEED_INITIAL_FREQ_CHECK is set and
cpufreq_policy_online() calls __cpufreq_driver_target() which invokes
cpufreq_driver->target().
To prepare for a subsequent change that will remove all initialization
of policy->min/max in driver .init() callbacks if the min/max value is
equal to the corresponding cpuinfo.min/max_freq, set default
policy->min/max values in the core for all drivers.
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com> Reviewed-by: Jie Zhan <zhanjie9@hisilicon.com>
[ rjw: Edits of the new comment and changelog ] Link: https://patch.msgid.link/20260528090913.2759118-3-pierre.gondois@arm.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This facilitats subsequent changes that will, in
cpufreq_policy_init_qos():
- Set a default policy->min/max value for all policies.
- Use the policy->min/max values set by drivers as initial request
values for policy frequency QoS requests.
No functional change.
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com> Reviewed-by: Zhongqiu Han <zhongqiu.han@oss.qualcomm.com> Reviewed-by: Jie Zhan <zhanjie9@hisilicon.com>
[ rjw: Changelog edits ] Link: https://patch.msgid.link/20260528090913.2759118-2-pierre.gondois@arm.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Jason Gunthorpe [Thu, 4 Jun 2026 18:03:13 +0000 (15:03 -0300)]
RDMA: During rereg_mr ensure that REREG_ACCESS is compatible
If IB_MR_REREG_ACCESS changes from RO to RW then the umem has to be
re-evaluated to ensure it is properly pinned as RW. Since the umem is
hidden inside each driver's mr struct add a ib_umem_check_rereg() function
that each driver has to call before processing IB_MR_REREG_ACCESS.
mlx4 has to retain its duplicate ib_access_writable check because it
implements IB_MR_REREG_ACCESS | IB_MR_REREG_TRANS by changing both items
in place sequentially while the MR is live, so it will continue to not
support this combination.
Wentao Liang [Mon, 8 Jun 2026 07:11:23 +0000 (07:11 +0000)]
i2c: riic: fix refcount leak in riic_i2c_resume_noirq()
When riic_i2c_resume_noirq() is called, it deasserts the reset
using reset_control_deassert(), which for shared resets increments
a reference count. If pm_runtime_force_resume() then fails, the
function returns without calling reset_control_assert() to
decrement the count. This leaves the reset deasserted and the
reference count unbalanced, which can prevent other users of the
shared reset from properly asserting it later.
Fix the leak by calling reset_control_assert() on the error
handling path for a failed pm_runtime_force_resume().
Wei Deng [Mon, 8 Jun 2026 09:17:01 +0000 (14:47 +0530)]
power: sequencing: pcie-m2: Add PCI ID 0x1103 for WCN6855 Bluetooth
WCN6855 is a Qualcomm Wi-Fi/BT combo chip that uses PCI device ID
0x1103. Add it to pwrseq_m2_pci_ids[] alongside the existing 0x1107
(WCN7850) entry, so that the pwrseq-pcie-m2 driver creates a Bluetooth
serdev device for WCN6855 cards inserted into PCIe M.2 Key E connectors.
Yun Zhou [Mon, 8 Jun 2026 08:43:34 +0000 (16:43 +0800)]
gpio: mvebu: fix NULL pointer dereference in suspend/resume
mvebu_pwm_suspend() and mvebu_pwm_resume() are called for all GPIO
banks during suspend/resume, but not all banks have PWM functionality.
GPIO banks without PWM have mvchip->mvpwm set to NULL.
Calling mvebu_pwm_suspend() with mvpwm == NULL causes a NULL pointer
dereference when it tries to access mvpwm->blink_select.
Unable to handle kernel NULL pointer dereference at virtual address 00000020 when write
[00000020] *pgd=00000000
Internal error: Oops: 815 [#1] PREEMPT ARM
Modules linked in:
CPU: 0 UID: 0 PID: 406 Comm: sh Not tainted 6.12.74-rt12-yocto-standard-g4e96f98fb7db-dirty #353
Hardware name: Marvell Armada 370/XP (Device Tree)
PC is at regmap_mmio_read+0x38/0x54
LR is at regmap_mmio_read+0x38/0x54
pc : [<c05fd2ac>] lr : [<c05fd2ac>] psr: 200f0013
sp : f0c11d10 ip : 00000000 fp : c100d2f0
r10: c14fb854 r9 : 00000000 r8 : 00000000
r7 : c1799c00 r6 : 00000020 r5 : 00000020 r4 : c179c7c0
r3 : f0a231a0 r2 : 00000020 r1 : 00000020 r0 : 00000000
Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
Control: 10c5387d Table: 135ec059 DAC: 00000051
Call trace:
regmap_mmio_read from _regmap_bus_reg_read+0x78/0xac
_regmap_bus_reg_read from _regmap_read+0x60/0x154
_regmap_read from regmap_read+0x3c/0x60
regmap_read from mvebu_gpio_suspend+0xa4/0x14c
mvebu_gpio_suspend from dpm_run_callback+0x54/0x180
dpm_run_callback from device_suspend+0x124/0x630
device_suspend from dpm_suspend+0x124/0x270
dpm_suspend from dpm_suspend_start+0x64/0x6c
dpm_suspend_start from suspend_devices_and_enter+0x140/0x8e8
suspend_devices_and_enter from pm_suspend+0x2fc/0x308
pm_suspend from state_store+0x6c/0xc8
state_store from kernfs_fop_write_iter+0x10c/0x1f8
kernfs_fop_write_iter from vfs_write+0x270/0x468
vfs_write from ksys_write+0x70/0xf0
ksys_write from ret_fast_syscall+0x0/0x54
Add a NULL check for mvchip->mvpwm before calling the PWM
suspend/resume functions.
Ming Lei [Mon, 8 Jun 2026 14:25:10 +0000 (09:25 -0500)]
io_uring/net: support registered buffer for plain send and recv
So far IORING_RECVSEND_FIXED_BUF is only honoured on the SEND_ZC path,
even though the import wiring is already present for plain send and
completely absent for recv. Targets such as ublk's NBD backend want to
push/pull I/O data directly to/from an io_uring registered buffer over a
plain send/recv on a TCP socket.
Wire IORING_RECVSEND_FIXED_BUF into the plain IORING_OP_SEND and
IORING_OP_RECV paths:
- Accept the flag in SENDMSG_FLAGS / RECVMSG_FLAGS and, at prep time,
restrict it to the non-vectorized IORING_OP_SEND / IORING_OP_RECV
opcodes. It is mutually exclusive with buffer select, bundles and
(for recv) multishot, and records sqe->buf_index.
- For recv, set REQ_F_IMPORT_BUFFER in setup so the registered buffer
is imported lazily at issue time, mirroring the send path.
- In io_send()/io_recv(), import the registered buffer via
io_import_reg_buf() (ITER_SOURCE for send, ITER_DEST for recv) and
clear REQ_F_IMPORT_BUFFER. The resulting bvec iter persists in
async_data, so MSG_WAITALL partial send/recv retries resume at the
right offset.
Linus Torvalds [Mon, 8 Jun 2026 14:31:41 +0000 (07:31 -0700)]
Merge tag 'hyperv-fixes-signed-20260607' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull hyperv fixes from Wei Liu:
- MSHV driver fixes from various people (Anirudh Rayabharam, Can Peng,
Dexuan Cui, Michael Kelley, Jork Loeser, Wei Liu)
- Hyper-V user space tools fixes (Thorsten Blum)
- Allow VMBus to be unloaded after frame buffer is flushed (Michael
Kelley)
* tag 'hyperv-fixes-signed-20260607' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
mshv: support 1G hugepages by passing them as 2M-aligned chunks
Drivers: hv: vmbus: Improve the logic of reserving fb_mmio on Gen2 VMs
mshv: use kmalloc_array in mshv_root_scheduler_init
mshv: Add conditional VMBus dependency
hyperv: Clean up and fix the guest ID comment in hvgdk.h
drm/hyperv: During panic do VMBus unload after frame buffer is flushed
Drivers: hv: vmbus: Provide option to skip VMBus unload on panic
mshv: unmap debugfs stats pages on kexec
mshv: clean up SynIC state on kexec for L1VH
mshv: limit SynIC management to MSHV-owned resources
hv: utils: replace deprecated strcpy with strscpy in kvp_register
hv: utils: handle and propagate errors in kvp_register
mshv: add a missing padding field
Filipe Manana [Tue, 5 May 2026 14:59:37 +0000 (15:59 +0100)]
btrfs: tracepoints: add trace event for log_new_dir_dentries()
log_new_dir_dentries() is an important step called during a fsync, as
well as during rename and link operations on inodes that were previously
logged. Add trace events for when entering and exiting that function.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
Filipe Manana [Tue, 5 May 2026 14:31:16 +0000 (15:31 +0100)]
btrfs: tracepoints: add trace event for log_all_new_ancestors()
log_all_new_ancestors() is an important step called during a fsync, as
well as during rename and link operations on inodes that were previously
logged. Add trace events for when entering and exiting that function.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
Filipe Manana [Thu, 30 Apr 2026 16:10:01 +0000 (17:10 +0100)]
btrfs: tracepoints: add trace event for btrfs_log_all_parents()
btrfs_log_all_parents() is an important step called during a fsync, as
well as during rename and link operations on inodes that were previously
logged. Add trace events for when entering and exiting that function.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
Filipe Manana [Thu, 30 Apr 2026 15:05:56 +0000 (16:05 +0100)]
btrfs: tracepoints: add trace event for btrfs_log_inode()
btrfs_log_inode() is one of the most important steps called during a fsync,
as well as during rename and link operations on inodes that were previously
logged. Add trace events for when entering and exiting that function.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
Filipe Manana [Wed, 29 Apr 2026 18:43:10 +0000 (19:43 +0100)]
btrfs: use a named enum for the log mode in inode log functions
We use this unnamed enum for the log mode and then pass it around log
functions as an int type with the odd name "inode_only" which suggests a
boolean. So add a name to the enum and change the type everywhere to that
enum and rename the parameters to something more clear - "log_mode".
Also move the enum into tree-log.h - it will be used later by new trace
events.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
Filipe Manana [Tue, 28 Apr 2026 15:52:09 +0000 (16:52 +0100)]
btrfs: tracepoints: add trace event for btrfs_log_inode_parent()
btrfs_log_inode_parent() is one of the most important steps called during
a fsync operation as well as during rename and link operations on inodes
that were previously logged. Add trace events for when entering and
exiting that function.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
Filipe Manana [Tue, 28 Apr 2026 14:32:15 +0000 (15:32 +0100)]
btrfs: tracepoints: add trace event for when fsync finishes
Currently we only have a trace event for when a fsync operation starts,
but this alone is not very helpful. Add a trace event for when fsync
finishes, which reports its return value, so that using tracing we can
see which other trace events happened in between (several will be added
soon for inode logging steps) and even measure execution time.
So rename the existing trace event btrfs_sync_file to
btrfs_sync_file_enter and add the trace event btrfs_sync_file_exit.
The naming is similar to what ext4 does (ext4_sync_file_enter and
ext4_sync_file_exit) and with similar information reported.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
Filipe Manana [Mon, 27 Apr 2026 14:37:41 +0000 (15:37 +0100)]
btrfs: remove redundant writeback error check during fsync
If we can skip logging the inode during fsync, we check for writeback
errors in the inode's mapping by calling filemap_check_wb_err() and then
jump to the 'out_release_extents' label, which in turn jumps to the 'out'
label under which we check again for a writeback error by calling
file_check_and_advance_wb_err(). So the filemap_check_wb_err() ends up
being redundant. This happens since commit 333427a505be ("btrfs: minimal
conversion to errseq_t writeback error reporting on fsync").
Remove the filemap_check_wb_err() call.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>