The blsp_dma controller is shared between the different subsystems,
which is why it is already initialized by the firmware. We should not
reinitialize it from Linux to avoid potential other users of the DMA
engine to misbehave.
In mainline this can be described using the "qcom,controlled-remotely"
property. In the downstream/vendor kernel from Qualcomm there is an
opposite "qcom,managed-locally" property. This property is *not* set
for the qcom,sps-dma@7884000 and qcom,sps-dma@7ac4000 [1] so adding
"qcom,controlled-remotely" upstream matches the behavior of the
downstream/vendor kernel.
In preparation for switching to the architected timer as the primary
clockevents device, mark the cpuidle nodes with the 'local-timer-stop'
property to indicate that an alternative clockevents device must be
used for waking up from the "c2" idle state.
Signed-off-by: Will Deacon <willdeacon@google.com>
[Original commit from https://android.googlesource.com/kernel/gs/+/a896fd98638047989513d05556faebd28a62b27c] Signed-off-by: Will McVicker <willmcvicker@google.com> Reviewed-by: Youngmin Nam <youngmin.nam@samsung.com> Tested-by: Youngmin Nam <youngmin.nam@samsung.com> Fixes: ea89fdf24fd9 ("arm64: dts: exynos: google: Add initial Google gs101 SoC support") Signed-off-by: Peter Griffin <peter.griffin@linaro.org> Reviewed-by: Peter Griffin <peter.griffin@linaro.org> Tested-by: Peter Griffin <peter.griffin@linaro.org> Link: https://lore.kernel.org/r/20250611-gs101-cpuidle-v2-1-4fa811ec404d@linaro.org Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The QMI_DATA_LEN type may have different sizes. Taking the element's
address of that type and interpret it as a smaller sized ones works fine
for little endian platforms but not for big endian ones. Instead use
temporary variables of smaller sized types and cast them correctly to
support big endian platforms.
only the first syscall may fail and set errno, but the second may succeed
and keep errno intact, and the check will falsely pass.
Or if errno happened to be EINVAL before, even the first check may falsely
pass.
Also use EXPECT/ASSERT consistently. Currently there is an inconsistent mix
without obvious reasons for usage of one or another.
In commit 32c9c06adb5b ("ASoC: mediatek: disable buffer pre-allocation")
buffer pre-allocation was disabled to accommodate newer platforms that
have a limited reserved memory region for the audio frontend.
Turns out disabling pre-allocation across the board impacts platforms
that don't have this reserved memory region. Buffer allocation failures
have been observed on MT8173 and MT8183 based Chromebooks under low
memory conditions, which results in no audio playback for the user.
Since some MediaTek platforms already have dedicated reserved memory
pools for the audio frontend, the plan is to enable this for all of
them. This requires device tree changes. As a fallback, reinstate the
original policy of pre-allocating audio buffers at probe time of the
reserved memory pool cannot be found or used.
This patch covers the MT8173, MT8183, MT8186 and MT8192 platforms for
now, the reason being that existing MediaTek platform drivers that
supported reserved memory were all platforms that mainly supported
ChromeOS, and is also the set of devices that I can verify.
Fixes: 32c9c06adb5b ("ASoC: mediatek: disable buffer pre-allocation") Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> Link: https://patch.msgid.link/20250612074901.4023253-7-wenst@chromium.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Change the function to dynamically allocate it instead.
There is probably a better way to do it since only two integer fields
inside of that structure are actually used, but this is the simplest
rework for the moment.
Fixes: 783db6851c18 ("ASoC: ops: Enforce platform maximum on initial value") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://patch.msgid.link/20250610093057.2643233-1-arnd@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 7f1186a8d738661 ("ASoC: soc-dai: check return value at
snd_soc_dai_set_tdm_slot()") checks return value of
xlate_tdm_slot_mask() (A1)(A2).
/*
* ...
(Y) * TDM mode can be disabled by passing 0 for @slots. In this case @tx_mask,
* @rx_mask and @slot_width will be ignored.
* ...
*/
int snd_soc_dai_set_tdm_slot(...)
{
...
if (...)
(A1) ret = dai->driver->ops->xlate_tdm_slot_mask(...);
else
(A2) ret = snd_soc_xlate_tdm_slot_mask(...);
if (ret)
goto err;
...
}
snd_soc_xlate_tdm_slot_mask() (A2) will return -EINVAL if slots was 0 (X),
but snd_soc_dai_set_tdm_slot() allow to use it (Y).
(A) static int snd_soc_xlate_tdm_slot_mask(...)
{
...
if (!slots)
(X) return -EINVAL;
...
}
Call xlate_tdm_slot_mask() only if slots was non zero.
Reported-by: Giedrius Trainavičius <giedrius@blokas.io> Closes: https://lore.kernel.org/r/CAMONXLtSL7iKyvH6w=CzPTxQdBECf++hn8RKL6Y4=M_ou2YHow@mail.gmail.com Fixes: 7f1186a8d738661 ("ASoC: soc-dai: check return value at snd_soc_dai_set_tdm_slot()") Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Link: https://patch.msgid.link/8734cdfx59.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Add a dependency for IO_URING for the GCOV_PROFILE_URING symbol.
Without this patch the EXPERT config menu ends with
"Enable IO uring support" and the menu prompts for
GCOV_PROFILE_URING and IO_URING_MOCK_FILE are not subordinate to it.
This causes all of the EXPERT Kconfig options that follow
GCOV_PROFILE_URING to be display in the "upper" menu (General setup),
just following the EXPERT menu.
When a node withdraws and it turns out that it is the only node that has
the filesystem mounted, gfs2 currently tries to replay the local journal
to bring the filesystem back into a consistent state. Not only is that
a very bad idea, it has also never worked because gfs2_recover_func()
will refuse to do anything during a withdraw.
However, before even getting to this point, gfs2_recover_func()
dereferences sdp->sd_jdesc->jd_inode. This was a use-after-free before
commit 04133b607a78 ("gfs2: Prevent double iput for journal on error")
and is a NULL pointer dereference since then.
Simply get rid of self recovery to fix that.
Fixes: 601ef0d52e96 ("gfs2: Force withdraw to replay journals and wait for it to finish") Reported-by: Chunjie Zhu <chunjie.zhu@cloud.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Initially, conditional lock acquisition was removed to fix an xfstest bug
that was observed during internal testing. The deadlock reported by syzbot
is resolved by reintroducing conditional acquisition. The xfstest bug no
longer occurs on kernel version 6.16-rc1 during internal testing. I
assume that changes in other modules may have contributed to this.
Fixes: 69505fe98f19 ("fs/ntfs3: Replace inode_trylock with inode_lock") Reported-by: syzbot+a91fcdbd2698f99db8f4@syzkaller.appspotmail.com Suggested-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
To avoid deadlock, Commit 31651c607151 ("hfsplus: avoid deadlock
on file truncation") unlock extree before hfsplus_free_extents(),
and add check wheather extree is locked in hfsplus_free_extents().
However, when operations such as hfsplus_file_release,
hfsplus_setattr, hfsplus_unlink, and hfsplus_get_block are executed
concurrently in different files, it is very likely to trigger the
WARN_ON, which will lead syzbot and xfstest to consider it as an
abnormality.
The comment above this warning also describes one of the easy
triggering situations, which can easily trigger and cause
xfstest&syzbot to report errors.
Several threads could try to lock the shared extents tree.
And warning can be triggered in one thread when another thread
has locked the tree. This is the wrong behavior of the code and
we need to remove the warning.
struct ublk_device's __queues points to an allocation with up to
UBLK_MAX_NR_QUEUES (4096) queues, each of which have:
- struct ublk_queue (48 bytes)
- Tail array of up to UBLK_MAX_QUEUE_DEPTH (4096) struct ublk_io's,
32 bytes each
This means the full allocation can exceed 512 MB, which may well be
impossible to service with contiguous physical pages. Switch to
kvcalloc() and kvfree(), since there is no need for physically
contiguous memory.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Fixes: 71f28f3136af ("ublk_drv: add io_uring based userspace block driver") Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250620151008.3976463-2-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
The reproducer uses a file0 on a ntfs3 file system with a corrupted i_link.
When renaming, the file0's inode is marked as a bad inode because the file
name cannot be deleted.
The underlying bug is that make_bad_inode() is called on a live inode.
In some cases it's "icache lookup finds a normal inode, d_splice_alias()
is called to attach it to dentry, while another thread decides to call
make_bad_inode() on it - that would evict it from icache, but we'd already
found it there earlier".
In some it's outright "we have an inode attached to dentry - that's how we
got it in the first place; let's call make_bad_inode() on it just for shits
and giggles".
Fixes: 78ab59fee07f ("fs/ntfs3: Rework file operations") Reported-by: syzbot+1aa90f0eb1fc3e77d969@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=1aa90f0eb1fc3e77d969 Signed-off-by: Edward Adam Davis <eadavis@qq.com> Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The macro takes a parameter called "p" but references "fc" internally.
This happens to compile as long as callers pass a variable named fc,
but breaks otherwise. Rename the first parameter to “fc” to match the
usage and to be consistent with warnfc() / errorfc().
Fixes: a3ff937b33d9 ("prefix-handling analogues of errorf() and friends") Signed-off-by: RubenKelevra <rubenkelevra@gmail.com> Link: https://lore.kernel.org/20250617230927.1790401-1-rubenkelevra@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
... and parse_longname() is not guaranteed that. That's the reason
why it uses kmemdup_nul() to build the argument for kstrtou64();
the problem is, kstrtou64() is not the only thing that need it.
Just get a NUL-terminated copy of the entire thing and be done
with that...
Fixes: dd66df0053ef "ceph: add support for encrypted snapshot names" Tested-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Sasha Levin <sashal@kernel.org>
The move of the module sanity check to earlier skipped the audit logging
call in the case of failure and to a place where the previously used
context is unavailable.
Add an audit logging call for the module loading failure case and get
the module name when possible.
Link: https://issues.redhat.com/browse/RHEL-52839 Fixes: 02da2cbab452 ("module: move check_modinfo() early to early_mod_check()") Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Reviewed-by: Petr Pavlu <petr.pavlu@suse.com> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
It is currently possible to configure a kernel with all Intel SoC
configs as loadable modules, but the board config as built-in. This
causes a link failure in the reference to the snd_soc_sof.ko module:
x86_64-linux-ld: sound/soc/intel/boards/sof_rt5682.o: in function `sof_rt5682_hw_params':
sof_rt5682.c:(.text+0x1f9): undefined reference to `sof_dai_get_mclk'
x86_64-linux-ld: sof_rt5682.c:(.text+0x234): undefined reference to `sof_dai_get_bclk'
x86_64-linux-ld: sound/soc/intel/boards/sof_rt5682.o: in function `sof_rt5682_codec_init':
sof_rt5682.c:(.text+0x3e0): undefined reference to `sof_dai_get_mclk'
x86_64-linux-ld: sound/soc/intel/boards/sof_cs42l42.o: in function `sof_cs42l42_hw_params':
sof_cs42l42.c:(.text+0x2a): undefined reference to `sof_dai_get_bclk'
x86_64-linux-ld: sound/soc/intel/boards/sof_nau8825.o: in function `sof_nau8825_hw_params':
sof_nau8825.c:(.text+0x7f): undefined reference to `sof_dai_get_bclk'
x86_64-linux-ld: sound/soc/intel/boards/sof_da7219.o: in function `da7219_codec_init':
sof_da7219.c:(.text+0xbf): undefined reference to `sof_dai_get_mclk'
x86_64-linux-ld: sound/soc/intel/boards/sof_maxim_common.o: in function `max_98373_hw_params':
sof_maxim_common.c:(.text+0x6f9): undefined reference to `sof_dai_get_tdm_slots'
x86_64-linux-ld: sound/soc/intel/boards/sof_realtek_common.o: in function `rt1015_hw_params':
sof_realtek_common.c:(.text+0x54c): undefined reference to `sof_dai_get_bclk'
x86_64-linux-ld: sound/soc/intel/boards/sof_realtek_common.o: in function `rt1308_hw_params':
sof_realtek_common.c:(.text+0x702): undefined reference to `sof_dai_get_mclk'
x86_64-linux-ld: sound/soc/intel/boards/sof_cirrus_common.o: in function `cs35l41_hw_params':
sof_cirrus_common.c:(.text+0x2f): undefined reference to `sof_dai_get_bclk'
Add an optional dependency on SND_SOC_SOF_INTEL_COMMON, to ensure that whenever
the SOF support is in a loadable module, none of the board code can be built-in.
This may be be a little heavy-handed, but I also don't see a reason why one would
want the boards to be built-in but not the SoC, so it shouldn't actually cause
any usability problems.
The Lenovo Yoga Book 9i GenX has the wrong values in the cirrus,dev-index
_DSD property. Add a fixup for this model to ignore the property and
hardcode the index from the I2C bus address.
The error in the cirrus,dev-index property would prevent the second amp
instance from probing. The component binding would never see all the
required instances and so there would not be a binding between
patch_realtek.c and the cs35l56 driver.
Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Reported-by: Brian Howard <blhoward2@gmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220228 Link: https://patch.msgid.link/20250714110154.204740-1-rf@opensource.cirrus.com Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
With large values of CONFIG_NR_CPUS, three Intel ethernet drivers fail to
compile like:
In function ‘i40e_free_q_vector’,
inlined from ‘i40e_vsi_alloc_q_vectors’ at drivers/net/ethernet/intel/i40e/i40e_main.c:12112:3:
571 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
include/linux/rcupdate.h:1084:17: note: in expansion of macro ‘BUILD_BUG_ON’
1084 | BUILD_BUG_ON(offsetof(typeof(*(ptr)), rhf) >= 4096); \
drivers/net/ethernet/intel/i40e/i40e_main.c:5113:9: note: in expansion of macro ‘kfree_rcu’
5113 | kfree_rcu(q_vector, rcu);
| ^~~~~~~~~
The problem is that the 'rcu' member in 'q_vector' is too far from the start
of the structure. Move this member before the CPU mask instead, in all three
drivers.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: David S. Miller <davem@davemloft.net> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
This indicates that the pgoff is unaligned. After analysis, I confirm the
vma is mapped to /dev/zero. Such a vma certainly has vm_file, but it is
set to anonymous by mmap_zero(). So even if it's mmapped by 2m-unaligned,
it can pass the check in thp_vma_allowable_order() as it is an
anonymous-mmap, but then be collapsed as a file-mmap.
It seems the problem has existed for a long time, but actually, since we
have khugepaged_max_ptes_none check before, we will skip collapse it as it
is /dev/zero and so has no present page. But commit d8ea7cc8547c limit
the check for only khugepaged, so the BUG_ON() can be triggered by
madvise_collapse().
Add vma_is_anonymous() check to make such vma be processed by
hpage_collapse_scan_pmd().
Link: https://lkml.kernel.org/r/20250111034511.2223353-1-liushixin2@huawei.com Fixes: d8ea7cc8547c ("mm/khugepaged: add flag to predicate khugepaged-only behavior") Signed-off-by: Liu Shixin <liushixin2@huawei.com> Reviewed-by: Yang Shi <yang@os.amperecomputing.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Mattew Wilcox <willy@infradead.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nanyong Sun <sunnanyong@huawei.com> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[acsjakub: backport, clean apply] Signed-off-by: Jakub Acs <acsjakub@amazon.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Free vCPUs before freeing any VM state, as both SVM and VMX may access
VM state when "freeing" a vCPU that is currently "in" L2, i.e. that needs
to be kicked out of nested guest mode.
Commit 6fcee03df6a1 ("KVM: x86: avoid loading a vCPU after .vm_destroy was
called") partially fixed the issue, but for unknown reasons only moved the
MMU unloading before VM destruction. Complete the change, and free all
vCPU state prior to destroying VM state, as nVMX accesses even more state
than nSVM.
In addition to the AVIC, KVM can hit a use-after-free on MSR filters:
Inarguably, both nSVM and nVMX need to be fixed, but punt on those
cleanups for the moment. Conceptually, vCPUs should be freed before VM
state. Assets like the I/O APIC and PIC _must_ be allocated before vCPUs
are created, so it stands to reason that they must be freed _after_ vCPUs
are destroyed.
Reported-by: Aaron Lewis <aaronlewis@google.com> Closes: https://lore.kernel.org/all/20240703175618.2304869-2-aaronlewis@google.com Cc: Jim Mattson <jmattson@google.com> Cc: Yan Zhao <yan.y.zhao@intel.com> Cc: Rick P Edgecombe <rick.p.edgecombe@intel.com> Cc: Kai Huang <kai.huang@intel.com> Cc: Isaku Yamahata <isaku.yamahata@intel.com> Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20250224235542.2562848-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Cheng <chengkev@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The helper function introduced in the reverted commit is for handling
the "refcounted domain mask" introduced in commit a7ddcea1f5ac
("drm/xe: Error handling in xe_force_wake_get()"). Since that API change
only exists in 6.13 and later, this helper is unnecessary in 6.12 stable
kernel.
The reverted commit updated the handling of xe_force_wake_get to match
the new "return refcounted domain mask" semantics introduced in commit a7ddcea1f5ac ("drm/xe: Error handling in xe_force_wake_get()"). However,
that API change only exists in 6.13 and later.
In 6.12 stable kernel, xe_force_wake_get still returns a status code.
The update incorrectly treats the return value as a mask, causing the
return value of 0 to be misinterpreted as an error
The reverted commit updated the handling of xe_force_wake_get to match
the new "return refcounted domain mask" semantics introduced in commit a7ddcea1f5ac ("drm/xe: Error handling in xe_force_wake_get()"). However,
that API change only exists in 6.13 and later.
In 6.12 stable kernel, xe_force_wake_get still returns a status code.
The update incorrectly treats the return value as a mask, causing the
return value of 0 to be misinterpreted as an error.
The reverted commit updated the handling of xe_force_wake_get to match
the new "return refcounted domain mask" semantics introduced in commit a7ddcea1f5ac ("drm/xe: Error handling in xe_force_wake_get()"). However,
that API change only exists in 6.13 and later.
In 6.12 stable kernel, xe_force_wake_get still returns a status code.
The update incorrectly treats the return value as a mask, causing the
return value of 0 to be misinterpreted as an error. As a result, the
driver probe fails with -ETIMEDOUT in xe_pci_probe -> xe_device_probe
-> xe_gt_init_hwconfig -> xe_force_wake_get.
[ 1254.323172] xe 0000:00:02.0: [drm] Found ALDERLAKE_P (device ID 46a6) display version 13.00 stepping D0
[ 1254.323175] xe 0000:00:02.0: [drm:xe_pci_probe [xe]] ALDERLAKE_P 46a6:000c dgfx:0 gfx:Xe_LP (12.00) media:Xe_M (12.00) display:yes dma_m_s:39 tc:1 gscfi:0 cscfi:0
[ 1254.323275] xe 0000:00:02.0: [drm:xe_pci_probe [xe]] Stepping = (G:C0, M:C0, B:**)
[ 1254.323328] xe 0000:00:02.0: [drm:xe_pci_probe [xe]] SR-IOV support: no (mode: none)
[ 1254.323379] xe 0000:00:02.0: [drm:intel_pch_type [xe]] Found Alder Lake PCH
[ 1254.323475] xe 0000:00:02.0: probe with driver xe failed with error -110
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/5373 Cc: Badal Nilawar <badal.nilawar@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On g4x we currently use the 96MHz non-SSC refclk, which can't actually
generate an exact 2.7 Gbps link rate. In practice we end up with 2.688
Gbps which seems to be close enough to actually work, but link training
is currently failing due to miscalculating the DP_LINK_BW value (we
calcualte it directly from port_clock which reflects the actual PLL
outpout frequency).
Ideas how to fix this:
- nudge port_clock back up to 270000 during PLL computation/readout
- track port_clock and the nominal link rate separately so they might
differ a bit
- switch to the 100MHz refclk, but that one should be SSC so perhaps
not something we want
While we ponder about a better solution apply some band aid to the
immediate issue of miscalculated DP_LINK_BW value. With this
I can again use 2.7 Gbps link rate on g4x.
Cc: stable@vger.kernel.org Fixes: 665a7b04092c ("drm/i915: Feed the DPLL output freq back into crtc_state") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250710201718.25310-2-ville.syrjala@linux.intel.com Reviewed-by: Imre Deak <imre.deak@intel.com>
(cherry picked from commit a8b874694db5cae7baaf522756f87acd956e6e66) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[ changed display->platform.g4x to IS_G4X(i915) ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Update HDA driver to support Tegra264 differences from legacy HDA,
which includes: clocks/resets, always power on, and hardware-managed
FPCI/IPFS initialization. The driver retrieves this chip-specific
information from soc_data.
The ring buffer size varies across VMBus channels. The size of sysfs
node for the ring buffer is currently hardcoded to 4 MB. Userspace
clients either use fstat() or hardcode this size for doing mmap().
To address this, make the sysfs node size dynamic to reflect the
actual ring buffer size for each channel. This will ensure that
fstat() on ring sysfs node always returns the correct size of
ring buffer.
Reviewed-by: Michael Kelley <mhklinux@outlook.com> Tested-by: Michael Kelley <mhklinux@outlook.com> Reviewed-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Naman Jain <namjain@linux.microsoft.com> Link: https://lore.kernel.org/r/20250502074811.2022-3-namjain@linux.microsoft.com
[ The structure "struct attribute_group" does not have bin_size field in
v6.12.x kernel so the logic of configuring size of sysfs node for ring buffer
has been moved to vmbus_chan_bin_attr_is_visible().
Original change was not a fix, but it needs to be backported to fix size
related discrepancy caused by the commit mentioned in Fixes tag. ] Signed-off-by: Naman Jain <namjain@linux.microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
After commit d5c8d6e0fa61 ("kbuild: Update assembler calls to use proper
flags and language target"), which updated as-instr to use the
'assembler-with-cpp' language option, the Kbuild version of as-instr
always fails internally for arch/arm with
<command-line>: fatal error: asm/unified.h: No such file or directory
compilation terminated.
because '-include' flags are now taken into account by the compiler
driver and as-instr does not have '$(LINUXINCLUDE)', so unified.h is not
found.
This went unnoticed at the time of the Kbuild change because the last
use of as-instr in Kbuild that arch/arm could reach was removed in 5.7
by commit 541ad0150ca4 ("arm: Remove 32bit KVM host support") but a
stable backport of the Kbuild change to before that point exposed this
potential issue if one were to be reintroduced.
Follow the general pattern of '-include' paths throughout the tree and
make unified.h absolute using '$(srctree)' to ensure KBUILD_AFLAGS can
be used independently.
Closes: https://lore.kernel.org/CACo-S-1qbCX4WAVFA63dWfHtrRHZBTyyr2js8Lx=Az03XHTTHg@mail.gmail.com/ Cc: stable@vger.kernel.org Fixes: d5c8d6e0fa61 ("kbuild: Update assembler calls to use proper flags and language target") Reported-by: KernelCI bot <bot@kernelci.org> Reviewed-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
[ No KBUILD_RUSTFLAGS in 6.12 ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The variables `scale_pre_decml`, `scale_post_decml`, and `scale_precision`
were assigned in commit d68c592e02f6 ("iio: hid-sensor-prox: Fix scale not
correct issue"), but due to a merge conflict in
commit 9c15db92a8e5 ("Merge tag 'iio-for-5.13a' of
https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-next"),
these assignments were lost.
Add back lost assignments and replace `st->prox_attr` with
`st->prox_attr[0]` because commit 596ef5cf654b ("iio: hid-sensor-prox: Add
support for more channels") changed `prox_attr` to an array.
Cc: stable@vger.kernel.org # 5.13+ Fixes: 9c15db92a8e5 ("Merge tag 'iio-for-5.13a' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-next") Signed-off-by: Zhang Lixu <lixu.zhang@intel.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Link: https://patch.msgid.link/20250331055022.1149736-2-lixu.zhang@intel.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
[ changed st->prox_attr[0] array access to st->prox_attr single struct member ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In the ACPI DSDT table, PPP_RESOURCE_ID_LDO2_J is configured with 1256000
uV instead of the 1200000 uV we have currently in the device tree. Use the
same for consistency and correctness.
Cc: stable@vger.kernel.org Fixes: bd50b1f5b6f3 ("arm64: dts: qcom: x1e80100: Add Compute Reference Device") Signed-off-by: Stephan Gerhold <stephan.gerhold@linaro.org> Reviewed-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: Abel Vesa <abel.vesa@linaro.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Link: https://lore.kernel.org/r/20250423-x1e-vreg-l2j-voltage-v1-1-24b6a2043025@linaro.org Signed-off-by: Bjorn Andersson <andersson@kernel.org>
[ Change x1e80100-crd.dts instead ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To start an application processor in SNP-isolated guest, a hypercall
is used that takes a virtual processor index. The hv_snp_boot_ap()
function uses that START_VP hypercall but passes as VP index to it
what it receives as a wakeup_secondary_cpu_64 callback: the APIC ID.
As those two aren't generally interchangeable, that may lead to hung
APs if the VP index and the APIC ID don't match up.
Update the parameter names to avoid confusion as to what the parameter
is. Use the APIC ID to the VP index conversion to provide the correct
input to the hypercall.
Cc: stable@vger.kernel.org Fixes: 44676bb9d566 ("x86/hyperv: Add smp support for SEV-SNP guest") Signed-off-by: Roman Kisel <romank@linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/20250507182227.7421-2-romank@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <20250507182227.7421-2-romank@linux.microsoft.com>
[ changed HVCALL_GET_VP_INDEX_FROM_APIC_ID to HVCALL_GET_VP_ID_FROM_APIC_ID ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In KVM guests with Hyper-V hypercalls enabled, the hypercalls
HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST and HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX
allow a guest to request invalidation of portions of a virtual TLB.
For this, the hypercall parameter includes a list of GVAs that are supposed
to be invalidated.
However, when non-canonical GVAs are passed, there is currently no
filtering in place and they are eventually passed to checked invocations of
INVVPID on Intel / INVLPGA on AMD. While AMD's INVLPGA silently ignores
non-canonical addresses (effectively a no-op), Intel's INVVPID explicitly
signals VM-Fail and ultimately triggers the WARN_ONCE in invvpid_error():
Hyper-V documents that invalid GVAs (those that are beyond a partition's
GVA space) are to be ignored. While not completely clear whether this
ruling also applies to non-canonical GVAs, it is likely fine to make that
assumption, and manual testing on Azure confirms "real" Hyper-V interprets
the specification in the same way.
Skip non-canonical GVAs when processing the list of address to avoid
tripping the INVVPID failure. Alternatively, KVM could filter out "bad"
GVAs before inserting into the FIFO, but practically speaking the only
downside of pushing validation to the final processing is that doing so
is suboptimal for the guest, and no well-behaved guest will request TLB
flushes for non-canonical addresses.
As a result of a recent investigation, it was determined that x86 CPUs
which support 5-level paging, don't always respect CR4.LA57 when doing
canonical checks.
In particular:
1. MSRs which contain a linear address, allow full 57-bitcanonical address
regardless of CR4.LA57 state. For example: MSR_KERNEL_GS_BASE.
2. All hidden segment bases and GDT/IDT bases also behave like MSRs.
This means that full 57-bit canonical address can be loaded to them
regardless of CR4.LA57, both using MSRS (e.g GS_BASE) and instructions
(e.g LGDT).
3. TLB invalidation instructions also allow the user to use full 57-bit
address regardless of the CR4.LA57.
Finally, it must be noted that the CPU doesn't prevent the user from
disabling 5-level paging, even when the full 57-bit canonical address is
present in one of the registers mentioned above (e.g GDT base).
In fact, this can happen without any userspace help, when the CPU enters
SMM mode - some MSRs, for example MSR_KERNEL_GS_BASE are left to contain
a non-canonical address in regard to the new mode.
Since most of the affected MSRs and all segment bases can be read and
written freely by the guest without any KVM intervention, this patch makes
the emulator closely follow hardware behavior, which means that the
emulator doesn't take in the account the guest CPUID support for 5-level
paging, and only takes in the account the host CPU support.
Add emulation flags for MSR accesses and Descriptor Tables loads, and pass
the new flags as appropriate to emul_is_noncanonical_address(). The flags
will be used to perform the correct canonical check, as the type of access
affects whether or not CR4.LA57 is consulted when determining the canonical
bit.
No functional change is intended.
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Link: https://lore.kernel.org/r/20240906221824.491834-3-mlevitsk@redhat.com
[sean: split to separate patch, massage changelog] Signed-off-by: Sean Christopherson <seanjc@google.com>
Stable-dep-of: fa787ac07b3c ("KVM: x86/hyper-v: Skip non-canonical addresses during PV TLB flush") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add emulate_ops.is_canonical_addr() to perform (non-)canonical checks in
the emulator, which will allow extending is_noncanonical_address() to
support different flavors of canonical checks, e.g. for descriptor table
bases vs. MSRs, without needing duplicate logic in the emulator.
No functional change is intended.
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Link: https://lore.kernel.org/r/20240906221824.491834-3-mlevitsk@redhat.com
[sean: separate from additional of flags, massage changelog] Signed-off-by: Sean Christopherson <seanjc@google.com>
Stable-dep-of: fa787ac07b3c ("KVM: x86/hyper-v: Skip non-canonical addresses during PV TLB flush") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Punching a hole with a start offset that exceeds max_end is not
permitted and will result in a negative length in the
truncate_inode_partial_folio() function while truncating the page cache,
potentially leading to undesirable consequences.
The error out label of file_modified() should be out_inode_lock in
ext4_fallocate().
Fixes: 2890e5e0f49e ("ext4: move out common parts into ext4_fallocate()") Reported-by: Baokun Li <libaokun1@huawei.com> Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Baokun Li <libaokun1@huawei.com> Link: https://patch.msgid.link/20250319023557.2785018-1-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
For the extents based inodes, the maxbytes should be sb->s_maxbytes
instead of sbi->s_bitmap_maxbytes. Additionally, for the calculation of
max_end, the -sb->s_blocksize operation is necessary only for
indirect-block based inodes. Correct the maxbytes and max_end value to
correct the behavior of punch hole.
Fixes: 2da376228a24 ("ext4: limit length to bitmap_maxbytes - blocksize in punch_hole") Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Baokun Li <libaokun1@huawei.com> Link: https://patch.msgid.link/20250506012009.3896990-2-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, all zeroing ranges, punch holes, collapse ranges, and insert
ranges first wait for all existing direct I/O workers to complete, and
then they acquire the mapping's invalidate lock before performing the
actual work. These common components are nearly identical, so we can
simplify the code by factoring them out into the ext4_fallocate().
Currently, all five sub-functions of ext4_fallocate() acquire the
inode's i_rwsem at the beginning and release it before exiting. This
process can be simplified by factoring out the management of i_rwsem
into the ext4_fallocate() function.
Now the real job of normal fallocate are open coded in ext4_fallocate(),
factor out a new helper ext4_do_fallocate() to do the real job, like
others functions (e.g. ext4_zero_range()) in ext4_fallocate() do, this
can make the code more clear, no functional changes.
Simplify ext4_insert_range() and align its code style with that of
ext4_collapse_range(). Refactor it by: a) renaming variables, b)
removing redundant input parameter checks and moving the remaining
checks under i_rwsem in preparation for future refactoring, and c)
renaming the three stale error tags.
Simplify ext4_collapse_range() and align its code style with that of
ext4_zero_range() and ext4_punch_hole(). Refactor it by: a) renaming
variables, b) removing redundant input parameter checks and moving
the remaining checks under i_rwsem in preparation for future
refactoring, and c) renaming the three stale error tags.
The current implementation of ext4_zero_range() contains complex
position calculations and stale error tags. To improve the code's
clarity and maintainability, it is essential to clean up the code and
improve its readability, this can be achieved by: a) simplifying and
renaming variables, making the style the same as ext4_punch_hole(); b)
eliminating unnecessary position calculations, writing back all data in
data=journal mode, and drop page cache from the original offset to the
end, rather than using aligned blocks; c) renaming the stale out_mutex
tags.
The current implementation of ext4_punch_hole() contains complex
position calculations and stale error tags. To improve the code's
clarity and maintainability, it is essential to clean up the code and
improve its readability, this can be achieved by: a) simplifying and
renaming variables; b) eliminating unnecessary position calculations;
c) writing back all data in data=journal mode, and drop page cache from
the original offset to the end, rather than using aligned blocks,
d) renaming the stale error tags.
After commit 'ad5cd4f4ee4d ("ext4: fix fallocate to use file_modified to
update permissions consistently"), we can update mtime and ctime
appropriately through file_modified() when doing zero range, collapse
rage, insert range and punch hole, hence there is no need to explicit
update times in those paths, just drop them.
Fragments aren't limited by Z_EROFS_PCLUSTER_MAX_DSIZE. However, if
a fragment's logical length is larger than Z_EROFS_PCLUSTER_MAX_DSIZE
but the fragment is not the whole inode, it currently returns
-EOPNOTSUPP because m_flags has the wrong EROFS_MAP_ENCODED flag set.
It is not intended by design but should be rare, as it can only be
reproduced by mkfs with `-Eall-fragments` in a specific case.
Let's normalize fragment m_flags using the new EROFS_MAP_FRAGMENT.
Use `z_idata_size != 0` to indicate that ztailpacking is enabled.
`Z_EROFS_ADVISE_INLINE_PCLUSTER` cannot be ignored, as `h_idata_size`
could be non-zero prior to erofs-utils 1.6 [1].
Additionally, merge `z_idataoff` and `z_fragmentoff` since these two
features are mutually exclusive for a given inode.
- Set `compressedblks = 1` directly for non-bigpcluster cases. This
simplifies the logic a bit since lcluster sizes larger than one block
are unsupported and the details remain unclear.
- For Z_EROFS_LCLUSTER_TYPE_PLAIN pclusters, avoid assuming
`compressedblks = 1` by default. Instead, check if
Z_EROFS_ADVISE_BIG_PCLUSTER_2 is set.
It basically has no impact to existing valid images, but it's useful to
find the gap to prepare for large PLAIN pclusters.
For QPIC V2 onwards there is a separate register to read
last code word "QPIC_NAND_READ_LOCATION_LAST_CW_n".
qcom_param_page_type_exec() is used to read only one code word
If it configures the number of code words to 1 in QPIC_NAND_DEV0_CFG0
register then QPIC controller thinks its reading the last code word,
since we are having separate register to read the last code word,
we have to configure "QPIC_NAND_READ_LOCATION_LAST_CW_n" register
to fetch data from QPIC buffer to system memory.
Without this change page read was failing with timeout error
/ # hexdump -C /dev/mtd1
[ 129.206113] qcom-nandc 1cc8000.nand-controller: failure to read page/oob
hexdump: /dev/mtd1: Connection timed out
This issue only seen on SDX targets since SDX target used QPICv2. But
same working on IPQ targets since IPQ used QPICv1.
As discussed in the thread containing
https://lore.kernel.org/linux-crypto/20250510053308.GB505731@sol/, the
Power10-optimized Poly1305 code is currently not safe to call in softirq
context. Disable it for now. It can be re-enabled once it is fixed.
Fixes: ba8f8624fde2 ("crypto: poly1305-p10 - Glue code for optmized Poly1305 implementation for ppc64le") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
[ applied to arch/powerpc/crypto/Kconfig ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In `waveform_common_attach()`, the two timers `&devpriv->ai_timer` and
`&devpriv->ao_timer` are initialized after the allocation of the device
private data by `comedi_alloc_devpriv()` and the subdevices by
`comedi_alloc_subdevices()`. The function may return with an error
between those function calls. In that case, `waveform_detach()` will be
called by the Comedi core to clean up. The check that
`waveform_detach()` uses to decide whether to delete the timers is
incorrect. It only checks that the device private data was allocated,
but that does not guarantee that the timers were initialized. It also
needs to check that the subdevices were allocated. Fix it.
This happens when 'clear_inode()' makes an attempt to finalize an underlying
JFS inode of unknown type. According to JFS layout description from
https://jfs.sourceforge.net/project/pub/jfslayout.pdf, inode types from 5 to
15 are reserved for future extensions and should not be encountered on a valid
filesystem. So add an extra check for valid inode type in 'copy_from_dinode()'.
Reported-by: syzbot+ac2116e48989e84a2893@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=ac2116e48989e84a2893 Fixes: 79ac5a46c5c1 ("jfs_lookup(): don't bother with . or ..") Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com> Signed-off-by: Aditya Dutt <duttaditya18@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Michael Zhivich [Wed, 23 Jul 2025 13:40:19 +0000 (09:40 -0400)]
x86/bugs: Fix use of possibly uninit value in amd_check_tsa_microcode()
For kernels compiled with CONFIG_INIT_STACK_NONE=y, the value of __reserved
field in zen_patch_rev union on the stack may be garbage. If so, it will
prevent correct microcode check when consulting p.ucode_rev, resulting in
incorrect mitigation selection.
This is a stable-only fix.
Cc: <stable@vger.kernel.org> Signed-off-by: Michael Zhivich <mzhivich@akamai.com> Fixes: 7a0395f6607a5 ("x86/bugs: Add a Transient Scheduler Attacks mitigation") Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Remove incorrect checks on cqspi->rx_chan that cause driver breakage
during failure cleanup. Ensure proper resource freeing on the success
path when operating in cqspi->use_direct_mode, preventing leaks and
improving stability.
This patch fixes Type-C compliance test TD 4.7.6 - Try.SNK DRP Connect
SNKAS.
tVbusON has a limit of 275ms when entering SRC_ATTACHED. Compliance
testers can interpret the TryWait.Src to Attached.Src transition after
Try.Snk as being in Attached.Src the entire time, so ~170ms is lost
to the debounce timer.
Setting the data role can be a costly operation in host mode, and when
completed after 100ms can cause Type-C compliance test check TD 4.7.5.V.4
to fail.
Turn VBUS on before tcpm_set_roles to meet timing requirement.
The funciton tcpm_acc_attach is not setting the proper state when
calling tcpm_set_role. The function tcpm_set_role is currently only
handling TYPEC_STATE_USB. For the tcpm_acc_attach to switch into other
modal states tcpm_set_role needs to be extended by an extra state
parameter. This patch is handling the proper state change when calling
tcpm_acc_attach.
Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20250404-ml-topic-tcpm-v1-3-b99f44badce8@pengutronix.de
Stable-dep-of: bec15191d523 ("usb: typec: tcpm: apply vbus before data bringup in tcpm_src_attach") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add two tests:
- one test has 'rX <op> r10' where rX is not r10, and
- another test has 'rX <op> rY' where rX and rY are not r10
but there is an early insn 'rX = r10'.
Without previous verifier change, both tests will fail.
Clippy's lints may avoid emitting a suggestion to use a language or
library feature that is not supported by the minimum supported version,
if given by the `msrv` field in the configuration file.
For instance, Clippy should not suggest using `let ... else` in a lint
if the MSRV did not implement that syntax.
If the MSRV is not provided, Clippy will assume all features are available.
Thus enable it with the minimum Rust version the kernel supports.
Note that there is currently a small disadvantage in doing so: since
we still use unstable features that nevertheless work in the range
of versions we support (e.g. `#[expect(...)]`), we lose suggestions
for those. However, over time we will stop using unstable features
(especially language and library ones) as it is our goal, thus, in the
end, we will want to have the `msrv` set.
Rust is also considering adding a similar feature in `rustc` too, which
we should probably enable if it becomes available [2].
Commit 48b4800a1c6a ("zsmalloc: page migration support") added support for
migrating zsmalloc pages using the movable_operations migration framework.
However, the commit did not take into account that zsmalloc supports
migration only when CONFIG_COMPACTION is enabled. Tracing shows that
zsmalloc was still passing the __GFP_MOVABLE flag even when compaction is
not supported.
This can result in unmovable pages being allocated from movable page
blocks (even without stealing page blocks), ZONE_MOVABLE and CMA area.
Possible user visible effects:
- Some ZONE_MOVABLE memory can be not actually movable
- CMA allocation can fail because of this
- Increased memory fragmentation due to ignoring the page mobility
grouping feature
I'm not really sure who uses kernels without compaction support, though :(
To fix this, clear the __GFP_MOVABLE flag when
!IS_ENABLED(CONFIG_COMPACTION).
Link: https://lkml.kernel.org/r/20250704103053.6913-1-harry.yoo@oracle.com Fixes: 48b4800a1c6a ("zsmalloc: page migration support") Signed-off-by: Harry Yoo <harry.yoo@oracle.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Minchan Kim <minchan@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In shrink_folio_list(), the hwpoisoned folio may be large folio, which
can't be handled by unmap_poisoned_folio(). For THP, try_to_unmap_one()
must be passed with TTU_SPLIT_HUGE_PMD to split huge PMD first and then
retry. Without TTU_SPLIT_HUGE_PMD, we will trigger null-ptr deref of
pvmw.pte. Even we passed TTU_SPLIT_HUGE_PMD, we will trigger a
WARN_ON_ONCE due to the page isn't in swapcache.
Since UCE is rare in real world, and race with reclaimation is more rare,
just skipping the hwpoisoned large folio is enough. memory_failure() will
handle it if the UCE is triggered again.
This happens when memory reclaim for large folio races with
memory_failure(), and will lead to kernel panic. The race is as
follows:
cpu0 cpu1
shrink_folio_list memory_failure
TestSetPageHWPoison
unmap_poisoned_folio
--> trigger BUG_ON due to
unmap_poisoned_folio couldn't
handle large folio
[tujinjiang@huawei.com: add comment to unmap_poisoned_folio()] Link: https://lkml.kernel.org/r/69fd4e00-1b13-d5f7-1c82-705c7d977ea4@huawei.com Link: https://lkml.kernel.org/r/20250627125747.3094074-2-tujinjiang@huawei.com Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com> Fixes: 1b0449544c64 ("mm/vmscan: don't try to reclaim hwpoison folio") Reported-by: syzbot+3b220254df55d8ca8a61@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/68412d57.050a0220.2461cf.000e.GAE@google.com/ Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Acked-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The checksum mode has been added a while ago, but it is only validated
when manually launching mptcp_connect.sh with "-C".
The different CIs were then not validating these MPTCP Connect tests
with checksum enabled. To make sure they do, add a new test program
executing mptcp_connect.sh with the checksum mode.
The "mmap" and "sendfile" alternate modes for mptcp_connect.sh/.c are
available from the beginning, but only tested when mptcp_connect.sh is
manually launched with "-m mmap" or "-m sendfile", not via the
kselftests helpers.
The MPTCP CI was manually running "mptcp_connect.sh -m mmap", but not
"-m sendfile". Plus other CIs, especially the ones validating the stable
releases, were not validating these alternate modes.
To make sure these modes are validated by these CIs, add two new test
programs executing mptcp_connect.sh with the alternate modes.
A warning is raised when __request_region() detects a conflict with a
resource whose resource.desc is IORES_DESC_DEVICE_PRIVATE_MEMORY.
But this warning is only valid for iomem_resources.
The hmem device resource uses resource.desc as the numa node id, which can
cause spurious warnings.
This warning appeared on a machine with multiple cxl memory expanders.
One of the NUMA node id is 6, which is the same as the value of
IORES_DESC_DEVICE_PRIVATE_MEMORY.
In this environment it was just a spurious warning, but when I saw the
warning I suspected a real problem so it's better to fix it.
This change fixes this by restricting the warning to only iomem_resource.
This also adds a missing new line to the warning message.
Link: https://lkml.kernel.org/r/20250719112604.25500-1-akinobu.mita@gmail.com Fixes: 7dab174e2e27 ("dax/hmem: Move hmem device registration to dax_hmem.ko") Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To prevent inodes with invalid file types from tripping through the vfs
and causing malfunctions or assertion failures, add a missing sanity check
when reading an inode from a block device. If the file type is not valid,
treat it as a filesystem error.
Since 6ee9b3d84775 ("kasan: remove kasan_find_vm_area() to prevent
possible deadlock"), more detailed info about the vmalloc mapping and the
origin was dropped due to potential deadlocks.
While fixing the deadlock is necessary, that patch was too quick in
killing an otherwise useful feature, and did no due-diligence in
understanding if an alternative option is available.
Restore printing more helpful vmalloc allocation info in KASAN reports
with the help of vmalloc_dump_obj(). Example report:
| BUG: KASAN: vmalloc-out-of-bounds in vmalloc_oob+0x4c9/0x610
| Read of size 1 at addr ffffc900002fd7f3 by task kunit_try_catch/493
|
| CPU: [...]
| Call Trace:
| <TASK>
| dump_stack_lvl+0xa8/0xf0
| print_report+0x17e/0x810
| kasan_report+0x155/0x190
| vmalloc_oob+0x4c9/0x610
| [...]
|
| The buggy address belongs to a 1-page vmalloc region starting at 0xffffc900002fd000 allocated at vmalloc_oob+0x36/0x610
| The buggy address belongs to the physical page:
| page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x126364
| flags: 0x200000000000000(node=0|zone=2)
| raw: 02000000000000000000000000000000dead0000000001220000000000000000
| raw: 0000000000000000000000000000000000000001ffffffff0000000000000000
| page dumped because: kasan: bad access detected
|
| [..]
Link: https://lkml.kernel.org/r/20250716152448.3877201-1-elver@google.com Fixes: 6ee9b3d84775 ("kasan: remove kasan_find_vm_area() to prevent possible deadlock") Signed-off-by: Marco Elver <elver@google.com> Suggested-by: Uladzislau Rezki <urezki@gmail.com> Acked-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Yeoreum Yun <yeoreum.yun@arm.com> Cc: Yunseong Kim <ysk@kzalloc.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
gve_tx_timeout was calculating missed completions in a way that is only
relevant in the GQ queue format. Additionally, it was attempting to
disable device interrupts, which is not needed in either GQ or DQ queue
formats.
As a result, TX timeouts with the DQ queue format likely would have
triggered early resets without kicking the queue at all.
This patch drops the check for pending work altogether and always kicks
the queue after validating the queue has not seen a TX timeout too
recently.
Cc: stable@vger.kernel.org Fixes: 87a7f321bb6a ("gve: Recover from queue stall due to missed IRQ") Co-developed-by: Tim Hostetler <thostet@google.com> Signed-off-by: Tim Hostetler <thostet@google.com> Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Link: https://patch.msgid.link/20250717192024.1820931-1-hramamurthy@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Starting from Tiger Lake, LAN NVM is locked for writes by SW, so the
> driver cannot perform checksum validation and correction. This means
> that all NVM images must leave the factory with correct checksum and
> checksum valid bit set.
Unfortunately some systems have left the factory with an uninitialized
value of 0xFFFF at register address 0x3F (checksum word location).
So on Tiger Lake platform we ignore the computed checksum when such
condition is encountered.
Signed-off-by: Jacek Kowalski <jacek@jacekk.info> Tested-by: Vlad URSU <vlad@ursu.me> Fixes: 4051f68318ca9 ("e1000e: Do not take care about recovery NVM checksum") Cc: stable@vger.kernel.org Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Vitaly Lifshits <vitaly.lifshits@intel.com> Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Starting from Tiger Lake, LAN NVM is locked for writes by SW, so the
> driver cannot perform checksum validation and correction. This means
> that all NVM images must leave the factory with correct checksum and
> checksum valid bit set. Since Tiger Lake devices were the first to have
> this lock, some systems in the field did not meet this requirement.
> Therefore, for these transitional devices we skip checksum update and
> verification, if the valid bit is not set.
Signed-off-by: Jacek Kowalski <jacek@jacekk.info> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Vitaly Lifshits <vitaly.lifshits@intel.com> Fixes: 4051f68318ca9 ("e1000e: Do not take care about recovery NVM checksum") Cc: stable@vger.kernel.org Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The fsl_mc_get_endpoint() function uses device_find_child() for
localization, which implicitly calls get_device() to increment the
device's reference count before returning the pointer. However, the
caller dpaa2_switch_port_connect_mac() fails to properly release this
reference in multiple scenarios. We should call put_device() to
decrement reference count properly.
As comment of device_find_child() says, 'NOTE: you will need to drop
the reference with put_device() after use'.
Found by code review.
Cc: stable@vger.kernel.org Fixes: 84cba72956fd ("dpaa2-switch: integrate the MAC endpoint support") Signed-off-by: Ma Ke <make24@iscas.ac.cn> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250717022309.3339976-3-make24@iscas.ac.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The fsl_mc_get_endpoint() function uses device_find_child() for
localization, which implicitly calls get_device() to increment the
device's reference count before returning the pointer. However, the
caller dpaa2_eth_connect_mac() fails to properly release this
reference in multiple scenarios. We should call put_device() to
decrement reference count properly.
As comment of device_find_child() says, 'NOTE: you will need to drop
the reference with put_device() after use'.
Found by code review.
Cc: stable@vger.kernel.org Fixes: 719479230893 ("dpaa2-eth: add MAC/PHY support through phylink") Signed-off-by: Ma Ke <make24@iscas.ac.cn> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250717022309.3339976-2-make24@iscas.ac.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
`cpu_switch_to()` and `call_on_irq_stack()` manipulate SP to change
to different stacks along with the Shadow Call Stack if it is enabled.
Those two stack changes cannot be done atomically and both functions
can be interrupted by SErrors or Debug Exceptions which, though unlikely,
is very much broken : if interrupted, we can end up with mismatched stacks
and Shadow Call Stack leading to clobbered stacks.
In `cpu_switch_to()`, it can happen when SP_EL0 points to the new task,
but x18 stills points to the old task's SCS. When the interrupt handler
tries to save the task's SCS pointer, it will save the old task
SCS pointer (x18) into the new task struct (pointed to by SP_EL0),
clobbering it.
In `call_on_irq_stack()`, it can happen when switching from the task stack
to the IRQ stack and when switching back. In both cases, we can be
interrupted when the SCS pointer points to the IRQ SCS, but SP points to
the task stack. The nested interrupt handler pushes its return addresses
on the IRQ SCS. It then detects that SP points to the task stack,
calls `call_on_irq_stack()` and clobbers the task SCS pointer with
the IRQ SCS pointer, which it will also use !
This leads to tasks returning to addresses on the wrong SCS,
or even on the IRQ SCS, triggering kernel panics via CONFIG_VMAP_STACK
or FPAC if enabled.
This is possible on a default config, but unlikely.
However, when enabling CONFIG_ARM64_PSEUDO_NMI, DAIF is unmasked and
instead the GIC is responsible for filtering what interrupts the CPU
should receive based on priority.
Given the goal of emulating NMIs, pseudo-NMIs can be received by the CPU
even in `cpu_switch_to()` and `call_on_irq_stack()`, possibly *very*
frequently depending on the system configuration and workload, leading
to unpredictable kernel panics.
Completely mask DAIF in `cpu_switch_to()` and restore it when returning.
Do the same in `call_on_irq_stack()`, but restore and mask around
the branch.
Mask DAIF even if CONFIG_SHADOW_CALL_STACK is not enabled for consistency
of behaviour between all configurations.
Introduce and use an assembly macro for saving and masking DAIF,
as the existing one saves but only masks IF.
Cc: <stable@vger.kernel.org> Signed-off-by: Ada Couprie Diaz <ada.coupriediaz@arm.com> Reported-by: Cristian Prundeanu <cpru@amazon.com> Fixes: 59b37fe52f49 ("arm64: Stash shadow stack pointer in the task struct on interrupt") Tested-by: Cristian Prundeanu <cpru@amazon.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250718142814.133329-1-ada.coupriediaz@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The mute LED on the HP Pavilion Laptop 15-eg0xxx,
which uses the ALC287 codec, didn't work.
This patch fixes the issue by enabling the ALC287_FIXUP_HP_GPIO_LED quirk.
Tested on a physical device, the LED now works as intended.
In file included from drivers/crypto/intel/qat/qat_common/adf_pm_dbgfs_utils.c:4:
include/linux/sprintf.h:11:54: error: unknown type name 'va_list'
11 | __printf(2, 0) int vsprintf(char *buf, const char *, va_list);
| ^~~~~~~
include/linux/sprintf.h:1:1: note: 'va_list' is defined in header '<stdarg.h>'; this is probably fixable by adding '#include <stdarg.h>'
Link: https://lkml.kernel.org/r/20250721173754.42865913@canb.auug.org.au Fixes: 39ced19b9e60 ("lib/vsprintf: split out sprintf() and friends") Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Andriy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Petr Mladek <pmladek@suse.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The fsl_mc_get_endpoint() function may call fsl_mc_device_lookup()
twice, which would increment the device's reference count twice if
both lookups find a device. This could lead to a reference count leak.
Found by code review.
Cc: stable@vger.kernel.org Fixes: 1ac210d128ef ("bus: fsl-mc: add the fsl_mc_get_endpoint function") Signed-off-by: Ma Ke <make24@iscas.ac.cn> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: Simon Horman <horms@kernel.org> Fixes: 8567494cebe5 ("bus: fsl-mc: rescan devices if endpoint not found") Link: https://patch.msgid.link/20250717022309.3339976-1-make24@iscas.ac.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The acpi_evaluate_object() returns an ACPI error code and not
Linux one. For the some platforms the err will have positive code
which may be interpreted incorrectly. Use device_reset() for
reset control which handles it correctly.
Fixes: bd2fdedbf2ba ("i2c: tegra: Add the ACPI support") Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Akhil R <akhilrajeev@nvidia.com> Cc: <stable@vger.kernel.org> # v5.17+ Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org> Link: https://lore.kernel.org/r/20250710131206.2316-2-akhilrajeev@nvidia.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>