Linus Torvalds [Sat, 8 Mar 2025 17:21:41 +0000 (07:21 -1000)]
Merge tag 'loongarch-fixes-6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch fixes from Huacai Chen:
"Fix bugs in kernel build, hibernation, memory management and KVM"
* tag 'loongarch-fixes-6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
LoongArch: KVM: Fix GPA size issue about VM
LoongArch: KVM: Reload guest CSR registers after sleep
LoongArch: KVM: Add interrupt checking for AVEC
LoongArch: Set hugetlb mmap base address aligned with pmd size
LoongArch: Set max_pfn with the PFN of the last page
LoongArch: Use polling play_dead() when resuming from hibernation
LoongArch: Eliminate superfluous get_numa_distances_cnt()
LoongArch: Convert unreachable() to BUG()
Bibo Mao [Sat, 8 Mar 2025 05:52:04 +0000 (13:52 +0800)]
LoongArch: KVM: Fix GPA size issue about VM
Physical address space is 48 bit on Loongson-3A5000 physical machine,
however it is 47 bit for VM on Loongson-3A5000 system. Size of physical
address space of VM is the same with the size of virtual user space (a
half) of physical machine.
Variable cpu_vabits represents user address space, kernel address space
is not included (user space and kernel space are both a half of total).
Here cpu_vabits, rather than cpu_vabits - 1, is to represent the size of
guest physical address space.
Also there is strict checking about page fault GPA address, inject error
if it is larger than maximum GPA address of VM.
Cc: stable@vger.kernel.org Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Bibo Mao [Sat, 8 Mar 2025 05:52:01 +0000 (13:52 +0800)]
LoongArch: KVM: Reload guest CSR registers after sleep
On host, the HW guest CSR registers are lost after suspend and resume
operation. Since last_vcpu of boot CPU still records latest vCPU pointer
so that the guest CSR register skips to reload when boot CPU resumes and
vCPU is scheduled.
Here last_vcpu is cleared so that guest CSR registers will reload from
scheduled vCPU context after suspend and resume.
Cc: stable@vger.kernel.org Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Bibo Mao [Sat, 8 Mar 2025 05:51:59 +0000 (13:51 +0800)]
LoongArch: KVM: Add interrupt checking for AVEC
There is a newly added macro INT_AVEC with CSR ESTAT register, which is
bit 14 used for LoongArch AVEC support. AVEC interrupt status bit 14 is
supported with macro CSR_ESTAT_IS, so here replace the hard-coded value
0x1fff with macro CSR_ESTAT_IS so that the AVEC interrupt status is also
supported by KVM.
Cc: stable@vger.kernel.org Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
The problem is that base address allocated from hugetlbfs is not aligned
with pmd size. Here add a checking for hugetlbfs and align base address
with pmd size. After this patch the test case "testcases/bin/hugefork02"
passes to run.
This is similar to the commit 7f24cbc9c4d42db8a3c8484d1 ("mm/mmap: teach
generic_get_unmapped_area{_topdown} to handle hugetlb mappings").
Bibo Mao [Sat, 8 Mar 2025 05:51:32 +0000 (13:51 +0800)]
LoongArch: Set max_pfn with the PFN of the last page
The current max_pfn equals to zero. In this case, it causes user cannot
get some page information through /proc filesystem such as kpagecount.
The following message is displayed by stress-ng test suite with command
"stress-ng --verbose --physpage 1 -t 1".
# stress-ng --verbose --physpage 1 -t 1
stress-ng: error: [1691] physpage: cannot read page count for address 0x134ac000 in /proc/kpagecount, errno=22 (Invalid argument)
stress-ng: error: [1691] physpage: cannot read page count for address 0x7ffff207c3a8 in /proc/kpagecount, errno=22 (Invalid argument)
stress-ng: error: [1691] physpage: cannot read page count for address 0x134b0000 in /proc/kpagecount, errno=22 (Invalid argument)
...
After applying this patch, the kernel can pass the test.
# stress-ng --verbose --physpage 1 -t 1
stress-ng: debug: [1701] physpage: [1701] started (instance 0 on CPU 3)
stress-ng: debug: [1701] physpage: [1701] exited (instance 0 on CPU 3)
stress-ng: debug: [1700] physpage: [1701] terminated (success)
Cc: stable@vger.kernel.org # 6.8+ Fixes: ff6c3d81f2e8 ("NUMA: optimize detection of memory with no node id assigned by firmware") Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Huacai Chen [Sat, 8 Mar 2025 05:51:32 +0000 (13:51 +0800)]
LoongArch: Use polling play_dead() when resuming from hibernation
When CONFIG_RANDOM_KMALLOC_CACHES or other randomization infrastructrue
enabled, the idle_task's stack may different between the booting kernel
and target kernel. So when resuming from hibernation, an ACTION_BOOT_CPU
IPI wakeup the idle instruction in arch_cpu_idle_dead() and jump to the
interrupt handler. But since the stack pointer is changed, the interrupt
handler cannot restore correct context.
So rename the current arch_cpu_idle_dead() to idle_play_dead(), make it
as the default version of play_dead(), and the new arch_cpu_idle_dead()
call play_dead() directly. For hibernation, implement an arch-specific
hibernate_resume_nonboot_cpu_disable() to use the polling version (idle
instruction is replace by nop, and irq is disabled) of play_dead(), i.e.
poll_play_dead(), to avoid IPI handler corrupting the idle_task's stack
when resuming from hibernation.
This solution is a little similar to commit 406f992e4a372dafbe3c ("x86 /
hibernate: Use hlt_play_dead() when resuming from hibernation").
Tiezhu Yang [Sat, 8 Mar 2025 05:50:45 +0000 (13:50 +0800)]
LoongArch: Convert unreachable() to BUG()
When compiling on LoongArch, there exists the following objtool warning
in arch/loongarch/kernel/machine_kexec.o:
kexec_reboot() falls through to next function crash_shutdown_secondary()
Avoid using unreachable() as it can (and will in the absence of UBSAN)
generate fall-through code. Use BUG() so we get a "break BRK_BUG" trap
(with unreachable annotation).
Linus Torvalds [Sat, 8 Mar 2025 02:21:02 +0000 (16:21 -1000)]
Merge tag 's390-6.14-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Vasily Gorbik:
- Fix return address recovery of traced function in ftrace to ensure
reliable stack unwinding
- Fix compiler warnings and runtime crashes of vDSO selftests on s390
by introducing a dedicated GNU hash bucket pointer with correct
32-bit entry size
- Fix test_monitor_call() inline asm, which misses CC clobber, by
switching to an instruction that doesn't modify CC
* tag 's390-6.14-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/ftrace: Fix return address recovery of traced function
selftests/vDSO: Fix GNU hash table entry size for s390x
s390/traps: Fix test_monitor_call() inline assembly
Linus Torvalds [Fri, 7 Mar 2025 22:17:42 +0000 (12:17 -1000)]
Merge tag 'acpi-6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
"Restore the previous behavior of the ACPI platform_profile sysfs
interface that has been changed recently in a way incompatible with
the existing user space (Mario Limonciello)"
* tag 'acpi-6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
platform/x86/amd: pmf: Add balanced-performance to hidden choices
platform/x86/amd: pmf: Add 'quiet' to hidden choices
ACPI: platform_profile: Add support for hidden choices
Linus Torvalds [Fri, 7 Mar 2025 21:17:30 +0000 (11:17 -1000)]
Merge tag 'for-6.14-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- fix leaked extent map after error when reading chunks
- replace use of deprecated strncpy
- in zoned mode, fixed range when ulocking extent range, causing a hang
* tag 'for-6.14-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix a leaked chunk map issue in read_one_chunk()
btrfs: replace deprecated strncpy() with strscpy()
btrfs: zoned: fix extent range end unlock in cow_file_range()
Linus Torvalds [Fri, 7 Mar 2025 21:12:33 +0000 (11:12 -1000)]
Merge tag 'block-6.14-20250306' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:
- NVMe pull request via Keith:
- TCP use after free fix on polling (Sagi)
- Controller memory buffer cleanup fixes (Icenowy)
- Free leaking requests on bad user passthrough commands (Keith)
- TCP error message fix (Maurizio)
- TCP corruption fix on partial PDU (Maurizio)
- TCP memory ordering fix for weakly ordered archs (Meir)
- Type coercion fix on message error for TCP (Dan)
- Name the RQF flags enum, fixing issues with anon enums and BPF import
of it
- ublk parameter setting fix
- GPT partition 7-bit conversion fix
* tag 'block-6.14-20250306' of git://git.kernel.dk/linux:
block: Name the RQF flags enum
nvme-tcp: fix signedness bug in nvme_tcp_init_connection()
block: fix conversion of GPT partition name to 7-bit
ublk: set_params: properly check if parameters can be applied
nvmet-tcp: Fix a possible sporadic response drops in weakly ordered arch
nvme-tcp: fix potential memory corruption in nvme_tcp_recv_pdu()
nvme-tcp: Fix a C2HTermReq error message
nvmet: remove old function prototype
nvme-ioctl: fix leaked requests on mapping error
nvme-pci: skip CMB blocks incompatible with PCI P2P DMA
nvme-pci: clean up CMBMSC when registering CMB fails
nvme-tcp: fix possible UAF in nvme_tcp_poll
Linus Torvalds [Fri, 7 Mar 2025 21:09:33 +0000 (11:09 -1000)]
Merge tag 'io_uring-6.14-20250306' of git://git.kernel.dk/linux
Pull io_uring fix from Jens Axboe:
"A single fix for a regression introduced in the 6.14 merge window,
causing stalls/hangs with IOPOLL reads or writes"
* tag 'io_uring-6.14-20250306' of git://git.kernel.dk/linux:
io_uring/rw: ensure reissue path is correctly handled for IOPOLL
- Fix possible memory corruption in child_cfs_rq_on_list()
* tag 'sched-urgent-2025-03-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/rt: Update limit of sched_rt sysctl in documentation
sched/deadline: Use online cpus for validating runtime
sched/fair: Fix potential memory corruption in child_cfs_rq_on_list
Linus Torvalds [Fri, 7 Mar 2025 20:38:33 +0000 (10:38 -1000)]
Merge tag 'perf-urgent-2025-03-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf event fixes from Ingo Molnar:
"Fix a race between PMU registration and event creation, and fix
pmus_lock vs. pmus_srcu lock ordering"
* tag 'perf-urgent-2025-03-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Fix perf_pmu_register() vs. perf_init_event()
perf/core: Fix pmus_lock vs. pmus_srcu ordering
Linus Torvalds [Fri, 7 Mar 2025 17:51:27 +0000 (07:51 -1000)]
Merge tag 'hwmon-for-v6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
- xgene-hwmon: Fix a NULL vs IS_ERR_OR_NULL() check
- ad7314: Return error if leading zero bits are non-zero
- ntc_thermistor: Update/fix the ncpXXxh103 sensor table
- pmbus: Initialise page count in pmbus_identify()
- peci/dimmtemp: Do not provide fake threshold data
* tag 'hwmon-for-v6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: fix a NULL vs IS_ERR_OR_NULL() check in xgene_hwmon_probe()
hwmon: (ad7314) Validate leading zero bits and return error
hwmon: (ntc_thermistor) Fix the ncpXXxh103 sensor table
hwmon: (pmbus) Initialise page count in pmbus_identify()
hwmon: (peci/dimmtemp) Do not provide fake thresholds data
Linus Torvalds [Fri, 7 Mar 2025 17:29:13 +0000 (07:29 -1000)]
Merge tag 'platform-drivers-x86-v6.14-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fixes from Ilpo Järvinen:
- amd/pmf:
- Initialize 'cb_mutex'
- Support for new version of PMF-TA
- intel-hid: Fix volume buttons on Microsoft Surface Go 4 tablet
- intel/vsec: Add Diamond Rapids support
- thinkpad_acpi: Add battery quirk for ThinkPad X131e
* tag 'platform-drivers-x86-v6.14-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/x86/amd/pmf: Update PMF Driver for Compatibility with new PMF-TA
platform/x86/amd/pmf: Propagate PMF-TA return codes
platform/x86/intel/vsec: Add Diamond Rapids support
platform/x86: thinkpad_acpi: Add battery quirk for ThinkPad X131e
platform/x86: intel-hid: fix volume buttons on Microsoft Surface Go 4 tablet
platform/x86/amd/pmf: Initialize and clean up `cb_mutex`
Linus Torvalds [Fri, 7 Mar 2025 17:24:41 +0000 (07:24 -1000)]
Merge tag 'sound-6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"There is a single change in ALSA core (for sequencer code for the
module auto-loading in a wrong timing) while the all rest are various
HD- and USB-audio fixes.
Many of them are boring device-specific quirks, and should be safe to
take"
* tag 'sound-6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda/realtek: Add support for ASUS Zenbook UM3406KA Laptops using CS35L41 HDA
ALSA: hda/realtek: Add support for ASUS B5405 and B5605 Laptops using CS35L41 HDA
ALSA: hda/realtek: Add support for ASUS B3405 and B3605 Laptops using CS35L41 HDA
ALSA: hda/realtek: Add support for various ASUS Laptops using CS35L41 HDA
ALSA: hda/realtek: Add support for ASUS ROG Strix G614 Laptops using CS35L41 HDA
ALSA: hda/realtek: Add support for ASUS ROG Strix GA603 Laptops using CS35L41 HDA
ALSA: hda/realtek: Add support for ASUS ROG Strix G814 Laptop using CS35L41 HDA
ALSA: hda: intel: Add Dell ALC3271 to power_save denylist
ALSA: hda/realtek: update ALC222 depop optimize
ALSA: hda: realtek: fix incorrect IS_REACHABLE() usage
ALSA: usx2y: validate nrpacks module parameter on probe
ALSA: hda/realtek - add supported Mic Mute LED for Lenovo platform
ALSA: seq: Avoid module auto-load handling at event delivery
ALSA: hda: Fix speakers on ASUS EXPERTBOOK P5405CSA 1.0
ALSA: hda/realtek: Fix Asus Z13 2025 audio
ALSA: hda/realtek: Remove (revert) duplicate Ally X config
Linus Torvalds [Fri, 7 Mar 2025 04:25:35 +0000 (18:25 -1000)]
fs/pipe: add simpler helpers for common cases
The fix to atomically read the pipe head and tail state when not holding
the pipe mutex has caused a number of headaches due to the size change
of the involved types.
It turns out that we don't have _that_ many places that access these
fields directly and were affected, but we have more than we strictly
should have, because our low-level helper functions have been designed
to have intimate knowledge of how the pipes work.
And as a result, that random noise of direct 'pipe->head' and
'pipe->tail' accesses makes it harder to pinpoint any actual potential
problem spots remaining.
For example, we didn't have a "is the pipe full" helper function, but
instead had a "given these pipe buffer indexes and this pipe size, is
the pipe full". That's because some low-level pipe code does actually
want that much more complicated interface.
But most other places literally just want a "is the pipe full" helper,
and not having it meant that those places ended up being unnecessarily
much too aware of this all.
It would have been much better if only the very core pipe code that
cared had been the one aware of this all.
So let's fix it - better late than never. This just introduces the
trivial wrappers for "is this pipe full or empty" and to get how many
pipe buffers are used, so that instead of writing
if (pipe_full(pipe->head, pipe->tail, pipe->max_usage))
the places that literally just want to know if a pipe is full can just
say
if (pipe_is_full(pipe))
instead. The existing trivial cases were converted with a 'sed' script.
This cuts down on the places that access pipe->head and pipe->tail
directly outside of the pipe code (and core splice code) quite a lot.
The splice code in particular still revels in doing the direct low-level
accesses, and the fuse fuse_dev_splice_write() code also seems a bit
unnecessarily eager to go very low-level, but it's at least a bit better
than it used to be.
xe:
- Remove double page flip on initial plane
- Properly setup userptr pfn_flags_mask
- Fix GT "for each engine" workarounds
- Fix userptr races and missed validations
- Userptr invalid page access fixes
- Cleanup some style nits
amdgpu:
- Fix NULL check in DC code
- SMU 14 fix
amdkfd:
- Fix NULL check in queue validation
radeon:
- RS400 HyperZ fix"
* tag 'drm-fixes-2025-03-07' of https://gitlab.freedesktop.org/drm/kernel: (22 commits)
drm/bochs: Fix DPMS regression
drm/xe/userptr: Unmap userptrs in the mmu notifier
drm/xe/hmm: Don't dereference struct page pointers without notifier lock
drm/xe/hmm: Style- and include fixes
drm/xe: Add staging tree for VM binds
drm/xe: Fix fault mode invalidation with unbind
drm/xe/vm: Fix a misplaced #endif
drm/xe/vm: Validate userptr during gpu vma prefetching
drm/amd/pm: always allow ih interrupt from fw
drm/radeon: Fix rs400_gpu_init for ATI mobility radeon Xpress 200M
drm/amdkfd: Fix NULL Pointer Dereference in KFD queue
drm/amd/display: Fix null check for pipe_ctx->plane_state in resource_build_scaling_params
drm/xe: Fix GT "for each engine" workarounds
drm/xe/userptr: properly setup pfn_flags_mask
drm/i915/mst: update max stream count to match number of pipes
drm/xe: Remove double pageflip
drm/sched: Fix preprocessor guard
drm/imagination: Fix timestamps in firmware traces
drm/imagination: only init job done fences once
drm/imagination: Hold drm_gem_gpuva lock for unmap
...
Breno Leitao [Thu, 6 Mar 2025 16:27:51 +0000 (08:27 -0800)]
block: Name the RQF flags enum
Commit 5f89154e8e9e3445f9b59 ("block: Use enum to define RQF_x bit
indexes") converted the RQF flags to an anonymous enum, which was
a beneficial change. This patch goes one step further by naming the enum
as "rqf_flags".
This naming enables exporting these flags to BPF clients, eliminating
the need to duplicate these flags in BPF code. Instead, BPF clients can
now access the same kernel-side values through CO:RE (Compile Once, Run
Everywhere), as shown in this example:
Linus Torvalds [Thu, 6 Mar 2025 23:52:15 +0000 (13:52 -1000)]
Merge tag 'bcachefs-2025-03-06' of git://evilpiepirate.org/bcachefs
Pull bcachefs fixes from Kent Overstreet:
- Fix a compatibility issue: we shouldn't be setting incompat feature
bits unless explicitly requested
- Fix another bug where the journal alloc/resize path could spuriously
fail with -BCH_ERR_open_buckets_empty
- Copygc shouldn't run on read-only devices: fragmentation isn't an
issue if we're not currently writing to a given device, and it may
not have anywhere to move the data to
* tag 'bcachefs-2025-03-06' of git://evilpiepirate.org/bcachefs:
bcachefs: copygc now skips non-rw devices
bcachefs: Fix bch2_dev_journal_alloc() spuriously failing
bcachefs: Don't set BCH_FEATURE_incompat_version_field unless requested
Kent Overstreet [Fri, 28 Feb 2025 16:34:41 +0000 (11:34 -0500)]
bcachefs: copygc now skips non-rw devices
There's no point in doing copygc on non-rw devices: the fragmentation
doesn't matter if we're not writing to them, and we may not have
anywhere to put the data on our other devices.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Previously, we fixed journal resize spuriousl failing with
-BCH_ERR_open_buckets_empty, but initial journal allocation was missed
because it didn't invoke the "block on allocator" loop at all.
Factor out the "loop on allocator" code to fix that.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Ard Biesheuvel [Thu, 6 Mar 2025 15:59:16 +0000 (16:59 +0100)]
x86/boot: Sanitize boot params before parsing command line
The 5-level paging code parses the command line to look for the 'no5lvl'
string, and does so very early, before sanitize_boot_params() has been
called and has been given the opportunity to wipe bogus data from the
fields in boot_params that are not covered by struct setup_header, and
are therefore supposed to be initialized to zero by the bootloader.
This triggers an early boot crash when using syslinux-efi to boot a
recent kernel built with CONFIG_X86_5LEVEL=y and CONFIG_EFI_STUB=n, as
the 0xff padding that now fills the unused PE/COFF header is copied into
boot_params by the bootloader, and interpreted as the top half of the
command line pointer.
Fix this by sanitizing the boot_params before use. Note that there is no
harm in calling this more than once; subsequent invocations are able to
spot that the boot_params have already been cleaned up.
- wifi: iwlwifi:
- fix A-MSDU TSO preparation
- free pages allocated when failing to build A-MSDU
- ipv6: fix dst ref loop in ila lwtunnel
- mptcp: fix 'scheduling while atomic' in
mptcp_pm_nl_append_new_local_addr
- bluetooth: add check for mgmt_alloc_skb() in
mgmt_device_connected()
- ethtool: allow NULL nlattrs when getting a phy_device
- eth: be2net: fix sleeping while atomic bugs in
be_ndo_bridge_getlink
Previous releases - always broken:
- core: support TCP GSO case for a few missing flags
- wifi: mac80211:
- fix vendor-specific inheritance
- cleanup sta TXQs on flush
- llc: do not use skb_get() before dev_queue_xmit()
- eth: ipa: nable checksum for IPA_ENDPOINT_AP_MODEM_{RX,TX}
for v4.7"
* tag 'net-6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (41 commits)
net: ipv6: fix missing dst ref drop in ila lwtunnel
net: ipv6: fix dst ref loop in ila lwtunnel
mctp i3c: handle NULL header address
net: dsa: mt7530: Fix traffic flooding for MMIO devices
net-timestamp: support TCP GSO case for a few missing flags
vlan: enforce underlying device type
mptcp: fix 'scheduling while atomic' in mptcp_pm_nl_append_new_local_addr
net: ethtool: netlink: Allow NULL nlattrs when getting a phy_device
ppp: Fix KMSAN uninit-value warning with bpf
net: ipa: Enable checksum for IPA_ENDPOINT_AP_MODEM_{RX,TX} for v4.7
net: ipa: Fix QSB data for v4.7
net: ipa: Fix v4.7 resource group names
net: hns3: make sure ptp clock is unregister and freed if hclge_ptp_get_cycle returns an error
wifi: nl80211: disable multi-link reconfiguration
net: dsa: rtl8366rb: don't prompt users for LED control
be2net: fix sleeping while atomic bugs in be_ndo_bridge_getlink
llc: do not use skb_get() before dev_queue_xmit()
wifi: cfg80211: regulatory: improve invalid hints checking
caif_virtio: fix wrong pointer check in cfv_probe()
net: gso: fix ownership in __udp_gso_segment
...
Linus Torvalds [Thu, 6 Mar 2025 19:19:15 +0000 (09:19 -1000)]
Merge tag 'v6.14-rc5-smb3-fixes' of git://git.samba.org/ksmbd
Pull smb fixes from Steve French:
"Five SMB server fixes, two related client fixes, and minor MAINTAINERS
update:
- Two SMB3 lock fixes fixes (including use after free and bug on fix)
- Fix to race condition that can happen in processing IPC responses
- Four ACL related fixes: one related to endianness of num_aces, and
two related fixes to the checks for num_aces (for both client and
server), and one fixing missing check for num_subauths which can
cause memory corruption
- And minor update to email addresses in MAINTAINERS file"
* tag 'v6.14-rc5-smb3-fixes' of git://git.samba.org/ksmbd:
cifs: fix incorrect validation for num_aces field of smb_acl
ksmbd: fix incorrect validation for num_aces field of smb_acl
smb: common: change the data type of num_aces to le16
ksmbd: fix bug on trap in smb2_lock
ksmbd: fix use-after-free in smb2_lock
ksmbd: fix type confusion via race condition when using ipc_msg_send_request
ksmbd: fix out-of-bounds in parse_sec_desc()
MAINTAINERS: update email address in cifs and ksmbd entry
Linus Torvalds [Thu, 6 Mar 2025 18:18:48 +0000 (08:18 -1000)]
Merge tag 'exfat-for-6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat
Pull exfat fixes from Namjae Jeon:
- Optimize new cluster allocation by correctly find empty entry slot
- Add a check to prevent excessive bitmap clearing due to invalid
data size of file/dir entry
- Fix incorrect error return for zero-byte writes
* tag 'exfat-for-6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
exfat: add a check for invalid data size
exfat: short-circuit zero-byte writes in exfat_file_write_iter
exfat: fix soft lockup in exfat_clear_bitmap
exfat: fix just enough dentries but allocate a new cluster to dir
Linus Torvalds [Thu, 6 Mar 2025 18:04:49 +0000 (08:04 -1000)]
Merge tag 'vfs-6.14-rc6.fixes' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
- Fix spelling mistakes in idmappings.rst
- Fix RCU warnings in override_creds()/revert_creds()
- Create new pid namespaces with default limit now that pid_max is
namespaced
* tag 'vfs-6.14-rc6.fixes' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs:
pid: Do not set pid_max in new pid namespaces
doc: correcting two prefix errors in idmappings.rst
cred: Fix RCU warnings in override/revert_creds
Linus Torvalds [Thu, 6 Mar 2025 17:53:25 +0000 (07:53 -1000)]
fs/pipe: fix pipe buffer index use in FUSE
This was another case that Rasmus pointed out where the direct access to
the pipe head and tail pointers broke on 32-bit configurations due to
the type changes.
As with the pipe FIONREAD case, fix it by using the appropriate helper
functions that deal with the right pipe index sizing.
Linus Torvalds [Thu, 6 Mar 2025 17:33:58 +0000 (07:33 -1000)]
fs/pipe: do not open-code pipe head/tail logic in FIONREAD
Rasmus points out that we do indeed have other cases of breakage from
the type changes that were introduced on 32-bit targets in order to read
the pipe head and tail values atomically (commit 3d252160b818: "fs/pipe:
Read pipe->{head,tail} atomically outside pipe->mutex").
Fix it up by using the proper helper functions that now deal with the
pipe buffer index types properly. This makes the code simpler and more
obvious.
The compiler does the CSE and loop hoisting of the pipe ring size
masking that we used to do manually, so open-coding this was never a
good idea.
Linus Torvalds [Thu, 6 Mar 2025 17:30:42 +0000 (07:30 -1000)]
fs/pipe: express 'pipe_empty()' in terms of 'pipe_occupancy()'
That's what 'pipe_full()' does, so it's more consistent. But more
importantly it gets the type limits right when the pipe head and tail
are no longer necessarily 'unsigned int'.
Fabrizio Castro [Wed, 5 Mar 2025 16:37:50 +0000 (16:37 +0000)]
gpio: rcar: Fix missing of_node_put() call
of_parse_phandle_with_fixed_args() requires its caller to
call into of_node_put() on the node pointer from the output
structure, but such a call is currently missing.
Haoxiang Li [Mon, 3 Mar 2025 02:42:33 +0000 (10:42 +0800)]
btrfs: fix a leaked chunk map issue in read_one_chunk()
Add btrfs_free_chunk_map() to free the memory allocated
by btrfs_alloc_chunk_map() if btrfs_add_chunk_map() fails.
Fixes: 7dc66abb5a47 ("btrfs: use a dedicated data structure for chunk maps") CC: stable@vger.kernel.org Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Haoxiang Li <haoxiang_li2024@163.com> Signed-off-by: David Sterba <dsterba@suse.com>
Jens Axboe [Thu, 6 Mar 2025 11:32:46 +0000 (04:32 -0700)]
Merge tag 'nvme-6.14-2025-03-05' of git://git.infradead.org/nvme into block-6.14
Pull NVMe fixe from Keith:
"nvme fixes for Linux 6.14
- TCP use after free fix on polling (Sagi)
- Controller memory buffer cleanup fixes (Icenowy)
- Free leaking requests on bad user passthrough commands (Keith)
- TCP error message fix (Maurizio)
- TCP corruption fix on partial PDU (Maurizio)
- TCP memory ordering fix for weakly ordered archs (Meir)
- Type coercion fix on message error for TCP (Dan)"
* tag 'nvme-6.14-2025-03-05' of git://git.infradead.org/nvme:
nvme-tcp: fix signedness bug in nvme_tcp_init_connection()
nvmet-tcp: Fix a possible sporadic response drops in weakly ordered arch
nvme-tcp: fix potential memory corruption in nvme_tcp_recv_pdu()
nvme-tcp: Fix a C2HTermReq error message
nvmet: remove old function prototype
nvme-ioctl: fix leaked requests on mapping error
nvme-pci: skip CMB blocks incompatible with PCI P2P DMA
nvme-pci: clean up CMBMSC when registering CMB fails
nvme-tcp: fix possible UAF in nvme_tcp_poll
Justin Iurman [Tue, 4 Mar 2025 18:10:39 +0000 (19:10 +0100)]
net: ipv6: fix dst ref loop in ila lwtunnel
This patch follows commit 92191dd10730 ("net: ipv6: fix dst ref loops in
rpl, seg6 and ioam6 lwtunnels") and, on a second thought, the same patch
is also needed for ila (even though the config that triggered the issue
was pathological, but still, we don't want that to happen).
Shrikanth Hegde [Thu, 6 Mar 2025 05:29:53 +0000 (10:59 +0530)]
sched/deadline: Use online cpus for validating runtime
The ftrace selftest reported a failure because writing -1 to
sched_rt_runtime_us returns -EBUSY. This happens when the possible
CPUs are different from active CPUs.
Active CPUs are part of one root domain, while remaining CPUs are part
of def_root_domain. Since active cpumask is being used, this results in
cpus=0 when a non active CPUs is used in the loop.
Fix it by looping over the online CPUs instead for validating the
bandwidth calculations.
Michal Koutný [Wed, 5 Mar 2025 14:58:49 +0000 (15:58 +0100)]
pid: Do not set pid_max in new pid namespaces
It is already difficult for users to troubleshoot which of multiple pid
limits restricts their workload. The per-(hierarchical-)NS pid_max would
contribute to the confusion.
Also, the implementation copies the limit upon creation from
parent, this pattern showed cumbersome with some attributes in legacy
cgroup controllers -- it's subject to race condition between parent's
limit modification and children creation and once copied it must be
changed in the descendant.
Let's do what other places do (ucounts or cgroup limits) -- create new
pid namespaces without any limit at all. The global limit (actually any
ancestor's limit) is still effectively in place, we avoid the
set/unshare race and bumps of global (ancestral) limit have the desired
effect on pid namespace that do not care.
Takashi Iwai [Tue, 4 Mar 2025 13:41:57 +0000 (14:41 +0100)]
drm/bochs: Fix DPMS regression
The recent rewrite with the use of regular atomic helpers broke the
DPMS unblanking on X11. Fix it by moving the call of
bochs_hw_blank(false) from CRTC mode_set_nofb() to atomic_enable().
net: dsa: mt7530: Fix traffic flooding for MMIO devices
On MMIO devices (e.g. MT7988 or EN7581) unicast traffic received on lanX
port is flooded on all other user ports if the DSA switch is configured
without VLAN support since PORT_MATRIX in PCR regs contains all user
ports. Similar to MDIO devices (e.g. MT7530 and MT7531) fix the issue
defining default VLAN-ID 0 for MT7530 MMIO devices.
Fixes: 110c18bfed414 ("net: dsa: mt7530: introduce driver for MT7988 built-in switch") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Chester A. Unal <chester.a.unal@arinc9.com> Link: https://patch.msgid.link/20250304-mt7988-flooding-fix-v1-1-905523ae83e9@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jens Axboe [Wed, 5 Mar 2025 21:03:34 +0000 (14:03 -0700)]
io_uring/rw: ensure reissue path is correctly handled for IOPOLL
The IOPOLL path posts CQEs when the io_kiocb is marked as completed,
so it cannot rely on the usual retry that non-IOPOLL requests do for
read/write requests.
If -EAGAIN is received and the request should be retried, go through
the normal completion path and let the normal flush logic catch it and
reissue it, like what is done for !IOPOLL reads or writes.
drm/xe/userptr: Unmap userptrs in the mmu notifier
If userptr pages are freed after a call to the xe mmu notifier,
the device will not be blocked out from theoretically accessing
these pages unless they are also unmapped from the iommu, and
this violates some aspects of the iommu-imposed security.
Ensure that userptrs are unmapped in the mmu notifier to
mitigate this. A naive attempt would try to free the sg table, but
the sg table itself may be accessed by a concurrent bind
operation, so settle for only unmapping.
drm/xe/hmm: Don't dereference struct page pointers without notifier lock
The pnfs that we obtain from hmm_range_fault() point to pages that
we don't have a reference on, and the guarantee that they are still
in the cpu page-tables is that the notifier lock must be held and the
notifier seqno is still valid.
So while building the sg table and marking the pages accesses / dirty
we need to hold this lock with a validated seqno.
However, the lock is reclaim tainted which makes
sg_alloc_table_from_pages_segment() unusable, since it internally
allocates memory.
Instead build the sg-table manually. For the non-iommu case
this might lead to fewer coalesces, but if that's a problem it can
be fixed up later in the resource cursor code. For the iommu case,
the whole sg-table may still be coalesced to a single contigous
device va region.
This avoids marking pages that we don't own dirty and accessed, and
it also avoid dereferencing struct pages that we don't own.
v2:
- Use assert to check whether hmm pfns are valid (Matthew Auld)
- Take into account that large pages may cross range boundaries
(Matthew Auld)
v3:
- Don't unnecessarily check for a non-freed sg-table. (Matthew Auld)
- Add a missing up_read() in an error path. (Matthew Auld)
Add proper #ifndef around the xe_hmm.h header, proper spacing
and since the documentation mostly follows kerneldoc format,
make it kerneldoc. Also prepare for upcoming -stable fixes.
Matthew Brost [Fri, 28 Feb 2025 07:30:58 +0000 (08:30 +0100)]
drm/xe: Add staging tree for VM binds
Concurrent VM bind staging and zapping of PTEs from a userptr notifier
do not work because the view of PTEs is not stable. VM binds cannot
acquire the notifier lock during staging, as memory allocations are
required. To resolve this race condition, use a staging tree for VM
binds that is committed only under the userptr notifier lock during the
final step of the bind. This ensures a consistent view of the PTEs in
the userptr notifier.
A follow up may only use staging for VM in fault mode as this is the
only mode in which the above race exists.
v3:
- Drop zap PTE change (Thomas)
- s/xe_pt_entry/xe_pt_entry_staging (Thomas)
Suggested-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: <stable@vger.kernel.org> Fixes: e8babb280b5e ("drm/xe: Convert multiple bind ops into single job") Fixes: a708f6501c69 ("drm/xe: Update PT layer with better error handling") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250228073058.59510-5-thomas.hellstrom@linux.intel.com Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
(cherry picked from commit 6f39b0c5ef0385eae586760d10b9767168037aa5) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Thomas Hellström [Fri, 28 Feb 2025 07:30:57 +0000 (08:30 +0100)]
drm/xe: Fix fault mode invalidation with unbind
Fix fault mode invalidation racing with unbind leading to the
PTE zapping potentially traversing an invalid page-table tree.
Do this by holding the notifier lock across PTE zapping. This
might transfer any contention waiting on the notifier seqlock
read side to the notifier lock read side, but that shouldn't be
a major problem.
At the same time get rid of the open-coded invalidation in the bind
code by relying on the notifier even when the vma bind is not
yet committed.
Finally let userptr invalidation call a dedicated xe_vm function
performing a full invalidation.
Fixes: e8babb280b5e ("drm/xe: Convert multiple bind ops into single job") Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: <stable@vger.kernel.org> # v6.12+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250228073058.59510-4-thomas.hellstrom@linux.intel.com
(cherry picked from commit 100a5b8dadfca50d91d9a4c9fc01431b42a25cab) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Thomas Hellström [Fri, 28 Feb 2025 07:30:56 +0000 (08:30 +0100)]
drm/xe/vm: Fix a misplaced #endif
Fix a (harmless) misplaced #endif leading to declarations
appearing multiple times.
Fixes: 0eb2a18a8fad ("drm/xe: Implement VM snapshot support for BO's and userptr") Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: José Roberto de Souza <jose.souza@intel.com> Cc: <stable@vger.kernel.org> # v6.12+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250228073058.59510-3-thomas.hellstrom@linux.intel.com
(cherry picked from commit fcc20a4c752214b3e25632021c57d7d1d71ee1dd) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Thomas Hellström [Fri, 28 Feb 2025 07:30:55 +0000 (08:30 +0100)]
drm/xe/vm: Validate userptr during gpu vma prefetching
If a userptr vma subject to prefetching was already invalidated
or invalidated during the prefetch operation, the operation would
repeatedly return -EAGAIN which would typically cause an infinite
loop.
Validate the userptr to ensure this doesn't happen.
v2:
- Don't fallthrough from UNMAP to PREFETCH (Matthew Brost)
Fixes: 5bd24e78829a ("drm/xe/vm: Subclass userptr vmas") Fixes: 617eebb9c480 ("drm/xe: Fix array of binds") Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.9+ Suggested-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250228073058.59510-2-thomas.hellstrom@linux.intel.com
(cherry picked from commit 03c346d4d0d85d210d549d43c8cfb3dfb7f20e0a) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Dan Carpenter [Wed, 5 Mar 2025 15:52:59 +0000 (18:52 +0300)]
nvme-tcp: fix signedness bug in nvme_tcp_init_connection()
The kernel_recvmsg() function returns an int which could be either
negative error codes or the number of bytes received. The problem is
that the condition:
if (ret < sizeof(*icresp)) {
is type promoted to type unsigned long and negative values are treated
as high positive values which is success, when they should be treated as
failure. Handle invalid positive returns separately from negative
error codes to avoid this problem.
Linus Torvalds [Wed, 5 Mar 2025 17:35:40 +0000 (07:35 -1000)]
fs/pipe: remove buggy and unused 'helper' function
While looking for incorrect users of the pipe head/tail fields (see
commit c27c66afc449: "fs/pipe: Fix pipe_occupancy() with 16-bit
indexes"), I found a bug in pipe_discard_from() that looked entirely
broken.
However, the fix is trivial: this buggy function isn't actually called
by anything, so let's just remove it ASAP.
Linus Torvalds [Wed, 5 Mar 2025 17:08:09 +0000 (07:08 -1000)]
fs/pipe: Fix pipe_occupancy() with 16-bit indexes
The pipe_occupancy() logic implicitly relied on the natural unsigned
modulo arithmetic in C, but that doesn't work for the new 'pipe_index_t'
case, since any arithmetic will be done in 'int' (and here we had also
made it 'unsigned int' due to the function call boundary).
So make the modulo arithmetic explicit by casting the result to the
proper type.
Andrew Martin [Fri, 28 Feb 2025 16:26:48 +0000 (11:26 -0500)]
drm/amdkfd: Fix NULL Pointer Dereference in KFD queue
Through KFD IOCTL Fuzzing we encountered a NULL pointer derefrence
when calling kfd_queue_acquire_buffers.
Fixes: 629568d25fea ("drm/amdkfd: Validate queue cwsr area and eop buffer size") Signed-off-by: Andrew Martin <Andrew.Martin@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Andrew Martin <Andrew.Martin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 049e5bf3c8406f87c3d8e1958e0a16804fa1d530) Cc: stable@vger.kernel.org
Ma Ke [Wed, 26 Feb 2025 08:37:31 +0000 (16:37 +0800)]
drm/amd/display: Fix null check for pipe_ctx->plane_state in resource_build_scaling_params
Null pointer dereference issue could occur when pipe_ctx->plane_state
is null. The fix adds a check to ensure 'pipe_ctx->plane_state' is not
null before accessing. This prevents a null pointer dereference.
Found by code review.
Fixes: 3be5262e353b ("drm/amd/display: Rename more dc_surface stuff to plane_state") Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Ma Ke <make24@iscas.ac.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 63e6a77ccf239337baa9b1e7787cde9fa0462092) Cc: stable@vger.kernel.org
Zecheng Li [Tue, 4 Mar 2025 21:40:31 +0000 (21:40 +0000)]
sched/fair: Fix potential memory corruption in child_cfs_rq_on_list
child_cfs_rq_on_list attempts to convert a 'prev' pointer to a cfs_rq.
This 'prev' pointer can originate from struct rq's leaf_cfs_rq_list,
making the conversion invalid and potentially leading to memory
corruption. Depending on the relative positions of leaf_cfs_rq_list and
the task group (tg) pointer within the struct, this can cause a memory
fault or access garbage data.
The issue arises in list_add_leaf_cfs_rq, where both
cfs_rq->leaf_cfs_rq_list and rq->leaf_cfs_rq_list are added to the same
leaf list. Also, rq->tmp_alone_branch can be set to rq->leaf_cfs_rq_list.
This adds a check `if (prev == &rq->leaf_cfs_rq_list)` after the main
conditional in child_cfs_rq_on_list. This ensures that the container_of
operation will convert a correct cfs_rq struct.
This check is sufficient because only cfs_rqs on the same CPU are added
to the list, so verifying the 'prev' pointer against the current rq's list
head is enough.
Fixes a potential memory corruption issue that due to current struct
layout might not be manifesting as a crash but could lead to unpredictable
behavior when the layout changes.
Fixes: fdaba61ef8a2 ("sched/fair: Ensure that the CFS parent is added after unthrottling") Signed-off-by: Zecheng Li <zecheng@google.com> Reviewed-and-tested-by: K Prateek Nayak <kprateek.nayak@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Link: https://lore.kernel.org/r/20250304214031.2882646-1-zecheng@google.com
Olivier Gayot [Wed, 5 Mar 2025 02:21:54 +0000 (10:21 +0800)]
block: fix conversion of GPT partition name to 7-bit
The utf16_le_to_7bit function claims to, naively, convert a UTF-16
string to a 7-bit ASCII string. By naively, we mean that it:
* drops the first byte of every character in the original UTF-16 string
* checks if all characters are printable, and otherwise replaces them
by exclamation mark "!".
This means that theoretically, all characters outside the 7-bit ASCII
range should be replaced by another character. Examples:
* lower-case alpha (ɒ) 0x0252 becomes 0x52 (R)
* ligature OE (œ) 0x0153 becomes 0x53 (S)
* hangul letter pieup (ㅂ) 0x3142 becomes 0x42 (B)
* upper-case gamma (Ɣ) 0x0194 becomes 0x94 (not printable) so gets
replaced by "!"
The result of this conversion for the GPT partition name is passed to
user-space as PARTNAME via udev, which is confusing and feels questionable.
However, there is a flaw in the conversion function itself. By dropping
one byte of each character and using isprint() to check if the remaining
byte corresponds to a printable character, we do not actually guarantee
that the resulting character is 7-bit ASCII.
This happens because we pass 8-bit characters to isprint(), which
in the kernel returns 1 for many values > 0x7f - as defined in ctype.c.
This results in many values which should be replaced by "!" to be kept
as-is, despite not being valid 7-bit ASCII. Examples:
* e with acute accent (é) 0x00E9 becomes 0xE9 - kept as-is because
isprint(0xE9) returns 1.
* euro sign (€) 0x20AC becomes 0xAC - kept as-is because isprint(0xAC)
returns 1.
This way has broken pyudev utility[1], fixes it by using a mask of 7 bits
instead of 8 bits before calling isprint.
Uday Shankar [Tue, 4 Mar 2025 21:34:26 +0000 (14:34 -0700)]
ublk: set_params: properly check if parameters can be applied
The parameters set by the set_params call are only applied to the block
device in the start_dev call. So if a device has already been started, a
subsequently issued set_params on that device will not have the desired
effect, and should return an error. There is an existing check for this
- set_params fails on devices in the LIVE state. But this check is not
sufficient to cover the recovery case. In this case, the device will be
in the QUIESCED or FAIL_IO states, so set_params will succeed. But this
success is misleading, because the parameters will not be applied, since
the device has already been started (by a previous ublk server). The bit
UB_STATE_USED is set on completion of the start_dev; use it to detect
and fail set_params commands which arrive too late to be applied (after
start_dev).
Jason Xing [Tue, 4 Mar 2025 00:44:29 +0000 (08:44 +0800)]
net-timestamp: support TCP GSO case for a few missing flags
When I read through the TSO codes, I found out that we probably
miss initializing the tx_flags of last seg when TSO is turned
off, which means at the following points no more timestamp
(for this last one) will be generated. There are three flags
to be handled in this patch:
1. SKBTX_HW_TSTAMP
2. SKBTX_BPF
3. SKBTX_SCHED_TSTAMP
Note that SKBTX_BPF[1] was added in 6.14.0-rc2 by commit 6b98ec7e882af ("bpf: Add BPF_SOCK_OPS_TSTAMP_SCHED_CB callback")
and only belongs to net-next branch material for now. The common
issue of the above three flags can be fixed by this single patch.
This patch initializes the tx_flags to SKBTX_ANY_TSTAMP like what
the UDP GSO does to make the newly segmented last skb inherit the
tx_flags so that requested timestamp will be generated in each
certain layer, or else that last one has zero value of tx_flags
which leads to no timestamp at all.
Fixes: 4ed2d765dfacc ("net-timestamp: TCP timestamping") Signed-off-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Sandeen [Tue, 11 Feb 2025 20:14:21 +0000 (14:14 -0600)]
exfat: short-circuit zero-byte writes in exfat_file_write_iter
When generic_write_checks() returns zero, it means that
iov_iter_count() is zero, and there is no work to do.
Simply return success like all other filesystems do, rather than
proceeding down the write path, which today yields an -EFAULT in
generic_perform_write() via the
(fault_in_iov_iter_readable(i, bytes) == bytes) check when bytes
== 0.
Fixes: 11a347fb6cef ("exfat: change to get file size from DataLength") Reported-by: Noah <kernel-org-10@maxgrass.eu> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Namjae Jeon [Fri, 31 Jan 2025 03:55:55 +0000 (12:55 +0900)]
exfat: fix soft lockup in exfat_clear_bitmap
bitmap clear loop will take long time in __exfat_free_cluster()
if data size of file/dir enty is invalid.
If cluster bit in bitmap is already clear, stop clearing bitmap go to
out of loop.
Fixes: 31023864e67a ("exfat: add fat entry operations") Reported-by: Kun Hu <huk23@m.fudan.edu.cn>, Jiaji Qin <jjtan24@m.fudan.edu.cn> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Yuezhang Mo [Fri, 22 Nov 2024 02:50:55 +0000 (10:50 +0800)]
exfat: fix just enough dentries but allocate a new cluster to dir
This commit fixes the condition for allocating cluster to parent
directory to avoid allocating new cluster to parent directory when
there are just enough empty directory entries at the end of the
parent directory.
Fixes: af02c72d0b62 ("exfat: convert exfat_find_empty_entry() to use dentry cache") Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Niklas Söderlund [Tue, 21 Jan 2025 13:58:33 +0000 (14:58 +0100)]
gpio: rcar: Use raw_spinlock to protect register access
Use raw_spinlock in order to fix spurious messages about invalid context
when spinlock debugging is enabled. The lock is only used to serialize
register access.
Koichiro Den [Mon, 24 Feb 2025 14:31:26 +0000 (23:31 +0900)]
gpio: aggregator: protect driver attr handlers against module unload
Both new_device_store and delete_device_store touch module global
resources (e.g. gpio_aggregator_lock). To prevent race conditions with
module unload, a reference needs to be held.
Add try_module_get() in these handlers.
For new_device_store, this eliminates what appears to be the most dangerous
scenario: if an id is allocated from gpio_aggregator_idr but
platform_device_register has not yet been called or completed, a concurrent
module unload could fail to unregister/delete the device, leaving behind a
dangling platform device/GPIO forwarder. This can result in various issues.
The following simple reproducer demonstrates these problems:
#!/bin/bash
while :; do
# note: whether 'gpiochip0 0' exists or not does not matter.
echo 'gpiochip0 0' > /sys/bus/platform/drivers/gpio-aggregator/new_device
done &
while :; do
modprobe gpio-aggregator
modprobe -r gpio-aggregator
done &
wait
Starting with the following warning, several kinds of warnings will appear
and the system may become unstable:
platform/x86/amd/pmf: Update PMF Driver for Compatibility with new PMF-TA
The PMF driver allocates a shared memory buffer using
tee_shm_alloc_kernel_buf() for communication with the PMF-TA.
The latest PMF-TA version introduces new structures with OEM debug
information and additional policy input conditions for evaluating the
policy binary. Consequently, the shared memory size must be increased to
ensure compatibility between the PMF driver and the updated PMF-TA.
To do so, introduce the new PMF-TA UUID and update the PMF shared memory
configuration to ensure compatibility with the latest PMF-TA version.
Additionally, export the TA UUID.
These updates will result in modifications to the prototypes of
amd_pmf_tee_init() and amd_pmf_ta_open_session().
In the amd_pmf_invoke_cmd_init() function within the PMF driver ensure
that the actual result from the PMF-TA is returned rather than a generic
EIO. This change allows for proper handling of errors originating from the
PMF-TA.
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Co-developed-by: Patil Rajesh Reddy <Patil.Reddy@amd.com> Signed-off-by: Patil Rajesh Reddy <Patil.Reddy@amd.com> Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Link: https://lore.kernel.org/r/20250305045842.4117767-1-Shyam-sundar.S-k@amd.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Hoku Ishibe [Mon, 24 Feb 2025 02:05:17 +0000 (21:05 -0500)]
ALSA: hda: intel: Add Dell ALC3271 to power_save denylist
Dell XPS 13 7390 with the Realtek ALC3271 codec experiences
persistent humming noise when the power_save mode is enabled.
This issue occurs when the codec enters power saving mode,
leading to unwanted noise from the speakers.
This patch adds the affected model (PCI ID 0x1028:0x0962) to the
power_save denylist to ensure power_save is disabled by default,
preventing power-off related noise issues.
Steps to Reproduce
1. Boot the system with `snd_hda_intel` loaded.
2. Verify that `power_save` mode is enabled:
```sh
cat /sys/module/snd_hda_intel/parameters/power_save
````
output: 10 (default power save timeout)
3. Wait for the power save timeout
4. Observe a persistent humming noise from the speakers
5. Disable `power_save` manually:
```sh
echo 0 | sudo tee /sys/module/snd_hda_intel/parameters/power_save
````
6. Confirm that the noise disappears immediately.
This issue has been observed on my system, and this patch
successfully eliminates the unwanted noise. If other users
experience similar issues, additional reports would be helpful.
Jarkko Sakkinen [Wed, 5 Mar 2025 05:00:05 +0000 (07:00 +0200)]
x86/sgx: Fix size overflows in sgx_encl_create()
The total size calculated for EPC can overflow u64 given the added up page
for SECS. Further, the total size calculated for shmem can overflow even
when the EPC size stays within limits of u64, given that it adds the extra
space for 128 byte PCMD structures (one for each page).
Address this by pre-evaluating the micro-architectural requirement of
SGX: the address space size must be power of two. This is eventually
checked up by ECREATE but the pre-check has the additional benefit of
making sure that there is some space for additional data.
Fixes: 888d24911787 ("x86/sgx: Add SGX_IOC_ENCLAVE_CREATE") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Dave Hansen <dave.hansen@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lore.kernel.org/r/20250305050006.43896-1-jarkko@kernel.org Closes: https://lore.kernel.org/linux-sgx/c87e01a0-e7dd-4749-a348-0980d3444f04@stanley.mountain/
Linus Torvalds [Wed, 5 Mar 2025 05:05:53 +0000 (19:05 -1000)]
Merge tag 'x86_microcode_for_v6.14_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull AMD microcode loading fixes from Borislav Petkov:
- Load only sha256-signed microcode patch blobs
- Other good cleanups
* tag 'x86_microcode_for_v6.14_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/microcode/AMD: Load only SHA256-checksummed patches
x86/microcode/AMD: Add get_patch_level()
x86/microcode/AMD: Get rid of the _load_microcode_amd() forward declaration
x86/microcode/AMD: Merge early_apply_microcode() into its single callsite
x86/microcode/AMD: Remove unused save_microcode_in_initrd_amd() declarations
x86/microcode/AMD: Remove ugly linebreak in __verify_patch_section() signature
mptcp: fix 'scheduling while atomic' in mptcp_pm_nl_append_new_local_addr
If multiple connection requests attempt to create an implicit mptcp
endpoint in parallel, more than one caller may end up in
mptcp_pm_nl_append_new_local_addr because none found the address in
local_addr_list during their call to mptcp_pm_nl_get_local_id. In this
case, the concurrent new_local_addr calls may delete the address entry
created by the previous caller. These deletes use synchronize_rcu, but
this is not permitted in some of the contexts where this function may be
called. During packet recv, the caller may be in a rcu read critical
section and have preemption disabled.
An example stack:
BUG: scheduling while atomic: swapper/2/0/0x00000302
This problem seems particularly prevalent if the user advertises an
endpoint that has a different external vs internal address. In the case
where the external address is advertised and multiple connections
already exist, multiple subflow SYNs arrive in parallel which tends to
trigger the race during creation of the first local_addr_list entries
which have the internal address instead.
Fix by skipping the replacement of an existing implicit local address if
called via mptcp_pm_nl_get_local_id.
So, when tb is NULL (such as in the ethnl notify path), we have a nice
crash.
It turns out that there's only the PLCA command that is in that case, as
the other phydev-specific commands don't have a notification.
This commit fixes the crash by passing the cmd index and the nlattr
array separately, allowing NULL-checking it directly inside the helper.
Fixes: c15e065b46dc ("net: ethtool: Allow passing a phy index for some commands") Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Kory Maincent <kory.maincent@bootlin.com> Reported-by: Parthiban Veerasooran <parthiban.veerasooran@microchip.com> Link: https://patch.msgid.link/20250301141114.97204-1-maxime.chevallier@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiayuan Chen [Fri, 28 Feb 2025 14:14:08 +0000 (22:14 +0800)]
ppp: Fix KMSAN uninit-value warning with bpf
Syzbot caught an "KMSAN: uninit-value" warning [1], which is caused by the
ppp driver not initializing a 2-byte header when using socket filter.
The following code can generate a PPP filter BPF program:
'''
struct bpf_program fp;
pcap_t *handle;
handle = pcap_open_dead(DLT_PPP_PPPD, 65535);
pcap_compile(handle, &fp, "ip and outbound", 0, 0);
bpf_dump(&fp, 1);
'''
Its output is:
'''
(000) ldh [2]
(001) jeq #0x21 jt 2 jf 5
(002) ldb [0]
(003) jeq #0x1 jt 4 jf 5
(004) ret #65535
(005) ret #0
'''
Wen can find similar code at the following link:
https://github.com/ppp-project/ppp/blob/master/pppd/options.c#L1680
The maintainer of this code repository is also the original maintainer
of the ppp driver.
As you can see the BPF program skips 2 bytes of data and then reads the
'Protocol' field to determine if it's an IP packet. Then it read the first
byte of the first 2 bytes to determine the direction.
The issue is that only the first byte indicating direction is initialized
in current ppp driver code while the second byte is not initialized.
For normal BPF programs generated by libpcap, uninitialized data won't be
used, so it's not a problem. However, for carefully crafted BPF programs,
such as those generated by syzkaller [2], which start reading from offset
0, the uninitialized data will be used and caught by KMSAN.
Jakub Kicinski [Wed, 5 Mar 2025 00:19:24 +0000 (16:19 -0800)]
Merge branch 'fixes-for-ipa-v4-7'
Luca Weiss says:
====================
Fixes for IPA v4.7
During bringup of IPA v4.7 unfortunately some bits were missed, and it
couldn't be tested much back then due to missing features in tqftpserv
which caused the modem to not enable correctly.
Especially the last commit is important since it makes mobile data
actually functional on SoCs with IPA v4.7 like SM6350 - used on the
Fairphone 4. Before that, you'd get an IP address on the interface but
then e.g. ping never got any response back.
====================
Luca Weiss [Thu, 27 Feb 2025 10:33:42 +0000 (11:33 +0100)]
net: ipa: Enable checksum for IPA_ENDPOINT_AP_MODEM_{RX,TX} for v4.7
Enable the checksum option for these two endpoints in order to allow
mobile data to actually work. Without this, no packets seem to make it
through the IPA.