PCI xHC host should be stopped and xhci driver memory freed before putting
host to PCI D3 state during PCI remove callback.
Hosts with XHCI_SPURIOUS_WAKEUP quirk did this the wrong way around
and set the host to D3 before calling usb_hcd_pci_remove(dev), which will
access the host to stop it, and then free xhci.
Fixes: f1f6d9a8b540 ("xhci: don't dereference a xhci member after removing xhci") Cc: stable@vger.kernel.org Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20240905143300.1959279-12-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
There is a general misunderstanding amongst engineers that {v}snprintf()
returns the length of the data *actually* encoded into the destination
array. However, as per the C99 standard {v}snprintf() really returns
the length of the data that *would have been* written if there were
enough space for it. This misunderstanding has led to buffer-overruns
in the past. It's generally considered safer to use the {v}scnprintf()
variants in their place (or even sprintf() in simple cases). So let's
do that.
Whilst we're at it, let's define some magic numbers to increase
readability and ease of maintenance.
MSGF_LEG_MASK is laid out with INTA in bit 0, INTB in bit 1, INTC in bit 2,
and INTD in bit 3. Hardware IRQ numbers start at 0, and we register
PCI_NUM_INTX IRQs. So to enable INTA (aka hwirq 0) we should set bit 0.
Remove the subtraction of one.
This bug would cause INTx interrupts not to be delivered, as enabling INTB
would actually enable INTA, and enabling INTA wouldn't enable anything at
all. It is likely that this got overlooked for so long since most PCIe
hardware uses MSIs. This fixes the following UBSAN error:
UBSAN: shift-out-of-bounds in ../drivers/pci/controller/pcie-xilinx-nwl.c:389:11
shift exponent 18446744073709551615 is too large for 32-bit type 'int'
CPU: 1 PID: 61 Comm: kworker/u10:1 Not tainted 6.6.20+ #268
Hardware name: xlnx,zynqmp (DT)
Workqueue: events_unbound deferred_probe_work_func
Call trace:
dump_backtrace (arch/arm64/kernel/stacktrace.c:235)
show_stack (arch/arm64/kernel/stacktrace.c:242)
dump_stack_lvl (lib/dump_stack.c:107)
dump_stack (lib/dump_stack.c:114)
__ubsan_handle_shift_out_of_bounds (lib/ubsan.c:218 lib/ubsan.c:387)
nwl_unmask_leg_irq (drivers/pci/controller/pcie-xilinx-nwl.c:389 (discriminator 1))
irq_enable (kernel/irq/internals.h:234 kernel/irq/chip.c:170 kernel/irq/chip.c:439 kernel/irq/chip.c:432 kernel/irq/chip.c:345)
__irq_startup (kernel/irq/internals.h:239 kernel/irq/chip.c:180 kernel/irq/chip.c:250)
irq_startup (kernel/irq/chip.c:270)
__setup_irq (kernel/irq/manage.c:1800)
request_threaded_irq (kernel/irq/manage.c:2206)
pcie_pme_probe (include/linux/interrupt.h:168 drivers/pci/pcie/pme.c:348)
Fixes: 9a181e1093af ("PCI: xilinx-nwl: Modify IRQ chip for legacy interrupts") Link: https://lore.kernel.org/r/20240531161337.864994-3-sean.anderson@linux.dev Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
Going through a full irq descriptor lookup instead of just using the proper
helper function which provides direct access is suboptimal.
In fact it _is_ wrong because the chip callback needs to get the chip data
which is relevant for the chip while using the irq descriptor variant
returns the irq chip data of the top level chip of a hierarchy. It does not
matter in this case because the chip is the top level chip, but that
doesn't make it more correct.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Rob Herring <robh@kernel.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20201210194044.364211860@linutronix.de
Stable-dep-of: 0199d2f2bd8c ("PCI: xilinx-nwl: Fix off-by-one in INTx IRQ handler") Signed-off-by: Sasha Levin <sashal@kernel.org>
When submitting more than 2^32 padata objects to padata_do_serial, the
current sorting implementation incorrectly sorts padata objects with
overflowed seq_nr, causing them to be placed before existing objects in
the reorder list. This leads to a deadlock in the serialization process
as padata_find_next cannot match padata->seq_nr and pd->processed
because the padata instance with overflowed seq_nr will be selected
next.
To fix this, we use an unsigned integer wrap around to correctly sort
padata objects in scenarios with integer overflow.
Fixes: bfde23ce200e ("padata: unbind parallel jobs from specific CPUs") Cc: <stable@vger.kernel.org> Co-developed-by: Christian Gafert <christian.gafert@rohde-schwarz.com> Signed-off-by: Christian Gafert <christian.gafert@rohde-schwarz.com> Co-developed-by: Max Ferger <max.ferger@rohde-schwarz.com> Signed-off-by: Max Ferger <max.ferger@rohde-schwarz.com> Signed-off-by: Van Giang Nguyen <vangiang.nguyen@rohde-schwarz.com> Acked-by: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit c77e22834ae9 ("NFSv4: Fix a potential sleep while atomic in
nfs4_do_reclaim()") separate out the freeing of the state owners from
nfs4_purge_state_owners() and finish it outside the rcu lock.
However, the error path is omitted. As a result, the state owners in
"freeme" will not be released.
Fix it by adding freeing in the error path.
Fixes: c77e22834ae9 ("NFSv4: Fix a potential sleep while atomic in nfs4_do_reclaim()") Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com> Cc: stable@vger.kernel.org # v5.3+ Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The fcntl's F_SETOWN command sets the process that handle SIGIO/SIGURG
for the related file descriptor. Before this change, the
file_set_fowner LSM hook was always called, ignoring the VFS logic which
may not actually change the process that handles SIGIO (e.g. TUN, TTY,
dnotify), nor update the related UID/EUID.
Moreover, because security_file_set_fowner() was called without lock
(e.g. f_owner.lock), concurrent F_SETOWN commands could result to a race
condition and inconsistent LSM states (e.g. SELinux's fown_sid) compared
to struct fown_struct's UID/EUID.
This change makes sure the LSM states are always in sync with the VFS
state by moving the security_file_set_fowner() call close to the
UID/EUID updates and using the same f_owner.lock .
Rename f_modown() to __f_setown() to simplify code.
Cc: stable@vger.kernel.org Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Casey Schaufler <casey@schaufler-ca.com> Cc: Christian Brauner <brauner@kernel.org> Cc: James Morris <jmorris@namei.org> Cc: Jann Horn <jannh@google.com> Cc: Ondrej Mosnacek <omosnace@redhat.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Serge E. Hallyn <serge@hallyn.com> Cc: Stephen Smalley <stephen.smalley.work@gmail.com> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Mickaël Salaün <mic@digikod.net> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Recently I noticed a bug[1] in btrfs, after digged it into
and I believe it'a race in vfs.
Let's assume there's a inode (ie ino 261) with i_count 1 is
called by iput(), and there's a concurrent thread calling
generic_shutdown_super().
cpu0: cpu1:
iput() // i_count is 1
->spin_lock(inode)
->dec i_count to 0
->iput_final() generic_shutdown_super()
->__inode_add_lru() ->evict_inodes()
// cause some reason[2] ->if (atomic_read(inode->i_count)) continue;
// return before // inode 261 passed the above check
// list_lru_add_obj() // and then schedule out
->spin_unlock()
// note here: the inode 261
// was still at sb list and hash list,
// and I_FREEING|I_WILL_FREE was not been set
btrfs_iget()
// after some function calls
->find_inode()
// found the above inode 261
->spin_lock(inode)
// check I_FREEING|I_WILL_FREE
// and passed
->__iget()
->spin_unlock(inode) // schedule back
->spin_lock(inode)
// check (I_NEW|I_FREEING|I_WILL_FREE) flags,
// passed and set I_FREEING
iput() ->spin_unlock(inode)
->spin_lock(inode) ->evict()
// dec i_count to 0
->iput_final()
->spin_unlock()
->evict()
Now, we have two threads simultaneously evicting
the same inode, which may trigger the BUG(inode->i_state & I_CLEAR)
statement both within clear_inode() and iput().
To fix the bug, recheck the inode->i_count after holding i_lock.
Because in the most scenarios, the first check is valid, and
the overhead of spin_lock() can be reduced.
If there is any misunderstanding, please let me know, thanks.
[1]: https://lore.kernel.org/linux-btrfs/000000000000eabe1d0619c48986@google.com/
[2]: The reason might be 1. SB_ACTIVE was removed or 2. mapping_shrinkable()
return false when I reproduced the bug.
Reported-by: syzbot+67ba3c42bcbb4665d3ad@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=67ba3c42bcbb4665d3ad CC: stable@vger.kernel.org Fixes: 63997e98a3be ("split invalidate_inodes()") Signed-off-by: Julian Sun <sunjunchao2870@gmail.com> Link: https://lore.kernel.org/r/20240823130730.658881-1-sunjunchao2870@gmail.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
All batches of the Pine64 Pinebook Pro, except the latest batch (as of 2024)
whose hardware design was revised due to the component shortage, use a 1S
lithium battery whose nominal/design capacity is 10,000 mAh, according to the
battery datasheet. [1][2] Let's correct the design full-charge value in the
Pinebook Pro board dts, to improve the accuracy of the hardware description,
and to hopefully improve the accuracy of the fuel gauge a bit on all units
that don't belong to the latest batch.
The above-mentioned latest batch uses a different 1S lithium battery with
a slightly lower capacity, more precisely 9,600 mAh. To make the fuel gauge
work reliably on the latest batch, a sample battery would need to be sent to
CellWise, to obtain its proprietary battery profile, whose data goes into
"cellwise,battery-profile" in the Pinebook Pro board dts. Without that data,
the fuel gauge reportedly works unreliably, so changing the design capacity
won't have any negative effects on the already unreliable operation of the
fuel gauge in the Pinebook Pros that belong to the latest batch.
According to the battery datasheet, its voltage can go as low as 2.75 V while
discharging, but it's better to leave the current 3.0 V value in the dts file,
because of the associated Pinebook Pro's voltage regulation issues.
Increase the frequency of the PWM signal that drives the LED backlight of
the Pinebook Pro's panel, from about 1.35 KHz (which equals to the PWM
period of 740,740 ns), to exactly 8 kHz (which equals to the PWM period of
125,000 ns). Using a higher PWM frequency for the panel backlight, which
reduces the flicker, can only be beneficial to the end users' eyes.
On top of that, increasing the backlight PWM signal frequency reportedly
eliminates the buzzing emitted from the Pinebook Pro's built-in speakers
when certain backlight levels are set, which cause some weird interference
with some of the components of the Pinebook Pro's audio chain.
The old value for the backlight PWM period, i.e. 740,740 ns, is pretty much
an arbitrary value that was selected during the very early bring-up of the
Pinebook Pro, only because that value seemed to minimize horizontal line
distortion on the display, which resulted from the old X.org drivers causing
screen tearing when dragging windows around. That's no longer an issue, so
there are no reasons to stick with the old PWM period value.
The lower and the upper backlight PWM frequency limits for the Pinebook Pro's
panel, according to its datasheet, are 200 Hz and 10 kHz, respectively. [1]
These changes still leave some headroom, which may have some positive effects
on the lifetime expectancy of the panel's backlight LEDs.
The result of multiplication between values derived from functions
dir_buckets() and bucket_blocks() *could* technically reach
2^30 * 2^2 = 2^32.
While unlikely to happen, it is prudent to ensure that it will not
lead to integer overflow. Thus, use mul_u32_u32() as it's more
appropriate to mitigate the issue.
Found by Linux Verification Center (linuxtesting.org) with static
analysis tool SVACE.
fill_pool() uses 'obj_pool_min_free' to decide whether objects should be
handed back to the kmem cache. But 'obj_pool_min_free' records the lowest
historical value of the number of objects in the object pool and not the
minimum number of objects which should be kept in the pool.
Use 'debug_objects_pool_min_level' instead, which holds the minimum number
which was scaled to the number of CPUs at boot time.
[ tglx: Massage change log ]
Fixes: d26bf5056fc0 ("debugobjects: Reduce number of pool_lock acquisitions in fill_pool()") Fixes: 36c4ead6f6df ("debugobjects: Add global free list and the counter") Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20240904133944.2124-3-thunder.leizhen@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
"iw dev wlp2s0 station dump" shows incorrect rx bitrate:
tx bitrate: 866.7 MBit/s VHT-MCS 9 80MHz short GI VHT-NSS 2
rx bitrate: 86.7 MBit/s VHT-MCS 9 VHT-NSS 1
This is because the RX band width is calculated incorrectly. Fix the
calculation according to the phydm_rxsc_2_bw() function from the
official drivers.
After:
tx bitrate: 866.7 MBit/s VHT-MCS 9 80MHz short GI VHT-NSS 2
rx bitrate: 390.0 MBit/s VHT-MCS 9 80MHz VHT-NSS 1
It also works correctly with the AP configured for 20 MHz and 40 MHz.
pt_event_snapshot_aux() uses pt->handle_nmi to determine if tracing
needs to be stopped, however tracing can still be going because
pt->handle_nmi is set to zero before tracing is stopped in pt_event_stop,
whereas pt_event_snapshot_aux() requires that tracing must be stopped in
order to copy a sample of trace from the buffer.
Instead call pt_config_stop() always, which anyway checks config for
RTIT_CTL_TRACEEN and does nothing if it is already clear.
Note pt_event_snapshot_aux() can continue to use pt->handle_nmi to
determine if the trace needs to be restarted afterwards.
The TPM event log table is a Linux specific construct, where the data
produced by the GetEventLog() boot service is cached in memory, and
passed on to the OS using an EFI configuration table.
The use of EFI_LOADER_DATA here results in the region being left
unreserved in the E820 memory map constructed by the EFI stub, and this
is the memory description that is passed on to the incoming kernel by
kexec, which is therefore unaware that the region should be reserved.
Even though the utility of the TPM2 event log after a kexec is
questionable, any corruption might send the parsing code off into the
weeds and crash the kernel. So let's use EFI_ACPI_RECLAIM_MEMORY
instead, which is always treated as reserved by the E820 conversion
logic.
Internal documentation suggest that the TUXEDO Polaris 15 Gen5 AMD might
have GMxXGxX as the board name instead of GMxXGxx.
Adding both to be on the safe side.
Signed-off-by: Werner Sembach <wse@tuxedocomputers.com> Cc: All applicable <stable@vger.kernel.org> Link: https://patch.msgid.link/20240910094008.1601230-1-wse@tuxedocomputers.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Only buffer objects are valid return values of _STR.
If something else is returned description_show() will access invalid
memory.
Fixes: d1efe3c324ea ("ACPI: Add new sysfs interface to export device description") Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://patch.msgid.link/20240709-acpi-sysfs-groups-v2-1-058ab0667fa8@weissschuh.net Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If the net_conf pointer is NULL and the code attempts to access its
fields without a check, it will lead to a null pointer dereference.
Add a NULL check before dereferencing the pointer.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
The violation of atomicity occurs when the drbd_uuid_set_bm function is
executed simultaneously with modifying the value of
device->ldev->md.uuid[UI_BITMAP]. Consider a scenario where, while
device->ldev->md.uuid[UI_BITMAP] passes the validity check when its
value is not zero, the value of device->ldev->md.uuid[UI_BITMAP] is
written to zero. In this case, the check in drbd_uuid_set_bm might refer
to the old value of device->ldev->md.uuid[UI_BITMAP] (before locking),
which allows an invalid value to pass the validity check, resulting in
inconsistency.
To address this issue, it is recommended to include the data validity
check within the locked section of the function. This modification
ensures that the value of device->ldev->md.uuid[UI_BITMAP] does not
change during the validation process, thereby maintaining its integrity.
This possible bug is found by an experimental static analysis tool
developed by our team. This tool analyzes the locking APIs to extract
function pairs that can be concurrently executed, and then analyzes the
instructions in the paired functions to identify possible concurrency
bugs including data races and atomicity violations.
Fixes: 9f2247bb9b75 ("drbd: Protect accesses to the uuid set with a spinlock") Cc: stable@vger.kernel.org Signed-off-by: Qiu-ji Chen <chenqiuji666@gmail.com> Reviewed-by: Philipp Reisner <philipp.reisner@linbit.com> Link: https://lore.kernel.org/r/20240913083504.10549-1-chenqiuji666@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In case of sev PLATFORM_STATUS failure, sev_get_api_version() fails
resulting in sev_data field of psp_master nulled out. This later becomes
a problem when unloading the ccp module because the device has not been
unregistered (via misc_deregister()) before clearing the sev_data field
of psp_master. As a result, on reloading the ccp module, a duplicate
device issue is encountered as can be seen from the dmesg log below.
on reloading ccp module via modprobe ccp
Call Trace:
<TASK>
dump_stack_lvl+0xd7/0xf0
dump_stack+0x10/0x20
sysfs_warn_dup+0x5c/0x70
sysfs_create_dir_ns+0xbc/0xd
kobject_add_internal+0xb1/0x2f0
kobject_add+0x7a/0xe0
? srso_alias_return_thunk+0x5/0xfbef5
? get_device_parent+0xd4/0x1e0
? __pfx_klist_children_get+0x10/0x10
device_add+0x121/0x870
? srso_alias_return_thunk+0x5/0xfbef5
device_create_groups_vargs+0xdc/0x100
device_create_with_groups+0x3f/0x60
misc_register+0x13b/0x1c0
sev_dev_init+0x1d4/0x290 [ccp]
psp_dev_init+0x136/0x300 [ccp]
sp_init+0x6f/0x80 [ccp]
sp_pci_probe+0x2a6/0x310 [ccp]
? srso_alias_return_thunk+0x5/0xfbef5
local_pci_probe+0x4b/0xb0
work_for_cpu_fn+0x1a/0x30
process_one_work+0x203/0x600
worker_thread+0x19e/0x350
? __pfx_worker_thread+0x10/0x10
kthread+0xeb/0x120
? __pfx_kthread+0x10/0x10
ret_from_fork+0x3c/0x60
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1a/0x30
</TASK>
kobject: kobject_add_internal failed for sev with -EEXIST, don't try to register things with the same name in the same directory.
ccp 0000:22:00.1: sev initialization failed
ccp 0000:22:00.1: psp initialization failed
ccp 0000:a2:00.1: no command queues available
ccp 0000:a2:00.1: psp enabled
Address this issue by unregistering the /dev/sev before clearing out
sev_data in case of PLATFORM_STATUS failure.
The write to RP2_GLOBAL_CMD followed by an immediate read of
RP2_GLOBAL_CMD in rp2_reset_asic() is intented to flush out the write,
however by then the device is already in reset and cannot respond to a
memory cycle access.
On platforms such as the Raspberry Pi 4 and others using the
pcie-brcmstb.c driver, any memory access to a device that cannot respond
is met with a fatal system error, rather than being substituted with all
1s as is usually the case on PC platforms.
Swapping the delay and the read ensures that the device has finished
resetting before we attempt to read from it.
Fixes: 7d9f49afa451 ("serial: rp2: New driver for Comtrol RocketPort 2 cards") Cc: stable <stable@kernel.org> Suggested-by: Jim Quinlan <james.quinlan@broadcom.com> Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://lore.kernel.org/r/20240906225435.707837-1-florian.fainelli@broadcom.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Most firmware names are hardcoded strings, or are constructed from fairly
constrained format strings where the dynamic parts are just some hex
numbers or such.
However, there are a couple codepaths in the kernel where firmware file
names contain string components that are passed through from a device or
semi-privileged userspace; the ones I could find (not counting interfaces
that require root privileges) are:
- lpfc_sli4_request_firmware_update() seems to construct the firmware
filename from "ModelName", a string that was previously parsed out of
some descriptor ("Vital Product Data") in lpfc_fill_vpd()
- nfp_net_fw_find() seems to construct a firmware filename from a model
name coming from nfp_hwinfo_lookup(pf->hwinfo, "nffw.partno"), which I
think parses some descriptor that was read from the device.
(But this case likely isn't exploitable because the format string looks
like "netronome/nic_%s", and there shouldn't be any *folders* starting
with "netronome/nic_". The previous case was different because there,
the "%s" is *at the start* of the format string.)
- module_flash_fw_schedule() is reachable from the
ETHTOOL_MSG_MODULE_FW_FLASH_ACT netlink command, which is marked as
GENL_UNS_ADMIN_PERM (meaning CAP_NET_ADMIN inside a user namespace is
enough to pass the privilege check), and takes a userspace-provided
firmware name.
(But I think to reach this case, you need to have CAP_NET_ADMIN over a
network namespace that a special kind of ethernet device is mapped into,
so I think this is not a viable attack path in practice.)
Fix it by rejecting any firmware names containing ".." path components.
For what it's worth, I went looking and haven't found any USB device
drivers that use the firmware loader dangerously.
Cc: stable@vger.kernel.org Reviewed-by: Danilo Krummrich <dakr@kernel.org> Fixes: abb139e75c2c ("firmware: teach the kernel to load firmware files directly from the filesystem") Signed-off-by: Jann Horn <jannh@google.com> Acked-by: Luis Chamberlain <mcgrof@kernel.org> Link: https://lore.kernel.org/r/20240828-firmware-traversal-v3-1-c76529c63b5f@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
TIOCGSERIAL is an ioctl. Thus it must be atomic. It returns
two values. Racing with set_serial it can return an inconsistent
result. The mutex must be taken.
In terms of logic the bug is as old as the driver. In terms of
code it goes back to the conversion to the get_serial and
set_serial methods.
Signed-off-by: Oliver Neukum <oneukum@suse.com> Cc: stable <stable@kernel.org> Fixes: 99f75a1fcd865 ("cdc-acm: switch to ->[sg]et_serial()") Link: https://lore.kernel.org/r/20240912141916.1044393-1-oneukum@suse.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is a small window during probing when IO is running
but the backlight is not registered. Processing events
during that time will crash. The completion handler
needs to check for a backlight before scheduling work.
[WHY]
The calculated vtotal may has 1 line deviation. To get precisely
vtotal number, round the vtotal result.
Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Anthony Koo <anthony.koo@amd.com> Signed-off-by: Robin Chen <robin.chen@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The Gen6 devices have the same problem and the same Solution as the Gen5
ones.
Some TongFang barebones have touchpad and/or keyboard issues after
suspend, fixable with nomux + reset + noloop + nopnp. Luckily, none of
them have an external PS/2 port so this can safely be set for all of
them.
I'm not entirely sure if every device listed really needs all four quirks,
but after testing and production use, no negative effects could be
observed when setting all four.
Some TongFang barebones have touchpad and/or keyboard issues after
suspend, fixable with nomux + reset + noloop + nopnp. Luckily, none of
them have an external PS/2 port so this can safely be set for all of
them.
I'm not entirely sure if every device listed really needs all four quirks,
but after testing and production use, no negative effects could be
observed when setting all four.
Return devm_of_clk_add_hw_provider() in order to transfer the error, if it
fails due to resource allocation failure or device tree clock provider
registration failure.
Cc: stable@vger.kernel.org Fixes: ebbfabc16d23 ("ASoC: rt5682: Add CCF usage for providing I2S clks") Signed-off-by: Ma Ke <make24@iscas.ac.cn> Link: https://patch.msgid.link/20240830143154.3448004-1-make24@iscas.ac.cn Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 3f1b0e1f2875 (".gitignore update") added *.orig and *.rej
patterns to .gitignore in v2.6.23. The commit message didn't give a
rationale. Later on, commit 1f5d3a6b6532 ("Remove *.rej pattern from
.gitignore") removed the *.rej pattern in v2.6.26, on the rationale that
*.rej files indicated something went really wrong and should not be
ignored.
The *.rej files are now shown by `git status`, which helps located
conflicts when applying patches and lowers the probability that they
will go unnoticed. It is however still easy to overlook the *.orig files
which slowly polute the source tree. That's not as big of a deal as not
noticing a conflict, but it's still not nice.
Drop the *.orig pattern from .gitignore to avoid this and help keep the
source tree clean.
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
[masahiroy@kernel.org:
I do not have a strong opinion about this. Perhaps some people may have
a different opinion.
If you are someone who wants to ignore *.orig, it is likely you would
want to do so across all projects. Then, $XDG_CONFIG_HOME/git/ignore
would be more suitable for your needs. gitignore(5) suggests, "Patterns
which a user wants Git to ignore in all situations generally go into a
file specified by core.excludesFile in the user's ~/.gitconfig".
Please note that you cannot do the opposite; if *.orig is ignored by
the project's .gitignore, you cannot override the decision because
$XDG_CONFIG_HOME/git/ignore has a lower priority.
If *.orig is sitting on the fence, I'd leave it to the users. ] Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
$ make -skj"$(nproc)" ARCH=x86_64 LLVM=1 LLVM_IAS=1 mrproper allmodconfig net/mptcp/subflow.o
net/mptcp/subflow.c:877:6: warning: variable 'incr' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
877 | if (WARN_ON_ONCE(offset > skb->len))
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/asm-generic/bug.h:101:33: note: expanded from macro 'WARN_ON_ONCE'
101 | #define WARN_ON_ONCE(condition) ({ \
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
102 | int __ret_warn_on = !!(condition); \
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
103 | if (unlikely(__ret_warn_on)) \
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
104 | __WARN_FLAGS(BUGFLAG_ONCE | \
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
105 | BUGFLAG_TAINT(TAINT_WARN)); \
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
106 | unlikely(__ret_warn_on); \
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
107 | })
| ~~
net/mptcp/subflow.c:893:6: note: uninitialized use occurs here
893 | if (incr)
| ^~~~
net/mptcp/subflow.c:877:2: note: remove the 'if' if its condition is always false
877 | if (WARN_ON_ONCE(offset > skb->len))
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
878 | goto out;
| ~~~~~~~~
net/mptcp/subflow.c:874:18: note: initialize the variable 'incr' to silence this warning
874 | u32 offset, incr, avail_len;
| ^
| = 0
1 warning generated.
As mentioned by Nathan, this issue is present because 5.10 does not
include commit ea4ca586b16f ("mptcp: refine MPTCP-level ack scheduling"),
which removed the use of 'incr' in the error path added by this change.
This other commit does not really look suitable for stable, hence this
dedicated patch for 5.10.
Marek Gresko reports that the root user on an NFS client is able to
change the security labels on files on an NFS filesystem that is
exported with root squashing enabled.
The end of the kerneldoc comment for __vfs_setxattr_noperm() states:
* This function requires the caller to lock the inode's i_mutex before it
* is executed. It also assumes that the caller will make the appropriate
* permission checks.
nfsd_setattr() does do permissions checking via fh_verify() and
nfsd_permission(), but those don't do all the same permissions checks
that are done by security_inode_setxattr() and its related LSM hooks do.
Since nfsd_setattr() is the only consumer of security_inode_setsecctx(),
simplest solution appears to be to replace the call to
__vfs_setxattr_noperm() with a call to __vfs_setxattr_locked(). This
fixes the above issue and has the added benefit of causing nfsd to
recall conflicting delegations on a file when a client tries to change
its security label.
Cc: stable@kernel.org Reported-by: Marek Gresko <marek.gresko@protonmail.com> Link: https://bugzilla.kernel.org/show_bug.cgi?id=218809 Signed-off-by: Scott Mayhew <smayhew@redhat.com> Tested-by: Stephen Smalley <stephen.smalley.work@gmail.com> Reviewed-by: Stephen Smalley <stephen.smalley.work@gmail.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
[Shivani: Modified to apply on v5.10.y] Signed-off-by: Shivani Agarwal <shivani.agarwal@broadcom.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The devmap code allocates a number hash buckets equal to the next power
of two of the max_entries value provided when creating the map. When
rounding up to the next power of two, the 32-bit variable storing the
number of buckets can overflow, and the code checks for overflow by
checking if the truncated 32-bit value is equal to 0. However, on 32-bit
arches the rounding up itself can overflow mid-way through, because it
ends up doing a left-shift of 32 bits on an unsigned long value. If the
size of an unsigned long is four bytes, this is undefined behaviour, so
there is no guarantee that we'll end up with a nice and tidy 0-value at
the end.
Syzbot managed to turn this into a crash on arm32 by creating a
DEVMAP_HASH with max_entries > 0x80000000 and then trying to update it.
Fix this by moving the overflow check to before the rounding up
operation.
Fixes: 6f9d451ab1a3 ("xdp: Add devmap_hash map type for looking up devices by hashed index") Link: https://lore.kernel.org/r/000000000000ed666a0611af6818@google.com Reported-and-tested-by: syzbot+8cd36f6b65f3cafd400a@syzkaller.appspotmail.com Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Message-ID: <20240307120340.99577-2-toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Pu Lehui <pulehui@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 70294d8bc31f ("bpf: Eliminate rlimit-based memory accounting for
devmap maps") is part of the v5.11+ base mechanism of memcg-based memory
accounting[0]. The commit cannot be independently backported to the 5.10
stable branch, otherwise the related memory when creating devmap will be
unrestricted. Let's roll back to rlimit-based memory accounting mode for
devmap.
Commit 225da02acdc9 ("bpf: fix DEVMAP_HASH overflow check on 32-bit
architectures") relies on the v5.11+ base mechanism of memcg-based
memory accounting[0], which is not yet supported on the 5.10 stable
branch, so let's revert this commit in preparation for re-adapting it.
New CPU #defines encode vendor and family as well as model.
[ dhansen: vertically align 0's in invlpg_miss_ids[] ]
Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/all/20240424181518.41946-1-tony.luck%40intel.com
[ Ricardo: I used the old match macro X86_MATCH_INTEL_FAM6_MODEL()
instead of X86_MATCH_VFM() as in the upstream commit.
I also kept the ALDERLAKE_N name instead of ATOM_GRACEMONT. Both refer
to the same CPU model. ] Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The current initialization of the struct x86_cpu_id via
pl4_support_ids[] is partial and wrong. It is initializing
"stepping" field with "X86_FEATURE_ANY" instead of "feature" field.
Use X86_MATCH_INTEL_FAM6_MODEL macro instead of initializing
each field of the struct x86_cpu_id for pl4_supported list of CPUs.
This X86_MATCH_INTEL_FAM6_MODEL macro internally uses another macro
X86_MATCH_VENDOR_FAM_MODEL_FEATURE for X86 based CPU matching with
appropriate initialized values.
Reported-by: Dave Hansen <dave.hansen@intel.com> Link: https://lore.kernel.org/lkml/28ead36b-2d9e-1a36-6f4e-04684e420260@intel.com Fixes: eb52bc2ae5b8 ("powercap: RAPL: Add Power Limit4 support for Meteor Lake SoC") Fixes: b08b95cf30f5 ("powercap: RAPL: Add Power Limit4 support for Alder Lake-N and Raptor Lake-P") Fixes: 515755906921 ("powercap: RAPL: Add Power Limit4 support for RaptorLake") Fixes: 1cc5b9a411e4 ("powercap: Add Power Limit4 support for Alder Lake SoC") Fixes: 8365a898fe53 ("powercap: Add Power Limit4 support") Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
[ Ricardo: I only kept TIGERLAKE in pl4_support_ids as only this model is
enumerated before this changeset. ] Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Use the new soc_intel_is_byt() helper from linux/platform_data/x86/soc.h.
Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20220131143539.109142-5-hdegoede@redhat.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
[Ricardo: Resolved minor cherry-pick conflict. The file linux/regulator/
consumer.h is not #included in the upstream version but it is in
v5.10.y. ] Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Only provide ctnetlink_label_size when it is used,
which is when CONFIG_NF_CONNTRACK_EVENTS is configured.
Flagged by clang-18 W=1 builds as:
.../nf_conntrack_netlink.c:385:19: warning: unused function 'ctnetlink_label_size' [-Wunused-function]
385 | static inline int ctnetlink_label_size(const struct nf_conn *ct)
| ^~~~~~~~~~~~~~~~~~~~
The condition on CONFIG_NF_CONNTRACK_LABELS being removed by
this patch guards compilation of non-trivial implementations
of ctnetlink_dump_labels() and ctnetlink_label_size().
However, this is not necessary as each of these functions
will always return 0 if CONFIG_NF_CONNTRACK_LABELS is not defined
as each function starts with the equivalent of:
And nf_ct_labels_find always returns NULL if CONFIG_NF_CONNTRACK_LABELS
is not enabled. So I believe that the compiler optimises the code away
in such cases anyway.
Found by inspection.
Compile tested only.
Originally splitted in two patches, Pablo Neira Ayuso collapsed them and
added Fixes: tag.
Fixes: 0ceabd83875b ("netfilter: ctnetlink: deliver labels to userspace") Link: https://lore.kernel.org/netfilter-devel/20240909151712.GZ2097826@kernel.org/ Signed-off-by: Simon Horman <horms@kernel.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Documentation of list_del_rcu() warns callers to not immediately free
the deleted list item. While it seems not necessary to use the
RCU-variant of list_del() here in the first place, doing so seems to
require calling kfree_rcu() on the deleted item as well.
Fixes: 3f0465a9ef02 ("netfilter: nf_tables: dynamically allocate hooks per net_device in flowtables") Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
When broadcasting data to multiple nodes via MHI, using skb_clone()
causes all nodes to receive the same header data. This can result in
packets being discarded by endpoints, leading to lost data.
This issue occurs when a socket is closed, and a QRTR_TYPE_DEL_CLIENT
packet is broadcasted. All nodes receive the same destination node ID,
causing the node connected to the client to discard the packet and
remain unaware of the client's deletion.
Replace skb_clone() with pskb_copy(), to create a separate copy of
the header for each sk_buff.
Fixes: bdabad3e363d ("net: Add Qualcomm IPC router") Signed-off-by: Youssef Samir <quic_yabdulra@quicinc.com> Reviewed-by: Jeffery Hugo <quic_jhugo@quicinc.com> Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com> Reviewed-by: Chris Lew <quic_clew@quicinc.com> Link: https://patch.msgid.link/20240916170858.2382247-1-quic_yabdulra@quicinc.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
We have some machines running stock Ubuntu 20.04.6 which is their 5.4.0-174-generic
kernel that are running ceph and recently hit a null ptr dereference in
tcp_rearm_rto(). Initially hitting it from the TLP path, but then later we also
saw it getting hit from the RACK case as well. Here are examples of the oops
messages we saw in each of those cases:
After we hit this we disabled TLP by setting tcp_early_retrans to 0 and then hit the crash in the RACK case:
Aug 7 07:26:16 rx [1006006.265582] BUG: kernel NULL pointer dereference, address: 0000000000000020
Aug 7 07:26:16 rx [1006006.272719] #PF: supervisor read access in kernel mode
Aug 7 07:26:16 rx [1006006.278030] #PF: error_code(0x0000) - not-present page
Aug 7 07:26:16 rx [1006006.283343] PGD 0 P4D 0
Aug 7 07:26:16 rx [1006006.286057] Oops: 0000 [#1] SMP NOPTI
Aug 7 07:26:16 rx [1006006.289896] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G W 5.4.0-174-generic #193-Ubuntu
Aug 7 07:26:16 rx [1006006.299107] Hardware name: Supermicro SMC 2x26 os-gen8 64C NVME-Y 256G/H12SSW-NTR, BIOS 2.5.V1.2U.NVMe.UEFI 05/09/2023
Aug 7 07:26:16 rx [1006006.309970] RIP: 0010:tcp_rearm_rto+0xe4/0x160
Aug 7 07:26:16 rx [1006006.314584] Code: 87 ca 04 00 00 00 5b 41 5c 41 5d 5d c3 c3 49 8b bc 24 40 06 00 00 eb 8d 48 bb cf f7 53 e3 a5 9b c4 20 4c 89 ef e8 0c fe 0e 00 <48> 8b 78 20 48 c1 ef 03 48 89 f8 41 8b bc 24 80 04 00 00 48 f7 e3
Aug 7 07:26:16 rx [1006006.333499] RSP: 0018:ffffb42600a50960 EFLAGS: 00010246
Aug 7 07:26:16 rx [1006006.338895] RAX: 0000000000000000 RBX: 20c49ba5e353f7cf RCX: 0000000000000000
Aug 7 07:26:16 rx [1006006.346193] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff92d687ed8160
Aug 7 07:26:16 rx [1006006.353489] RBP: ffffb42600a50978 R08: 0000000000000000 R09: 00000000cd896dcc
Aug 7 07:26:16 rx [1006006.360786] R10: ffff92dc3404f400 R11: 0000000000000001 R12: ffff92d687ed8000
Aug 7 07:26:16 rx [1006006.368084] R13: ffff92d687ed8160 R14: 00000000cd896dcc R15: 00000000cd8fca81
Aug 7 07:26:16 rx [1006006.375381] FS: 0000000000000000(0000) GS:ffff93158ad40000(0000) knlGS:0000000000000000
Aug 7 07:26:16 rx [1006006.383632] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 7 07:26:16 rx [1006006.389544] CR2: 0000000000000020 CR3: 0000003e775ce006 CR4: 0000000000760ee0
Aug 7 07:26:16 rx [1006006.396839] PKRU: 55555554
Aug 7 07:26:16 rx [1006006.399717] Call Trace:
Aug 7 07:26:16 rx [1006006.402335]
Aug 7 07:26:16 rx [1006006.404525] ? show_regs.cold+0x1a/0x1f
Aug 7 07:26:16 rx [1006006.408532] ? __die+0x90/0xd9
Aug 7 07:26:16 rx [1006006.411760] ? no_context+0x196/0x380
Aug 7 07:26:16 rx [1006006.415599] ? __bad_area_nosemaphore+0x50/0x1a0
Aug 7 07:26:16 rx [1006006.420392] ? _raw_spin_lock+0x1e/0x30
Aug 7 07:26:16 rx [1006006.424401] ? bad_area_nosemaphore+0x16/0x20
Aug 7 07:26:16 rx [1006006.428927] ? do_user_addr_fault+0x267/0x450
Aug 7 07:26:16 rx [1006006.433450] ? __do_page_fault+0x58/0x90
Aug 7 07:26:16 rx [1006006.437542] ? do_page_fault+0x2c/0xe0
Aug 7 07:26:16 rx [1006006.441470] ? page_fault+0x34/0x40
Aug 7 07:26:16 rx [1006006.445134] ? tcp_rearm_rto+0xe4/0x160
Aug 7 07:26:16 rx [1006006.449145] tcp_ack+0xa32/0xb30
Aug 7 07:26:16 rx [1006006.452542] tcp_rcv_established+0x13c/0x670
Aug 7 07:26:16 rx [1006006.456981] ? sk_filter_trim_cap+0x48/0x220
Aug 7 07:26:16 rx [1006006.461419] tcp_v6_do_rcv+0xdb/0x450
Aug 7 07:26:16 rx [1006006.465257] tcp_v6_rcv+0xc2b/0xd10
Aug 7 07:26:16 rx [1006006.468918] ip6_protocol_deliver_rcu+0xd3/0x4e0
Aug 7 07:26:16 rx [1006006.473706] ip6_input_finish+0x15/0x20
Aug 7 07:26:16 rx [1006006.477710] ip6_input+0xa2/0xb0
Aug 7 07:26:16 rx [1006006.481109] ? ip6_protocol_deliver_rcu+0x4e0/0x4e0
Aug 7 07:26:16 rx [1006006.486151] ip6_sublist_rcv_finish+0x3d/0x50
Aug 7 07:26:16 rx [1006006.490679] ip6_sublist_rcv+0x1aa/0x250
Aug 7 07:26:16 rx [1006006.494779] ? ip6_rcv_finish_core.isra.0+0xa0/0xa0
Aug 7 07:26:16 rx [1006006.499828] ipv6_list_rcv+0x112/0x140
Aug 7 07:26:16 rx [1006006.503748] __netif_receive_skb_list_core+0x1a4/0x250
Aug 7 07:26:16 rx [1006006.509057] netif_receive_skb_list_internal+0x1a1/0x2b0
Aug 7 07:26:16 rx [1006006.514538] gro_normal_list.part.0+0x1e/0x40
Aug 7 07:26:16 rx [1006006.519068] napi_complete_done+0x91/0x130
Aug 7 07:26:16 rx [1006006.523352] mlx5e_napi_poll+0x18e/0x610 [mlx5_core]
Aug 7 07:26:16 rx [1006006.528481] net_rx_action+0x142/0x390
Aug 7 07:26:16 rx [1006006.532398] __do_softirq+0xd1/0x2c1
Aug 7 07:26:16 rx [1006006.536142] irq_exit+0xae/0xb0
Aug 7 07:26:16 rx [1006006.539452] do_IRQ+0x5a/0xf0
Aug 7 07:26:16 rx [1006006.542590] common_interrupt+0xf/0xf
Aug 7 07:26:16 rx [1006006.546421]
Aug 7 07:26:16 rx [1006006.548695] RIP: 0010:native_safe_halt+0xe/0x10
Aug 7 07:26:16 rx [1006006.553399] Code: 7b ff ff ff eb bd 90 90 90 90 90 90 e9 07 00 00 00 0f 00 2d 36 2c 50 00 f4 c3 66 90 e9 07 00 00 00 0f 00 2d 26 2c 50 00 fb f4 90 0f 1f 44 00 00 55 48 89 e5 41 55 41 54 53 e8 dd 5e 61 ff 65
Aug 7 07:26:16 rx [1006006.572309] RSP: 0018:ffffb42600177e70 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffc2
Aug 7 07:26:16 rx [1006006.580040] RAX: ffffffff8ed08b20 RBX: 0000000000000005 RCX: 0000000000000001
Aug 7 07:26:16 rx [1006006.587337] RDX: 00000000f48eeca2 RSI: 0000000000000082 RDI: 0000000000000082
Aug 7 07:26:16 rx [1006006.594635] RBP: ffffb42600177e90 R08: 0000000000000000 R09: 000000000000020f
Aug 7 07:26:16 rx [1006006.601931] R10: 0000000000100000 R11: 0000000000000000 R12: 0000000000000005
Aug 7 07:26:16 rx [1006006.609229] R13: ffff93157deb5f00 R14: 0000000000000000 R15: 0000000000000000
Aug 7 07:26:16 rx [1006006.616530] ? __cpuidle_text_start+0x8/0x8
Aug 7 07:26:16 rx [1006006.620886] ? default_idle+0x20/0x140
Aug 7 07:26:16 rx [1006006.624804] arch_cpu_idle+0x15/0x20
Aug 7 07:26:16 rx [1006006.628545] default_idle_call+0x23/0x30
Aug 7 07:26:16 rx [1006006.632640] do_idle+0x1fb/0x270
Aug 7 07:26:16 rx [1006006.636035] cpu_startup_entry+0x20/0x30
Aug 7 07:26:16 rx [1006006.640126] start_secondary+0x178/0x1d0
Aug 7 07:26:16 rx [1006006.644218] secondary_startup_64+0xa4/0xb0
Aug 7 07:26:17 rx [1006006.648568] Modules linked in: vrf bridge stp llc vxlan ip6_udp_tunnel udp_tunnel nls_iso8859_1 nft_ct amd64_edac_mod edac_mce_amd kvm_amd kvm crct10dif_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper wmi_bmof ipmi_ssif input_leds joydev rndis_host cdc_ether usbnet ast mii drm_vram_helper ttm drm_kms_helper i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt ccp mac_hid ipmi_si ipmi_devintf ipmi_msghandler sch_fq_codel nf_tables_set nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink ramoops reed_solomon efi_pstore drm ip_tables x_tables autofs4 raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear mlx5_ib ib_uverbs ib_core raid1 hid_generic mlx5_core pci_hyperv_intf crc32_pclmul usbhid ahci tls mlxfw bnxt_en hid libahci nvme i2c_piix4 nvme_core wmi [last unloaded: cpuid]
Aug 7 07:26:17 rx [1006006.726180] CR2: 0000000000000020
Aug 7 07:26:17 rx [1006006.729718] ---[ end trace e0e2e37e4e612984 ]---
Prior to seeing the first crash and on other machines we also see the warning in
tcp_send_loss_probe() where packets_out is non-zero, but both transmit and retrans
queues are empty so we know the box is seeing some accounting issue in this area:
The NULL ptr deref is coming from tcp_rto_delta_us() attempting to pull an skb
off the head of the retransmit queue and then dereferencing that skb to get the
skb_mstamp_ns value via tcp_skb_timestamp_us(skb).
The crash is the same one that was reported a # of years ago here:
https://lore.kernel.org/netdev/86c0f836-9a7c-438b-d81a-839be45f1f58@gmail.com/T/#t
and the kernel we're running has the fix which was added to resolve this issue.
Unfortunately we've been unsuccessful so far in reproducing this problem in the
lab and do not have the luxury of pushing out a new kernel to try and test if
newer kernels resolve this issue at the moment. I realize this is a report
against both an Ubuntu kernel and also an older 5.4 kernel. I have reported this
issue to Ubuntu here: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2077657
however I feel like since this issue has possibly cropped up again it makes
sense to build in some protection in this path (even on the latest kernel
versions) since the code in question just blindly assumes there's a valid skb
without testing if it's NULL b/f it looks at the timestamp.
Given we have seen crashes in this path before and now this case it seems like
we should protect ourselves for when packets_out accounting is incorrect.
While we should fix that root cause we should also just make sure the skb
is not NULL before dereferencing it. Also add a warn once here to capture
some information if/when the problem case is hit again.
Fixes: e1a10ef7fa87 ("tcp: introduce tcp_rto_delta_us() helper for xmit timer fix") Signed-off-by: Josh Hunt <johunt@akamai.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
The rpl sr tunnel code contains calls to dst_cache_*() which are
only present when the dst cache is built.
Select DST_CACHE to build the dst cache, similar to other kconfig
options in the same file.
Compiling the rpl sr tunnel without DST_CACHE will lead to linker
errors.
Fixes: a7a29f9c361f ("net: ipv6: add rpl sr tunnel") Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Simon Horman <horms@kernel.org> # build-tested Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
In the ether3_probe function, a timer is initialized with a callback
function ether3_ledoff, bound to &prev(dev)->timer. Once the timer is
started, there is a risk of a race condition if the module or device
is removed, triggering the ether3_remove function to perform cleanup.
The sequence of operations that may lead to a UAF bug is as follows:
CPU0 CPU1
| ether3_ledoff
ether3_remove |
free_netdev(dev); |
put_devic |
kfree(dev); |
| ether3_outw(priv(dev)->regs.config2 |= CFG2_CTRLO, REG_CONFIG2);
| // use dev
Fix it by ensuring that the timer is canceled before proceeding with
the cleanup in ether3_remove.
Fixes: 6fd9c53f7186 ("net: seeq: Convert timers to use timer_setup()") Signed-off-by: Kaixin Wang <kxwang23@m.fudan.edu.cn> Link: https://patch.msgid.link/20240915144045.451-1-kxwang23@m.fudan.edu.cn Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The following sequence of commands causes a livelock - there will be
workqueue process looping and consuming 100% CPU:
dmsetup create --notable test
truncate -s 1MiB testdata
losetup /dev/loop0 testdata
dmsetup load test --table '0 2048 linear /dev/loop0 0'
dd if=/dev/zero of=/dev/dm-0 bs=16k count=1 conv=fdatasync
The livelock is caused by the commit fa247089de99. The commit claims that
it fixes a race condition, however, it is unknown what the actual race
condition is and what program is involved in the race condition.
When the inactive table is loaded, the nodes /dev/dm-0 and
/sys/block/dm-0 are created. /dev/dm-0 has zero size at this point. When
the device is suspended and resumed, the nodes /dev/mapper/test and
/dev/disk/* are created.
If some program opens a block device before it is created by dmsetup or
lvm, the program is buggy, so dm could just report an error as it used to
do before.
We used to call irq_bypass_unregister_producer() in
vhost_vdpa_setup_vq_irq() which is problematic as we don't know if the
token pointer is still valid or not.
Actually, we use the eventfd_ctx as the token so the life cycle of the
token should be bound to the VHOST_SET_VRING_CALL instead of
vhost_vdpa_setup_vq_irq() which could be called by set_status().
Fixing this by setting up irq bypass producer's token when handling
VHOST_SET_VRING_CALL and un-registering the producer before calling
vhost_vring_ioctl() to prevent a possible use after free as eventfd
could have been released in vhost_vring_ioctl(). And such registering
and unregistering will only be done if DRIVER_OK is set.
Reported-by: Dragos Tatulea <dtatulea@nvidia.com> Tested-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Fixes: 2cf1ba9a4d15 ("vhost_vdpa: implement IRQ offloading in vhost_vdpa") Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20240816031900.18013-1-jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
This due to the fact that we do not free the "sg_table" itself while
freeing up the SG table and data pages. Fix this by freeing the sg_table
in tmc_free_sg_table().
Fixes: 99443ea19e8b ("coresight: Add generic TMC sg table framework") Cc: Mike Leach <mike.leach@linaro.org> Cc: James Clark <james.clark@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20240702132846.1677261-1-suzuki.poulose@arm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
The binding's documentation specifies that "As the line is active low, it
should be marked GPIO_ACTIVE_LOW". However, in the driver, it was handled
the opposite way. This commit sets the driver's behaviour in sync with the
documentation
Fixes: 722407a4e8c0 ("staging:iio:ad7606: Use GPIO descriptor API") Signed-off-by: Guillaume Stols <gstols@baylibre.com> Reviewed-by: Nuno Sa <nuno.sa@analog.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
gpiod_set_array_value was misused here: the implementation relied on the
assumption that an unsigned long was required for each gpio, while the
function expects a bit array stored in "as much unsigned long as needed
for storing one bit per GPIO", i.e it is using a bit field.
This leaded to incorrect parameter passed to gpiod_set_array_value, that
would set 1 value instead of 3.
It also prevents to select the software mode correctly for the AD7606B.
Fixes: d2a415c86c6b ("iio: adc: ad7606: Add support for AD7606B ADC") Fixes: 41f71e5e7daf ("staging: iio: adc: ad7606: Use find_closest() macro") Signed-off-by: Guillaume Stols <gstols@baylibre.com> Reviewed-by: Nuno Sa <nuno.sa@analog.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
It's important to undo pm_runtime_use_autosuspend() with
pm_runtime_dont_use_autosuspend() at driver exit time unless driver
initially enabled pm_runtime with devm_pm_runtime_enable()
(which handles it for you).
Hence, call pm_runtime_dont_use_autosuspend() at driver exit time
to fix it.
Fixes: 944c01a889d9 ("spi: lpspi: enable runtime pm for lpspi") Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Link: https://patch.msgid.link/20240906021251.610462-1-ruanjinjie@huawei.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The requested DMA channels are never released. Do this in .remove as well
as in .probe. spi_register_controller() can return -EPROBE_DEFER if
cs-gpios are not probed yet.
Do not print error messages with error code -517. Silences the following
errors upon on imx8qm:
fsl_lpspi 5a000000.spi: spi_register_controller error: -517
fsl_lpspi 5a010000.spi: spi_register_controller error: -517
fsl_lpspi 5a020000.spi: spi_register_controller error: -517
We may trigger high frequent checkpoint for below case:
1. mkdir /mnt/dir1; set dir1 encrypted
2. touch /mnt/file1; fsync /mnt/file1
3. mkdir /mnt/dir2; set dir2 encrypted
4. touch /mnt/file2; fsync /mnt/file2
...
Although, newly created dir and file are not related, due to
commit bbf156f7afa7 ("f2fs: fix lost xattrs of directories"), we will
trigger checkpoint whenever fsync() comes after a new encrypted dir
created.
In order to avoid such performance regression issue, let's record an
entry including directory's ino in global cache whenever we update
directory's xattr data, and then triggerring checkpoint() only if
xattr metadata of target file's parent was updated.
This patch updates to cover below no encryption case as well:
1) parent is checkpointed
2) set_xattr(dir) w/ new xnid
3) create(file)
4) fsync(file)
Fixes: bbf156f7afa7 ("f2fs: fix lost xattrs of directories") Reported-by: wangzijie <wangzijie1@honor.com> Reported-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Tested-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Reported-by: Yunlei He <heyunlei@hihonor.com> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
generic/728 - output mismatch (see /media/fstests/results//generic/728.out.bad)
--- tests/generic/728.out 2023-07-19 07:10:48.362711407 +0000
+++ /media/fstests/results//generic/728.out.bad 2023-07-19 08:39:57.000000000 +0000
QA output created by 728
+Expected ctime to change after setxattr.
+Expected ctime to change after removexattr.
Silence is golden
...
(Run 'diff -u /media/fstests/tests/generic/728.out /media/fstests/results//generic/728.out.bad' to see the entire diff)
generic/729 1s
It needs to update i_ctime after {set,remove}xattr, fix it.
Previously, in f2fs_setattr(), we don't update S_ISUID|S_ISGID|S_ISVTX
bits with S_IRWXUGO bits and acl entries atomically, so in error path,
chmod() may partially success, this patch enhances to make chmod() flow
being atomical.
When we have a corrupted main.sqlite in /var/lib/nfs/nfsdcld/, it may
result in namelen being 0, which will cause memdup_user() to return
ZERO_SIZE_PTR.
When we access the name.data that has been assigned the value of
ZERO_SIZE_PTR in nfs4_client_to_reclaim(), null pointer dereference is
triggered.
If not enough buffer space available, but idmap_lookup has triggered
lookup_fn which calls cache_get and returns successfully. Then we
missed to call cache_put here which pairs with cache_get.
Fixes: ddd1ea563672 ("nfsd4: use xdr_reserve_space in attribute encoding") Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev> Reviwed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The lookup_atid() function can return NULL if the ATID is
invalid or does not exist in the identifier table, which
could lead to dereferencing a null pointer without a
check in the `act_establish()` and `act_open_rpl()` functions.
Add a NULL check to prevent null pointer dereferencing.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
When allocating MTT hem, for each hop level of each hem that is being
allocated, the driver iterates the hem list to find out whether the
bt page has been allocated in this hop level. If not, allocate a new
one and splice it to the list. The time complexity is O(n^2) in worst
cases.
Currently the allocation for-loop uses 'unit' as the step size. This
actually has taken into account the reuse of last-hop-level MTT bt
pages by multiple buffer pages. Thus pages of last hop level will
never have been allocated, so there is no need to iterate the hem list
in last hop level.
Removing this unnecessary iteration can reduce the time complexity to
O(n).
Fixes: 38389eaa4db1 ("RDMA/hns: Add mtr support for mixed multihop addressing") Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://patch.msgid.link/20240906093444.3571619-9-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Split the hem_list_alloc_root_bt() into serval small functions to make the
code flow more clear.
Link: https://lore.kernel.org/r/1621589395-2435-3-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Stable-dep-of: d586628b169d ("RDMA/hns: Fix the overflow risk of hem_list_calc_ba_range()") Signed-off-by: Sasha Levin <sashal@kernel.org>
Add the mapped page count checking flow to avoid invalid page size when
creating MTR.
Fixes: 38389eaa4db1 ("RDMA/hns: Add mtr support for mixed multihop addressing") Link: https://lore.kernel.org/r/1612517974-31867-4-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Stable-dep-of: d586628b169d ("RDMA/hns: Fix the overflow risk of hem_list_calc_ba_range()") Signed-off-by: Sasha Levin <sashal@kernel.org>
Parts of the suspend and resume chain is left unprotected if we disable
the WDT here.
>From experiments we can see that the SCU disables and re-enables the WDT
when we enter and leave suspend to ram. By not touching the WDT here we
are protected by the WDT all the way to the SCU.
Convert platform_get_resource(), devm_ioremap_resource() to a single
call to devm_platform_get_and_ioremap_resource(), as this is exactly
what this function does.
If we wait_for_construction and find that the file is no longer hashed,
and we're going to retry the open, the old nfsd_file reference is
currently leaked. Put the reference before retrying.
Fixes: c6593366c0bf ("nfsd: don't kill nfsd_files because of lease break error") Signed-off-by: Jeff Layton <jlayton@kernel.org> Tested-by: Youzhong Yang <youzhong@gmail.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
This fix leaking the of_node references in of_dra7_atl_clk_probe().
The docs for of_parse_phandle_with_args() say that the caller must call
of_node_put() on the returned node. This adds the missing of_node_put()
to fix the leak.
In the commit aee2424246f9 ("RDMA/iwcm: Fix a use-after-free related to
destroying CM IDs"), the function flush_workqueue is invoked to flush the
work queue iwcm_wq.
But at that time, the work queue iwcm_wq was created via the function
alloc_ordered_workqueue without the flag WQ_MEM_RECLAIM.
Because the current process is trying to flush the whole iwcm_wq, if
iwcm_wq doesn't have the flag WQ_MEM_RECLAIM, verify that the current
process is not reclaiming memory or running on a workqueue which doesn't
have the flag WQ_MEM_RECLAIM as that can break forward-progress guarantee
leading to a deadlock.
Ensure index in rtl2830_pid_filter does not exceed 31 to prevent
out-of-bounds access.
dev->filters is a 32-bit value, so set_bit and clear_bit functions should
only operate on indices from 0 to 31. If index is 32, it will attempt to
access a non-existent 33rd bit, leading to out-of-bounds access.
Change the boundary check from index > 32 to index >= 32 to resolve this
issue.
Fixes: df70ddad81b4 ("[media] rtl2830: implement PID filter") Signed-off-by: Junlin Li <make24@iscas.ac.cn> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Sasha Levin <sashal@kernel.org>
Ensure index in rtl2832_pid_filter does not exceed 31 to prevent
out-of-bounds access.
dev->filters is a 32-bit value, so set_bit and clear_bit functions should
only operate on indices from 0 to 31. If index is 32, it will attempt to
access a non-existent 33rd bit, leading to out-of-bounds access.
Change the boundary check from index > 32 to index >= 32 to resolve this
issue.
Signed-off-by: Junlin Li <make24@iscas.ac.cn> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Fixes: 4b01e01a81b6 ("[media] rtl2832: implement PID filter")
[hverkuil: added fixes tag, rtl2830_pid_filter -> rtl2832_pid_filter in logmsg] Signed-off-by: Sasha Levin <sashal@kernel.org>
Similar to DCLK_LCDC on RK3328, the DCLK_VOP on RK3228 is typically
parented by the hdmiphy clk and it is expected that the DCLK_VOP and
hdmiphy clk rate are kept in sync.
Use CLK_SET_RATE_PARENT and CLK_SET_RATE_NO_REPARENT flags, same as used
on RK3328, to make full use of all possible supported display modes.
Fixes: 0a9d4ac08ebc ("clk: rockchip: set the clock ids for RK3228 VOP") Fixes: 307a2e9ac524 ("clk: rockchip: add clock controller for rk3228") Signed-off-by: Jonas Karlman <jonas@kwiboo.se> Link: https://lore.kernel.org/r/20240615170417.3134517-3-jonas@kwiboo.se Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
On the i.MX8M*, the TF-A exposes a SiP (Silicon Provider) service
for DDR frequency scaling. The imx8m-ddrc-devfreq driver calls the
SiP and then does clk_set_parent on the DDR muxes to synchronize
the clock tree.
since commit 936c383673b9 ("clk: imx: fix composite peripheral flags"),
these TF-A managed muxes have SET_PARENT_GATE set, which results
in imx8m-ddrc-devfreq's clk_set_parent after SiP failing with -EBUSY:
commit 926bf91248dd
("clk: imx8m: fix clock tree update of TF-A managed clocks") adds this
method and enables 8mm, 8mn and 8mq. i.MX8MP also needs it.
This is safe to do, because updating the Linux clock tree to reflect
reality will always be glitch-free.
Another reason to this patch is that powersave image BT music
requires dram to be 400MTS, so clk_set_parent(dram_alt_src,
sys1_pll_800m); is required. Without this patch, it will not succeed.
Fixes: 936c383673b9 ("clk: imx: fix composite peripheral flags") Signed-off-by: Zhipeng Wang <zhipeng.wang_1@nxp.com> Reviewed-by: Ahmad Fatoum <a.fatoum@pengutronix.de> Signed-off-by: Peng Fan <peng.fan@nxp.com> Reviewed-by: Abel Vesa <abel.vesa@linaro.org> Link: https://lore.kernel.org/r/20240607133347.3291040-7-peng.fan@oss.nxp.com Signed-off-by: Abel Vesa <abel.vesa@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Switch strtoul to strtoull as an unsigned long in 32-bit build isn't
64-bits.
Fixes: c284d669a20d408b ("perf tools: Move parse_nsec_time to time-utils.c") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: David Ahern <dsa@cumulusnetworks.com> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Junhao He <hejunhao3@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yang Jihong <yangjihong@bytedance.com> Link: https://lore.kernel.org/r/20240831070415.506194-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
If sched_in event for current task is not recorded, sched_in timestamp
will be set to end_time of time window interest, causing an error in
timestamp show. In this case, we choose to ignore this event.
Test scenario:
perf[1229608] does not record the first sched_in event, run time and sch delay are both 0
# perf sched timehist
Samples of sched_switch event do not have callchains.
time cpu task name wait time sch delay run time
[tid/pid] (msec) (msec) (msec)
--------------- ------ ------------------------------ --------- --------- --------- 2090450.763231 [0000] perf[1229608] 0.000 0.000 0.000 2090450.763235 [0000] migration/0[15] 0.000 0.001 0.003 2090450.763263 [0001] perf[1229608] 0.000 0.000 0.000 2090450.763268 [0001] migration/1[21] 0.000 0.001 0.004 2090450.763302 [0002] perf[1229608] 0.000 0.000 0.000 2090450.763309 [0002] migration/2[27] 0.000 0.001 0.007 2090450.763338 [0003] perf[1229608] 0.000 0.000 0.000 2090450.763343 [0003] migration/3[33] 0.000 0.001 0.004
Before:
arbitrarily specify a time window of interest, timestamp will be set to an incorrect value
# perf sched timehist --time 100,200
Samples of sched_switch event do not have callchains.
time cpu task name wait time sch delay run time
[tid/pid] (msec) (msec) (msec)
--------------- ------ ------------------------------ --------- --------- ---------
200.000000 [0000] perf[1229608] 0.000 0.000 0.000
200.000000 [0001] perf[1229608] 0.000 0.000 0.000
200.000000 [0002] perf[1229608] 0.000 0.000 0.000
200.000000 [0003] perf[1229608] 0.000 0.000 0.000
200.000000 [0004] perf[1229608] 0.000 0.000 0.000
200.000000 [0005] perf[1229608] 0.000 0.000 0.000
200.000000 [0006] perf[1229608] 0.000 0.000 0.000
200.000000 [0007] perf[1229608] 0.000 0.000 0.000
After:
# perf sched timehist --time 100,200
Samples of sched_switch event do not have callchains.
time cpu task name wait time sch delay run time
[tid/pid] (msec) (msec) (msec)
--------------- ------ ------------------------------ --------- --------- ---------
Fixes: 853b74071110bed3 ("perf sched timehist: Add option to specify time window of interest") Signed-off-by: Yang Jihong <yangjihong@bytedance.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsa@cumulusnetworks.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240819024720.2405244-1-yangjihong@bytedance.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>