This is odd to have a of_node_put() just after a for_each_child_of_node()
or a for_each_endpoint_of_node() loop. It should already be called
during the last iteration.
The value of V4L2_CID_VBLANK control is initialized to default vblank
value of 640x480 when driver probe. When OV5640 work at DVP mode, the
control value won't update and lead to sensor can't output data if the
resolution remain the same as last time since incorrect total vertical
size. So update it when there is a new value applied.
In case of encoded input VP9 data width that is not multiple of macroblock
size, which is 16 (e.g. 1080x1920 frames, where 1080 is multiple of 8), the
width is padded to be a multiple of macroblock size (for 1080x1920 frames,
that is 1088x1920).
The hantro_postproc_g2_enable() checks whether the encoded data width is
equal to decoded frame width, and if not, enables down-scale mode. For a
frame where input is 1080x1920 and output is 1088x1920, this is incorrect
as no down-scale happens, the frame is only padded. Enabling the down-scale
mode in this case results in corrupted frames.
Fix this by adjusting the check to test whether encoded data width is
greater than decoded frame width, and only in that case enable the
down-scale mode.
To generate input test data to trigger this bug, use e.g.:
$ gst-launch-1.0 videotestsrc ! video/x-raw,width=272,height=256,format=I420 ! \
vp9enc ! matroskamux ! filesink location=/tmp/test.vp9
To trigger the bug upon decoding (note that the NV12 must be forced, as
that assures the output data would pass the G2 postproc):
$ gst-launch-1.0 filesrc location=/tmp/test.vp9 ! matroskademux ! vp9parse ! \
v4l2slvp9dec ! video/x-raw,format=NV12 ! videoconvert ! fbdevsink
Fixes: 79c987de8b35 ("media: hantro: Use post processor scaling capacities") Signed-off-by: Marek Vasut <marex@denx.de> Reviewed-by: Benjamin Gaignard <benjamin.gaignard@collabora.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Sasha Levin <sashal@kernel.org>
The i.MX8MM/N/P does not define the .reset op since reset of the VPU is
done by genpd. Check whether the .reset op is defined before calling it
to avoid NULL pointer dereference.
Note that the Fixes tag is set to the commit which removed the reset op
from i.MX8M Hantro G2 implementation, this is because before this commit
all the implementations did define the .reset op.
Fixes: 6971efb70ac3 ("media: hantro: Allow i.MX8MQ G1 and G2 to run independently") Signed-off-by: Marek Vasut <marex@denx.de> Reviewed-by: Chen-Yu Tsai <wenst@chromium.org> Tested-by: Chen-Yu Tsai <wenst@chromium.org> Reviewed-by: Adam Ford <aford173@gmail.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Sasha Levin <sashal@kernel.org>
The last buffer from before the change must be marked,
with the V4L2_BUF_FLAG_LAST flag,
similarly to the Drain sequence above.
Meanwhile if V4L2_DEC_CMD_STOP is sent before
the source change triggered,
we need to restore the is_draing flag after
the draining in dynamic resolution change.
Fixes: b4e1fb8643da ("media: imx-jpeg: Support dynamic resolution change") Signed-off-by: Ming Qian <ming.qian@nxp.com> Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Sasha Levin <sashal@kernel.org>
Afer commit 1fa5ae857bb1 ("driver core: get rid of struct device's
bus_id string array"), the name of device is allocated dynamically.
Therefore, it needs to be freed, which is done by the driver core for
us once all references to the device are gone. Therefore, move the
dev_set_name() call immediately before the call device_register(), which
either succeeds (then the freeing will be done upon subsequent remvoal),
or puts the reference in the error call. Also, it is not unusual that the
return value of dev_set_name is not checked.
Fixes: 1fa5ae857bb1 ("driver core: get rid of struct device's bus_id string array") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
[linux@dominikbrodowski.net: simplification, commit message modified] Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
As the comment of device_register() says, it should use put_device()
to give up the reference in the error path. Then, insofar resources
will be freed in pcmcia_release_dev(), the error path is no longer
needed. In particular, this means that the (previously missing) dropping
of the reference to &p_dev->function_config->ref is now handled by
pcmcia_release_dev().
If device_register() returns error in pccardd(), it leads two issues:
1. The socket_released has never been completed, it will block
pcmcia_unregister_socket(), because of waiting for completion
of socket_released.
2. The device name allocated by dev_set_name() is leaked.
Fix this two issues by calling put_device() when device_register() fails.
socket_released can be completed in pcmcia_release_socket(), the name can
be freed in kobject_cleanup().
Dan reports that cxl_decoder_commit() potentially leaks a hold of
cxl_dpa_rwsem. The potential error case is a "should not" happen
scenario, turn it into a "can not" happen scenario by adding the error
check to cxl_port_setup_targets() where other setting validation occurs.
Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: http://lore.kernel.org/r/63295673-5d63-4919-b851-3b06d48734c0@moroto.mountain Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Fixes: 176baefb2eb5 ("cxl/hdm: Commit decoder state to hardware") Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 5d2ffbe4b81a ("cxl/port: Store the downstream port's Component Register mappings in struct cxl_dport")
...moved the dport component registers from a raw component_reg_phys
passed in at dport instantiation time to a 'struct cxl_register_map'
populated with both the component register data *and* the "host" device
for mapping operations.
While typical CXL switch dports are mapped by their associated 'struct
cxl_port', an RCH host bridge dport registered by cxl_acpi needs to wait
until the cxl_mem driver makes the attachment to map the registers. This
is because there are no intervening 'struct cxl_port' instances between
the root cxl_port and the endpoint port in an RCH topology.
For now just mark the host as NULL in the RCH dport case until code that
needs to map the dport registers arrives.
This patch is not flagged for -stable since nothing in the current
driver uses the dport->comp_map.
Now, I am slightly uneasy that cxl_setup_comp_regs() sets map->host to a
wrong value and then cxl_dport_setup_regs() fixes it up, but the
alternatives I came up with are more messy. For example, adding an
@logdev to 'struct cxl_register_map' that the dev_printk()s can fall
back to when @host is NULL. I settled on "post-fixup+comment" since it
is only RCH dports that have this special case where register probing is
split between a host-bridge RCRB lookup and when cxl_mem_probe() does
the association of the cxl_memdev and endpoint port.
[moved rename of @comp_map to @reg_map into next patch]
Fixes: 5d2ffbe4b81a ("cxl/port: Store the downstream port's Component Register mappings in struct cxl_dport") Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20231018171713.1883517-4-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The primary role of @dev is to host the mappings for devm operations.
@dev is too ambiguous as a name. I.e. when does @dev refer to the
'struct device *' instance that the registers belong, and when does
@dev refer to the 'struct device *' instance hosting the mapping for
devm operations?
Clarify the role of @dev in cxl_register_map by renaming it to @host.
Also, rename local variables to 'host' where map->host is used.
Signed-off-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20231018171713.1883517-3-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Stable-dep-of: 33d9c987bf8f ("cxl/port: Fix @host confusion in cxl_dport_setup_regs()") Signed-off-by: Sasha Levin <sashal@kernel.org>
Fix a missed "goto out" to unlock on error to cleanup this splat:
WARNING: lock held when returning to user space!
6.6.0-rc3-lizhijian+ #213 Not tainted
------------------------------------------------
cxl/673 is leaving the kernel with locks still held!
1 lock held by cxl/673:
#0: ffffffffa013b9d0 (cxl_region_rwsem){++++}-{3:3}, at: commit_store+0x7d/0x3e0 [cxl_core]
In terms of user visible impact of this bug for backports:
cxl_region_invalidate_memregion() on x86 invokes wbinvd which is a
problematic instruction for virtualized environments. So, on virtualized
x86, cxl_region_invalidate_memregion() returns an error. This failure
case got missed because CXL memory-expander device passthrough is not a
production use case, and emulation of CXL devices is typically limited
to kernel development builds with CONFIG_CXL_REGION_INVALIDATION_TEST=y,
that makes cxl_region_invalidate_memregion() succeed.
In other words, the expected exposure of this bug is limited to CXL
subsystem development environments using QEMU that neglected
CONFIG_CXL_REGION_INVALIDATION_TEST=y.
Fixes: d1257d098a5a ("cxl/region: Move cache invalidation before region teardown, and before setup") Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20231025085450.2514906-1-lizhijian@fujitsu.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
For auto-discovered regions the driver must assign each target to
a valid position in the region interleave set based on the decoder
topology.
The current implementation fails to parse valid decode topologies,
as it does not consider the child offset into a parent port. The sort
put all targets of one port ahead of another port when an interleave
was expected, causing the region assembly to fail.
Replace the existing relative sort with cxl_calc_interleave_pos() that
finds the exact position in a region interleave for an endpoint based
on a walk up the ancestral tree from endpoint to root decoder.
cxl_calc_interleave_pos() was introduced in a prior patch, so the work
here is to use it in cxl_region_sort_targets().
Remove the obsoleted helper functions from the prior sort.
Testing passes on pre-production hardware with BIOS defined regions
that natively trigger this autodiscovery path of the region driver.
Testing passes a CXL unit test using the dev_dbg() calculation test
(see cxl_region_attach()) across an expanded set of region configs:
1, 1, 1+1, 1+1+1, 2, 2+2, 2+2+2, 2+2+2+2, 4, 4+4, where each number
represents the count of endpoints per host bridge.
Fixes: a32320b71f08 ("cxl/region: Add region autodiscovery") Reported-by: Dmytro Adamenko <dmytro.adamenko@intel.com> Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jim Harris <jim.harris@samsung.com> Link: https://lore.kernel.org/r/3946cc55ddc19678733eddc9de2c317749f43f3b.1698263080.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Introduce a calculation to find a target's position in a region
interleave. Perform a self-test of the calculation on user-defined
regions.
The region driver uses the kernel sort() function to put region
targets in relative order. Positions are assigned based on each
target's index in that sorted list. That relative sort doesn't
consider the offset of a port into its parent port which causes
some auto-discovered regions to fail creation. In one failure case,
a 2 + 2 config (2 host bridges each with 2 endpoints), the sort
puts all the targets of one port ahead of another port when they
were expected to be interleaved.
In preparation for repairing the autodiscovery region assembly,
introduce a new method for discovering a target position in the
region interleave.
cxl_calc_interleave_pos() adds a method to find the target position by
ascending from an endpoint to a root decoder. The calculation starts
with the endpoint's local position and position in the parent port. It
traverses towards the root decoder and examines both position and ways
in order to allow the position to be refined all the way to the root
decoder.
This calculation: position = position * parent_ways + parent_pos;
applied iteratively yields the correct position.
Include a self-test that exercises this new position calculation against
every successfully configured user-defined region.
match_decoder_by_range() and decoder_match_range() both determine
if an HPA range matches a decoder. The first does it for root
decoders and the second one operates on switch decoders.
Tidy these up with clear naming and make the switch helper more
like the root decoder helper in style and functionality. Make it
take the actual range, rather than an endpoint decoder from which
it extracts the range. Require an exact match on switch decoders,
because unlike a root decoder that maps an entire region, Linux
only supports 1:1 mapping of switch to endpoint decoders. Note that
root-decoders are a super-set of switch-decoders and the range they
cover is a super-set of a region, hence the use of range_contains() for
that case.
Aside from aesthetics and maintainability, this is in preparation
for reuse.
Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Jim Harris <jim.harris@samsung.com> Link: https://lore.kernel.org/r/011b1f498e1758bb8df17c5951be00bd8d489e3b.1698263080.git.alison.schofield@intel.com
[djbw: fixup root decoder vs switch decoder range checks] Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Stable-dep-of: 0cf36a85c140 ("cxl/region: Use cxl_calc_interleave_pos() for auto-discovery") Signed-off-by: Sasha Levin <sashal@kernel.org>
The current implementation passes PIN_IO_INTA_OUT (2) as a mask and
PIN_IO_INTAPM (GENMASK(1, 0)) as a value.
Swap the variables to assign mask and value the right way.
This error was first introduced with the alarm support. For better or
worse it worked as expected because 0x02 was applied as a mask to 0x03,
resulting 0x02 anyway. This will of course not work for any other value.
When wakeup-source is set in the devicetree, set up the device for
using the output as interrupt instead of clock. This is similar to
how other RTC devices handle this.
This allows the clock chip to turn on the board when wired to do
so in hardware.
CONFIG_DEBUG_SG highlights that get_{report,ext_report,derived_key)()}
are passing stack buffers as the @req_buf argument to
handle_guest_request(), generating a Call Trace of the following form:
WARNING: CPU: 0 PID: 1175 at include/linux/scatterlist.h:187 enc_dec_message+0x518/0x5b0 [sev_guest]
[..]
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/26/2023
RIP: 0010:enc_dec_message+0x518/0x5b0 [sev_guest]
Call Trace:
<TASK>
[..]
handle_guest_request+0x135/0x520 [sev_guest]
get_ext_report+0x1ec/0x3e0 [sev_guest]
snp_guest_ioctl+0x157/0x200 [sev_guest]
Note that the above Call Trace was with the DEBUG_SG BUG_ON()s converted
to WARN_ON()s.
This is benign as long as there are no hardware crypto accelerators
loaded for the aead cipher, and no subsequent dma_map_sg() is performed
on the scatterlist. However, sev-guest can not assume the presence of
an aead accelerator nor can it assume that CONFIG_DEBUG_SG is disabled.
Resolve this bug by allocating virt_addr_valid() memory, similar to the
other buffers am @snp_dev instance carries, to marshal requests from
user buffers to kernel buffers.
Reported-by: Peter Gonda <pgonda@google.com> Closes: http://lore.kernel.org/r/CAMkAt6r2VPPMZ__SQfJse8qWsUyYW3AgYbOUVM0S_Vtk=KvkxQ@mail.gmail.com Fixes: fce96cf04430 ("virt: Add SEV-SNP guest driver") Cc: Borislav Petkov <bp@alien8.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Dionna Glaze <dionnaglaze@google.com> Cc: Jeremi Piotrowski <jpiotrowski@linux.microsoft.com> Tested-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
This crash is due to the clearing out the cxl_memdev's driver context
(@cxlds) before the subsystem is done with it. This is ultimately due to
the region(s), that this memdev is a member, being torn down and expecting
to be able to de-reference @cxlds, like here:
static int cxl_region_decode_reset(struct cxl_region *cxlr, int count)
...
if (cxlds->rcd)
goto endpoint_reset;
...
Fix it by keeping the driver context valid until memdev-device
unregistration, and subsequently the entire stack of related
dependencies, unwinds.
Fixes: 9cc238c7a526 ("cxl/pci: Introduce cdevm_file_operations") Reported-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The sanitize operation is destructive and the expectation is that the
device is unmapped while in progress. The current implementation does a
lockless check for decoders being active, but then does nothing to
prevent decoders from racing to be committed. Introduce state tracking
to resolve this race.
This incidentally cleans up unpriveleged userspace from triggering mmio
read cycles by spinning on reading the 'security/state' attribute. Which
at a minimum is a waste since the kernel state machine can cache the
completion result.
Lastly cxl_mem_sanitize() was mistakenly marked EXPORT_SYMBOL() in the
original implementation, but an export was never required.
Fixes: 0c36b6ad436a ("cxl/mbox: Add sanitization handling machinery") Cc: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Fix a race condition between the mailbox-background command interrupt
firing and the security-state sysfs attribute being removed.
The race is difficult to see due to the awkward placement of the
sanitize-notifier setup code and the multiple places the teardown calls
are made, cxl_memdev_security_init() and cxl_memdev_security_shutdown().
Unify setup in one place, cxl_sanitize_setup_notifier(). Arrange for
the paired cxl_sanitize_teardown_notifier() to safely quiet the notifier
and let the cxl_memdev + irq be unregistered later in the flow.
Note: The special wrinkle of the sanitize notifier is that it interacts
with interrupts, which are enabled early in the flow, and it interacts
with memdev sysfs which is not initialized until late in the flow. Hence
why this setup routine takes an @cxlmd argument, and not just @mds.
This fix is also needed as a preparation fix for a memdev unregistration
crash.
Reported-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com> Closes: http://lore.kernel.org/r/20230929100316.00004546@Huawei.com Cc: Dave Jiang <dave.jiang@intel.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Fixes: 0c36b6ad436a ("cxl/mbox: Add sanitization handling machinery") Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
It is all too easy to get confused about @dev usage in the CXL driver
stack. Before adding a new cxl_pci_probe() setup operation that has a
devm lifetime dependent on @cxlds->dev binding, but also references
@cxlmd->dev, and prints messages, rework the devm_cxl_add_memdev() and
cxl_memdev_setup_fw_upload() function signatures to make this
distinction explicit. I.e. pass in the devm context as an @host argument
rather than infer it from other objects.
This is in preparation for adding a devm_cxl_sanitize_setup_notifier().
Note the whitespace fixup near the change of the devm_cxl_add_memdev()
signature. That uncaught typo originated in the patch that added
cxl_memdev_security_init().
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Stable-dep-of: 5f2da1971446 ("cxl/pci: Fix sanitize notifier setup") Signed-off-by: Sasha Levin <sashal@kernel.org>
If dev_err_probe() is to be used it should at least be used consistently
within the same function. It is also worth questioning whether
every potential -ENOMEM needs an explicit error message.
Remove the cxl_setup_fw_upload() error prints for what are rare /
hardware-independent failures.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Stable-dep-of: 5f2da1971446 ("cxl/pci: Fix sanitize notifier setup") Signed-off-by: Sasha Levin <sashal@kernel.org>
In preparation for fixing the init/teardown of the 'sanitize' workqueue
and sysfs notification mechanism, arrange for cxl_mbox_sanitize_work()
to be the single location where the sysfs attribute is notified. With
that change there is no distinction between polled mode and interrupt
mode. All the interrupt does is accelerate the polling interval.
The change to check for "mds->security.sanitize_node" under the lock is
there to ensure that the interrupt, the work routine and the
setup/teardown code can all have a consistent view of the registered
notifier and the workqueue state. I.e. the expectation is that the
interrupt is live past the point that the sanitize sysfs attribute is
published, and it may race teardown, so it must be consulted under a
lock. Given that new locking requirement, cxl_pci_mbox_irq() is moved
from hard to thread irq context.
Lastly, some opportunistic replacements of
"queue_delayed_work(system_wq, ...)", which is just open coded
schedule_delayed_work(), are included.
Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Stable-dep-of: 5f2da1971446 ("cxl/pci: Fix sanitize notifier setup") Signed-off-by: Sasha Levin <sashal@kernel.org>
Given that any particular put_device() could be the final put of the
device, the fact that there are usages of cxlds->dev after
put_device(cxlds->dev) is a red flag. Drop the reference counting since
the device is pinned by being registered and will not be unregistered
without triggering the driver + workqueue to shutdown.
Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Stable-dep-of: 5f2da1971446 ("cxl/pci: Fix sanitize notifier setup") Signed-off-by: Sasha Levin <sashal@kernel.org>
Some devices (e.g. BCM72112) use an alarm_irq interrupt that is
connected to a level interrupt controller rather than an edge
interrupt controller. In this case, the interrupt cannot be left
enabled by the irq handler while preserving the hardware wake-up
signal on wake capable devices or an interrupt storm will occur.
The alarm_expired flag is introduced to allow the disabling of
the interrupt when an alarm expires and to support balancing the
calls to disable_irq() and enable_irq() in accordance with the
existing design.
Variable found is not being initialized, in the case where the desired
mount is not found the variable contains garbage. Fix this by initializing
it to zero.
When p9pdu_readf() is called with "s?d" attribute, it allocates a pointer
that will store a string. But when p9pdu_readf() fails while handling "d"
then this pointer will not be freed in p9_check_errors().
Fixes: 51a87c552dfd ("9p: rework client code to use new protocol support functions") Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com> Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
Message-ID: <20231027030302.11927-1-hbh25y@gmail.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Broadwell-de has a consumer core and server uncore. The uncore_arb PMU
isn't present and the broadwellx style cbox PMU should be used
instead. Fix the tma_info_system_dram_bw_use metric to use the server
metric rather than client.
The associated converter script fix is in:
https://github.com/intel/perfmon/pull/111
Fixes: 7d124303d620 ("perf vendor events intel: Update broadwell variant events/metrics") Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Link: https://lore.kernel.org/r/20230926031034.1201145-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
If this driver enables the xHC clocks while resuming from sleep, it calls
clk_prepare_enable() without checking for errors and blithely goes on to
read/write the xHC's registers -- which, with the xHC not being clocked,
at least on ARM32 usually causes an imprecise external abort exceptions
which cause kernel oops. Currently, the chips for which the driver does
the clock dance on suspend/resume seem to be the Broadcom STB SoCs, based
on ARM32 CPUs, as it seems...
Found by Linux Verification Center (linuxtesting.org) with the Svace static
analysis tool.
The AMD USB host controller (1022:43f7) isn't going into PCI D3 by default
without anything connected. This is because the policy that was introduced
by commit a611bf473d1f ("xhci-pci: Set runtime PM as default policy on all
xHC 1.2 or later devices") only covered 1.2 or later.
The 1.1 specification also has the same requirement as the 1.2
specification for D3 support. So expand the runtime PM as default policy
to all AMD 1.1 devices as well.
Fixes: a611bf473d1f ("xhci-pci: Set runtime PM as default policy on all xHC 1.2 or later devices") Link: https://composter.com.ua/documents/xHCI_Specification_for_USB.pdf Co-developed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20231019102924.2797346-15-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The CPI_STALL_RATIO metric group can be used to present the high
level CPI stall breakdown metrics in powerpc, which will show:
- DISPATCH_STALL_CPI ( Dispatch stall cycles per insn )
- ISSUE_STALL_CPI ( Issue stall cycles per insn )
- EXECUTION_STALL_CPI ( Execution stall cycles per insn )
- COMPLETION_STALL_CPI ( Completion stall cycles per insn )
Commit cf26e043c2a9 ("perf vendor events power10: Add JSON
metric events to present CPI stall cycles in powerpc)" which added
the CPI_STALL_RATIO metric group, also modified
the PMC value used in PM_RUN_INST_CMPL event from PMC4 to PMC5,
to avoid multiplexing of events.
But that got revert in recent changes. Fix this issue by changing
back the PMC value used in PM_RUN_INST_CMPL to PMC5.
Result with the fix:
./perf stat --metric-no-group -M CPI_STALL_RATIO <workload>
The macro __SPIN_LOCK_INITIALIZER() is implementation specific. Users
that desire to initialize a spinlock in a struct must use
__SPIN_LOCK_UNLOCKED().
Use __SPIN_LOCK_UNLOCKED() for the spinlock_t in imc_global_refc.
Fixes: 76d588dddc459 ("powerpc/imc-pmu: Fix use of mutex in IRQs disabled section") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230309134831.Nz12nqsU@linutronix.de Signed-off-by: Sasha Levin <sashal@kernel.org>
The VAS open window call prints error message and returns -EBUSY
after the migration suspend event initiated and until the resume
event completed on the destination system. It can cause the log
buffer filled with these error messages if the user space issues
continuous open window calls. Similar case even for DLPAR CPU
remove event when no credits are available until the credits are
freed or with the other DLPAR CPU add event.
So changes in the patch to use pr_err_ratelimited() instead of
pr_err() to display open window failure and not-available credits
error messages.
Use pr_fmt() and make the corresponding changes to have the
consistencein prefix all pr_*() messages (vas-api.c).
Fixes: 37e6764895ef ("powerpc/pseries/vas: Add VAS migration handler") Signed-off-by: Haren Myneni <haren@linux.ibm.com>
[mpe: Use "vas-api" as the prefix to match the file name.] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20231019215033.1335251-1-haren@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Perf test case 111 Check open filename arg using perf trace + vfs_getname
fails on s390. This is caused by a failing function
bpf_probe_read() in file util/bpf_skel/augmented_raw_syscalls.bpf.c.
The root cause is the lookup by address. Function bpf_probe_read()
is used. This function works only for architectures
with ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE.
On s390 is not possible to determine from the address to which
address space the address belongs to (user or kernel space).
Replace bpf_probe_read() by bpf_probe_read_kernel()
and bpf_probe_read_str() by bpf_probe_read_user_str() to
explicity specify the address space the address refers to.
A thread started via eg. user_mode_thread() runs in the kernel to begin
with and then may later return to userspace. While it's running in the
kernel it has a pt_regs at the base of its kernel stack, but that
pt_regs is all zeroes.
If the thread oopses in that state, it leads to an ugly stack trace with
a big block of zero GPRs, as reported by Joel:
The all-zero pt_regs looks ugly and conveys no useful information, other
than its presence. So detect that case and just show the presence of the
frame by printing the interrupt marker, eg:
To avoid ever suppressing a valid pt_regs make sure the pt_regs has a
zero MSR and TRAP value, and is located at the very base of the stack.
Fixes: 6895dfc04741 ("powerpc: copy_thread fill in interrupt frame marker and back chain") Reported-by: Joel Stanley <joel@jms.id.au> Reported-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230824064210.907266-1-mpe@ellerman.id.au Signed-off-by: Sasha Levin <sashal@kernel.org>
Sparse reports a size mismatch in the endian swap. The Opal
implementation[1] passes the value as a __be64, and the receiving
variable out_qsize is a u64, so the use of be32_to_cpu() appears to be
an error.
The recent change made it possible to generate vmlinux.h from BTF and
to ignore the file. But we also have a minimal vmlinux.h that will be
used by default. It should not be ignored by GIT.
Fixes: b7a2d774c9c5 ("perf build: Add ability to build with a generated vmlinux.h") Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Ian Rogers <irogers@google.com> Closes: https://lore.kernel.org/oe-kbuild-all/202310110451.rvdUZJEY-lkp@intel.com/ Cc: oe-kbuild-all@lists.linux.dev Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
When MODULE_DEVICE_TABLE(ishtp, ) is built on a host with a different
endianness from the target architecture, it results in an incorrect
MODULE_ALIAS().
For example, see a case where drivers/platform/x86/intel/ishtp_eclite.c
is built as a module for x86.
If you build it on a little-endian host, you will get the correct
MODULE_ALIAS:
When MODULE_DEVICE_TABLE(tee, ) is built on a host with a different
endianness from the target architecture, it results in an incorrect
MODULE_ALIAS().
For example, see a case where drivers/char/hw_random/optee-rng.c
is built as a module for ARM little-endian.
If you build it on a little-endian host, you will get the correct
MODULE_ALIAS:
The same problem also occurs when you enable CONFIG_CPU_BIG_ENDIAN,
and build it on a little-endian host.
This issue has been unnoticed because the ARM kernel is configured for
little-endian by default, and most likely built on a little-endian host
(cross-build on x86 or native-build on ARM).
The uuid field must not be reversed because uuid_t is an array of __u8.
Fixes: 0fc1db9d1059 ("tee: add bus driver framework for TEE based devices") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Sumit Garg <sumit.garg@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
On a state toggle from config off to config on and on the
state toggle from checkstop to not checkstop the queue's
internal states was set but the state machine was not
nudged. This did not care as on the first enqueue of a
request the state machine kick ran.
However, within an Secure Execution guest a queue is
only chosen by the scheduler when it has been bound.
But to bind a queue, it needs to run through the initial
states (reset, enable interrupts, ...). So this is like
a chicken-and-egg problem and the result was in fact
that a queue was unusable after a config off/on toggle.
With some slight rework of the handling of these states
now the new function _ap_queue_init_state() is called
which is the core of the ap_queue_init_state() function
but without locking handling. This has the benefit that
it can be called on all the places where a (re-)init
of the AP queue's state machine is needed.
Fixes: 2d72eaf036d2 ("s390/ap: implement SE AP bind, unbind and associate") Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The PMU name could be NULL in the case of the fake_pmu. Initialize the
name for the fake_pmu to "fake" so that all other logic can assume it
is initialized. Add a const to the type of name so that a literal can
be used to avoid additional initialization code. Propagate the cost
through related routines and remove now unnecessary "(char *)"
casts. Doing this located a bug in builtin-list for the pmu_glob that
was missing a strdup.
Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20230825024002.801955-3-irogers@google.com Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Wei Li <liwei391@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Ming Wang <wangming01@loongson.cn> Cc: John Garry <john.g.garry@oracle.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-kernel@vger.kernel.org Cc: linux-perf-users@vger.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Stable-dep-of: 85f73c377b2a ("perf mem-events: Avoid uninitialized read") Signed-off-by: Sasha Levin <sashal@kernel.org>
Raw events can be strings like 'r0xead' but the 0x is optional so they
can also be 'read'. On IcelakeX uncore_imc_free_running has an event
called 'read' which may be programmed as:
```
$ perf stat -e 'uncore_imc_free_running/event=read/' -a sleep 1
```
However, the PE_RAW type isn't allowed on the right of a term, even
though in this case we just want to interpret it as a string. This
leads to the following error on IcelakeX:
```
$ perf stat -e 'uncore_imc_free_running/event=read/' -a sleep 1
event syntax error: '..nning/event=read/'
\___ parser error
Run 'perf list' for a list of valid events
Usage: perf stat [<options>] [<command>]
-e, --event <event> event selector. use 'perf list' to list available events
```
Fix this by allowing raw types on the right of terms and treat them as
strings, just as is already done for PE_LEGACY_CACHE. Make this
consistent by just entirely removing name_or_legacy and always using
name_or_raw that covers all three cases.
Fixes: 6fd1e5191591 ("perf parse-events: Support PMUs for legacy cache events") Signed-off-by: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@arm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Link: https://lore.kernel.org/r/20230928004431.1926969-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
By default perf will fail the build if the development files for
libtraceevent are not available.
To build perf without libtraceevent support, disabling several features
such as 'perf trace', one needs to add NO_LIBTRACEVENT=1 to the make
command line.
Add the missing comments about that to the tools/perf/Makefile.perf
file, just like all the other such command line toggles.
Fixes: 378ef0f5d9d7f465 ("perf build: Use libtraceevent from the system") Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/ZR6+MhXtLnv6ow6E@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Add missing clk_disable_unprepare() and clk_bulk_disable_unprepare()
in the error path in qnoc_probe(). And when qcom_icc_qos_set() fails,
it needs remove nodes and disable clks.
If pxad_alloc_desc() fails on the first dma_pool_alloc() call, then
sw_desc->nb_desc is zero.
In such a case pxad_free_desc() is called and it will BUG_ON().
Remove this erroneous BUG_ON().
It is also useless, because if "sw_desc->nb_desc == 0", then, on the first
iteration of the for loop, i is -1 and the loop will not be executed.
(both i and sw_desc->nb_desc are 'int')
If a hub is disconnected that has device(s) that's attached to the usbip layer
the disconnect function might fail because it tries to release the port
on an already disconnected hub.
Fixes: 6080cd0e9239 ("staging: usbip: claim ports used by shared devices") Signed-off-by: Jonas Blixt <jonas.blixt@actia.se> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/20230615092810.1215490-1-jonas.blixt@actia.se Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The iio_generic_buffer can return garbage values when the total size of
scan data is not a multiple of the largest element in the scan. This can be
demonstrated by reading a scan, consisting, for example of one 4-byte and
one 2-byte element, where the 4-byte element is first in the buffer.
The IIO generic buffer code does not take into account the last two
padding bytes that are needed to ensure that the 4-byte data for next
scan is correctly aligned.
Add the padding bytes required to align the next sample with the scan size.
It is not allowed to call kfree_skb() from hardware interrupt
context or with hardware interrupts being disabled.
So replace kfree_skb() with dev_kfree_skb_irq() under
spin_lock_irqsave(). Compile tested only.
The perf test named "kernel lock contention analysis test"
fails in powerpc system with below error:
[command]# ./perf test 81 -vv
81: kernel lock contention analysis test :
--- start ---
test child forked, pid 2140
Testing perf lock record and perf lock contention
Testing perf lock contention --use-bpf
[Skip] No BPF support
Testing perf lock record and perf lock contention at the same time
Testing perf lock contention --threads
Testing perf lock contention --lock-addr
Testing perf lock contention --type-filter (w/ spinlock)
Testing perf lock contention --lock-filter (w/ tasklist_lock)
Testing perf lock contention --callstack-filter (w/ unix_stream)
[Fail] Recorded result should have a lock from unix_stream:
test child finished with -1
---- end ----
kernel lock contention analysis test: FAILED!
The test is failing because we get an address entry with 0 in
perf lock samples for powerpc, and code for lock contention
option "--callstack-filter" will not check further entries after
address 0.
Below are some of the samples from test generated perf.data file, which
have 0 address in the 2nd entry of callstack:
--------
sched-messaging 3409 [001] 7152.904029: lock:contention_begin: 0xc00000c80904ef00 (flags=SPIN) c0000000001e926c __traceiter_contention_begin+0x6c ([kernel.kallsyms])
0 [unknown] ([unknown]) c000000000f8a178 native_queued_spin_lock_slowpath+0x1f8 ([kernel.kallsyms]) c000000000f89f44 _raw_spin_lock_irqsave+0x84 ([kernel.kallsyms]) c0000000001d9fd0 prepare_to_wait+0x50 ([kernel.kallsyms]) c000000000c80f50 sock_alloc_send_pskb+0x1b0 ([kernel.kallsyms]) c000000000e82298 unix_stream_sendmsg+0x2b8 ([kernel.kallsyms]) c000000000c78980 sock_sendmsg+0x80 ([kernel.kallsyms])
Based on commit 20002ded4d93 ("perf_counter: powerpc: Add callchain support"),
incase of powerpc, the callchain saved by kernel always includes first
three entries as the NIP (next instruction pointer), LR (link register), and
the contents of LR save area in the second stack frame. In certain scenarios
its possible to have invalid kernel instruction addresses in either of LR or the
second stack frame's LR. In that case, kernel will store the address as zer0.
Hence, its possible to have 2nd or 3rd callstack entry as 0.
As per the current code in match_callstack_filter function, we skip the callstack
check incase we get 0 address. And hence the test case is failing in powerpc.
Fix this issue by updating the check in match_callstack_filter function,
to not skip callstack check if the 2nd or 3rd entry have 0 address
for powerpc.
Result in powerpc after patch changes:
[command]# ./perf test 81 -vv
81: kernel lock contention analysis test :
--- start ---
test child forked, pid 4570
Testing perf lock record and perf lock contention
Testing perf lock contention --use-bpf
[Skip] No BPF support
Testing perf lock record and perf lock contention at the same time
Testing perf lock contention --threads
Testing perf lock contention --lock-addr
Testing perf lock contention --type-filter (w/ spinlock)
Testing perf lock contention --lock-filter (w/ tasklist_lock)
[Skip] Could not find 'tasklist_lock'
Testing perf lock contention --callstack-filter (w/ unix_stream)
Testing perf lock contention --callstack-filter with task aggregation
Testing perf lock contention CSV output
[Skip] No BPF support
test child finished with 0
---- end ----
kernel lock contention analysis test: Ok
Fixes: ebab291641be ("perf lock contention: Support filters for different aggregation") Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com> Tested-by: Disha Goel <disgoel@linux.ibm.com> Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Cc: maddy@linux.ibm.com Cc: atrajeev@linux.vnet.ibm.com Link: https://lore.kernel.org/r/20231003092113.252380-1-kjain@linux.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Zero is not a valid IRQ for in-kernel code and the irq_of_parse_and_map()
function returns zero on error. So this check for valid IRQs should only
accept values > 0.
Fixes: 2b6b3b742019 ("ARM/dmaengine: edma: Merge the two drivers under drivers/dma/") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Acked-by: Peter Ujfalusi <peter.ujfalusi@gmail.com> Link: https://lore.kernel.org/r/f15cb6a7-8449-4f79-98b6-34072f04edbc@moroto.mountain Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The USB host on Tegra3 works with 32-bit alignment. Previous code tried
to align the buffer, but it did align the wrapper struct instead, so
the buffer was at a constant offset of 8 bytes (two pointers) from
expected alignment. Since kmalloc() guarantees at least 8-byte
alignment already, the alignment-extending is removed.
Tegra USB controllers seem to issue DMA in full 32-bit words only and thus
may overwrite unevenly-sized buffers. One such occurrence is detected by
SLUB when receiving a reply to a 1-byte buffer (below). Fix this by
allocating a bounce buffer also for buffers with sizes not a multiple of 4.
=============================================================================
BUG kmalloc-64 (Tainted: G B ): kmalloc Redzone overwritten
-----------------------------------------------------------------------------
Redzone 8555ccc0: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
Redzone 8555ccd0: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
Redzone 8555cce0: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
Redzone 8555ccf0: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
Object 8555cd00: 01 00 00 00 cc cc cc cc cc cc cc cc cc cc cc cc ................
Object 8555cd10: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
Object 8555cd20: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
Object 8555cd30: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
Redzone 8555cd40: cc cc cc cc ....
Padding 8555cd74: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZ
CPU: 3 PID: 41 Comm: kworker/3:1 Tainted: G B 6.6.0-rc1mq-00118-g59786f827ea1 #1115
Hardware name: NVIDIA Tegra SoC (Flattened Device Tree)
Workqueue: usb_hub_wq hub_event
[<8010ca28>] (unwind_backtrace) from [<801090a5>] (show_stack+0x11/0x14)
[<801090a5>] (show_stack) from [<805da2fb>] (dump_stack_lvl+0x4d/0x7c)
[<805da2fb>] (dump_stack_lvl) from [<8026464f>] (check_bytes_and_report+0xb3/0xe4)
[<8026464f>] (check_bytes_and_report) from [<802648e1>] (check_object+0x261/0x290)
[<802648e1>] (check_object) from [<802671b1>] (free_to_partial_list+0x105/0x3f8)
[<802671b1>] (free_to_partial_list) from [<80268613>] (__kmem_cache_free+0x103/0x128)
[<80268613>] (__kmem_cache_free) from [<80425a67>] (usb_get_status+0x73/0xac)
[<80425a67>] (usb_get_status) from [<80421b31>] (hub_probe+0x5e9/0xcec)
[<80421b31>] (hub_probe) from [<80428bbb>] (usb_probe_interface+0xbf/0x21c)
[<80428bbb>] (usb_probe_interface) from [<803ee13d>] (really_probe+0xa5/0x2c4)
[<803ee13d>] (really_probe) from [<803ee3d1>] (__driver_probe_device+0x75/0x174)
[<803ee3d1>] (__driver_probe_device) from [<803ee501>] (driver_probe_device+0x31/0x94)
usb 1-1: device descriptor read/8, error -71
In _dwc2_hcd_urb_enqueue(), "urb->hcpriv = NULL" is executed without
holding the lock "hsotg->lock". In _dwc2_hcd_urb_dequeue():
spin_lock_irqsave(&hsotg->lock, flags);
...
if (!urb->hcpriv) {
dev_dbg(hsotg->dev, "## urb->hcpriv is NULL ##\n");
goto out;
}
rc = dwc2_hcd_urb_dequeue(hsotg, urb->hcpriv); // Use urb->hcpriv
...
out:
spin_unlock_irqrestore(&hsotg->lock, flags);
When _dwc2_hcd_urb_enqueue() and _dwc2_hcd_urb_dequeue() are
concurrently executed, the NULL check of "urb->hcpriv" can be executed
before "urb->hcpriv = NULL". After urb->hcpriv is NULL, it can be used
in the function call to dwc2_hcd_urb_dequeue(), which can cause a NULL
pointer dereference.
This possible bug is found by an experimental static analysis tool
developed by myself. This tool analyzes the locking APIs to extract
function pairs that can be concurrently executed, and then analyzes the
instructions in the paired functions to identify possible concurrency
bugs including data races and atomicity violations. The above possible
bug is reported, when my tool analyzes the source code of Linux 6.5.
To fix this possible bug, "urb->hcpriv = NULL" should be executed with
holding the lock "hsotg->lock". After using this patch, my tool never
reports the possible bug, with the kernelconfiguration allyesconfig for
x86_64. Because I have no associated hardware, I cannot test the patch
in runtime testing, and just verify it according to the code logic.
idxd sub-drivers belong to bus dsa_bus_type. Thus, dsa_bus_type must be
registered in dsa bus init before idxd drivers can be registered.
But the order is wrong when both idxd and idxd_bus are builtin drivers.
In this case, idxd driver is compiled and linked before idxd_bus driver.
Since the initcall order is determined by the link order, idxd sub-drivers
are registered in idxd initcall before dsa_bus_type is registered
in idxd_bus initcall. idxd initcall fails:
[ 21.562803] calling idxd_init_module+0x0/0x110 @ 1
[ 21.570761] Driver 'idxd' was unable to register with bus_type 'dsa' because the bus was not initialized.
[ 21.586475] initcall idxd_init_module+0x0/0x110 returned -22 after 15717 usecs
[ 21.597178] calling dsa_bus_init+0x0/0x20 @ 1
To fix the issue, compile and link idxd_bus driver before idxd driver
to ensure the right registration order.
Fixes: d9e5481fca74 ("dmaengine: dsa: move dsa_bus_type out of idxd driver to standalone") Reported-by: Michael Prinke <michael.prinke@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Lijun Pan <lijun.pan@intel.com> Tested-by: Lijun Pan <lijun.pan@intel.com> Link: https://lore.kernel.org/r/20230924162232.1409454-1-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The BTF func proto for a tracepoint has one more argument than the
actual tracepoint function since it has a context argument at the
begining. So it should compare to 5 when the tracepoint has 4
arguments.
Also, recent change in the perf tool would use a hand-written minimal
vmlinux.h to generate BTF in the skeleton. So it won't have the info
of the tracepoint. Anyway it should use the kernel's vmlinux BTF to
check the type in the kernel.
Fixes: b36888f71c85 ("perf record: Handle argument change in sched_switch") Reviewed-by: Ian Rogers <irogers@google.com> Acked-by: Song Liu <song@kernel.org> Cc: Hao Luo <haoluo@google.com> CC: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20230922234444.3115821-1-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
We usually do reverse order of enable() for disable(). Currently, the
ordering of irq_chip_disable_parent() is not correct in
rzg2l_gpio_irq_disable(). Fix the incorrect order.
Fuzzing found that an invalid tracepoint name would create a memory
leak with an address sanitizer build:
```
$ perf stat -e '*:o/' true
event syntax error: '*:o/'
\___ parser error
Run 'perf list' for a list of valid events
Usage: perf stat [<options>] [<command>]
-e, --event <event> event selector. use 'perf list' to list available events
Direct leak of 4 byte(s) in 2 object(s) allocated from:
#0 0x7f38ac07077b in __interceptor_strdup ../../../../src/libsanitizer/asan/asan_interceptors.cpp:439
#1 0x55f2f41be73b in str util/parse-events.l:49
#2 0x55f2f41d08e8 in parse_events_lex util/parse-events.l:338
#3 0x55f2f41dc3b1 in parse_events_parse util/parse-events-bison.c:1464
#4 0x55f2f410b8b3 in parse_events__scanner util/parse-events.c:1822
#5 0x55f2f410d1b9 in __parse_events util/parse-events.c:2094
#6 0x55f2f410e57f in parse_events_option util/parse-events.c:2279
#7 0x55f2f4427b56 in get_value tools/lib/subcmd/parse-options.c:251
#8 0x55f2f4428d98 in parse_short_opt tools/lib/subcmd/parse-options.c:351
#9 0x55f2f4429d80 in parse_options_step tools/lib/subcmd/parse-options.c:539
#10 0x55f2f442acb9 in parse_options_subcommand tools/lib/subcmd/parse-options.c:654
#11 0x55f2f3ec99fc in cmd_stat tools/perf/builtin-stat.c:2501
#12 0x55f2f4093289 in run_builtin tools/perf/perf.c:322
#13 0x55f2f40937f5 in handle_internal_command tools/perf/perf.c:375
#14 0x55f2f4093bbd in run_argv tools/perf/perf.c:419
#15 0x55f2f409412b in main tools/perf/perf.c:535
SUMMARY: AddressSanitizer: 4 byte(s) leaked in 2 allocation(s).
```
Fix by adding the missing destructor.
Fixes: 865582c3f48e ("perf tools: Adds the tracepoint name parsing support") Signed-off-by: Ian Rogers <irogers@google.com> Cc: He Kuang <hekuang@huawei.com> Link: https://lore.kernel.org/r/20230914164028.363220-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
This reverts commit e571e029bdbf ("perf tools: Enable indices setting
syntax for BPF map").
The reverted commit added a notion of arrays that could be set as
event terms for BPF events. The parsing hasn't worked over multiple
Linux releases. Given the broken nature of the parsing it appears the
code isn't in use, nor could I find a way for it to be used to add a
test.
The original commit contains a test in the commit message,
however, running it yields:
```
$ perf record -e './test_bpf_map_3.c/map:channel.value[0,1,2,3...5]=101/' usleep 2
event syntax error: '..pf_map_3.c/map:channel.value[0,1,2,3...5]=101/'
\___ parser error
Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
```
Given the code can't be used this commit reverts and removes it.
Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rob Herring <robh@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Cc: Wang ShaoBo <bobo.shaobowang@huawei.com> Cc: YueHaibing <yuehaibing@huawei.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20230728001212.457900-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Stable-dep-of: ede72dca45b1 ("perf parse-events: Fix tracepoint name memory leak") Signed-off-by: Sasha Levin <sashal@kernel.org>
There is a pid leakage:
------------------------------
unreferenced object 0xffff88810c181940 (size 224):
comm "sshd", pid 8191, jiffies 4294946950 (age 524.570s)
hex dump (first 32 bytes):
01 00 00 00 00 00 00 00 00 00 00 00 ad 4e ad de .............N..
ff ff ff ff 6b 6b 6b 6b ff ff ff ff ff ff ff ff ....kkkk........
backtrace:
[<ffffffff814774e6>] kmem_cache_alloc+0x5c6/0x9b0
[<ffffffff81177342>] alloc_pid+0x72/0x570
[<ffffffff81140ac4>] copy_process+0x1374/0x2470
[<ffffffff81141d77>] kernel_clone+0xb7/0x900
[<ffffffff81142645>] __se_sys_clone+0x85/0xb0
[<ffffffff8114269b>] __x64_sys_clone+0x2b/0x30
[<ffffffff83965a72>] do_syscall_64+0x32/0x80
[<ffffffff83a00085>] entry_SYSCALL_64_after_hwframe+0x61/0xc6
It turns out that there is a race condition between disassociate_ctty() and
tty_signal_session_leader(), which caused this leakage.
The pid memleak is triggered by the following race:
task[sshd] task[bash]
----------------------- -----------------------
disassociate_ctty();
spin_lock_irq(¤t->sighand->siglock);
put_pid(current->signal->tty_old_pgrp);
current->signal->tty_old_pgrp = NULL;
tty = tty_kref_get(current->signal->tty);
spin_unlock_irq(¤t->sighand->siglock);
tty_vhangup();
tty_lock(tty);
...
tty_signal_session_leader();
spin_lock_irq(&p->sighand->siglock);
...
if (tty->ctrl.pgrp) //tty->ctrl.pgrp is not NULL
p->signal->tty_old_pgrp = get_pid(tty->ctrl.pgrp); //An extra get
spin_unlock_irq(&p->sighand->siglock);
...
tty_unlock(tty);
if (tty) {
tty_lock(tty);
...
put_pid(tty->ctrl.pgrp);
tty->ctrl.pgrp = NULL; //It's too late
...
tty_unlock(tty);
}
The issue is believed to be introduced by commit c8bcd9c5be24 ("tty:
Fix ->session locking") who moves the unlock of siglock in
disassociate_ctty() above "if (tty)", making a small window allowing
tty_signal_session_leader() to kick in. It can be easily reproduced by
adding a delay before "if (tty)" and at the entrance of
tty_signal_session_leader().
To fix this issue, we move "put_pid(current->signal->tty_old_pgrp)" after
"tty->ctrl.pgrp = NULL".
In f2fs_put_super(), it tries to do sanity check on dirty and IO
reference count of f2fs, once there is any reference count leak,
it will trigger panic.
The root case is, during f2fs_put_super(), if there is any IO error
in f2fs_wait_on_all_pages(), we missed to truncate meta_inode's page
cache later, result in panic, fix this case.
Fixes: 20872584b8c0 ("f2fs: fix to drop all dirty meta/node pages during umount()") Reported-by: syzbot+ebd7072191e2eddd7d6e@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-f2fs-devel/000000000000a14f020604a62a98@google.com Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
With below script, redundant compress extension will be parsed and added
by parse_options(), because parse_options() doesn't check whether the
extension is existed or not, fix it.
1. mount -t f2fs -o compress_extension=so /dev/vdb /mnt/f2fs
2. mount -t f2fs -o remount,compress_extension=so /mnt/f2fs
3. mount|grep f2fs
/dev/vdb on /mnt/f2fs type f2fs (...,compress_extension=so,compress_extension=so,...)
Fixes: 4c8ff7095bef ("f2fs: support data compression") Fixes: 151b1982be5d ("f2fs: compress: add nocompress extensions support") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
In f2fs_read_multi_pages(), once f2fs_decompress_cluster() was called if
we hit cached page in compress_inode's cache, dic may be released, it needs
break the loop rather than continuing it, in order to avoid accessing
invalid dic pointer.
INFO: task sync:4788 blocked for more than 120 seconds.
Not tainted 6.5.0-rc1+ #322
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:sync state:D stack:0 pid:4788 ppid:509 flags:0x00000002
Call Trace:
<TASK>
__schedule+0x335/0xf80
schedule+0x6f/0xf0
wb_wait_for_completion+0x5e/0x90
sync_inodes_sb+0xd8/0x2a0
sync_inodes_one_sb+0x1d/0x30
iterate_supers+0x99/0xf0
ksys_sync+0x46/0xb0
__do_sys_sync+0x12/0x20
do_syscall_64+0x3f/0x90
entry_SYSCALL_64_after_hwframe+0x6e/0xd8
The reason is f2fs_all_cluster_page_ready() assumes that pages array should
cover at least one cluster, otherwise, it will always return false, result
in deadloop.
By default, pages array size is 16, and it can cover the case cluster_size
is equal or less than 16, for the case cluster_size is larger than 16, let's
allocate memory of pages array dynamically.
Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
1. Atoms are managed in page mode and should be released using atom_free()
instead of free().
2. When the event does not match, the atom needs to free.
Fixes: f98919ec4fccdacf ("perf kwork: Implement 'report' subcommand") Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Yang Jihong <yangjihong1@huawei.com> Link: https://lore.kernel.org/r/20230812084917.169338-2-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The devm_clk_get_enabled() helper:
- calls devm_clk_get()
- calls clk_prepare_enable() and registers what is needed in order to
call clk_disable_unprepare() when needed, as a managed resource.
Also replace devm_regulator_get() and regulator_enable() with
devm_regulator_get_enable() helper and remove regulator_disable().
Replace iio_device_register() with devm_iio_device_register() and remove
iio_device_unregister().
And st->reg is not used anymore, so remove it.
As Jonathan pointed out, couple of things that are wrong:
1) The device is powered down 'before' we unregister it with the
subsystem and as such userspace interfaces are still exposed which
probably won't do the right thing if the chip is powered down.
2) This isn't done in the error paths in probe.
To solve this problem, register a new callback adf4350_power_down()
with devm_add_action_or_reset(), to enable software power down in both
error and device detach path. So the remove function can be removed.
Remove spi_set_drvdata() from the probe function, since spi_get_drvdata()
is not used anymore.
Fixes: e31166f0fd48 ("iio: frequency: New driver for Analog Devices ADF4350/ADF4351 Wideband Synthesizers") Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Link: https://lore.kernel.org/r/20230828062717.2310219-1-ruanjinjie@huawei.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Generating metrics llc_code_read_mpi_demand_plus_prefetch,
llc_data_read_mpi_demand_plus_prefetch,
llc_miss_local_memory_bandwidth_read,
llc_miss_local_memory_bandwidth_write,
nllc_miss_remote_memory_bandwidth_read, memory_bandwidth_read,
memory_bandwidth_write, uncore_frequency, upi_data_transmit_bw,
C2_Pkg_Residency, C3_Core_Residency, C3_Pkg_Residency,
C6_Core_Residency, C6_Pkg_Residency, C7_Core_Residency,
C7_Pkg_Residency, UNCORE_FREQ and tma_info_system_socket_clks would
trigger an address sanitizer heap-buffer-overflows on a SkylakeX.
```
==2567752==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x5020003ed098 at pc 0x5621a816654e bp 0x7fffb55d4da0 sp 0x7fffb55d4d98
READ of size 4 at 0x5020003eee78 thread T0
#0 0x558265d6654d in aggr_cpu_id__is_empty tools/perf/util/cpumap.c:694:12
#1 0x558265c914da in perf_stat__get_aggr tools/perf/builtin-stat.c:1490:6
#2 0x558265c914da in perf_stat__get_global_cached tools/perf/builtin-stat.c:1530:9
#3 0x558265e53290 in should_skip_zero_counter tools/perf/util/stat-display.c:947:31
#4 0x558265e53290 in print_counter_aggrdata tools/perf/util/stat-display.c:985:18
#5 0x558265e51931 in print_counter tools/perf/util/stat-display.c:1110:3
#6 0x558265e51931 in evlist__print_counters tools/perf/util/stat-display.c:1571:5
#7 0x558265c8ec87 in print_counters tools/perf/builtin-stat.c:981:2
#8 0x558265c8cc71 in cmd_stat tools/perf/builtin-stat.c:2837:3
#9 0x558265bb9bd4 in run_builtin tools/perf/perf.c:323:11
#10 0x558265bb98eb in handle_internal_command tools/perf/perf.c:377:8
#11 0x558265bb9389 in run_argv tools/perf/perf.c:421:2
#12 0x558265bb9389 in main tools/perf/perf.c:537:3
```
The issue was the use of testing a cpumap with NULL rather than using
empty, as a map containing the dummy value isn't NULL and the -1
results in an empty aggr map being allocated which legitimately
overflows when any member is accessed.
Fixes: 8a96f454f5668572 ("perf stat: Avoid SEGV if core.cpus isn't set") Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230906003912.3317462-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
profile->disconnected was storing an invalid reference to the
disconnected path. Fix it by duplicating the string using
aa_unpack_strdup and freeing accordingly.
Fixes: 72c8a768641d ("apparmor: allow profiles to provide info to disconnected paths") Signed-off-by: Georgia Garcia <georgia.garcia@canonical.com> Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Sasha Levin <sashal@kernel.org>