Big-endian GENERIC_CPU supports 970, but builds with -mcpu=power5.
POWER5 is ISA v2.02 whereas 970 is v2.01 plus Altivec. 2.02 added
the popcntb instruction which a compiler might use.
As reported[1] by Arnd, the arch-specific fadvise64_64 and fallocate
compatibility handlers assume parameters are passed with 32-bit
big-endian ABI. This affects the assignment of odd-even parameter pairs
to the high or low words of a 64-bit syscall parameter.
Fix fadvise64_64 fallocate compat handlers to correctly swap upper/lower
32 bits conditioned on endianness.
A future patch will replace the arch-specific compat fallocate with an
asm-generic implementation. This patch is intended for ease of
back-port.
Partition partition@20000 contains generic kernel image and it does not
have to be used only for rescue purposes. Partition partition@1c0000
contains bootable rescue system and partition partition@340000 contains
factory image/data for restoring to NAND. So change partition labels to
better fit their purpose by removing possible misleading substring "rootfs"
from these labels.
Fixes: 54c15ec3b738 ("powerpc: dts: Add DTS file for CZ.NIC Turris 1.x routers") Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220830225500.8856-1-pali@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
Currently, we are using CPU_PM_CPU_IDLE_ENTER_PARAM() for all SBI HSM
suspend types so retentive suspend types are also treated non-retentive
and kernel will do redundant additional work for these states.
The BIT[31] of SBI HSM suspend types allows us to differentiate between
retentive and non-retentive suspend types so we should use this BIT
to call appropriate CPU_PM_CPU_IDLE_ENTER_xyz() macro.
Fixes: 6abf32f1d9c5 ("cpuidle: Add RISC-V SBI CPU idle driver") Signed-off-by: Anup Patel <apatel@ventanamicro.com> Link: https://lore.kernel.org/r/20220718084553.2056169-1-apatel@ventanamicro.com/ Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
In fsl_setup_msi_irqs(), use of_node_put() to drop the reference
returned by of_parse_phandle().
Fixes: 895d603f945ba ("powerpc/fsl_msi: add support for the fsl, msi property in PCI nodes") Co-authored-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: Liang He <windhl@126.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220704145233.278539-1-windhl@126.com Signed-off-by: Sasha Levin <sashal@kernel.org>
When building with a recent version of clang, there are a couple of
errors around the call to module_init():
arch/powerpc/math-emu/math_efp.c:927:1: error: type specifier missing, defaults to 'int'; ISO C99 and later do not support implicit int [-Wimplicit-int]
module_init(spe_mathemu_init);
^
int
arch/powerpc/math-emu/math_efp.c:927:13: error: a parameter list without types is only allowed in a function definition
module_init(spe_mathemu_init);
^
2 errors generated.
module_init() is a macro, which is not getting expanded because module.h
is not included in this file. Add the include so that the macro can
expand properly, clearing up the build failure.
commit db7cfc380900 ("ipc: Free mq_sysctls if ipc namespace creation
failed")
Here's a similar memory leak to the one fixed by the patch above.
retire_mq_sysctls need to be called when init_mqueue_fs fails after
setup_mq_sysctls.
Fixes: dc55e35f9e81 ("ipc: Store mqueue sysctls in the ipc namespace") Signed-off-by: Hangyu Hua <hbh25y@gmail.com> Link: https://lkml.kernel.org/r/20220715062301.19311-1-hbh25y@gmail.com Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The mailbox offset is not only used for receiving messages, but it is
also used by messages sent to the system controller by Linux that have a
payload, such as the "digital signature service". It is also overloaded
by certain other services (reprogramming of the FPGA fabric, see Link:)
to have a meaning other than the offset the system controller should
read from.
When the driver was written, no such services of the latter type were
in use & those of the former used an offset of zero so this has gone
un-noticed.
The "data" region of the PolarFire SoC's system controller mailbox is
not one continuous register space - the system controller's QSPI sits
between the control and data registers. Split the "data" reg into two
parts: "data" & "control". Optionally get the "data" register address
from the 3rd reg property in the devicetree & fall back to using the
old base + MAILBOX_REG_OFFSET that the current code uses.
pm_runtime_get_sync() will increment pm usage counter.
Forgetting to putting operation will result in reference leak.
Add missing pm_runtime_put_sync in some error paths.
Fixes: 9ac33b0ce81f ("CLK: TI: Driver for DRA7 ATL (Audio Tracking Logic)") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Link: https://lore.kernel.org/r/20220602030838.52057-1-linmq006@gmail.com Reviewed-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
In ti_find_clock_provider(), of_find_node_by_name() will call
of_node_put() for the 'from' argument, possibly putting the node one too
many times. Let's maintain the of_node_get() from the previous search
and only put when we're exiting the function early. This should avoid a
misbalanced reference count on the node.
Fixes: 51f661ef9a10 ("clk: ti: Add ti_find_clock_provider() to use clock-output-names") Signed-off-by: Liang He <windhl@126.com> Link: https://lore.kernel.org/r/20220915031121.4003589-1-windhl@126.com
[sboyd@kernel.org: Rewrite commit text, maintain reference instead of
get again] Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The enet_qos_root_clk takes sim_enet_root_clk as parent. When
registering enet_qos_root_clk, it will be put into clk orphan list,
because sim_enet_root_clk is not ready.
When sim_enet_root_clk is ready, clk_core_reparent_orphans_nolock will
set enet_qos_root_clk parent to sim_enet_root_clk.
Because CLK_OPS_PARENT_ENABLE is set, sim_enet_root_clk will be
enabled and disabled during the enet_qos_root_clk reparent phase.
All the above are correct. But with M7 booted early and using
enet, M7 enet feature will be broken, because clk driver probe phase
disable the needed clks, in case M7 firmware not configure
sim_enet_root_clk.
And tune the order would also save cpu cycles.
Reviewed-by: Ye Li <ye.li@nxp.com> Signed-off-by: Peng Fan <peng.fan@nxp.com> Reviewed-by: Abel Vesa <abel.vesa@linaro.org> Signed-off-by: Abel Vesa <abel.vesa@linaro.org> Link: https://lore.kernel.org/r/20220815013428.476015-1-peng.fan@oss.nxp.com
Stable-dep-of: 855ae87a2073 ("clk: imx: scu: fix memleak on platform_device_add() fails") Signed-off-by: Sasha Levin <sashal@kernel.org>
The return value of bcm2835_clock_rate_from_divisor is always unsigned
and also all caller expect this. So fix the declaration accordingly.
Fixes: 41691b8862e2 ("clk: bcm2835: Add support for programming the audio domain clocks") Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com> Link: https://lore.kernel.org/r/20220904141037.38816-1-stefan.wahren@i2se.com Reviewed-by: Ivan T. Ivanov <iivanov@suse.de> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
When testing for a series affecting the VEC, it was discovered that
turning off and on the VEC clock is crashing the system.
It turns out that, when disabling the VEC clock, it's the only child of
the PLLC-per clock which will also get disabled. The source of the crash
is PLLC-per being disabled.
It's likely that some other device might not take a clock reference that
it actually needs, but it's unclear which at this point. Let's make
PLLC-per critical so that we don't have that crash.
It turns out the internal SATA reference clock signal will stay
unavailable for the SATA interface consumer until the buffer on it's way
is ungated. So aside with having the actual clock divider enabled we need
to ungate a buffer placed on the signal way to the SATA controller (most
likely some rudiment from the initial SoC release). Seeing the switch flag
is placed in the same register as the SATA-ref clock divider at a
non-standard ffset, let's implement it as a separate clock controller with
the set-rate propagation to the parental clock divider wrapper. As such
we'll be able to disable/enable and still change the original clock source
rate.
Baikal-T1 CCU reference manual says that both xGMAC reference and xGMAC
PTP clocks are generated by two different wrappers with the same constant
divider thus each producing a 156.25 MHz signal. But for some reason both
of these clock sources are gated by a single switch-flag in the CCU
registers space - CCU_SYS_XGMAC_BASE.BIT(0). In order to make the clocks
handled independently we need to define a shared parental gate so the base
clock signal would be switched off only if both of the child-clocks are
disabled.
Note the ID is intentionally set to -2 since we are going to add a one
more internal clock identifier in the next commit.
Most likely due to copy-paste mistake the divider has been set to 10 while
according to the SoC reference manual it's supposed to be 8 thus having
PTP clock frequency of 156.25 MHz.
We have discovered random glitches during the system boot up procedure.
The problem investigation led us to the weird outcomes: when none of the
Renesas 5P49V6901 ports are explicitly enabled by the kernel driver, the
glitches disappeared. It was a mystery since the SoC external clock
domains were fed with different 5P49V6901 outputs. The driver code didn't
seem like bogus either. We almost despaired to find out a root cause when
the solution has been found for a more modern revision of the chip. It
turned out the 5P49V6901 clock generator stopped its output for a short
period of time during the VC5_OUT_DIV_CONTROL register writing. The same
problem was found for the 5P49V6965 revision of the chip and was
successfully fixed in commit fc336ae622df ("clk: vc5: fix output disabling
when enabling a FOD") by enabling the "bypass_sync" flag hidden inside
"Unused Factory Reserved Register". Even though the 5P49V6901 registers
description and programming guide doesn't provide any intel regarding that
flag, setting it up anyway in the officially unused register completely
eliminated the denoted glitches. Thus let's activate the functionality
submitted in commit fc336ae622df ("clk: vc5: fix output disabling when
enabling a FOD") for the Renesas 5P49V6901 chip too in order to remove the
ports implicit inter-dependency.
Fixes: dbf6b16f5683 ("clk: vc5: Add support for IDT VersaClock 5P49V6901") Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Reviewed-by: Luca Ceresoli <luca@lucaceresoli.net> Link: https://lore.kernel.org/r/20220929225402.9696-2-Sergey.Semin@baikalelectronics.ru Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Correct the way that duplicate PPID mappings are handled for PMIC
arbiter v5. The final APID mapped to a given PPID should be the
one which has write owner = APPS EE, if it exists, or if not
that, then the first APID mapped to the PPID, if it exists.
When the dr_mode is "host", after the host enter runtime suspend,
the mtu3 can't do it, because the mtu3's device wakeup function is
not enabled, instead it's enabled in gadget init function, to fix
the issue, init wakeup early in mtu3's probe()
Fixes: 6b587394c65c ("usb: mtu3: support suspend/resume for dual-role mode") Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reported-by: Tianping Fang <tianping.fang@mediatek.com> Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com> Link: https://lore.kernel.org/r/20220929064459.32522-1-chunfeng.yun@mediatek.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Based on num_hid_devices, each sensor device registers to HID. If
"no sensors" then amd_sfh work initialization and scheduling
doesn’t make sense and return ENODEV to stop driver probe.
Hence add a check for num_hid_devices to handle special
case in the situation of "no sensors" for SFH1.1.
User reports observing timer event report channel halted but no error
observed in CHANERR register. The driver finished self-test and released
channel resources. Debug shows that __cleanup() can call
mod_timer() after the timer has been deleted and thus resurrect the
timer. While harmless, it causes suprious error message to be emitted.
Use mod_timer_pending() call to prevent deleted timer from being
resurrected.
We can't call these off the kiocb completion as that might be off
soft/hard irq context. Defer the calls to when we process the
task_work for this request. That avoids valid complaints like:
During the previous |struct clk| to |struct clk_hw| clk provider API
migration in commit 6f691a586296 ("clk: mediatek: Switch to clk_hw
provider APIs"), a few clk_unregister_*() calls were missed.
Migrate the remaining ones to the |struct clk_hw| provider API, i.e.
change clk_unregister_*() to clk_hw_unregister_*().
Fixes: 6f691a586296 ("clk: mediatek: Switch to clk_hw provider APIs") Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20220926102523.2367530-3-wenst@chromium.org Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
When the cleanup paths for the various clk register APIs in the MediaTek
clk library were added, the one in the dividers type used the wrong type
of unregister function. This would result in incorrect dereferencing of
the clk pointer and freeing of invalid pointers.
Fix this by switching to the correct type of clk unregistration call.
Fixes: 3c3ba2ab0226 ("clk: mediatek: mtk: Implement error handling in register APIs") Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20220926102523.2367530-2-wenst@chromium.org Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The MFG_BG3D is a gate to enable/disable clock output to the GPU,
but the actual output is decided by multiple muxes; in particular:
mfg_ck_fast_ref muxes between "slow" (top_mfg_core_tmp) and
"fast" (MFGPLL) clock, while top_mfg_core_tmp muxes between the
26MHz clock and various system PLLs.
The clock gate comes after all the muxes, so its parent is
mfg_ck_fast_reg, not top_mfg_core_tmp.
Reparent MFG_BG3D to the latter to match the hardware and add the
CLK_SET_RATE_PARENT flag to it: this way we ensure propagating
rate changes that are requested on MFG_BG3D along its entire clock
tree.
In da9062_i2c_probe() regmap_clear_bits() tries to access CONFIG_J
register. As CONFIG_J is not present in da9061_aa_writeable_ranges[] probe
of da9061 fails:
da9062 2-0058: Entering I2C mode!
da9062 2-0058: Failed to set Two-Wire Bus Mode.
da9062: probe of 2-0058 failed with error -5
Add CONFIG_J register to da9061_aa_writeable_ranges[].
Fixes: 5c6f0f456351 ("mfd: da9062: Support SMBus and I2C mode") Signed-off-by: Jens Hillenstedt <jens.hillenstedt@ise.de> Reviewed-by: Adam Ward <DLG-Adam.Ward.opensource@dm.renesas.com> Signed-off-by: Lee Jones <lee@kernel.org> Link: https://lore.kernel.org/r/20220915092004.168744-1-jens.hillenstedt@ise.de Signed-off-by: Sasha Levin <sashal@kernel.org>
The mx25_tsadc_remove() function assumes all non-zero returns are success
but the platform_get_irq() function returns negative on error and
positive non-zero values on success. It never returns zero, but if it
did then treat that as a success.
Fixes: 18f773937968 ("mfd: fsl-imx25: Clean up irq settings during removal") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Martin Kaiser <martin@kaiser.cx> Signed-off-by: Lee Jones <lee@kernel.org> Link: https://lore.kernel.org/r/YvTfkbVQWYKMKS/t@kili Signed-off-by: Sasha Levin <sashal@kernel.org>
In lp8788_irq_init(), if an error occurs after a successful
irq_domain_add_linear() call, it must be undone by a corresponding
irq_domain_remove() call.
irq_domain_remove() should also be called in lp8788_irq_exit() for the same
reason.
If devm_of_platform_populate() fails, some resources need to be
released.
Introduce a mx25_tsadc_unset_irq() function that undoes
mx25_tsadc_setup_irq() and call it both from the new error handling path
of the probe and in the remove function.
The commit in Fixes: has added a pwm_add_table() call in the probe() and
a pwm_remove_table() call in the remove(), but forget to update the error
handling path of the probe.
Add the missing pwm_remove_table() call.
Fixes: a3aa9a93df9f ("mfd: intel_soc_pmic_core: ADD PWM lookup table for CRC PMIC based PWM") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Lee Jones <lee@kernel.org> Link: https://lore.kernel.org/r/20220801114211.36267-1-andriy.shevchenko@linux.intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
If allocation fails, the ida_simple_get() will return error number.
So master->idx could be error number and be used in dev_set_name().
Therefore, it should be better to check it and return error if fails,
like the ida_simple_get() in __fsi_get_new_minor().
Fixes: 09aecfab93b8 ("drivers/fsi: Add fsi master definition") Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Reviewed-by: Eddie James <eajames@linux.ibm.com> Link: https://lore.kernel.org/r/20220111073411.614138-1-jiasheng@iscas.ac.cn Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
Currently in resize_finish() in rxe_queue.c there is a loop which copies
the entries in the original queue into a newly allocated queue. The
termination logic for this loop is incorrect. The call to
queue_next_index() updates cons but has no effect on whether the queue is
empty. So if the queue starts out empty nothing is copied but if it is not
then the loop will run forever. This patch changes the loop to compare the
value of cons to the original producer index.
Fixes: ae6e843fe08d0 ("RDMA/rxe: Add memory barriers to kernel queues") Link: https://lore.kernel.org/r/20220825221446.6512-1-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Move setting of pd in mr objects ahead of any possible errors so that it
will always be set in rxe_mr_cleanup() to avoid seg faults when
rxe_put(mr_pd(mr)) is called.
Fixes: cf40367961d8 ("RDMA/rxe: Move mr cleanup code to rxe_mr_cleanup()") Link: https://lore.kernel.org/r/20220805183153.32007-2-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Currently blktests nvme/002 trips up debugobjects if CONFIG_NVME_AUTH is
enabled, but authentication is not on a queue. This is because
nvmet_auth_sq_free cancels sq->auth_expired_work unconditionaly, while
auth_expired_work is only ever initialized if authentication is enabled
for a given controller.
Fix this by calling most of what is nvmet_init_auth unconditionally
when initializing the SQ, and just do the setting of the result
field in the connect command handler.
While fixing up the driver I noticed that my IPQ8074 board was hanging
after CPUFreq switched the frequency during boot, WDT would eventually
reset it.
So mark apcs_alias0_core_clk as critical since its the clock feeding the
CPU cluster and must never be disabled.
Fix a NULL pointer crash that occurs when we are freeing the socket at the
same time we access it via sysfs.
The problem is that:
1. iscsi_sw_tcp_conn_get_param() and iscsi_sw_tcp_host_get_param() take
the frwd_lock and do sock_hold() then drop the frwd_lock. sock_hold()
does a get on the "struct sock".
2. iscsi_sw_tcp_release_conn() does sockfd_put() which does the last put
on the "struct socket" and that does __sock_release() which sets the
sock->ops to NULL.
3. iscsi_sw_tcp_conn_get_param() and iscsi_sw_tcp_host_get_param() then
call kernel_getpeername() which accesses the NULL sock->ops.
Above we do a get on the "struct sock", but we needed a get on the "struct
socket". Originally, we just held the frwd_lock the entire time but in
commit bcf3a2953d36 ("scsi: iscsi: iscsi_tcp: Avoid holding spinlock while
calling getpeername()") we switched to refcount based because the network
layer changed and started taking a mutex in that path, so we could no
longer hold the frwd_lock.
Instead of trying to maintain multiple refcounts, this just has us use a
mutex for accessing the socket in the interface code paths.
Link: https://lore.kernel.org/r/20220907221700.10302-1-michael.christie@oracle.com Fixes: bcf3a2953d36 ("scsi: iscsi: iscsi_tcp: Avoid holding spinlock while calling getpeername()") Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The issue is that the per-device running_req read in
pm8001_dev_gone_notify() never goes to zero and we never make progress.
This is caused by missing accounting for running_req for when an internal
abort command completes.
In commit 2cbbf489778e ("scsi: pm8001: Use libsas internal abort support")
we started to send internal abort commands as a proper sas_task. In this
when we deliver a sas_task to HW the per-device running_req is incremented
in pm8001_queue_command(). However it is never decremented for internal
abort commnds, so decrement in pm8001_mpi_task_abort_resp().
Link: https://lore.kernel.org/r/1663854664-76165-1-git-send-email-john.garry@huawei.com Fixes: 2cbbf489778e ("scsi: pm8001: Use libsas internal abort support") Acked-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
When executing SMP task failed, the smp_execute_task_sg() calls del_timer()
to delete "slow_task->timer". However, if the timer handler
sas_task_internal_timedout() is running, the del_timer() in
smp_execute_task_sg() will not stop it and a UAF will happen. The process
is shown below:
Fix by calling del_timer_sync() in smp_execute_task_sg(), which makes sure
the timer handler have finished before the "task->slow_task" is
deallocated.
Link: https://lore.kernel.org/r/20220920144213.10536-1-duoming@zju.edu.cn Fixes: 2908d778ab3e ("[SCSI] aic94xx: new driver") Reviewed-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Since commit edc6afc54968 ("tty: switch to ktermios and new framework")
termios speed is no longer stored only in c_cflag member but also in new
additional c_ispeed and c_ospeed members. If BOTHER flag is set in c_cflag
then termios speed is stored only in these new members.
Since commit 027b57170bf8 ("serial: core: Fix initializing and restoring
termios speed") termios speed is available also in struct console.
So properly restore also c_ispeed and c_ospeed members after suspend to fix
restoring termios speed which is not represented by Bnnn constant.
Fixes: 4516d50aabed ("serial: 8250: Use canary to restart console after suspend") Signed-off-by: Pali Rohár <pali@kernel.org> Link: https://lore.kernel.org/r/20220924104324.4035-1-pali@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Currently the gsmi driver registers a panic notifier as well as
reboot and die notifiers. The callbacks registered are called in
atomic and very limited context - for instance, panic disables
preemption and local IRQs, also all secondary CPUs (not executing
the panic path) are shutdown.
With that said, taking a spinlock in this scenario is a dangerous
invitation for lockup scenarios. So, fix that by checking if the
spinlock is free to acquire in the panic notifier callback - if not,
bail-out and avoid a potential hang.
Fixes: 74c5b31c6618 ("driver: Google EFI SMI") Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: David Gow <davidgow@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Julius Werner <jwerner@chromium.org> Cc: Petr Mladek <pmladek@suse.com> Reviewed-by: Evan Green <evgreen@chromium.org> Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Link: https://lore.kernel.org/r/20220909200755.189679-1-gpiccoli@igalia.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
No error handling is performed when platform_device_add()
return fails. Refer to the error handling of driver_set_override(),
add error handling for platform_device_add().
In some initialization functions of this driver, memory is allocated with
'i' acting as an index variable and increasing from 0. The commit in
"Fixes" introduces some clean-up codes in case of allocation failure,
which free memory in reverse order with 'i' decreasing to 0. However,
there are some problems:
- The case i=0 is left out. Thus memory is leaked.
- In case memory allocation fails right from the start, the memory
freeing loops will start with i=-1 and invalid memory locations will
be accessed.
One of these loops has been fixed in commit c8ff91535880 ("staging:
vt6655: fix potential memory leak"). Fix the remaining erroneous loops.
drivers/phy/qualcomm/phy-qcom-usb-hsic.c:82 qcom_usb_hsic_phy_power_on()
warn: 'uphy->cal_clk' from clk_prepare_enable() not released on lines:
58.
drivers/phy/qualcomm/phy-qcom-usb-hsic.c:82 qcom_usb_hsic_phy_power_on()
warn: 'uphy->cal_sleep_clk' from clk_prepare_enable() not released on
lines: 58.
drivers/phy/qualcomm/phy-qcom-usb-hsic.c:82 qcom_usb_hsic_phy_power_on()
warn: 'uphy->phy_clk' from clk_prepare_enable() not released on lines:
58.
Fix this by calling proper clk_disable_unprepare calls.
Fixes: 0b56e9a7e835 ("phy: Group vendor specific phy drivers") Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20220914051334.69282-1-dzm91@hust.edu.cn Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
lpuart_dma_shutdown tears down lpuart dma, but lpuart_flush_buffer can
still occur which in turn tries to access dma apis if lpuart_dma_tx_use
flag is true. At this point since dma is torn down, these dma apis can
abort. Set lpuart_dma_tx_use and the corresponding rx flag
lpuart_dma_rx_use to false in lpuart_dma_shutdown so that dmas are not
accessed after they are relinquished.
Invoking TIOCVHANGUP on 8250_mid port on Ice Lake-D and then reopening
the port triggers these faults during serial8250_do_startup():
DMAR: DRHD: handling fault status reg 3
DMAR: [DMA Write NO_PASID] Request device [00:1a.0] fault addr 0x0 [fault reason 0x05] PTE Write access is not set
If the IRQ hasn't been set up yet, the UART will have zeroes in its MSI
address/data registers. Disabling the IRQ at the interrupt controller
won't stop the UART from performing a DMA write to the address programmed
in its MSI address register (zero) when it wants to signal an interrupt.
The UARTs (in Ice Lake-D) implement PCI 2.1 style MSI without masking
capability, so there is no way to mask the interrupt at the source PCI
function level, except disabling the MSI capability entirely, but that
would cause it to fall back to INTx# assertion, and the PCI specification
prohibits disabling the MSI capability as a way to mask a function's
interrupt service request.
The MSI address register is zeroed by the hangup as the irq is freed.
The interrupt is signalled during serial8250_do_startup() performing a
THRE test that temporarily toggles THRI in IER. The THRE test currently
occurs before UART's irq (and MSI address) is properly set up.
Refactor serial8250_do_startup() such that irq is set up before the
THRE test. The current irq setup code is intermixed with the timer
setup code. As THRE test must be performed prior to the timer setup,
extract it into own function and call it only after the THRE test.
The ->setup_timer() needs to be part of the struct uart_8250_ops in
order to not create circular dependency between 8250 and 8250_base
modules.
so there is some additional clean up required on these error paths.
Fixes: 6f0764b5adea ("usb: dwc3: add a power supply for current control") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/YyxFYFnP53j9sCg+@kili Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
In anx7411_typec_switch_probe(), we should call of_get_child_by_name()
instead of of_find_node_by_name() as of_find_xxx API will decrease the
refcount of the 'from' argument.
When opts->pnp_string is changed with configfs, new memory is allocated for
the string. It does not, however, update dev->pnp_string, even though the
memory is freed. When rquesting the string, the host then gets old or
corrupted data rather than the new string. The ieee 1284 id string should
be allowed to change while the device is connected.
The bug was introduced in commit fdc01cc286be ("usb: gadget: printer:
Remove pnp_string static buffer"), which changed opts->pnp_string from a
char[] to a char*.
This patch changes dev->pnp_string from a char* to a char** pointing to
opts->pnp_string.
commit 8b328f8002bc ("xhci: re-initialize the HC during resume if HCE was
set") introduced a new warning message when the host controller error
was set and re-initializing.
This is expected behavior on some designs which already set
`xhci->broken_suspend` so the new warning is alarming to some users.
Modify the code to only show the warning if this was a surprising behavior
to the XHCI driver.
Set 'iova' and 'length' on ib_mr in ib_uverbs and ib_core layers to let all
drivers have the members filled. Also, this commit removes redundancy in
the respective drivers.
Previously, commit 04c0a5fcfcf65 ("IB/uverbs: Set IOVA on IB MR in uverbs
layer") changed to set 'iova', but seems to have missed 'length' and the
ib_core layer at that time.
The responder should always use WC's SLID as the dlid, to follow the
IB SPEC section "13.5.4.2 COMMON RESPONSE ACTIONS":
A responder always takes the following actions in constructing a
response packet:
- The SLID of the received packet is used as the DLID in the response
packet.
A regression is seen where mddev devices stay permanently after they
are stopped due to an elevated reference count.
This was tracked down to an extra mddev_get() in md_seq_start().
It only happened rarely because most of the time the md_seq_start()
is called with a zero offset. The path with an extra mddev_get() only
happens when it starts with a non-zero offset.
The commit noted below changed an mddev_get() to check its success
but inadvertently left the original call in. Remove the extra call.
Fixes: 12a6caf27324 ("md: only delete entries from all_mddevs when the disk is freed") Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Guoqing Jiang <Guoqing.jiang@linux.dev> Signed-off-by: Song Liu <song@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The double free is caused by an unnecessary bio_put() in the
if(is_badblock(...)) error path in raid5_read_one_chunk().
The error path was moved ahead of bio_alloc_clone() in c82aa1b76787c
("md/raid5: move checking badblock before clone bio in
raid5_read_one_chunk"). The previous code checked and freed align_bio
which required a bio_put. After the move that is no longer needed as
raid_bio is returned to the control of the common io path which
performs its own endio resulting in a double free on bad device blocks.
Fixes: c82aa1b76787c ("md/raid5: move checking badblock before clone bio in raid5_read_one_chunk") Signed-off-by: David Sloan <david.sloan@eideticom.com> Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Guoqing Jiang <Guoqing.jiang@linux.dev> Signed-off-by: Song Liu <song@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
When doing degrade/recover tests using the journal a kernel BUG
is hit at drivers/md/raid5.c:4381 in handle_parity_checks5():
BUG_ON(!test_bit(R5_UPTODATE, &dev->flags));
This was found to occur because handle_stripe_fill() was skipped
for stripes in the journal due to a condition in that function.
Thus blocks were not fetched and R5_UPTODATE was not set when
the code reached handle_parity_checks5().
To fix this, don't skip handle_stripe_fill() unless the stripe is
for read.
Fixes: 07e83364845e ("md/r5cache: shift complex rmw from read path to write path") Link: https://lore.kernel.org/linux-raid/e05c4239-41a9-d2f7-3cfa-4aa9d2cea8c1@deltatee.com/ Suggested-by: Song Liu <song@kernel.org> Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Song Liu <song@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Current code produces a warning as shown below when total characters
in the constituent block device names plus the slashes exceeds 200.
snprintf() returns the number of characters generated from the given
input, which could cause the expression “200 – len” to wrap around
to a large positive number. Fix this by using scnprintf() instead,
which returns the actual number of characters written into the buffer.
If we have doubly sized SQEs, then we need to shift the sq index by 1
to account for using two entries for a single request. The CQE dumping
gets this right, but the SQE one does not.
Improve the SQE dumping in general, the information dumped is pretty
sparse and doesn't even cover the whole basic part of the SQE. Include
information on the extended part of the SQE, if doubly sized SQEs are
in use. A typical dump now looks like the following:
Guard wakeups that the user can trigger, and that may end up triggering a
call back into eventfd_signal. This is in addition to the current approach
that only guards in eventfd_signal.
Rename in_eventfd_signal -> in_eventfd at the same time to reflect this.
Without this there would be a deadlock in the following code using libaio:
int main()
{
struct io_context *ctx = NULL;
struct iocb iocb;
struct iocb *iocbs[] = { &iocb };
int evfd;
uint64_t val = 1;
The documentation of the blk_eh_timer_return enumeration values does not
reflect correctly how e.g. the SCSI core uses these values. Fix the
documentation.
Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Damien Le Moal <damien.lemoal@wdc.com> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Fixes: 88b0cfad2888 ("block: document the blk_eh_timer_return values") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Link: https://lore.kernel.org/r/20220920200626.3422296-1-bvanassche@acm.org Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
The meson_nfc_ecc_correct() function accidentally does a right shift
instead of a left shift so it only works for BIT(0). Also use
BIT_ULL() because "correct_bitmap" is a u64 and we want to avoid
shift wrapping bugs.
Fixes: 8fae856c5350 ("mtd: rawnand: meson: add support for Amlogic NAND flash controller") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Liang Yang <liang.yang@amlogic.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/YuI2zF1hP65+LE7r@kili Signed-off-by: Sasha Levin <sashal@kernel.org>
ACS-5 section
7.13.6.36 Word 78: Serial ATA features supported
states that:
If word 76 is not 0000h or FFFFh, word 78 reports the features supported
by the device. If this word is not supported, the word shall be cleared
to zero.
(This text also exists in really old ACS standards, e.g. ACS-3.)
The problem with ata_id_has_dipm() is that the while it performs a
check against 0 and 0xffff, it performs the check against
ATA_ID_FEATURE_SUPP (word 78), the same word where the feature bit
is stored.
Fix this by performing the check against ATA_ID_SATA_CAPABILITY
(word 76), like required by the spec. The feature bit check itself
is of course still performed against ATA_ID_FEATURE_SUPP (word 78).
Additionally, move the macro to the other ATA_ID_FEATURE_SUPP macros
(which already have this check), thus making it more likely that the
next ATA_ID_FEATURE_SUPP macro that is added will include this check.
Fixes: ca77329fb713 ("[libata] Link power management infrastructure") Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
ACS-5 section
7.13.6.36 Word 78: Serial ATA features supported
states that:
If word 76 is not 0000h or FFFFh, word 78 reports the features supported
by the device. If this word is not supported, the word shall be cleared
to zero.
(This text also exists in really old ACS standards, e.g. ACS-3.)
Additionally, move the macro to the other ATA_ID_FEATURE_SUPP macros
(which already have this check), thus making it more likely that the
next ATA_ID_FEATURE_SUPP macro that is added will include this check.
Fixes: 5b01e4b9efa0 ("libata: Implement NCQ autosense") Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
ACS-5 section
7.13.6.36 Word 78: Serial ATA features supported
states that:
If word 76 is not 0000h or FFFFh, word 78 reports the features supported
by the device. If this word is not supported, the word shall be cleared
to zero.
(This text also exists in really old ACS standards, e.g. ACS-3.)
Additionally, move the macro to the other ATA_ID_FEATURE_SUPP macros
(which already have this check), thus making it more likely that the
next ATA_ID_FEATURE_SUPP macro that is added will include this check.
Fixes: 65fe1f0f66a5 ("ahci: implement aggressive SATA device sleep support") Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
ACS-5 section
7.13.6.41 Words 85..87, 120: Commands and feature sets supported or enabled
states that:
If bit 15 of word 86 is set to one, bit 14 of word 119 is set to one,
and bit 15 of word 119 is cleared to zero, then word 119 is valid.
If bit 15 of word 86 is set to one, bit 14 of word 120 is set to one,
and bit 15 of word 120 is cleared to zero, then word 120 is valid.
(This text also exists in really old ACS standards, e.g. ACS-3.)
Currently, ata_id_sense_reporting_enabled() and
ata_id_has_sense_reporting() both check bit 15 of word 86,
but neither of them check that bit 14 of word 119 is set to one,
or that bit 15 of word 119 is cleared to zero.
Additionally, make ata_id_sense_reporting_enabled() return false
if !ata_id_has_sense_reporting(), similar to how e.g.
ata_id_flush_ext_enabled() returns false if !ata_id_has_flush_ext().
Fixes: e87fd28cf9a2 ("libata: Implement support for sense data reporting") Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Delay QP destroy completion until all siw references to QP are
dropped. The calling RDMA core will free QP structure after
successful return from siw_qp_destroy() call, so siw must not
hold any remaining reference to the QP upon return.
A use-after-free was encountered in xfstest generic/460, while
testing NFSoRDMA. Here, after a TCP connection drop by peer,
the triggered siw_cm_work_handler got delayed until after
QP destroy call, referencing a QP which has already freed.
Fixes: 303ae1cdfdf7 ("rdma/siw: application interface") Reported-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com> Link: https://lore.kernel.org/r/20220920082503.224189-1-bmt@zurich.ibm.com Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
For header and trailer/padding processing, siw did not consume new
skb data until minimum amount present to fill current header or trailer
structure, including potential payload padding. Not consuming any
data during upcall may cause a receive stall, since tcp_read_sock()
is not upcalling again if no new data arrive.
A NFSoRDMA client got stuck at RDMA Write reception of unaligned
payload, if the current skb did contain only the expected 3 padding
bytes, but not the 4 bytes CRC trailer. Expecting 4 more bytes already
arrived in another skb, and not consuming those 3 bytes in the current
upcall left the Write incomplete, waiting for the CRC forever.
Fixes: 8b6a361b8c48 ("rdma/siw: receive path") Reported-by: Olga Kornievskaia <kolga@netapp.com> Tested-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com> Link: https://lore.kernel.org/r/20220920081202.223629-1-bmt@zurich.ibm.com Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Currently ib_copy_from_udata and ib_copy_to_udata could underfill
the request and response buffer if the user-space passes an undersized
value for udata->inlen or udata->outlen respectively [1]
This could lead to undesirable behavior.
Zero initing the buffer only goes as far as preventing using the buffer
uninitialized.
Validate udata->inlen and udata->outlen passed from user-space to ensure
they are at least the required minimum size.
A number of asynchronous event (AE) ids were not aligned to the
correct flush_code and event_type. Fix these up so that the
correct IBV error and event codes are returned to application.
Also, add handling for new AE ids like IRDMA_AE_INVALID_REQUEST to
return the correct WC error code.
Commit f6424c22aa36 ("mtd: rawnand: fsl_elbc: Make SW ECC work") added
support for specifying ECC mode via DTS and skipping autodetection.
But it broke explicit specification of HW ECC mode in DTS as correct
settings for HW ECC mode are applied only when NONE mode or nothing was
specified in DTS file.
Also it started aliasing NONE mode to be same as when ECC mode was not
specified and disallowed usage of ON_DIE mode.
Fix all these issues. Use autodetection of ECC mode only in case when mode
was really not specified in DTS file by checking that ecc value is invalid.
Set HW ECC settings either when HW ECC was specified in DTS or it was
autodetected. And do not fail when ON_DIE mode is set.
The "intel,nand-controller" compatible string is not part of the
dt-bindings. Remove it from the driver as it's not supposed to be used
without any documentation for it.
As the of_get_parent() will increase the refcount of the node->parent
and the reference will be discarded, so we should hold the reference
with which we can decrease the refcount when done.
Fixes: 8eff8b4e22d9 ("phy: amlogic: phy-meson-axg-mipi-pcie-analog: add support for MIPI DSI analog") Signed-off-by: Liang He <windhl@126.com> Link: https://lore.kernel.org/r/20220915093506.4009456-1-windhl@126.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The struct_size() macro protects against integer overflows but adding
"+ rsc->config_len" introduces the risk of integer overflows again.
Use size_add() to be safe.
Fixes: c87846571587 ("remoteproc: use struct_size() helper") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Mukesh Ojha <quic_mojha@quicinc.com> Link: https://lore.kernel.org/r/YyMyoPoGOJUcEpZT@kili Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
10. lpfc_init.c:13573 lpfc_sli4_hba_unset()
warn: variable dereferenced before check 'phba->pport' (see
line 13546)
11. lpfc_auth.c:1923 lpfc_auth_handle_dhchap_reply()
error: double free of 'hash_value'
Fixes:
1. Initialize vlan_id to LPFC_FCOE_NULL_VID.
2. Initialize vlan_id to LPFC_FCOE_NULL_VID.
3. prg->dist is a 2 bit field. Its value can only be between 0-3.
Remove redundent check 'if (prg->dist < 4)'.
4. Fix inconsistent indenting. Moved logic into helper function
lpfc_fill_vpd().
5. Define 'eqhdl->irq' as int value as pci_irq_vector() returns int.
Also, check for return value of pci_irq_vector() and log message in
case of failure.
6. Initialize 'ext_cnt' to 0.
7. Initialize 'ext_size' to 0.
8. Use alloc_percpu_gfp() with GFP_ATOMIC flag.
9. 'rc' was not updated when dma_pool_create() fails. Update 'rc =
-ENOMEM' when dma_pool_create() fails before calling goto statement.
10. Add check for 'phba->pport' in lpfc_cpuhp_remove().
11. Initialize 'hash_value' to NULL, same like 'aug_chal' variable.
Link: https://lore.kernel.org/r/20220911221505.117655-13-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
This exported fn is unused, and will not be needed. Lets dump it.
The export was added to let drm control pr_debugs, as part of using
them to avoid drm_debug_enabled overheads. But its better to just
implement the drm.debug bitmap interface, then its available for
everyone.
Fixes: a2d375eda771 ("dyndbg: refine export, rename to dynamic_debug_exec_queries()") Fixes: 4c0d77828d4f ("dyndbg: export ddebug_exec_queries") Acked-by: Jason Baron <jbaron@akamai.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Link: https://lore.kernel.org/r/20220904214134.408619-10-jim.cromie@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
dyndbg's control-parser: ddebug_parse_query(), requires that search
terms: module, func, file, lineno, are used only once in a query; a
thing cannot be named both foo and bar.
The cited commit added an overriding module modname, taken from the
module loader, which is authoritative. So it set query.module 1st,
which disallowed its use in the query-string.
But now, its useful to allow a module-load to enable classes across a
whole (or part of) a subsystem at once.
For CONFIG_DYNAMIC_DEBUG=N, the ddebug_dyndbg_module_param_cb()
stub-fn is too permissive:
bash-5.1# modprobe drm JUNKdyndbg
bash-5.1# modprobe drm dyndbgJUNK
[ 42.933220] dyndbg param is supported only in CONFIG_DYNAMIC_DEBUG builds
[ 42.937484] ACPI: bus type drm_connector registered
This caused no ill effects, because unknown parameters are either
ignored by default with an "unknown parameter" warning, or ignored
because dyndbg allows its no-effect use on non-dyndbg builds.
But since the code has an explicit feedback message, it should be
issued accurately. Fix with strcmp for exact param-name match.
Fixes: b48420c1d301 dynamic_debug: make dynamic-debug work for module initialization Reported-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Acked-by: Jason Baron <jbaron@akamai.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Link: https://lore.kernel.org/r/20220904214134.408619-3-jim.cromie@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
In https://lore.kernel.org/lkml/20211209150910.GA23668@axis.com/
Vincent's patch commented on, and worked around, a bug toggling
static_branch's, when a 2nd PRINTK-ish flag was added. The bug
results in a premature static_branch_disable when the 1st of 2 flags
was disabled.
The cited commit computed newflags, but then in the JUMP_LABEL block,
failed to use that result, instead using just one of the terms in it.
Using newflags instead made the code work properly.
This is Vincents test-case, reduced. It needs the 2nd flag to
demonstrate the bug, but it's explanatory here.
This from static analysis. The vla_item() takes a size and adds it to
the total. It has a built in integer overflow check so if it encounters
an integer overflow anywhere then it records the total as SIZE_MAX.
However there is an issue here because the "lang_count*(needed_count+1)"
multiplication can overflow. Technically the "lang_count + 1" addition
could overflow too, but that would be detected and is harmless. Fix
both using the new size_add() and size_mul() functions.
If an IIO driver uses callbacks from another IIO driver and calls
iio_channel_start_all_cb() from one of its buffer setup ops, then
lockdep complains due to the lock nesting, as in the below example with
lmp91000.
Since the locks are being taken on different IIO devices, there is no
actual deadlock. Fix the warning by telling lockdep to use a different
class for each iio_device.
============================================
WARNING: possible recursive locking detected
--------------------------------------------
python3/23 is trying to acquire lock:
(&indio_dev->mlock){+.+.}-{3:3}, at: iio_update_buffers
but task is already holding lock:
(&indio_dev->mlock){+.+.}-{3:3}, at: enable_store
other info that might help us debug this:
Possible unsafe locking scenario:
When we get a DMA channel and try to use it in multiple threads it
will cause oops and hanging the system.
% echo 100 > /sys/module/dmatest/parameters/threads_per_chan
% echo 100 > /sys/module/dmatest/parameters/iterations
% echo 1 > /sys/module/dmatest/parameters/run
[383493.327077] Unable to handle kernel paging request at virtual
address dead000000000108
[383493.335103] Mem abort info:
[383493.335103] ESR = 0x96000044
[383493.335105] EC = 0x25: DABT (current EL), IL = 32 bits
[383493.335107] SET = 0, FnV = 0
[383493.335108] EA = 0, S1PTW = 0
[383493.335109] FSC = 0x04: level 0 translation fault
[383493.335110] Data abort info:
[383493.335111] ISV = 0, ISS = 0x00000044
[383493.364739] CM = 0, WnR = 1
[383493.367793] [dead000000000108] address between user and kernel
address ranges
[383493.375021] Internal error: Oops: 96000044 [#1] PREEMPT SMP
[383493.437574] CPU: 63 PID: 27895 Comm: dma0chan0-copy2 Kdump:
loaded Tainted: GO 5.17.0-rc4+ #2
[383493.457851] pstate: 204000c9 (nzCv daIF +PAN -UAO -TCO -DIT
-SSBS BTYPE=--)
[383493.465331] pc : vchan_tx_submit+0x64/0xa0
[383493.469957] lr : vchan_tx_submit+0x34/0xa0
This occurs because the transmission timed out, and that's due
to data race. Each thread rewrite channels's descriptor as soon as
device_issue_pending is called. It leads to the situation that
the driver thinks that it uses the right descriptor in interrupt
handler while channels's descriptor has been changed by other
thread. The descriptor which in fact reported interrupt will not
be handled any more, as well as its tx->callback.
That's why timeout reports.
With current fixes channels' descriptor changes it's value only
when it has been used. A new descriptor is acquired from
vc->desc_issued queue that is already filled with descriptors
that are ready to be sent. Threads have no direct access to DMA
channel descriptor. In case of channel's descriptor is busy, try
to submit to HW again when a descriptor is completed. In this case,
vc->desc_issued may be empty when hisi_dma_start_transfer is called,
so delete error reporting on this. Now it is just possible to queue
a descriptor for further processing.
Fixes: e9f08b65250d ("dmaengine: hisilicon: Add Kunpeng DMA engine support") Signed-off-by: Jie Hai <haijie1@huawei.com> Acked-by: Zhou Wang <wangzhou1@hisilicon.com> Link: https://lore.kernel.org/r/20220830062251.52993-4-haijie1@huawei.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>