When connecting a controller with a zero kato value using the following
command line
nvme connect -t tcp -n NQN -a ADDR -s PORT --keep-alive-tmo=0
the warning below can be reproduced:
WARNING: CPU: 1 PID: 241 at kernel/workqueue.c:1627 __queue_delayed_work+0x6d/0x90
with trace:
mod_delayed_work_on+0x59/0x90
nvmet_update_cc+0xee/0x100 [nvmet]
nvmet_execute_prop_set+0x72/0x80 [nvmet]
nvmet_tcp_try_recv_pdu+0x2f7/0x770 [nvmet_tcp]
nvmet_tcp_io_work+0x63f/0xb2d [nvmet_tcp]
...
This is caused by queuing up an uninitialized work. Althrough the
keep-alive timer is disabled during allocating the controller (fixed in 0d3b6a8d213a), ka_work still has a chance to run (called by
nvmet_start_ctrl).
Fixes: 0d3b6a8d213a ("nvmet: Disable keep-alive timer when kato is cleared to 0h") Signed-off-by: zhenwei pi <pizhenwei@bytedance.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
When an UE or memory error exception is encountered the MCE handler
tries to find the pfn using addr_to_pfn() which takes effective
address as an argument, later pfn is used to poison the page where
memory error occurred, recent rework in this area made addr_to_pfn
to run in real mode, which can be fatal as it may try to access
memory outside RMO region.
Have two helper functions to separate things to be done in real mode
and virtual mode without changing any functionality. This also fixes
the following error as the use of addr_to_pfn is now moved to virtual
mode.
Without this change following kernel crash is seen on hitting UE.
Every dump reported by OPAL is exported to userspace through a sysfs
interface and notified using kobject_uevent(). The userspace daemon
(opal_errd) then reads the dump and acknowledges that the dump is
saved safely to disk. Once acknowledged the kernel removes the
respective sysfs file entry causing respective resources to be
released including kobject.
However it's possible the userspace daemon may already be scanning
dump entries when a new sysfs dump entry is created by the kernel.
User daemon may read this new entry and ack it even before kernel can
notify userspace about it through kobject_uevent() call. If that
happens then we have a potential race between
dump_ack_store->kobject_put() and kobject_uevent which can lead to
use-after-free of a kernfs object resulting in a kernel crash.
This patch fixes this race by protecting the sysfs file
creation/notification by holding a reference count on kobject until we
safely send kobject_uevent().
The function create_dump_obj() returns the dump object which if used
by caller function will end up in use-after-free problem again.
However, the return value of create_dump_obj() function isn't being
used today and there is no need as well. Hence change it to return
void to make this fix complete.
There is an off-by-one array check that can lead to a out-of-bounds
write to devices->info[i]. Fix this by checking by using >= rather
than > for the size check. Also replace hard-coded array size limit
with ARRAY_SIZE on the array.
Addresses-Coverity: ("Out-of-bounds write") Fixes: cd9e9808d18f ("lightnvm: Support for Open-Channel SSDs") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
However, the driver from the 3.10 vendor kernel does not use the
following four interrupt lines:
- INT_MALI_PP3
- INT_MALI_PP3_MMU
- INT_MALI_PP7
- INT_MALI_PP7_MMU
Drop the "pp3" and "ppmmu3" interrupt lines. This is also important
because there is no matching entry in interrupt-names for it (meaning
the "pp2" interrupt is actually assigned to the "pp3" interrupt line).
Fixes: 7d3f6b536e72c9 ("ARM: dts: meson8: add the Mali-450 MP6 GPU") Reported-by: Thomas Graichen <thomas.graichen@gmail.com> Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: Kevin Hilman <khilman@baylibre.com> Tested-by: thomas graichen <thomas.graichen@gmail.com> Reviewed-by: Neil Armstrong <narmstrong@baylibre.com> Link: https://lore.kernel.org/r/20200815181957.408649-1-martin.blumenstingl@googlemail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
DT binding permits only one compatible string which was decribed in past by
commit 63cab195bf49 ("i2c: removed work arounds in i2c driver for Zynq
Ultrascale+ MPSoC").
The commit aea37006e183 ("dt-bindings: i2c: cadence: Migrate i2c-cadence
documentation to YAML") has converted binding to yaml and the following
issues is reported:
...: i2c@ff030000: compatible: Additional items are not allowed
('cdns,i2c-r1p10' was unexpected)
From schema:
.../Documentation/devicetree/bindings/i2c/cdns,i2c-r1p10.yaml fds
...: i2c@ff030000: compatible: ['cdns,i2c-r1p14', 'cdns,i2c-r1p10'] is too
long
The commit c415f9e8304a ("ARM64: zynqmp: Fix i2c node's compatible string")
has added the second compatible string but without removing origin one.
The patch is only keeping one compatible string "cdns,i2c-r1p14".
No need to clear event again since event always clear before wait.
This fix depend on patch:
"soc: mediatek: cmdq: add clear option in cmdq_pkt_wfe api"
Fixes: 2f965be7f9008 ("drm/mediatek: apply CMDQ control flow") Signed-off-by: Dennis YC Hsieh <dennis-yc.hsieh@mediatek.com> Reviewed-by: Bibby Hsieh <bibby.hsieh@mediatek.com> Acked-by: Chun-Kuang Hu <chunkuang.hu@kernel.org> Link: https://lore.kernel.org/r/1594136714-11650-10-git-send-email-dennis-yc.hsieh@mediatek.com Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
As per the iWave RZ/G1M schematic, the signal LVDS_PPEN controls the
supply voltage for the touch panel, LVDS receiver and RGB LCD panel. Add
a regulator for these device nodes and remove the powerdown-gpios
property from the lvds-receiver node as it results in a touch controller
driver probe failure.
On the production revision of the SoM, 587-200, the PHY reset GPIO and
touchscreen IRQs are swapped to prevent collision between EXTi IRQs,
reflect that in DT.
Fixes: 34e0c7847dcf ("ARM: dts: stm32: Add DH Electronics DHCOM STM32MP1 SoM and PDK2 board") Signed-off-by: Marek Vasut <marex@denx.de> Cc: Alexandre Torgue <alexandre.torgue@st.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Patrice Chotard <patrice.chotard@st.com> Cc: Patrick Delaunay <patrick.delaunay@st.com> Cc: linux-stm32@st-md-mailman.stormreply.com
To: linux-arm-kernel@lists.infradead.org Signed-off-by: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The KSZ9031 PHY skew timings for rxc/txc, originally set to achieve
the desired phase shift between clock- and data-signal, now trigger a
kernel warning when used in rgmii-id mode:
*-skew-ps values should be used only with phy-mode = "rgmii"
This is because commit bcf3440c6dd7 ("net: phy: micrel: add phy-mode
support for the KSZ9031 PHY") now configures own timings when
phy-mode = "rgmii-id". Device trees wanting to set their own delays
should use phy-mode "rgmii" instead as the warning prescribes.
The "standard" timings now used with "rgmii-id" work fine on this
board, so drop the explicit timings in the device tree and thereby
silence the warning.
The AV96 uses sdmmc2_d47_pins_c and sdmmc2_d47_sleep_pins_c, which
differ from sdmmc2_d47_pins_b and sdmmc2_d47_sleep_pins_b in one
pin, SDMMC2_D5, which is PA15 in the former and PA9 in the later.
The PA15 is correct on AV96, so fix this. This error is likely a
result of rebasing across the stm32mp1 DT pinctrl rework.
Fixes: 611325f68102 ("ARM: dts: stm32: Add eMMC attached to SDMMC2 on AV96") Signed-off-by: Marek Vasut <marex@denx.de> Cc: Alexandre Torgue <alexandre.torgue@st.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Patrice Chotard <patrice.chotard@st.com> Cc: Patrick Delaunay <patrick.delaunay@st.com> Cc: linux-stm32@st-md-mailman.stormreply.com
To: linux-arm-kernel@lists.infradead.org Signed-off-by: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
If cpu_cluster_pm_enter() fails, we need to set MPU power domain back
to enabled to prevent the next WFI from potentially triggering an
undesired MPU power domain state change.
We already do this for omap_enter_idle_smp() but are missing it for
omap_enter_idle_coupled().
Fixes: 55be2f50336f ("ARM: OMAP2+: Handle errors for cpu_pm") Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
On error the function was meant to return -ERRNO. This also fixes
compile warning:
drivers/soc/fsl/qbman/bman.c:640:6: warning: variable 'err' set but not used [-Wunused-but-set-variable]
Fixes: 0505d00c8dba ("soc/fsl/qbman: Cleanup buffer pools if BMan was initialized prior to bootup") Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Li Yang <leoyang.li@nxp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
After commit 7cdf8446ed1d ("arm64: dts: actions: Add pinctrl node for
Actions Semi S700") following error has been observed while booting
Linux on Cubieboard7-lite(based on S700 SoC).
[ 0.257415] pinctrl-s700 e01b0000.pinctrl: can't request region for
resource [mem 0xe01b0000-0xe01b0fff]
[ 0.266902] pinctrl-s700: probe of e01b0000.pinctrl failed with error -16
This is due to the fact that memory range for "sps" power domain controller
clashes with pinctrl.
One way to fix it, is to limit pinctrl address range which is safe
to do as current pinctrl driver uses address range only up to 0x100.
This commit limits the pinctrl address range to 0x100 so that it doesn't
conflict with sps range.
When adding allwinner,sun8i-a33-crypto, I forgot to add that it needs reset.
Furthermore, there are no need to use items to list only one compatible
in compatible list.
Fixes: f81547ba7a98 ("dt-bindings: crypto: add new compatible for A33 SS") Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://lore.kernel.org/r/20200907175437.4464-1-clabbe.montjoie@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
The mdss node sets #interrupt-cells = <1>, so its interrupts
should be referenced using a single cell (in this case: only the
interrupt number).
However, right now the mdp/dsi node both have two interrupt cells
set, e.g. interrupts = <4 0>. The 0 is probably meant to say
IRQ_TYPE_NONE (= 0), but with #interrupt-cells = <1> this is
actually interpreted as a second interrupt line.
Remove the IRQ flags from both interrupts to fix this.
Tha parent node of "wcd_codec" specifies #address-cells = <1>
and #size-cells = <0>, which means that each resource should be
described by one cell for the address and size omitted.
However, wcd_codec currently lists 0x200 as second cell (probably
the size of the resource). When parsing this would be treated like
another memory resource - which is entirely wrong.
To quote the device tree specification [1]:
"If the parent node specifies a value of 0 for #size-cells,
the length field in the value of reg shall be omitted."
[1]: https://www.devicetree.org/specifications/
Fixes: 5582fcb3829f ("arm64: dts: apq8016-sbc: add analog audio support with multicodec") Signed-off-by: Stephan Gerhold <stephan@gerhold.net> Link: https://lore.kernel.org/r/20200915071221.72895-4-stephan@gerhold.net Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit fe2aff0c574d2 ("arm64: dts: qcom: msm8916: remove unit name for thermal trip points")
removed the unit names for most of the thermal trip points defined
in msm8916.dtsi, but missed to update the one for cpu0_1-thermal.
So why wasn't this spotted by "make dtbs_check"? Apparently, the name
of the thermal zone is already invalid: thermal-zones.yaml specifies
a regex of ^[a-zA-Z][a-zA-Z0-9\\-]{1,12}-thermal$, so it is not allowed
to contain underscores. Therefore the thermal zone was never verified
using the DTB schema.
After replacing the underscore in the thermal zone name, the warning
shows up:
apq8016-sbc.dt.yaml: thermal-zones: cpu0-1-thermal:trips: 'trip-point@0'
does not match any of the regexes: '^[a-zA-Z][a-zA-Z0-9\\-_]{0,63}$', 'pinctrl-[0-9]+'
Fix up the thermal zone names and remove the unit name for the trip point.
Cc: Amit Kucheria <amit.kucheria@linaro.org> Fixes: fe2aff0c574d2 ("arm64: dts: qcom: msm8916: remove unit name for thermal trip points") Signed-off-by: Stephan Gerhold <stephan@gerhold.net> Link: https://lore.kernel.org/r/20200915071221.72895-3-stephan@gerhold.net Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The array type of get_domain_list_resp is incorrectly marked as NO_ARRAY.
Due to which the following error was observed when using pdr helpers with
the downstream proprietary pd-mapper. Fix this up by marking it as
VAR_LEN_ARRAY instead.
Err logs:
qmi_decode_struct_elem: Fault in decoding: dl(2), db(27), tl(160), i(1), el(1)
failed to decode incoming message
PDR: tms/servreg get domain list txn wait failed: -14
PDR: service lookup for tms/servreg failed: -14
The number of interrupt cells for the mdss interrupt controller is 1,
meaning there should only be one cell for the interrupt number, not two
where the second cell is the irq flags. Drop the second cell to match
the binding.
The i.MX General Power Controller v2 device node was missing interrupts
property necessary to route its interrupt to GIC. This also fixes the
dbts_check warnings like:
arch/arm64/boot/dts/freescale/imx8mq-evk.dt.yaml: gpc@303a0000:
{'compatible': ... '$nodename': ['gpc@303a0000']} is not valid under any of the given schemas
arch/arm64/boot/dts/freescale/imx8mq-evk.dt.yaml: gpc@303a0000: 'interrupts' is a required property
Fixes: fdbcc04da246 ("arm64: dts: imx8mq: add GPC power domains") Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
scmi_mailbox is obtained from cinfo->transport_info and the first
call to mailbox_chan_free frees the channel and sets cinfo->transport_info
to NULL. Care is taken to check for non NULL smbox->chan but smbox can
itself be NULL. Fix it by checking for it without which, kernel crashes
with below NULL pointer dereference and eventually kernel panic.
Unable to handle kernel NULL pointer dereference at
virtual address 0000000000000038
Modules linked in: scmi_module(-)
Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno
Development Platform, BIOS EDK II Sep 2 2020
pstate: 80000005 (Nzcv daif -PAN -UAO BTYPE=--)
pc : mailbox_chan_free+0x2c/0x70 [scmi_module]
lr : idr_for_each+0x6c/0xf8
Call trace:
mailbox_chan_free+0x2c/0x70 [scmi_module]
idr_for_each+0x6c/0xf8
scmi_remove+0xa8/0xf0 [scmi_module]
platform_drv_remove+0x34/0x58
device_release_driver_internal+0x118/0x1f0
driver_detach+0x58/0xe8
bus_remove_driver+0x64/0xe0
driver_unregister+0x38/0x68
platform_driver_unregister+0x1c/0x28
scmi_driver_exit+0x38/0x44 [scmi_module]
---[ end trace 17bde19f50436de9 ]---
Kernel panic - not syncing: Fatal exception
SMP: stopping secondary CPUs
Kernel Offset: 0x1d0000 from 0xffff800010000000
PHYS_OFFSET: 0x80000000
CPU features: 0x0240022,25806004
Memory Limit: none
---[ end Kernel panic - not syncing: Fatal exception ]---
There is one LLCC logical bank(LLCC0) on SC7180 SoC and the
size of the LLCC0 base is 0x50000(320KB) not 2MB, so correct
the size and fix copy paste mistake carried over from SDM845.
Reviewed-by: Douglas Anderson <dianders@chromium.org> Fixes: 7cee5c742899 ("arm64: dts: qcom: sc7180: Fix node order") Fixes: c831fa299996 ("arm64: dts: qcom: sc7180: Add Last level cache controller node") Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> Link: https://lore.kernel.org/r/20200818145514.16262-1-saiprakash.ranjan@codeaurora.org Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
H5's Mali GPU PMU is not present or working corretly although
H5 datasheet record its interrupt vector.
Adding this module will miss lead lima driver try to shutdown
it and get waiting timeout. This problem is not exposed before
lima runtime PM support is added.
DCDC1 regulator powers many different subsystems. While some of them can
work at 3.0 V, some of them can not. For example, VCC-HDMI can only work
between 3.24 V and 3.36 V. According to OS images provided by the board
manufacturer this regulator should be set to 3.3 V.
Set DCDC1 and DCDC1SW to 3.3 V in order to fix this.
Fixes: da7ac948fa93 ("ARM: dts: sun8i: Add board dts file for Banana Pi M2 Ultra") Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://lore.kernel.org/r/20200824193649.978197-1-jernej.skrabec@siol.net Signed-off-by: Sasha Levin <sashal@kernel.org>
Similar to 7980d2eabde8 ("ipvs: clear skb->tstamp in forwarding path").
fq qdisc requires tstamp to be cleared in forwarding path.
Fixes: 8203e2d844d3 ("net: clear skb->tstamp in forwarding paths") Fixes: fb420d5d91c1 ("tcp/fq: move back to CLOCK_MONOTONIC") Fixes: 80b14dee2bea ("net: Add a new socket option for a future transmit time.") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Since commit bbc4d71d63549bc ("net: phy: realtek: fix rtl8211e rx/tx
delay config"), the Realtek PHY driver will override any TX/RX delay
set by hardware straps if the phy-mode device property does not match.
This is causing problems on SynQuacer based platforms (the only SoC
that incorporates the netsec hardware), since many were built with
this Realtek PHY, and shipped with firmware that defines the phy-mode
as 'rgmii', even though the PHY is configured for TX and RX delay using
pull-ups.
From the driver's perspective, we should not make any assumptions in
the general case that the PHY hardware does not require any initial
configuration. However, the situation is slightly different for ACPI
boot, since it implies rich firmware with AML abstractions to handle
hardware details that are not exposed to the OS. So in the ACPI case,
it is reasonable to assume that the PHY comes up in the right mode,
regardless of whether the mode is set by straps, by boot time firmware
or by AML executed by the ACPI interpreter.
So let's ignore the 'phy-mode' device property when probing the netsec
driver in ACPI mode, and hardcode the mode to PHY_INTERFACE_MODE_NA,
which should work with any PHY provided that it is configured by the
time the driver attaches to it. While at it, document that omitting
the mode is permitted for DT probing as well, by setting the phy-mode
DT property to the empty string.
Fixes an error causing small packets to get dropped. skb_ensure_writable
expects the second parameter to be a length in the ethernet payload.=20
If we want to write the ethernet header (src, dst), we should pass 0.
Otherwise, packets with small payloads (< ETH_ALEN) will get dropped.
If the first packet conntrack sees after a re-register is an outgoing
keepalive packet with no data (SEG.SEQ = SND.NXT-1), td_end is set to
SND.NXT-1.
When the peer correctly acknowledges SND.NXT, tcp_in_window fails
check III (Upper bound for valid (s)ack: sack <= receiver.td_end) and
returns false, which cascades into nf_conntrack_in setting
skb->_nfct = 0 and in later conntrack iptables rules not matching.
In cases where iptables are dropping packets that do not match
conntrack rules this can result in idle tcp connections to time out.
v2: adjust td_end when getting the reply rather than when sending out
the keepalive packet.
Fixes: f94e63801ab2 ("netfilter: conntrack: reset tcp maxwin on re-register") Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
On arm64, the global variable memstart_addr represents the physical
address of PAGE_OFFSET, and so physical to virtual translations or
vice versa used to come down to simple additions or subtractions
involving the values of PAGE_OFFSET and memstart_addr.
When support for 52-bit virtual addressing was introduced, we had to
deal with PAGE_OFFSET potentially being outside of the region that
can be covered by the virtual range (as the 52-bit VA capable build
needs to be able to run on systems that are only 48-bit VA capable),
and for this reason, another translation was introduced, and recorded
in the global variable physvirt_offset.
However, if we go back to the original definition of memstart_addr,
i.e., the physical address of PAGE_OFFSET, it turns out that there is
no need for two separate translations: instead, we can simply subtract
the size of the unaddressable VA space from memstart_addr to make the
available physical memory appear in the 48-bit addressable VA region.
This simplifies things, but also fixes a bug on KASLR builds, which
may update memstart_addr later on in arm64_memblock_init(), but fails
to update vmemmap and physvirt_offset accordingly.
Per Intel's SDM, RDPID takes a #UD if it is unsupported, which is more or
less what KVM is emulating when MSR_TSC_AUX is not available. In fact,
there are no scenarios in which RDPID is supposed to #GP.
Fixes: fb6d4d340e ("KVM: x86: emulate RDPID") Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Message-Id: <1598581422-76264-1-git-send-email-robert.hu@linux.intel.com> Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
By default, the lightbar commands are set to the biggest lightbar command
and response. That length is greater than 128 bytes and may not work on
all machines. But all EC are probed for lightbar by sending a get version
request. Set that request size precisely.
When the passed token is longer than 4032 bytes, the remaining part
of the token must be copied from the rqstp->rq_arg.pages. But the
copy must make sure it happens in a consecutive way.
With the existing code, the first memcpy copies 'length' bytes from
argv->iobase, but since the header is in front, this never fills the
whole first page of in_token->pages.
The mecpy in the loop copies the following bytes, but starts writing at
the next page of in_token->pages. This leaves the last bytes of page 0
unwritten.
Symptoms were that users with many groups were not able to access NFS
exports, when using Active Directory as the KDC.
Signed-off-by: Martijn de Gouw <martijn.de.gouw@prodrive-technologies.com> Fixes: 5866efa8cbfb "SUNRPC: Fix svcauth_gss_proxy_init()" Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
pfn is not added to pfn_list when vfio_add_to_pfn_list fails.
vfio_unpin_page_external will exit directly without calling
vfio_iova_put_vfio_pfn. This will lead to a memory leak.
Fixes: a54eb55045ae ("vfio iommu type1: Add support for mediated devices") Signed-off-by: Xiaoyang Xu <xuxiaoyang2@huawei.com>
[aw: simplified logic, add Fixes] Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The eventfd context is used as our irqbypass token, therefore if an
eventfd is re-used, our token is the same. The irqbypass code will
return an -EBUSY in this case, but we'll still attempt to unregister
the producer, where if that duplicate token still exists, results in
removing the wrong object. Clear the token of failed producers so
that they harmlessly fall out when unregistered.
If userspace asked fsmap to try to count the number of entries, we cannot
return more than UINT_MAX entries because fmh_entries is u32.
Therefore, stop counting if we hit this limit or else we will waste time
to return truncated results.
Fixes: 0c9ec4beecac ("ext4: support GETFSMAP ioctls") Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Link: https://lore.kernel.org/r/20201001222148.GA49520@magnolia Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>
inline_data is mutually exclusive to DAX so enabling both of them triggers
the following issue:
------------------------------------------
# mkfs.ext4 -F -O inline_data /dev/pmem1
...
# mount /dev/pmem1 /mnt
# echo 'test' >/mnt/file
# lsattr -l /mnt/file
/mnt/file Inline_Data
# xfs_io -c "chattr +x" /mnt/file
# xfs_io -c "lsattr -v" /mnt/file
[dax] /mnt/file
# umount /mnt
# mount /dev/pmem1 /mnt
# cat /mnt/file
cat: /mnt/file: Numerical result out of range
------------------------------------------
Fixes: b383a73f2b83 ("fs/ext4: Introduce DAX inode flag") Signed-off-by: Xiao Yang <yangx.jy@cn.fujitsu.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/20200828084330.15776-1-yangx.jy@cn.fujitsu.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>
ext4_mb_discard_group_preallocations() can be releasing group lock with
preallocations accumulated on its local list. Thus although
discard_pa_seq was incremented and concurrent allocating processes will
be retrying allocations, it can happen that premature ENOSPC error is
returned because blocks used for preallocations are not available for
reuse yet. Make sure we always free locally accumulated preallocations
before releasing group lock.
Fixes: 07b5b8e1ac40 ("ext4: mballoc: introduce pcpu seqcnt for freeing PA to improve ENOSPC handling") Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20200924150959.4335-1-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>
As we test disk offline/online with running fsstress, we find fsstress
process is keeping running state.
kworker/u32:3-262 [004] ...1 140.787471: ext4_mb_discard_preallocations: dev 8,32 needed 114
....
kworker/u32:3-262 [004] ...1 140.787471: ext4_mb_discard_preallocations: dev 8,32 needed 114
As we see seq_retry is sum of discard_pa_seq every cpu, if
ext4_mb_discard_group_preallocations return zero discard_pa_seq in this
cpu maybe increase one, so condition "seq_retry != *seq" have always
been met.
Ritesh Harjani suggest to in ext4_mb_discard_group_preallocations function we
only increase discard_pa_seq when there is some PA to free.
Fixes: 07b5b8e1ac40 ("ext4: mballoc: introduce pcpu seqcnt for freeing PA to improve ENOSPC handling") Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Ritesh Harjani <riteshh@linux.ibm.com> Link: https://lore.kernel.org/r/20200916113859.1556397-3-yebin10@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>
Since commit 269a535ca931 ("modpost: generate vmlinux.symvers and
reuse it for the second modpost"), with CONFIG_MODULES disabled,
"make deb-pkg" (or "make bindeb-pkg") fails with:
find: ‘Module.symvers’: No such file or directory
If CONFIG_MODULES is disabled, it doesn't really make sense to build
the linux-headers package.
Fixes: 269a535ca931 ("modpost: generate vmlinux.symvers and reuse it for the second modpost") Reported-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
In the implementation of bcm2835_register_pll(), the allocated pll is
leaked if devm_clk_hw_register() fails to register hw. Release pll if
devm_clk_hw_register() fails.
Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com> Link: https://lore.kernel.org/r/20200809231202.15811-1-navid.emamdoost@gmail.com Fixes: 41691b8862e2 ("clk: bcm2835: Add support for programming the audio domain clocks") Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
SAMA5D2 datasheet specifies on chapter 33.22.8 (PMC Clock Generator
Main Oscillator Register) that writing any value other than
0x37 on KEY field aborts the write operation. Use the key when
selecting main clock parent.
Corentin hit the following workqueue warning when running with
CRYPTO_MANAGER_EXTRA_TESTS:
WARNING: CPU: 2 PID: 147 at kernel/workqueue.c:1473 __queue_work+0x3b8/0x3d0
Modules linked in: ghash_generic
CPU: 2 PID: 147 Comm: modprobe Not tainted 5.6.0-rc1-next-20200214-00068-g166c9264f0b1-dirty #545
Hardware name: Pine H64 model A (DT)
pc : __queue_work+0x3b8/0x3d0
Call trace:
__queue_work+0x3b8/0x3d0
queue_work_on+0x6c/0x90
do_init_module+0x188/0x1f0
load_module+0x1d00/0x22b0
I wasn't able to reproduce on x86 or rpi 3b+.
This is
WARN_ON(!list_empty(&work->entry))
from __queue_work(), and it happens because the init_free_wq work item
isn't initialized in time for a crypto test that requests the gcm
module. Some crypto tests were recently moved earlier in boot as
explained in commit c4741b230597 ("crypto: run initcalls for generic
implementations earlier"), which went into mainline less than two weeks
before the Fixes commit.
Avoid the warning by statically initializing init_free_wq and the
corresponding llist.
Link: https://lore.kernel.org/lkml/20200217204803.GA13479@Red/ Fixes: 1a7b7d922081 ("modules: Use vmalloc special flag") Reported-by: Corentin Labbe <clabbe.montjoie@gmail.com> Tested-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Tested-on: sun50i-h6-pine-h64
Tested-on: imx8mn-ddr4-evk
Tested-on: sun50i-a64-bananapi-m64 Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: Jessica Yu <jeyu@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
We can get down to this return value from ERR_CAST() without
initializing hw. Set it to -ENOMEM so that we always return something
sane.
Fixes the following smatch warning:
drivers/clk/rockchip/clk-half-divider.c:228 rockchip_clk_register_halfdiv() error: uninitialized symbol 'hw'.
drivers/clk/rockchip/clk-half-divider.c:228 rockchip_clk_register_halfdiv() warn: passing zero to 'ERR_CAST'
Cc: Elaine Zhang <zhangqing@rock-chips.com> Cc: Heiko Stuebner <heiko@sntech.de> Fixes: 956060a52795 ("clk: rockchip: add support for half divider") Reviewed-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
pci_restore_msi_state() directly writes the MSI/MSI-X related registers
via MMIO. On a physical machine, this works perfectly; for a Linux VM
running on a hypervisor, which typically enables IOMMU interrupt remapping,
the hypervisor usually should trap and emulate the MMIO accesses in order
to re-create the necessary interrupt remapping table entries in the IOMMU,
otherwise the interrupts can not work in the VM after hibernation.
Hyper-V is different from other hypervisors in that it does not trap and
emulate the MMIO accesses, and instead it uses a para-virtualized method,
which requires the VM to call hv_compose_msi_msg() to notify the hypervisor
of the info that would be passed to the hypervisor in the case of the
trap-and-emulate method. This is not an issue to a lot of PCI device
drivers, which destroy and re-create the interrupts across hibernation, so
hv_compose_msi_msg() is called automatically. However, some PCI device
drivers (e.g. the in-tree GPU driver nouveau and the out-of-tree Nvidia
proprietary GPU driver) do not destroy and re-create MSI/MSI-X interrupts
across hibernation, so hv_pci_resume() has to call hv_compose_msi_msg(),
otherwise the PCI device drivers can no longer receive interrupts after
the VM resumes from hibernation.
Hyper-V is also different in that chip->irq_unmask() may fail in a
Linux VM running on Hyper-V (on a physical machine, chip->irq_unmask()
can not fail because unmasking an MSI/MSI-X register just means an MMIO
write): during hibernation, when a CPU is offlined, the kernel tries
to move the interrupt to the remaining CPUs that haven't been offlined
yet. In this case, hv_irq_unmask() -> hv_do_hypercall() always fails
because the vmbus channel has been closed: here the early "return" in
hv_irq_unmask() means the pci_msi_unmask_irq() is not called, i.e. the
desc->masked remains "true", so later after hibernation, the MSI interrupt
always remains masked, which is incorrect. Refer to cpu_disable_common()
-> fixup_irqs() -> irq_migrate_all_off_this_cpu() -> migrate_one_irq():
Fix the issue by calling pci_msi_unmask_irq() unconditionally in
hv_irq_unmask(). Also suppress the error message for hibernation because
the hypercall failure during hibernation does not matter (at this time
all the devices have been frozen). Note: the correct affinity info is
still updated into the irqdata data structure in migrate_one_irq() ->
irq_do_set_affinity() -> hv_set_affinity(), so later when the VM
resumes, hv_pci_restore_msi_state() is able to correctly restore
the interrupt with the correct affinity.
Currently when pointer scp is null a dev_err is being called that
references the pointer which is the very thing we are trying to
avoid doing. Remove the extraneous error message to avoid this
issue.
Addresses-Coverity: ("Dereference after null check") Fixes: 63c13d61eafe ("remoteproc/mediatek: add SCP support for mt8183") Signed-off-by: Colin Ian King <colin.king@canonical.com> Link: https://lore.kernel.org/r/20200918152428.27258-1-colin.king@canonical.com Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
dev_get_drvdata() is called in img_pwm_runtime_resume() before the
driver data is set.
When pm_runtime_enabled() returns false in img_pwm_probe() it calls
img_pwm_runtime_resume() which results in a null pointer access.
This patch fixes the problem by setting the driver data earlier in the
img_pwm_probe() function.
This crash was seen when booting the Imagination Technologies Creator
Ci40 (Marduk) with kernel 5.4 in OpenWrt.
Following commit cfc4c189bc70 ("pwm: Read initial hardware state at
request time") the Rockchip PWM driver can no longer assume a device's
pwm_state structure has been populated after a call to pwmchip_add().
Consequently, the test in rockchip_pwm_probe() intended to prevent the
driver from stopping PWM devices already enabled by the bootloader no
longer functions reliably and this can lead to the kernel hanging
during startup, particularly on devices like the Pinebook Pro that use
a PWM-controlled backlight for their display.
Avoid this by querying the device directly at probe time to determine
whether or not it is enabled.
Fixes: cfc4c189bc70 ("pwm: Read initial hardware state at request time") Signed-off-by: Simon South <simon@simonsouth.net> Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Thierry Reding <thierry.reding@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The DT clock probe loop incorrectly terminates after processing "clocks"
only, fix this by re-starting the loop when all entries for current
DT property have been parsed.
Fixes: 8e48b33f9def ("clk: keystone: sci-clk: probe clocks from DT instead of firmware") Reported-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Signed-off-by: Tero Kristo <t-kristo@ti.com> Link: https://lore.kernel.org/r/20200907085740.1083-2-t-kristo@ti.com Acked-by: Santosh Shilimkar <ssantosh@kernel.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
While it is true that devices with is_virtfn=1 will have a Memory Space
Enable bit that is hard-wired to 0, this is not the only case where we
see this behavior -- For example some bare-metal hypervisors lack
Memory Space Enable bit emulation for devices not setting is_virtfn
(s390). Fix this by instead checking for the newly-added
no_command_memory bit which directly denotes the need for
PCI_COMMAND_MEMORY emulation in vfio.
Fixes: abafbc551fdd ("vfio-pci: Invalidate mmaps and block MMIO access on disabled memory") Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
For s390 we can have VFs that are passed-through without the associated
PF. Firmware provides an emulation layer to allow these devices to
operate independently, but is missing emulation of the Memory Space
Enable bit. For these as well as linked VFs, set no_command_memory
which specifies these devices do not implement PCI_COMMAND_MEMORY.
Fixes: abafbc551fdd ("vfio-pci: Invalidate mmaps and block MMIO access on disabled memory") Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Page pinning is used both to translate and pin device mappings for DMA
purpose, as well as to indicate to the IOMMU backend to limit the dirty
page scope to those pages that have been pinned, in the case of an IOMMU
backed device.
To support this, the vfio_pin_pages() interface limits itself to only
singleton groups such that the IOMMU backend can consider dirty page
scope only at the group level. Implement the same requirement for the
vfio_group_pin_pages() interface.
Fixes: 95fc87b44104 ("vfio: Selective dirty page tracking if IOMMU backed device pins pages") Signed-off-by: Yan Zhao <yan.y.zhao@intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The core interrupt code expects the irq_set_affinity call to update the
effective affinity for the interrupt. This was not being done, so update
iproc_msi_irq_set_affinity() to do so.
Link: https://lore.kernel.org/r/20200803035241.7737-1-mark.tomlinson@alliedtelesis.co.nz Fixes: 3bc2b2348835 ("PCI: iproc: Add iProc PCIe MSI support") Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Ray Jui <ray.jui@broadcom.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Include linux/gpio/consumer.h instead of linux/gpio.h, as is said in the
latter file.
This was reported by kernel test bot when compiling for s390.
drivers/pci/controller/pci-aardvark.c:350:2: error: implicit declaration of function 'gpiod_set_value_cansleep' [-Werror,-Wimplicit-function-declaration]
drivers/pci/controller/pci-aardvark.c:1074:21: error: implicit declaration of function 'devm_gpiod_get_from_of_node' [-Werror,-Wimplicit-function-declaration]
drivers/pci/controller/pci-aardvark.c:1076:14: error: use of undeclared identifier 'GPIOD_OUT_LOW'
On Amlogic Meson G12b platform, similar to fclk_div3, the fclk_div2
seems to be necessary for the system to operate correctly as well.
Typically, the clock also gets chosen by the eMMC peripheral. This
probably masked the problem so far. However, when booting from a SD
card the clock seems to get disabled which leads to a system freeze.
Let's mark this clock as critical, fixing boot from SD card on G12b
platforms.
The i2c-rcar driver utilizes the Generic Reset Controller kernel
feature, so select the RESET_CONTROLLER option when the I2C_RCAR
option is selected with a Gen3 SoC.
Fixes: 2b16fd63059ab9 ("i2c: rcar: handle RXDMA HW behaviour on Gen3") Signed-off-by: Dirk Behme <dirk.behme@de.bosch.com> Signed-off-by: Andy Lowe <andy_lowe@mentor.com>
[erosca: Add "if ARCH_RCAR_GEN3" per Wolfram's request] Signed-off-by: Eugeniu Rosca <erosca@de.adit-jv.com> Signed-off-by: Wolfram Sang <wsa@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
There are more differences than what we initially thought.
Let's keeps things clear and separate the axg and g12a regmap tables of the
audio clock controller.
If the txdone is done by polling, it is possible for msg_submit() to start
the timer while txdone_hrtimer() callback is running. If the timer needs
recheduling, it could already be enqueued by the time hrtimer_forward_now()
is called, leading hrtimer to loudly complain.
This can be fixed by not starting the timer from the callback path. Which
requires the timer reloading as long as any message is queued on the
channel, and not just when current tx is not done yet.
rio_dma_transfer() attempts to clamp the return value of
pin_user_pages_fast() to be >= 0. However, the attempt fails because
nr_pages is overridden a few lines later, and restored to the undesirable
-ERRNO value.
The return value is ultimately stored in nr_pages, which in turn is passed
to unpin_user_pages(), which expects nr_pages >= 0, else, disaster.
Fix this by fixing the nesting of the assignment to nr_pages: nr_pages
should be clamped to zero if pin_user_pages_fast() returns -ERRNO, or set
to the return value of pin_user_pages_fast(), otherwise.
[jhubbard@nvidia.com: new changelog]
Fixes: e8de370188d09 ("rapidio: add mport char device driver") Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Alexandre Bounine <alex.bou9@gmail.com> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: Madhuparna Bhowmik <madhuparnabhowmik10@gmail.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lkml.kernel.org/r/1600227737-20785-1-git-send-email-jrdr.linux@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
ramfs needs to check that pages are both physically contiguous and
contiguous in the file. If the page cache happens to have, eg, page A for
index 0 of the file, no page for index 1, and page A+1 for index 2, then
an mmap of the first two pages of the file will succeed when it should
fail.
Fixes: 642fb4d1f1dd ("[PATCH] NOMMU: Provide shared-writable mmap support on ramfs") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: David Howells <dhowells@redhat.com> Link: https://lkml.kernel.org/r/20200914122239.GO6583@casper.infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>