kabylake_ssp_fixup function uses snd_soc_dpcm to identify the
codecs DAIs. The HW parameters are changed based on the codec DAI of the
stream. The earlier approach to get snd_soc_dpcm was using container_of()
macro on snd_pcm_hw_params.
The structures have been modified over time and snd_soc_dpcm does not have
snd_pcm_hw_params as a reference but as a copy. This causes the current
driver to crash when used.
This patch changes the way snd_soc_dpcm is extracted. snd_soc_pcm_runtime
holds 2 dpcm instances (one for playback and one for capture). 2 codecs
on the SSP are dmic (capture) and speakers (playback). Based on the
stream direction, snd_soc_dpcm is extracted from snd_soc_pcm_runtime.
Tested for all use cases of the driver.
Based on similar fix in kbl_rt5663_rt5514_max98927.c
from Harsha Priya <harshapriya.n@intel.com> and
Vamshi Krishna Gopal <vamshi.krishna.gopal@intel.com>
sound/soc/samsung/tm2_wm5110.c:605:6: style: Variable 'ret' is
reassigned a value before the old one has been
used. [redundantAssignment]
ret = devm_snd_soc_register_component(dev, &tm2_component,
^
sound/soc/samsung/tm2_wm5110.c:554:7: note: ret is assigned
ret = of_parse_phandle_with_args(dev->of_node, "i2s-controller",
^
sound/soc/samsung/tm2_wm5110.c:605:6: note: ret is overwritten
ret = devm_snd_soc_register_component(dev, &tm2_component,
^
The args is a stack variable, so it could have junk (uninitialized)
therefore args.np could have a non-NULL and random value even though
property was missing. Later could trigger invalid pointer dereference.
There's no need to check for args.np because args.np won't be
initialized on errors.
Fixes: 8d1513cef51a ("ASoC: samsung: Add support for HDMI audio on TM2 board") Cc: <stable@vger.kernel.org> Suggested-by: Krzysztof Kozlowski <krzk@kernel.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Reviewed-by: Sylwester Nawrocki <s.nawrocki@samsung.com> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Link: https://lore.kernel.org/r/20210312180231.2741-2-pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When the USB headset is plug into an external hub, sometimes
can't set config due to not enough bandwidth, so need improve
LS/FS INT/ISOC bandwidth scheduling with TT.
Side effect may happen if use or operator to set schedule parameters
when the parameters are already set before. Set them directly due to
other bits are reserved.
The XR21V141X does not have a 5- or 6-bit mode, but the current
implementation failed to properly restore the old setting when CS5 or
CS6 was requested. Instead an invalid request would be sent to the
device.
Fixes: c2d405aa86b4 ("USB: serial: add MaxLinear/Exar USB to Serial driver") Reviewed-by: Manivannan Sadhasivam <mani@kernel.org> Cc: stable@vger.kernel.org # 5.12 Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
power_supply_changed needs to be called to notify clients
after the partner accepts the requested values for the pps
case.
Also, remove the redundant power_supply_changed at the end
of the tcpm_reset_port as power_supply_changed is already
called right after usb_type is changed.
tcpm_pd_select_pps_apdo overwrites port->pps_data.min_volt,
port->pps_data.max_volt, port->pps_data.max_curr even before
port partner accepts the requests. This leaves incorrect values
in current_limit and supply_voltage that get exported by
"tcpm-source-psy-". Solving this problem by caching the request
values in req_min_volt, req_max_volt, req_max_curr, req_out_volt,
req_op_curr. min_volt, max_volt, max_curr gets updated once the
partner accepts the request. current_limit, supply_voltage gets updated
once local port's tcpm enters SNK_TRANSITION_SINK when the accepted
current_limit and supply_voltage is enforced.
tcpm_pd_build_request overwrites current_limit and supply_voltage
even before port partner accepts the requests. This leaves stale
values in current_limit and supply_voltage that get exported by
"tcpm-source-psy-". Solving this problem by caching the request
values of current limit/supply voltage in req_current_limit
and req_supply_voltage. current_limit/supply_voltage gets updated
once the port partner accepts the request.
The port close_delay and closing wait parameters set by TIOCSSERIAL are
specified in jiffies, while the values returned by TIOCGSERIAL are
specified in centiseconds.
Add the missing conversions so that TIOCSSERIAL works as expected also
when HZ is not 100.
usb_role_switch_find_by_fwnode() returns a reference to the role-switch
which must be put by calling usb_role_switch_put().
usb_role_switch_put() calls module_put(sw->dev.parent->driver->owner),
add a matching try_module_get() to usb_role_switch_find_by_fwnode(),
making it behave the same as the other usb_role_switch functions
which return a reference.
This avoids a WARN_ON being hit at kernel/module.c:1158 due to the
module-refcount going below 0.
Fixes: c6919d5e0cd1 ("usb: roles: Add usb_role_switch_find_by_fwnode()") Cc: stable <stable@vger.kernel.org> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20210409124136.65591-1-hdegoede@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The offending commit claimed that trying to set the values reported back
by TIOCGSERIAL as a regular user could result in an -EPERM error when HZ
is 250, but that was never the case.
With HZ=250, the default 0.5 second value of close_delay is converted to
125 jiffies when set and is converted back to 50 centiseconds by
TIOCGSERIAL as expected (not 12 cs as was claimed, even if that was the
case before an earlier fix).
Comparing the internal current and new jiffies values is just fine to
determine if the value is about to change so drop the bogus workaround
(which was also backported to stable).
For completeness: With different default values for these parameters or
with a HZ value not divisible by two, the lack of rounding when setting
the default values in tty_port_init() could result in an -EPERM being
returned, but this is hardly something we need to worry about.
Cc: Anthony Mallet <anthony.mallet@laas.fr> Cc: stable@vger.kernel.org Acked-by: Oliver Neukum <oneukum@suse.com> Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20210408131602.27956-2-johan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commits 8a4cd82d ("nfc: fix refcount leak in llcp_sock_connect()")
and c33b1cc62 ("nfc: fix refcount leak in llcp_sock_bind()")
fixed a refcount leak bug in bind/connect but introduced a
use-after-free if the same local is assigned to 2 different sockets.
This can be triggered by the following simple program:
int sock1 = socket( AF_NFC, SOCK_STREAM, NFC_SOCKPROTO_LLCP );
int sock2 = socket( AF_NFC, SOCK_STREAM, NFC_SOCKPROTO_LLCP );
memset( &addr, 0, sizeof(struct sockaddr_nfc_llcp) );
addr.sa_family = AF_NFC;
addr.nfc_protocol = NFC_PROTO_NFC_DEP;
bind( sock1, (struct sockaddr*) &addr, sizeof(struct sockaddr_nfc_llcp) )
bind( sock2, (struct sockaddr*) &addr, sizeof(struct sockaddr_nfc_llcp) )
close(sock1);
close(sock2);
Fix this by assigning NULL to llcp_sock->local after calling
nfc_llcp_local_put.
This addresses CVE-2021-23134.
Reported-by: Or Cohen <orcohen@paloaltonetworks.com> Reported-by: Nadav Markus <nmarkus@paloaltonetworks.com> Fixes: c33b1cc62 ("nfc: fix refcount leak in llcp_sock_bind()") Signed-off-by: Or Cohen <orcohen@paloaltonetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is a possible race condition vulnerability between issuing a HCI
command and removing the cont. Specifically, functions hci_req_sync()
and hci_dev_do_close() can race each other like below:
thread-A in hci_req_sync() | thread-B in hci_dev_do_close()
| hci_req_sync_lock(hdev);
test_bit(HCI_UP, &hdev->flags); |
... | test_and_clear_bit(HCI_UP, &hdev->flags)
hci_req_sync_lock(hdev); |
|
In this commit we alter the sequence in function hci_req_sync(). Hence,
the thread-A cannot issue th.
Signed-off-by: Lin Ma <linma@zju.edu.cn> Cc: Marcel Holtmann <marcel@holtmann.org> Fixes: 7c6a329e4447 ("[Bluetooth] Fix regression from using default link policy") Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
hci_chan can be created in 2 places: hci_loglink_complete_evt() if
it is an AMP hci_chan, or l2cap_conn_add() otherwise. In theory,
Only AMP hci_chan should be removed by a call to
hci_disconn_loglink_complete_evt(). However, the controller might mess
up, call that function, and destroy an hci_chan which is not initiated
by hci_loglink_complete_evt().
This patch adds a verification that the destroyed hci_chan must have
been init'd by hci_loglink_complete_evt().
Memory state around the buggy address: ffff8881d7af9080: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb ffff8881d7af9100: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
>ffff8881d7af9180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^ ffff8881d7af9200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff8881d7af9280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
The tz->lock must be hold during the looping over the instances in that
thermal zone. This lock was missing in the governor code since the
beginning, so it's hard to point into a particular commit.
CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210422153624.6074-2-lukasz.luba@arm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Slab OOB issue is scanned by KASAN in cpu_power_to_freq().
If power is limited below the power of OPP0 in EM table,
it will cause slab out-of-bound issue with negative array
index.
Return the lowest frequency if limited power cannot found
a suitable OPP in EM table to fix this issue.
Fixes: 371a3bc79c11b ("thermal/drivers/cpufreq_cooling: Fix wrong frequency converted from power") Signed-off-by: brian-sy yang <brian-sy.yang@mediatek.com> Signed-off-by: Michael Kao <michael.kao@mediatek.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Cc: stable@vger.kernel.org #v5.7 Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20201229050831.19493-1-michael.kao@mediatek.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 9af7706492f9 ("lib/vsprintf: Remove support for %pF and %pf in
favour of %pS and %ps") removed support for %pF and %pf, and correctly
removed the handling of those cases in vbin_printf(). However, the
corresponding cases in bstr_printf() were left behind.
In the same series, %pf was re-purposed for dealing with
fwnodes (3bd32d6a2ee6, "lib/vsprintf: Add %pfw conversion specifier
for printing fwnode names").
So should anyone use %pf with the binary printf routines,
vbin_printf() would (correctly, as it involves dereferencing the
pointer) do the string formatting to the u32 array, but bstr_printf()
would not copy the string from the u32 array, but instead interpret
the first sizeof(void*) bytes of the formatted string as a pointer -
which generally won't end well (also, all subsequent get_args would be
out of sync).
Fixes: 9af7706492f9 ("lib/vsprintf: Remove support for %pF and %pf in favour of %pS and %ps") Cc: stable@vger.kernel.org Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20210423094529.1862521-1-linux@rasmusvillemoes.dk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When loading a device-mapper table for a request-based mapped device,
and the allocation/initialization of the blk_mq_tag_set for the device
fails, a following device remove will cause a double free.
When allocation/initialization of the blk_mq_tag_set fails in
dm_mq_init_request_queue(), it is uninitialized/freed, but the pointer
is not reset to NULL; so when dev_remove() later gets into
dm_mq_cleanup_mapped_device() it sees the pointer and tries to
uninitialize and free it again.
Fix this by setting the pointer to NULL in dm_mq_init_request_queue()
error-handling. Also set it to NULL in dm_mq_cleanup_mapped_device().
Cc: <stable@vger.kernel.org> # 4.6+ Fixes: 1c357a1e86a4 ("dm: allocate blk_mq_tag_set rather than embed in mapped_device") Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This division bug meant the search for free metadata space could skip
the final allocation bitmap's worth of entries. Fix affects DM thinp,
cache and era targets.
Cc: stable@vger.kernel.org Signed-off-by: Joe Thornber <ejt@redhat.com> Tested-by: Ming-Hung Tsai <mtsai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It was reported that a fix to the ring buffer recursion detection would
cause a hung machine when performing suspend / resume testing. The
following backtrace was extracted from debugging that case:
Since the fix to the recursion detection would allow a single recursion to
happen while tracing, this lead to the trace_clock_global() taking a spin
lock and then trying to take it again:
ring_buffer_lock_reserve() {
trace_clock_global() {
arch_spin_lock() {
queued_spin_lock_slowpath() {
/* lock taken */
(something else gets traced by function graph tracer)
ring_buffer_lock_reserve() {
trace_clock_global() {
arch_spin_lock() {
queued_spin_lock_slowpath() {
/* DEAD LOCK! */
Tracing should *never* block, as it can lead to strange lockups like the
above.
Restructure the trace_clock_global() code to instead of simply taking a
lock to update the recorded "prev_time" simply use it, as two events
happening on two different CPUs that calls this at the same time, really
doesn't matter which one goes first. Use a trylock to grab the lock for
updating the prev_time, and if it fails, simply try again the next time.
If it failed to be taken, that means something else is already updating
it.
Link: https://lkml.kernel.org/r/20210430121758.650b6e8a@gandalf.local.home Cc: stable@vger.kernel.org Tested-by: Konstantin Kharlamov <hi-angel@yandex.ru> Tested-by: Todd Brandt <todd.e.brandt@linux.intel.com> Fixes: b02414c8f045 ("ring-buffer: Fix recursion protection transitions between interrupt context") # started showing the problem Fixes: 14131f2f98ac3 ("tracing: implement trace_clock_*() APIs") # where the bug happened
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=212761 Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The default max PID is set by PID_MAX_DEFAULT, and the tracing
infrastructure uses this number to map PIDs to the comm names of the
tasks, such output of the trace can show names from the recorded PIDs in
the ring buffer. This mapping is also exported to user space via the
"saved_cmdlines" file in the tracefs directory.
But currently the mapping expects the PIDs to be less than
PID_MAX_DEFAULT, which is the default maximum and not the real maximum.
Recently, systemd will increases the maximum value of a PID on the system,
and when tasks are traced that have a PID higher than PID_MAX_DEFAULT, its
comm is not recorded. This leads to the entire trace to have "<...>" as
the comm name, which is pretty useless.
Instead, keep the array mapping the size of PID_MAX_DEFAULT, but instead
of just mapping the index to the comm, map a mask of the PID
(PID_MAX_DEFAULT - 1) to the comm, and find the full PID from the
map_cmdline_to_pid array (that already exists).
This bug goes back to the beginning of ftrace, but hasn't been an issue
until user space started increasing the maximum value of PIDs.
The idx_to_offset() function returns type int (32-bit signed), but
MSR_PKG_ENERGY_STAT is u32 and would be interpreted as a negative number.
The end result is that it hits the if (offset < 0) check in update_msr_sum()
which prevents the timer callback from updating the stat in the background when
long durations are used. The similar issue exists in offset_to_idx() and
update_msr_sum(). Fix this issue by converting the 'int' to 'off_t' accordingly.
Fixes: 9972d5d84d76 ("tools/power turbostat: Enable accumulate RAPL display") Signed-off-by: Calvin Walton <calvin.walton@kepstin.ca> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The rsi_resume() does access the bus to enable interrupts on the RSI
SDIO WiFi card, however when calling sdio_claim_host() in the resume
path, it is possible the bus is already claimed and sdio_claim_host()
spins indefinitelly. Enable the SDIO card interrupts in resume_noirq
instead to prevent anything else from claiming the SDIO bus first.
Fixes: 20db07332736 ("rsi: sdio suspend and resume support") Signed-off-by: Marek Vasut <marex@denx.de> Cc: Amitkumar Karwar <amit.karwar@redpinesignals.com> Cc: Angus Ainslie <angus@akkea.ca> Cc: David S. Miller <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Kalle Valo <kvalo@codeaurora.org> Cc: Karun Eagalapati <karun256@gmail.com> Cc: Martin Kepplinger <martink@posteo.de> Cc: Sebastian Krzyszkowiak <sebastian.krzyszkowiak@puri.sm> Cc: Siva Rebbagondla <siva8118@gmail.com> Cc: netdev@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20210327235932.175896-1-marex@denx.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
syzbot reported memory leak in tty/vt.
The problem was in VT_DISALLOCATE ioctl cmd.
After allocating unimap with PIO_UNIMAP it wasn't
freed via VT_DISALLOCATE, but vc_cons[currcons].d was
zeroed.
dw_pcie_ep_init() depends on the detected iATU region numbers to allocate
the in/outbound window management bitmap. It fails after 281f1f99cf3a
("PCI: dwc: Detect number of iATU windows").
Move the iATU region detection into a new function, move the detection to
the very beginning of dw_pcie_host_init() and dw_pcie_ep_init(). Also
remove it from the dw_pcie_setup(), since it's more like a software
initialization step than hardware setup.
Link: https://lore.kernel.org/r/20210125044803.4310-1-Zhiqiang.Hou@nxp.com Link: https://lore.kernel.org/linux-pci/20210407131255.702054-1-dmitry.baryshkov@linaro.org Link: https://lore.kernel.org/r/20210413142219.2301430-1-dmitry.baryshkov@linaro.org Fixes: 281f1f99cf3a ("PCI: dwc: Detect number of iATU windows") Tested-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Tested-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com>
[DB: moved dw_pcie_iatu_detect to happen after host_init callback] Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Rob Herring <robh@kernel.org> Cc: stable@vger.kernel.org # v5.11+ Cc: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
According to the programming guide, to switch mode for DRD controller,
the driver needs to do the following.
To switch from device to host:
1. Reset controller with GCTL.CoreSoftReset
2. Set GCTL.PrtCapDir(host mode)
3. Reset the host with USBCMD.HCRESET
4. Then follow up with the initializing host registers sequence
To switch from host to device:
1. Reset controller with GCTL.CoreSoftReset
2. Set GCTL.PrtCapDir(device mode)
3. Reset the device with DCTL.CSftRst
4. Then follow up with the initializing registers sequence
Currently we're missing step 1) to do GCTL.CoreSoftReset and step 3) of
switching from host to device. John Stult reported a lockup issue seen
with HiKey960 platform without these steps[1]. Similar issue is observed
with Ferry's testing platform[2].
So, apply the required steps along with some fixes to Yu Chen's and John
Stultz's version. The main fixes to their versions are the missing wait
for clocks synchronization before clearing GCTL.CoreSoftReset and only
apply DCTL.CSftRst when switching from host to device.
The START_TRANSFER command needs to be executed while in ON/U0 link
state (with an exception during register initialization). Don't use
dwc->link_state to check this since the driver only tracks the link
state when the link state change interrupt is enabled. Check the link
state from DSTS register instead.
Note that often the host already brings the device out of low power
before it sends/requests the next transfer. So, the user won't see any
issue when the device starts transfer then. This issue is more
noticeable in cases when the device delays starting transfer, which can
happen during delayed control status after the host put the device in
low power.
The programming guide incorrectly stated that the DCFG.bInterval_m1 must
be set to 0 when operating in fullspeed. There's no such limitation for
all IPs. See DWC_usb3x programming guide section 3.2.2.1.
Fixes bug with the handling of more than one language in
the string table in f_fs.c.
str_count was not reset for subsequent language codes.
str_count-- "rolls under" and processes u32 max strings on
the processing of the second language entry.
The existing bug can be reproduced by adding a second language table
to the structure "strings" in tools/usb/ffs-test.c.
Upon driver unbind usb_free_all_descriptors() function frees all
speed descriptor pointers without setting them to NULL. In case
gadget speed changes (i.e from super speed plus to super speed)
after driver unbind only upto super speed descriptor pointers get
populated. Super speed plus desc still holds the stale (already
freed) pointer. Fix this issue by setting all descriptor pointers
to NULL after freeing them in usb_free_all_descriptors().
Fixes: f5c61225cf29 ("usb: gadget: Update function for SuperSpeedPlus")
cc: stable@vger.kernel.org Reviewed-by: Peter Chen <peter.chen@kernel.org> Signed-off-by: Hemant Kumar <hemantk@codeaurora.org> Signed-off-by: Wesley Cheng <wcheng@codeaurora.org> Link: https://lore.kernel.org/r/1619034452-17334-1-git-send-email-wcheng@codeaurora.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fix a general protection fault reported by syzbot due to a race between
gadget_setup() and gadget_unbind() in raw_gadget.
The gadget core is supposed to guarantee that there won't be any more
callbacks to the gadget driver once the driver's unbind routine is
called. That guarantee is enforced in usb_gadget_remove_driver as
follows:
usb_gadget_disconnect(udc->gadget);
if (udc->gadget->irq)
synchronize_irq(udc->gadget->irq);
udc->driver->unbind(udc->gadget);
usb_gadget_udc_stop(udc);
usb_gadget_disconnect turns off the pullup resistor, telling the host
that the gadget is no longer connected and preventing the transmission
of any more USB packets. Any packets that have already been received
are sure to processed by the UDC driver's interrupt handler by the time
synchronize_irq returns.
But this doesn't work with dummy_hcd, because dummy_hcd doesn't use
interrupts; it uses a timer instead. It does have code to emulate the
effect of synchronize_irq, but that code doesn't get invoked at the
right time -- it currently runs in usb_gadget_udc_stop, after the unbind
callback instead of before. Indeed, there's no way for
usb_gadget_remove_driver to invoke this code before the unbind callback.
To fix this, move the synchronize_irq() emulation code to dummy_pullup
so that it runs before unbind. Also, add a comment explaining why it is
necessary to have it there.
syzkaller identified KASAN: null-ptr-deref Write in
io_uring_cancel_sqpoll.
io_uring_cancel_sqpoll is called by io_sq_thread before calling
io_uring_alloc_task_context. This leads to current->io_uring being NULL.
io_uring_cancel_sqpoll should not have to deal with threads where
current->io_uring is NULL.
In order to cast a wider safety net, perform input sanitisation directly
in io_uring_cancel_sqpoll and return for NULL value of current->io_uring.
This is safe since if current->io_uring isn't set, then there's no way
for the task to have submitted any requests.
After closing an SQPOLL ring, io_ring_exit_work() kicks in and starts
doing cancellations via io_uring_try_cancel_requests(). It will go
through io_uring_try_cancel_iowq(), which uses ctx->tctx_list, but as
SQPOLL task don't have a ctx note, its io-wq won't be reachable and so
is left not cancelled.
It will eventually cancelled when one of the tasks dies, but if a thread
group survives for long and changes rings, it will spawn lots of
unreclaimed resources and live locked works.
Cancel SQPOLL task's io-wq separately in io_ring_exit_work().
[ 736.982891] INFO: task iou-sqp-4294:4295 blocked for more than 122 seconds.
[ 736.982897] Call Trace:
[ 736.982901] schedule+0x68/0xe0
[ 736.982903] io_uring_cancel_sqpoll+0xdb/0x110
[ 736.982908] io_sqpoll_cancel_cb+0x24/0x30
[ 736.982911] io_run_task_work_head+0x28/0x50
[ 736.982913] io_sq_thread+0x4e3/0x720
We call io_uring_cancel_sqpoll() one by one for each ctx either in
sq_thread() itself or via task works, and it's intended to cancel all
requests of a specified context. However the function uses per-task
counters to track the number of inflight requests, so it counts more
requests than available via currect io_uring ctx and goes to sleep for
them to appear (e.g. from IRQ), that will never happen.
Cancel a bit more than before, i.e. all ctxs that share sqpoll
and continue to use shared counters. Don't forget that we should not
remove ctx from the list before running that task_work sqpoll-cancel,
otherwise the function wouldn't be able to find the context and will
hang.
SQPOLL task won't submit requests for a context that is currently dying,
so no need to remove ctx from sqd_list prior the main loop of
io_ring_exit_work(). Kill it, will be removed by io_sq_thread_finish()
and only brings confusion and lockups.
The inst function argument is != NULL only for Venus v1 and
we did not migrate v1 to a hfi_platform abstraction yet. So
check for instance != NULL only after hfi_platform_get returns
no error.
The Venus v1 behaves differently comparing with the other Venus
version in respect to capability parsing and when they are send
to the driver. So we don't need to initialize hfi parser for
multiple invocations like what we do for > v1 Venus versions.
It is observed that on Venus v1 the default header-mode is producing
a bitstream which is not playble. Change the default header-mode to
joined with 1st frame.
The rate of the core clock is set through devm_pm_opp_set_rate and
to avoid errors from it we have to set the name of the clock via
dev_pm_opp_set_clkname.
Commit b2d3bef1aa78 ("media: coda: Add a V4L2 user for control error
macroblocks count") add the control for the decoder devices. But
during streamon() this ioctl gets called for all (encoder and decoder)
devices and on encoder devices this causes a null pointer exception.
Fix this by setting the control only if it is really accessible.
Fixes: b2d3bef1aa78 ("media: coda: Add a V4L2 user for control error macroblocks count") Signed-off-by: Marco Felsch <m.felsch@pengutronix.de> Cc: <stable@vger.kernel.org> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When controls are used together with the Request API, then for
each request a v4l2_ctrl_handler struct is allocated. This contains
the controls that can be set in a request. If a control is *not* set in
the request, then the value used in the most recent previous request
must be used, or the current value if it is not found in any outstanding
requests.
The framework tried to find such a previous request and it would set
the 'req' pointer in struct v4l2_ctrl_ref to the v4l2_ctrl_ref of the
control in such a previous request. So far, so good. However, when that
previous request was applied to the hardware, returned to userspace, and
then userspace would re-init or free that request, any 'ref' pointer in
still-queued requests would suddenly point to freed memory.
This was not noticed before since the drivers that use this expected
that each request would always have the controls set, so there was
never any need to find a control in older requests. This requirement
was relaxed, and now this bug surfaced.
It was also made worse by changeset 2fae4d6aabc8 ("media: v4l2-ctrls: v4l2_ctrl_request_complete() should always set ref->req")
which increased the chance of this happening.
The use of the 'req' pointer in v4l2_ctrl_ref was very fragile, so
drop this entirely. Instead add a valid_p_req bool to indicate that
p_req contains a valid value for this control. And if it is false,
then just use the current value of the control.
Note that VIDIOC_G_EXT_CTRLS will always return -EACCES when attempting
to get a control from a request until the request is completed. And in
that case, all controls in the request will have the control value set
(i.e. valid_p_req is true). This means that the whole 'find the most
recent previous request containing a control' idea is pointless, and
the code can be simplified considerably.
The v4l2_g_ext_ctrls_common() function was refactored a bit to make
it more understandable. It also avoids updating volatile controls
in a completed request since that was already done when the request
was completed.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Fixes: 2fae4d6aabc8 ("media: v4l2-ctrls: v4l2_ctrl_request_complete() should always set ref->req") Fixes: 6fa6f831f095 ("media: v4l2-ctrls: add core request support") Cc: <stable@vger.kernel.org> # for v5.9 and up Tested-by: Alexandre Courbot <acourbot@chromium.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Do not modify imgu_pipe->nodes[inode].vdev_fmt.fmt.pix_mp, until the
format has been correctly validated.
Otherwise, even if we use a backup variable, there is a period of time
where imgu_pipe->nodes[inode].vdev_fmt.fmt.pix_mp might have an invalid
value that can be used by other functions.
If there in an error during a set_fmt, do not overwrite the previous
sizes with the invalid config.
Without this patch, v4l2-compliance ends up allocating 4GiB of RAM and
causing the following OOPs
[ 38.662975] ipu3-imgu 0000:00:05.0: swiotlb buffer is full (sz: 4096 bytes)
[ 38.662980] DMA: Out of SW-IOMMU space for 4096 bytes at device 0000:00:05.0
[ 38.663010] general protection fault: 0000 [#1] PREEMPT SMP
dvb_usb_device_init() allocates a dvb_usb_device object, but it
doesn't release the object by itself even at errors. The object is
released in the callee side (dvb_usb_init()) in some error cases via
dvb_usb_exit() call, but it also missed the object free in other error
paths. And, the caller (it's only dvb_usb_device_init()) doesn't seem
caring the resource management as well, hence those memories are
leaked.
This patch assures releasing the memory at the error path in
dvb_usb_device_init(). Now dvb_usb_init() frees the resources it
allocated but leaves the passed dvb_usb_device object intact. In
turn, the dvb_usb_device object is released in dvb_usb_device_init()
instead.
We could use dvb_usb_exit() function for releasing everything in the
callee (as it was used for some error cases in the original code), but
releasing the passed object in the callee is non-intuitive and
error-prone. So I took this approach (which is more standard in Linus
kernel code) although it ended with a bit more open codes.
Along with the change, the patch makes sure that USB intfdata is reset
and don't return the bogus pointer to the caller of
dvb_usb_device_init() at the error path, too.
Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
dvb_usb_device_init() copies the properties to the own data, so that
the callers can release the original properties later (as done in the
commit 299c7007e936 ("media: dw2102: Fix memleak on sequence of
probes")). However, it also stores dev->desc pointer that is a
reference to the original properties data. Since dev->desc is
referred later, it may result in use-after-free, in the worst case,
leading to a kernel Oops as reported.
This patch addresses the problem by allocating and copying the
properties at first, then get the desc from the copied properties.
Reported-and-tested-by: Stefan Seyfried <seife+kernel@b1-systems.com> BugLink: http://bugzilla.opensuse.org/show_bug.cgi?id=1181104 Reviewed-by: Robert Foss <robert.foss@linaro.org> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
dvb_media_device_free() is leaking memory. Free `dvbdev->adapter->conn`
before setting it to NULL, as documented in include/media/media-device.h:
"The media_entity instance itself must be freed explicitly by the driver
if required."
Eric has noticed that after pagecache read rework, generic/418 is
occasionally failing for ext4 when blocksize < pagesize. In fact, the
pagecache rework just made hard to hit race in ext4 more likely. The
problem is that since ext4 conversion of direct IO writes to iomap
framework (commit 378f32bab371), we update inode size after direct IO
write only after invalidating page cache. Thus if buffered read sneaks
at unfortunate moment like:
CPU1 - write at offset 1k CPU2 - read from offset 0
iomap_dio_rw(..., IOMAP_DIO_FORCE_WAIT);
ext4_readpage();
ext4_handle_inode_extension()
the read will zero out tail of the page as it still sees smaller inode
size and thus page cache becomes inconsistent with on-disk contents with
all the consequences.
Fix the problem by moving inode size update into end_io handler which
gets called before the page cache is invalidated.
Reported-and-tested-by: Eric Whitney <enwlinux@gmail.com> Fixes: 378f32bab371 ("ext4: introduce direct I/O write using iomap infrastructure") CC: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Acked-by: Dave Chinner <dchinner@redhat.com> Link: https://lore.kernel.org/r/20210415155417.4734-1-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This is needed to allow generic/607 to pass for file systems with the
inline data_feature enabled, and it allows the use of file systems
where the directories use inline_data, while the files are accessed
via DAX.
Fix As write_mmp_block() so that it returns -EIO instead of 1, so that
the correct error gets saved into the superblock.
Cc: stable@kernel.org Fixes: 54d3adbc29f0 ("ext4: save all error info in save_error_info() and drop ext4_set_errno()") Reported-by: Liu Zhi Qiang <liuzhiqiang26@huawei.com> Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/20210406025331.148343-1-yebin10@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We should set the error code when ext4_commit_super check argument failed.
Found in code review. Fixes: c4be0c1dc4cdc ("filesystem freeze: add error handling of write_super_lockfs/unlockfs"). Cc: stable@kernel.org Signed-off-by: Fengnan Chang <changfengnan@vivo.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/20210402101631.561-1-changfengnan@vivo.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Before commit 014c9caa29d3 ("ext4: make ext4_abort() use
__ext4_error()"), the following series of commands would trigger a
panic:
1. mount /dev/sda -o ro,errors=panic test
2. mount /dev/sda -o remount,abort test
After commit 014c9caa29d3, remounting a file system using the test
mount option "abort" will no longer trigger a panic. This commit will
restore the behaviour immediately before commit 014c9caa29d3.
(However, note that the Linux kernel's behavior has not been
consistent; some previous kernel versions, including 5.4 and 4.19
similarly did not panic after using the mount option "abort".)
This also makes a change to long-standing behaviour; namely, the
following series commands will now cause a panic, when previously it
did not:
1. mount /dev/sda -o ro,errors=panic test
2. echo test > /sys/fs/ext4/sda/trigger_fs_error
However, this makes ext4's behaviour much more consistent, so this is
a good thing.
Cc: stable@kernel.org Fixes: 014c9caa29d3 ("ext4: make ext4_abort() use __ext4_error()") Signed-off-by: Ye Bin <yebin10@huawei.com> Link: https://lore.kernel.org/r/20210401081903.3421208-1-yebin10@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When CONFIG_QUOTA is enabled, if we failed to mount the filesystem due
to some error happens behind ext4_orphan_cleanup(), it will end up
triggering a after free issue of super_block. The problem is that
ext4_orphan_cleanup() will set SB_ACTIVE flag if CONFIG_QUOTA is
enabled, after we cleanup the truncated inodes, the last iput() will put
them into the lru list, and these inodes' pages may probably dirty and
will be write back by the writeback thread, so it could be raced by
freeing super_block in the error path of mount_bdev().
After check the setting of SB_ACTIVE flag in ext4_orphan_cleanup(), it
was used to ensure updating the quota file properly, but evict inode and
trash data immediately in the last iput does not affect the quotafile,
so setting the SB_ACTIVE flag seems not required[1]. Fix this issue by
just remove the SB_ACTIVE setting.
Cc: stable@kernel.org Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Tested-by: Jan Kara <jack@suse.cz> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20210331033138.918975-1-yi.zhang@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit <50122847007> ("ext4: fix check to prevent initializing reserved
inodes") check the block group zero and prevent initializing reserved
inodes. But in some special cases, the reserved inode may not all belong
to the group zero, it may exist into the second group if we format
filesystem below.
So, it will end up triggering a false positive report of a corrupted
file system. This patch fix it by avoid check reserved inodes if no free
inode blocks will be zeroed.
Cc: stable@kernel.org Fixes: 50122847007 ("ext4: fix check to prevent initializing reserved inodes") Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Suggested-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20210331121516.2243099-1-yi.zhang@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Assertion checks in jbd2_journal_dirty_metadata() are known to be racy
but we don't want to be grabbing locks just for them. We thus recheck
them under b_state_lock only if it looks like they would fail. Annotate
the checks with data_race().
Cc: stable@kernel.org Reported-by: Hao Sun <sunhao.th@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20210406161804.20150-2-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Access to journal->j_running_transaction is not protected by appropriate
lock and thus is racy. We are well aware of that and the code handles
the race properly. Just add a comment and data_race() annotation.
Cc: stable@kernel.org Reported-by: syzbot+30774a6acf6a2cf6d535@syzkaller.appspotmail.com Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20210406161804.20150-1-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If the timestamp of the .config file is updated, config_data.gz is
regenerated, then vmlinux is re-linked. This occurs even if the content
of the .config has not changed at all.
This issue was mitigated by commit 67424f61f813 ("kconfig: do not write
.config if the content is the same"); Kconfig does not update the
.config when it ends up with the identical configuration.
The issue is remaining when the .config is created by *_defconfig with
some config fragment(s) applied on top.
This is typical for powerpc and mips, where several *_defconfig targets
are constructed by using merge_config.sh.
One workaround is to have the copy of the .config. The filechk rule
updates the copy, kernel/config_data, by checking the content instead
of the timestamp.
With this commit, the second run with the same configuration avoids
the needless rebuilds.
$ make ARCH=mips defconfig all
[ snip ]
$ make ARCH=mips defconfig all
*** Default configuration is based on target '32r2el_defconfig'
Using ./arch/mips/configs/generic_defconfig as base
Merging arch/mips/configs/generic/32r2.config
Merging arch/mips/configs/generic/el.config
Merging ./arch/mips/configs/generic/board-boston.config
Merging ./arch/mips/configs/generic/board-ni169445.config
Merging ./arch/mips/configs/generic/board-ocelot.config
Merging ./arch/mips/configs/generic/board-ranchu.config
Merging ./arch/mips/configs/generic/board-sead-3.config
Merging ./arch/mips/configs/generic/board-xilfpga.config
#
# configuration written to .config
#
SYNC include/config/auto.conf
CALL scripts/checksyscalls.sh
CALL scripts/atomic/check-atomics.sh
CHK include/generated/compile.h
CHK include/generated/autoksyms.h
Initialize MSR_TSC_AUX with CPU node information if RDTSCP or RDPID is
supported. This fixes a bug where vdso_read_cpunode() will read garbage
via RDPID if RDPID is supported but RDTSCP is not. While no known CPU
supports RDPID but not RDTSCP, both Intel's SDM and AMD's APM allow for
RDPID to exist without RDTSCP, e.g. it's technically a legal CPU model
for a virtual machine.
Note, technically MSR_TSC_AUX could be initialized if and only if RDPID
is supported since RDTSCP is currently not used to retrieve the CPU node.
But, the cost of the superfluous WRMSR is negigible, whereas leaving
MSR_TSC_AUX uninitialized is just asking for future breakage if someone
decides to utilize RDTSCP.
Fixes: a582c540ac1b ("x86/vdso: Use RDPID in preference to LSL when available") Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210504225632.1532621-2-seanjc@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
FUTEX_LOCK_PI does not require to have the FUTEX_CLOCK_REALTIME bit set
because it has been using CLOCK_REALTIME based absolute timeouts
forever. Due to that, the time namespace adjustment which is applied when
FUTEX_CLOCK_REALTIME is not set, will wrongly take place for FUTEX_LOCK_PI
and wreckage the timeout.
Exclude it from that procedure.
Fixes: c2f7d08cccf4 ("futex: Adjust absolute futex timeouts with per time namespace offset") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210422194704.984540159@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The FUTEX_WAIT operand has historically a relative timeout which means that
the clock id is irrelevant as relative timeouts on CLOCK_REALTIME are not
subject to wall clock changes and therefore are mapped by the kernel to
CLOCK_MONOTONIC for simplicity.
If a caller would set FUTEX_CLOCK_REALTIME for FUTEX_WAIT the timeout is
still treated relative vs. CLOCK_MONOTONIC and then the wait arms that
timeout based on CLOCK_REALTIME which is broken and obviously has never
been used or even tested.
Reject any attempt to use FUTEX_CLOCK_REALTIME with FUTEX_WAIT again.
The desired functionality can be achieved with FUTEX_WAIT_BITSET and a
FUTEX_BITSET_MATCH_ANY argument.
Fixes: 337f13046ff0 ("futex: Allow FUTEX_CLOCK_REALTIME with FUTEX_WAIT op") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210422194704.834797921@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Mounting with "multichannel" is obviously implied if user requested
more than one channel on mount (ie mount parm max_channels>1).
Currently both have to be specified. Fix that so that if max_channels
is greater than 1 on mount, enable multichannel rather than silently
falling back to non-multichannel.
Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-By: Tom Talpey <tom@talpey.com> Cc: <stable@vger.kernel.org> # v5.11+ Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In the SMB3/SMB3.1.1 negotiate protocol request, we are supposed to
advertise CAP_MULTICHANNEL capability when establishing multiple
channels has been requested by the user doing the mount. See MS-SMB2
sections 2.2.3 and 3.2.5.2
Without setting it there is some risk that multichannel could fail
if the server interpreted the field strictly.
Reviewed-By: Tom Talpey <tom@talpey.com> Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Cc: <stable@vger.kernel.org> # v5.8+ Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It seems like Fedora 34 ends up enabling a few new gcc warnings, notably
"-Wstringop-overread" and "-Warray-parameter".
Both of them cause what seem to be valid warnings in the kernel, where
we have array size mismatches in function arguments (that are no longer
just silently converted to a pointer to element, but actually checked).
This fixes most of the trivial ones, by making the function declaration
match the function definition, and in the case of intel_pm.c, removing
the over-specified array size from the argument declaration.
At least one 'stringop-overread' warning remains in the i915 driver, but
that one doesn't have the same obvious trivial fix, and may or may not
actually be indicative of a bug.
[ It was a mistake to upgrade one of my machines to Fedora 34 while
being busy with the merge window, but if this is the extent of the
compiler upgrade problems, things are better than usual - Linus ]
gcc-11 introdces a harmless warning for cap_inode_getsecurity:
security/commoncap.c: In function ‘cap_inode_getsecurity’:
security/commoncap.c:440:33: error: ‘memcpy’ reading 16 bytes from a region of size 0 [-Werror=stringop-overread]
440 | memcpy(&nscap->data, &cap->data, sizeof(__le32) * 2 * VFS_CAP_U32);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The problem here is that tmpbuf is initialized to NULL, so gcc assumes
it is not accessible unless it gets set by vfs_getxattr_alloc(). This is
a legitimate warning as far as I can tell, but the code is correct since
it correctly handles the error when that function fails.
Add a separate NULL check to tell gcc about it as well.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: James Morris <jamorris@linux.microsoft.com> Cc: Andrey Zhizhikin <andrey.z@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This sequence of events can lead to a failure to requeue a CPU's
->nocb_timer:
1. There are no callbacks queued for any CPU covered by CPU 0-2's
->nocb_gp_kthread. Note that ->nocb_gp_kthread is associated
with CPU 0.
2. CPU 1 enqueues its first callback with interrupts disabled, and
thus must defer awakening its ->nocb_gp_kthread. It therefore
queues its rcu_data structure's ->nocb_timer. At this point,
CPU 1's rdp->nocb_defer_wakeup is RCU_NOCB_WAKE.
3. CPU 2, which shares the same ->nocb_gp_kthread, also enqueues a
callback, but with interrupts enabled, allowing it to directly
awaken the ->nocb_gp_kthread.
4. The newly awakened ->nocb_gp_kthread associates both CPU 1's
and CPU 2's callbacks with a future grace period and arranges
for that grace period to be started.
5. This ->nocb_gp_kthread goes to sleep waiting for the end of this
future grace period.
6. This grace period elapses before the CPU 1's timer fires.
This is normally improbably given that the timer is set for only
one jiffy, but timers can be delayed. Besides, it is possible
that kernel was built with CONFIG_RCU_STRICT_GRACE_PERIOD=y.
7. The grace period ends, so rcu_gp_kthread awakens the
->nocb_gp_kthread, which in turn awakens both CPU 1's and
CPU 2's ->nocb_cb_kthread. Then ->nocb_gb_kthread sleeps
waiting for more newly queued callbacks.
8. CPU 1's ->nocb_cb_kthread invokes its callback, then sleeps
waiting for more invocable callbacks.
9. Note that neither kthread updated any ->nocb_timer state,
so CPU 1's ->nocb_defer_wakeup is still set to RCU_NOCB_WAKE.
10. CPU 1 enqueues its second callback, this time with interrupts
enabled so it can wake directly ->nocb_gp_kthread.
It does so with calling wake_nocb_gp() which also cancels the
pending timer that got queued in step 2. But that doesn't reset
CPU 1's ->nocb_defer_wakeup which is still set to RCU_NOCB_WAKE.
So CPU 1's ->nocb_defer_wakeup and its ->nocb_timer are now
desynchronized.
11. ->nocb_gp_kthread associates the callback queued in 10 with a new
grace period, arranges for that grace period to start and sleeps
waiting for it to complete.
12. The grace period ends, rcu_gp_kthread awakens ->nocb_gp_kthread,
which in turn wakes up CPU 1's ->nocb_cb_kthread which then
invokes the callback queued in 10.
13. CPU 1 enqueues its third callback, this time with interrupts
disabled so it must queue a timer for a deferred wakeup. However
the value of its ->nocb_defer_wakeup is RCU_NOCB_WAKE which
incorrectly indicates that a timer is already queued. Instead,
CPU 1's ->nocb_timer was cancelled in 10. CPU 1 therefore fails
to queue the ->nocb_timer.
14. CPU 1 has its pending callback and it may go unnoticed until
some other CPU ever wakes up ->nocb_gp_kthread or CPU 1 ever
calls an explicit deferred wakeup, for example, during idle entry.
This commit fixes this bug by resetting rdp->nocb_defer_wakeup everytime
we delete the ->nocb_timer.
It is quite possible that there is a similar scenario involving
->nocb_bypass_timer and ->nocb_defer_wakeup. However, despite some
effort from several people, a failure scenario has not yet been located.
However, that by no means guarantees that no such scenario exists.
Finding a failure scenario is left as an exercise for the reader, and the
"Fixes:" tag below relates to ->nocb_bypass_timer instead of ->nocb_timer.
efx->xdp_tx_queue_count is initially initialized to num_possible_cpus() and is
later used to allocate and traverse efx->xdp_tx_queues lookup array. However,
we may end up not initializing all the array slots with real queues during
probing. This results, for example, in a NULL pointer dereference, when running
"# ethtool -S <iface>", similar to below
Fix this by adjusting efx->xdp_tx_queue_count after probing to reflect the true
value of initialized slots in efx->xdp_tx_queues.
Signed-off-by: Ignat Korchagin <ignat@cloudflare.com> Fixes: e26ca4b53582 ("sfc: reduce the number of requested xdp ev queues") Cc: <stable@vger.kernel.org> # 5.12.x Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We're starting from a TXQ label, not a TXQ type, so
efx_channel_get_tx_queue() is inappropriate (and could return NULL,
leading to panics).
Fixes: 12804793b17c ("sfc: decouple TXQ type from label") Cc: stable@vger.kernel.org Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We're starting from a TXQ instance number ('qid'), not a TXQ type, so
efx_get_tx_queue() is inappropriate (and could return NULL, leading
to panics).
Fixes: 12804793b17c ("sfc: decouple TXQ type from label") Reported-by: Trevor Hemsley <themsley@voiceflex.com> Cc: stable@vger.kernel.org Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If mounted with discard option, exFAT issues discard command when clear
cluster bit to remove file. But the input parameter of cluster-to-sector
calculation is abnormally added by reserved cluster size which is 2,
leading to discard unrelated sectors included in target+2 cluster.
With fixing this, remove the wrong comments in set/clear/find bitmap
functions.
On !ARCH_SUPPORTS_DEBUG_PAGEALLOC (like ia64) debug_pagealloc=1 implies
page_poison=on:
if (page_poisoning_enabled() ||
(!IS_ENABLED(CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC) &&
debug_pagealloc_enabled()))
static_branch_enable(&_page_poisoning_enabled);
page_poison=on needs to override init_on_free=1.
Before the change it did not work as expected for the following case:
- have PAGE_POISONING=y
- have page_poison unset
- have !ARCH_SUPPORTS_DEBUG_PAGEALLOC arch (like ia64)
- have init_on_free=1
- have debug_pagealloc=1
That way we get both keys enabled:
- static_branch_enable(&init_on_free);
- static_branch_enable(&_page_poisoning_enabled);
which leads to poisoned pages returned for __GFP_ZERO pages.
After the change we execute only:
- static_branch_enable(&_page_poisoning_enabled);
and ignore init_on_free=1.
Link: https://lkml.kernel.org/r/20210329222555.3077928-1-slyfox@gentoo.org Link: https://lkml.org/lkml/2021/3/26/443 Fixes: 8db26a3d4735 ("mm, page_poison: use static key more efficiently") Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There are two modes for write(2) and friends in fuse:
a) write through (update page cache, send sync WRITE request to userspace)
b) buffered write (update page cache, async writeout later)
The write through method kept all the page cache pages locked that were
used for the request. Keeping more than one page locked is deadlock prone
and Qian Cai demonstrated this with trinity fuzzing.
The reason for keeping the pages locked is that concurrent mapped reads
shouldn't try to pull possibly stale data into the page cache.
For full page writes, the easy way to fix this is to make the cached page
be the authoritative source by marking the page PG_uptodate immediately.
After this the page can be safely unlocked, since mapped/cached reads will
take the written data from the cache.
Concurrent mapped writes will now cause data in the original WRITE request
to be updated; this however doesn't cause any data inconsistency and this
scenario should be exceedingly rare anyway.
If the WRITE request returns with an error in the above case, currently the
page is not marked uptodate; this means that a concurrent read will always
read consistent data. After this patch the page is uptodate between
writing to the cache and receiving the error: there's window where a cached
read will read the wrong data. While theoretically this could be a
regression, it is unlikely to be one in practice, since this is normal for
buffered writes.
In case of a partial page write to an already uptodate page the locking is
also unnecessary, with the above caveats.
Partial write of a not uptodate page still needs to be handled. One way
would be to read the complete page before doing the write. This is not
possible, since it might break filesystems that don't expect any READ
requests when the file was opened O_WRONLY.
The other solution is to serialize the synchronous write with reads from
the partial pages. The easiest way to do this is to keep the partial pages
locked. The problem is that a write() may involve two such pages (one head
and one tail). This patch fixes it by only locking the partial tail page.
If there's a partial head page as well, then split that off as a separate
WRITE request.
If fast table reloads occur during an ongoing reshape of raid4/5/6
devices the target may race reading a superblock vs the the MD resync
thread; causing an inconclusive reshape state to be read in its
constructor.
lvm2 test lvconvert-raid-reshape-stripes-load-reload.sh can cause
BUG_ON() to trigger in md_run(), e.g.:
"kernel BUG at drivers/md/raid5.c:7567!".
Scenario triggering the bug:
1. the MD sync thread calls end_reshape() from raid5_sync_request()
when done reshaping. However end_reshape() _only_ updates the
reshape position to MaxSector keeping the changed layout
configuration though (i.e. any delta disks, chunk sector or RAID
algorithm changes). That inconclusive configuration is stored in
the superblock.
2. dm-raid constructs a mapping, loading named inconsistent superblock
as of step 1 before step 3 is able to finish resetting the reshape
state completely, and calls md_run() which leads to mentioned bug
in raid5.c.
3. the MD RAID personality's finish_reshape() is called; which resets
the reshape information on chunk sectors, delta disks, etc. This
explains why the bug is rarely seen on multi-core machines, as MD's
finish_reshape() superblock update races with the dm-raid
constructor's superblock load in step 2.
Fix identifies inconclusive superblock content in the dm-raid
constructor and resets it before calling md_run(), factoring out
identifying checks into rs_is_layout_change() to share in existing
rs_reshape_requested() and new rs_reset_inclonclusive_reshape(). Also
enhance a comment and remove an empty line.
This patch addresses a data corruption bug in raid1 arrays using bitmaps.
Without this fix, the bitmap bits for the failed I/O end up being cleared.
Since we are in the failure leg of raid1_end_write_request, the request
either needs to be retried (R1BIO_WriteError) or failed (R1BIO_Degraded).
Fixes: eeba6809d8d5 ("md/raid1: end bio when the device faulty") Cc: stable@vger.kernel.org # v5.2+ Signed-off-by: Paul Clements <paul.clements@us.sios.com> Signed-off-by: Song Liu <song@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
crypto_stats_get() is a no-op when the kernel is compiled without
CONFIG_CRYPTO_STATS, so pairing it with crypto_alg_put() unconditionally
(as crypto_rng_reset() does) is wrong.
Fix this by moving the call to crypto_stats_get() to just before the
actual algorithm operation which might need it. This makes it always
paired with crypto_stats_rng_seed().
Fixes: eed74b3eba9e ("crypto: rng - Fix a refcounting bug in crypto_rng_reset()") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Debian's clang carries a patch that makes the default FPU mode
'vfp3-d16' instead of 'neon' for 'armv7-a' to avoid generating NEON
instructions on hardware that does not support them:
Shuffle the order of the '.arch' and '.fpu' directives so that the code
builds regardless of the default FPU mode. This has been tested against
both clang with and without Debian's patch and GCC.
Cc: stable@vger.kernel.org Fixes: d8f1308a025f ("crypto: arm/curve25519 - wire up NEON implementation") Link: https://github.com/ClangBuiltLinux/continuous-integration2/issues/118 Reported-by: Arnd Bergmann <arnd@arndb.de> Suggested-by: Arnd Bergmann <arnd@arndb.de> Suggested-by: Jessica Clarke <jrtc27@jrtc27.com> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Acked-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Avoid allocating memory and reading the host log when a virtual device
is used since this log is of no use to that driver. A virtual
device can be identified through the flag TPM_CHIP_FLAG_VIRTUAL, which
is only set for the tpm_vtpm_proxy driver.
Cc: stable@vger.kernel.org Fixes: 6f99612e2500 ("tpm: Proxy driver for supporting multiple emulated TPMs") Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When tpm_read_log_efi is called multiple times, which happens when
one loads and unloads a TPM2 driver multiple times, then the global
variable efi_tpm_final_log_size will at some point become a negative
number due to the subtraction of final_events_preboot_size occurring
each time. Use a local variable to avoid this integer underflow.
The following issue is now resolved:
Mar 8 15:35:12 hibinst kernel: Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
Mar 8 15:35:12 hibinst kernel: Workqueue: tpm-vtpm vtpm_proxy_work [tpm_vtpm_proxy]
Mar 8 15:35:12 hibinst kernel: RIP: 0010:__memcpy+0x12/0x20
Mar 8 15:35:12 hibinst kernel: Code: 00 b8 01 00 00 00 85 d2 74 0a c7 05 44 7b ef 00 0f 00 00 00 c3 cc cc cc 66 66 90 66 90 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 <f3> 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 f3 a4
Mar 8 15:35:12 hibinst kernel: RSP: 0018:ffff9ac4c0fcfde0 EFLAGS: 00010206
Mar 8 15:35:12 hibinst kernel: RAX: ffff88f878cefed5 RBX: ffff88f878ce9000 RCX: 1ffffffffffffe0f
Mar 8 15:35:12 hibinst kernel: RDX: 0000000000000003 RSI: ffff9ac4c003bff9 RDI: ffff88f878cf0e4d
Mar 8 15:35:12 hibinst kernel: RBP: ffff9ac4c003b000 R08: 0000000000001000 R09: 000000007e9d6073
Mar 8 15:35:12 hibinst kernel: R10: ffff9ac4c003b000 R11: ffff88f879ad3500 R12: 0000000000000ed5
Mar 8 15:35:12 hibinst kernel: R13: ffff88f878ce9760 R14: 0000000000000002 R15: ffff88f77de7f018
Mar 8 15:35:12 hibinst kernel: FS: 0000000000000000(0000) GS:ffff88f87bd00000(0000) knlGS:0000000000000000
Mar 8 15:35:12 hibinst kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 8 15:35:12 hibinst kernel: CR2: ffff9ac4c003c000 CR3: 00000001785a6004 CR4: 0000000000060ee0
Mar 8 15:35:12 hibinst kernel: Call Trace:
Mar 8 15:35:12 hibinst kernel: tpm_read_log_efi+0x152/0x1a7
Mar 8 15:35:12 hibinst kernel: tpm_bios_log_setup+0xc8/0x1c0
Mar 8 15:35:12 hibinst kernel: tpm_chip_register+0x8f/0x260
Mar 8 15:35:12 hibinst kernel: vtpm_proxy_work+0x16/0x60 [tpm_vtpm_proxy]
Mar 8 15:35:12 hibinst kernel: process_one_work+0x1b4/0x370
Mar 8 15:35:12 hibinst kernel: worker_thread+0x53/0x3e0
Mar 8 15:35:12 hibinst kernel: ? process_one_work+0x370/0x370
Cc: stable@vger.kernel.org Fixes: 166a2809d65b ("tpm: Don't duplicate events from the final event log in the TCG2 log") Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>