The non-MMU m68k pagetable ZERO_PAGE() macro is being set to the
somewhat non-sensical value of "virt_to_page(0)". The zeroth page
is not in any way guaranteed to be a page full of "0". So the result
is that ZERO_PAGE() will almost certainly contain random values.
We already allocate a real "empty_zero_page" in the mm setup code shared
between MMU m68k and non-MMU m68k. It is just not hooked up to the
ZERO_PAGE() macro for the non-MMU m68k case.
Fix ZERO_PAGE() to use the allocated "empty_zero_page" pointer.
I am not aware of any specific issues caused by the old code.
Timeout as 1 second sets an upper limit on the length
of the transfer executed, but there is no maximum length
of a write or read message set in i2c_adapter_quirks for
this controller.
This upper limit affects devices that require sending
large firmware blobs over I2C.
To remove that limitation, calculate the minimal time
necessary, plus some wiggle room, for every message and
use it instead of the default one second, if more than
one second.
Signed-off-by: Lucas Tanure <tanureal@opensource.cirrus.com> Acked-by: Michal Simek <michal.simek@xilinx.com> Signed-off-by: Wolfram Sang <wsa@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
To prepare for support asynchronous tracer_init_tracefs initcall,
avoid calling create_trace_option_files before __update_tracer_options.
Otherwise, create_trace_option_files will show warning because
some tracers in trace_types list are already in tr->topts.
For example, hwlat_tracer call register_tracer in late_initcall,
and global_trace.dir is already created in tracing_init_dentry,
hwlat_tracer will be put into tr->topts.
Then if the __update_tracer_options is executed after hwlat_tracer
registered, create_trace_option_files find that hwlat_tracer is
already in tr->topts.
When setting bootparams="trace_event=initcall:initcall_start tp_printk=1" in the
cmdline, the output_printk() was called, and the spin_lock_irqsave() was called in the
atomic and irq disable interrupt context suitation. On the PREEMPT_RT kernel,
these locks are replaced with sleepable rt-spinlock, so the stack calltrace will
be triggered.
Fix it by raw_spin_lock_irqsave when PREEMPT_RT and "trace_event=initcall:initcall_start
tp_printk=1" enabled.
The function percent_rmt_hitm_cmp() wrongly uses local HITMs for
sorting remote HITMs.
Since this function is to sort cache lines for remote HITMs, this patch
changes to use 'rmt_hitm' field for correct sorting.
Fixes: 9cb3500afc0980c5 ("perf c2c report: Add hitm/store percent related sort keys") Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joe Mario <jmario@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220530084253.750190-1-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
In AFS, a directory is handled as a file that the client downloads and
parses locally for the purposes of performing lookup and getdents
operations. The in-kernel afs filesystem has a number of functions that
do this.
A directory file is arranged as a series of 2K blocks divided into
32-byte slots, where a directory entry occupies one or more slots, plus
each block starts with one or more metadata blocks.
When parsing a block, if the last slots are occupied by a dirent that
occupies more than a single slot and the file position points at a slot
that's not the initial one, the logic in afs_dir_iterate_block() that
skips over it won't advance the file pointer to the end of it. This
will cause an infinite loop in getdents() as it will keep retrying that
block and failing to advance beyond the final entry.
Fix this by advancing the file pointer if the next entry will be beyond
it when we skip a block.
This was found by the generic/676 xfstest but can also be triggered with
something like:
1) A new passive FastOpen TCP socket is created.
This FO socket waits for an ACK coming from client to be a complete
ESTABLISHED one.
2) A socket operation on this socket goes through lock_sock()
release_sock() dance.
3) While the socket is owned by the user in step 2),
a retransmit of the SYN is received and stored in socket backlog.
4) At release_sock() time, the socket backlog is processed while
in process context.
5) A SYNACK packet is cooked in response of the SYN retransmit.
6) -> tcp_rtx_synack() is called in process context.
Before blamed commit, tcp_rtx_synack() was always called from BH handler,
from a timer handler.
Fix this by using TCP_INC_STATS() & NET_INC_STATS()
which do not assume caller is in non preemptible context.
Some features (LRO, HW GRO) conflict with XDP. If there is an attempt to
enable such features while XDP is active, they will be set to `off
[requested on]`. In order to activate these features after XDP is turned
off, the driver needs to call netdev_update_features(). This commit adds
this missing call after XDP state changes.
Fixes: cf6e34c8c22f ("net/mlx5e: Properly block LRO when XDP is enabled") Fixes: b0617e7b3500 ("net/mlx5e: Properly block HW GRO when XDP is enabled") Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
With the `-z unique-symbol` linker flag or any similar mechanism,
it is possible to trigger the following:
ERROR: modpost: "param_set_uint.0" [vmlinux] is a static EXPORT_SYMBOL
The reason is that for now the condition from remove_dot():
if (m && (s[n + m] == '.' || s[n + m] == 0))
which was designed to test if it's a dot or a '\0' after the suffix
is never satisfied.
This is due to that `s[n + m]` always points to the last digit of a
numeric suffix, not on the symbol next to it (from a custom debug
print added to modpost):
param_set_uint.0, s[n + m] is '0', s[n + m + 1] is '\0'
So it's off-by-one and was like that since 2014.
Fix this for the sake of any potential upcoming features, but don't
bother stable-backporting, as it's well hidden -- apart from that
LD flag, it can be triggered only with GCC LTO which never landed
upstream.
of_get_child_by_name() returns a node pointer with refcount
incremented, we should use of_node_put() on it when done.
mv88e6xxx_mdio_register() pass the device node to of_mdiobus_register().
We don't need the device node after it.
Add missing of_node_put() to avoid refcount leak.
Fixes: a3c53be55c95 ("net: dsa: mv88e6xxx: Support multiple MDIO busses") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Reviewed-by: Marek Behún <kabel@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
The "fsp->location" variable comes from user via ethtool_get_rxnfc().
Check that it is valid to prevent an out of bounds read.
Fixes: 7aab747e5563 ("net: ethernet: mediatek: add ethtool functions to configure RX flows of HW LRO") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
The argument of scatterwalk_unmap() is supposed to be the void* that was
returned by the previous scatterwalk_map() call.
The s390 AES-GCM implementation was instead passing the pointer to the
struct scatter_walk.
This doesn't actually break anything because scatterwalk_unmap() only uses
its argument under CONFIG_HIGHMEM and ARCH_HAS_FLUSH_ON_KUNMAP.
We can get "failed to disable" clock_unprepare warnings on unbind at least
for the serial console device if the unbind is done before the device has
been idled.
As some devices are using deferred idle, we must check the status for
pending idle work to idle the device.
Add CSIZE sanitization for unsupported CSIZE configurations. In
addition, if parity is asked for but CSx was unsupported, the sensible
result is CS8+parity which requires setting USART_CR1_M0 like with 9
bits.
Incorrect CSIZE results in miscalculation of the frame bits in
tty_get_char_size() or in its predecessor where the roughly the same
code is directly within uart_update_timeout().
Fixes: c8a9d043947b (serial: stm32: fix word length configuration) Cc: Erwan Le Ray <erwan.leray@st.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20220519081808.3776-9-ilpo.jarvinen@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Only CS7 and CS8 seem supported but CSIZE is not sanitized from CS5 or
CS6 to CS8. In addition, ASC_CTL_MODE_7BIT_PAR suggests that CS7 has
to have parity, thus add PARENB.
Incorrect CSIZE results in miscalculation of the frame bits in
tty_get_char_size() or in its predecessor where the roughly the same
code is directly within uart_update_timeout().
Only CS7 and CS8 seem supported but CSIZE is not sanitized from
CS5 or CS6 to CS8.
Set CSIZE correctly so that userspace knows the effective value.
Incorrect CSIZE also results in miscalculation of the frame bits in
tty_get_char_size() or in its predecessor where the roughly the same
code is directly within uart_update_timeout().
Only CS7 and CS8 are supported but CSIZE is not sanitized with
CS5 or CS6 to CS8.
Set CSIZE correctly so that userspace knows the effective value.
Incorrect CSIZE also results in miscalculation of the frame bits in
tty_get_char_size() or in its predecessor where the roughly the same
code is directly within uart_update_timeout().
Only CS7 and CS8 seem supported but CSIZE is not sanitized to CS8 in
the default: block.
Set CSIZE correctly so that userspace knows the effective value.
Incorrect CSIZE also results in miscalculation of the frame bits in
tty_get_char_size() or in its predecessor where the roughly the same
code is directly within uart_update_timeout().
SER_RS485_RTS_ON_SEND and SER_RS485_RTS_AFTER_SEND relate to behavior
within RS485 operation. The driver checks if they have the same value
which is not possible to realize with the hardware. The check is taken
regardless of SER_RS485_ENABLED flag and -EINVAL is returned when the
check fails, which creates problems.
This check makes it unnecessarily complicated to turn RS485 mode off as
simple zeroed serial_rs485 struct will trigger that equal values check.
In addition, the driver itself memsets its rs485 structure to zero when
RS485 is disabled but if userspace would try to make an TIOCSRS485
ioctl() call with the very same struct, it would end up failing with
-EINVAL which doesn't make much sense.
Resolve the problem by moving the check inside SER_RS485_ENABLED block.
The uart_ops startup() callback is called without interrupts
disabled and without port->lock locked, relatively late during the
boot process (from the call path of console_on_rootfs()). If the
device is a console, it was already previously registered and could
be actively printing messages.
Since the startup() callback is reading/writing registers used by
the console write() callback (AML_UART_CONTROL), its access must
be synchronized using the port->lock. Currently it is not.
The startup() callback is the only function that explicitly enables
interrupts. Without the synchronization, it is possible that
interrupts become accidentally permanently disabled.
Some implementations of the SBI time extension depend on hart-local
state (for example, CSRs) that are lost or hardware that is powered
down when a CPU is suspended. To be safe, the clockevents driver
cannot assume that timer IRQs will be received during CPU suspend.
Fixes: 62b019436814 ("clocksource: new RISC-V SBI timer driver") Signed-off-by: Samuel Holland <samuel@sholland.org> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20220509012121.40031-1-samuel@sholland.org Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
of_find_matching_node_and_match returns a node pointer with refcount
incremented, we should use of_node_put() on it when done.
Add missing of_node_put() to avoid refcount leak.
The panic notifier infrastructure executes registered callbacks when
a panic event happens - such callbacks are executed in atomic context,
with interrupts and preemption disabled in the running CPU and all other
CPUs disabled. That said, mutexes in such context are not a good idea.
This patch replaces a regular mutex with a mutex_trylock safer approach;
given the nature of the mutex used in the driver, it should be pretty
uncommon being unable to acquire such mutex in the panic path, hence
no functional change should be observed (and if it is, that would be
likely a deadlock with the regular mutex).
Fixes: 2227b7c74634 ("coresight: add support for CPU debug module") Cc: Leo Yan <leo.yan@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20220427224924.592546-10-gpiccoli@igalia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
If the device is already in a runtime PM enabled state
pm_runtime_get_sync() will return 1, so a test for negative
value should be used to check for errors.
The hardware only supports periods <= 1.6 ms and if a bigger period is
requested it is clamped to 1.6 ms. In this case duty_cycle might be bigger
than 1.6 ms and then the duty cycle register is written with a value
bigger than LP3943_MAX_DUTY. So clamp duty_cycle accordingly.
The function documentation of usb_set_configuration says that its
callers should hold the device lock. This lock is held for all
callsites except tweak_set_configuration_cmd. The code path can be
executed for example when attaching a remote USB device.
The solution is to surround the call by the device lock.
This bug was found using my experimental own-developed static analysis
tool, which reported the missing lock on v5.17.2. I manually verified
this bug report by doing code review as well. I runtime checked that
the required lock is not held. I compiled and runtime tested this on
x86_64 with a USB mouse. After applying this patch, my analyser no
longer reports this potential bug.
In goldfish_tty_probe(), the port initialized through tty_port_init()
should be destroyed in error paths.In goldfish_tty_remove(), qtty->port
also should be destroyed or else might leak resources.
If the list does not exit early then data == NULL and 'module' does not
point to a valid list element.
Using 'module' in such a case is not valid and was therefore removed.
Fixes: 6dd67645f22c ("greybus: audio: Use single codec driver registration") Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Vaibhav Agarwal <vaibhav.sr@gmail.com> Reviewed-by: Mark Greer <mgreer@animalcreek.com> Signed-off-by: Jakob Koschel <jakobkoschel@gmail.com> Link: https://lore.kernel.org/r/20220321123626.3068639-1-jakobkoschel@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
When the MIPS_ALCHEMY board selection is MIPS_XXS1500 instead of
MIPS_DB1XXX, the PCMCIA driver 'db1xxx_ss' has build errors due
to missing DB1XXX symbols. The PCMCIA driver should be restricted
to MIPS_DB1XXX instead of MIPS_ALCHEMY to fix this build error.
Remove unsupported forcing of `cpu_has_fpu' to 1, which makes the `nofpu'
kernel parameter non-functional, and also causes a link error:
ld: arch/mips/kernel/traps.o: in function `trap_init':
./arch/mips/include/asm/msa.h:(.init.text+0x348): undefined reference to `handle_fpe'
ld: ./arch/mips/include/asm/msa.h:(.init.text+0x354): undefined reference to `handle_fpe'
ld: ./arch/mips/include/asm/msa.h:(.init.text+0x360): undefined reference to `handle_fpe'
where the CONFIG_MIPS_FP_SUPPORT configuration option has been disabled.
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Reported-by: Stephen Zhang <starzhangzsd@gmail.com> Fixes: 0ebb2f4159af ("MIPS: IP27: Update/restructure CPU overrides") Cc: stable@vger.kernel.org # v4.2+ Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Current rxe_requester() doesn't generate a completion when processing an
unsupported/invalid opcode. If rxe driver doesn't support a new opcode
(e.g. RDMA Atomic Write) and RDMA library supports it, an application
using the new opcode can reproduce this issue. Fix the issue by calling
"goto err;".
iolatency needs to track the number of inflight IOs per cgroup. As this
tracking can be expensive, it is disabled when no cgroup has iolatency
configured for the device. To ensure that the inflight counters stay
balanced, iolatency_set_limit() freezes the request_queue while manipulating
the enabled counter, which ensures that no IO is in flight and thus all
counters are zero.
Unfortunately, iolatency_set_limit() isn't the only place where the enabled
counter is manipulated. iolatency_pd_offline() can also dec the counter and
trigger disabling. As this disabling happens without freezing the q, this
can easily happen while some IOs are in flight and thus leak the counts.
This can be easily demonstrated by turning on iolatency on an one empty
cgroup while IOs are in flight in other cgroups and then removing the
cgroup. Note that iolatency shouldn't have been enabled elsewhere in the
system to ensure that removing the cgroup disables iolatency for the whole
device.
The following keeps flipping on and off iolatency on sda:
If a cgroup with stuck inflight ends up getting throttled, the throttled IOs
will never get issued as there's no completion event to wake it up leading
to an indefinite hang.
This patch fixes the bug by unifying enable handling into a work item which
is automatically kicked off from iolatency_set_min_lat_nsec() which is
called from both iolatency_set_limit() and iolatency_pd_offline() paths.
Punting to a work item is necessary as iolatency_pd_offline() is called
under spinlocks while freezing a request_queue requires a sleepable context.
This also simplifies the code reducing LOC sans the comments and avoids the
unnecessary freezes which were happening whenever a cgroup's latency target
is newly set or cleared.
Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Liu Bo <bo.liu@linux.alibaba.com> Fixes: 8c772a9bfc7c ("blk-iolatency: fix IO hang due to negative inflight counter") Cc: stable@vger.kernel.org # v5.0+ Link: https://lore.kernel.org/r/Yn9ScX6Nx2qIiQQi@slm.duckdns.org Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The list iterator value 'crtc' will *always* be set and non-NULL by
list_for_each_entry(), so it is incorrect to assume that the iterator
value will be NULL if the list is empty or no element is found.
To fix the bug, return 'crtc' when found, otherwise return NULL.
If the previous list_for_each_entry_continue_rcu() don't exit early
(no goto hit inside the loop), the iterator 'cvif' after the loop
will be a bogus pointer to an invalid structure object containing
the HEAD (&ar->vif_list). As a result, the use of 'cvif' after that
will lead to a invalid memory access (i.e., 'cvif->id': the invalid
pointer dereference when return back to/after the callsite in the
carl9170_update_beacon()).
The original intention should have been to return the valid 'cvif'
when found in list, NULL otherwise. So just return NULL when no
entry found, to fix this bug.
Cc: stable@vger.kernel.org Fixes: 1f1d9654e183c ("carl9170: refactor carl9170_update_beacon") Signed-off-by: Xiaomeng Tong <xiam0nd.tong@gmail.com> Acked-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Link: https://lore.kernel.org/r/20220328122820.1004-1-xiam0nd.tong@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The driver has a custom put function for "DSP Voice Wake Up" which does
not generate event notifications on change, instead returning 0. Since we
already exit early in the case that there is no change this can be fixed
by unconditionally returning 1 at the end of the function.
Using not existing queues can panic the kernel with rtl8180/rtl8185 cards.
Ignore the skb priority for those cards, they only have one tx queue. Pierre
Asselin (pa@panix.com) reported the kernel crash in the Gentoo forum:
He also confirmed that this patch fixes the issue. In summary this happened:
After updating wpa_supplicant from 2.9 to 2.10 the kernel crashed with a
"divide error: 0000" when connecting to an AP. Control port tx now tries to
use IEEE80211_AC_VO for the priority, which wpa_supplicants starts to use in
2.10.
Since only the rtl8187se part of the driver supports QoS, the priority
of the skb is set to IEEE80211_AC_BE (2) by mac80211 for rtl8180/rtl8185
cards.
rtl8180 is then unconditionally reading out the priority and finally crashes on
drivers/net/wireless/realtek/rtl818x/rtl8180/dev.c line 544 without this
patch:
idx = (ring->idx + skb_queue_len(&ring->queue)) % ring->entries
"ring->entries" is zero for rtl8180/rtl8185 cards, tx_ring[2] never got
initialized.
Cc: stable@vger.kernel.org Reported-by: pa@panix.com Tested-by: pa@panix.com Signed-off-by: Alexander Wetzel <alexander@wetzel-home.de> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20220422145228.7567-1-alexander@wetzel-home.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The routine huge_pmd_unshare() is passed a pointer to an address
associated with an area which may be unshared. If unshare is successful
this address is updated to 'optimize' callers iterating over huge page
addresses. For the optimization to work correctly, address should be
updated to the last huge page in the unmapped/unshared area. However, in
the common case where the passed address is PUD_SIZE aligned, the address
is incorrectly updated to the address of the preceding huge page. That
wastes CPU cycles as the unmapped/unshared range is scanned twice.
Link: https://lkml.kernel.org/r/20220524205003.126184-1-mike.kravetz@oracle.com Fixes: 39dde65c9940 ("shared page table for hugetlb page") Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Acked-by: Muchun Song <songmuchun@bytedance.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
With gcc version 12.0.1 20220401 (Red Hat 12.0.1-0), building with
defconfig results in the following compilation error:
| CC mm/swapfile.o
| mm/swapfile.c: In function `setup_swap_info':
| mm/swapfile.c:2291:47: error: array subscript -1 is below array bounds
| of `struct plist_node[]' [-Werror=array-bounds]
| 2291 | p->avail_lists[i].prio = 1;
| | ~~~~~~~~~~~~~~^~~
| In file included from mm/swapfile.c:16:
| ./include/linux/swap.h:292:27: note: while referencing `avail_lists'
| 292 | struct plist_node avail_lists[]; /*
| | ^~~~~~~~~~~
This is due to the compiler detecting that the mask in
node_states[__state] could theoretically be zero, which would lead to
first_node() returning -1 through find_first_bit.
I believe that the warning/error is legitimate. I first tried adding a
test to check that the node mask is not emtpy, since a similar test exists
in the case where MAX_NUMNODES == 1.
However, adding the if statement causes other warnings to appear in
for_each_cpu_node_but, because it introduces a dangling else ambiguity.
And unfortunately, GCC is not smart enough to detect that the added test
makes the case where (node) == -1 impossible, so it still complains with
the same message.
This is why I settled on replacing that with a harmless, but relatively
useless (node) >= 0 test. Based on the warning for the dangling else, I
also decided to fix the case where MAX_NUMNODES == 1 by moving the
condition inside the for loop. It will still only be tested once. This
ensures that the meaning of an else following for_each_node_mask or
derivatives would not silently have a different meaning depending on the
configuration.
Link: https://lkml.kernel.org/r/20220414150855.2407137-3-dinechin@redhat.com Signed-off-by: Christophe de Dinechin <christophe@dinechin.org> Signed-off-by: Christophe de Dinechin <dinechin@redhat.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Ben Segall <bsegall@google.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Daniel Bristot de Oliveira <bristot@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Zhen Lei <thunder.leizhen@huawei.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The bug is here:
if (!iommu || iommu->dev->of_node != spec->np) {
The list iterator value 'iommu' will *always* be set and non-NULL by
list_for_each_entry(), so it is incorrect to assume that the iterator
value will be NULL if the list is empty or no element is found (in fact,
it will point to a invalid structure object containing HEAD).
To fix the bug, use a new value 'iter' as the list iterator, while use
the old value 'iommu' as a dedicated variable to point to the found one,
and remove the unneeded check for 'iommu->dev->of_node != spec->np'
outside the loop.
syscall_stub_data() expects the data_count parameter to be the number of
longs, not bytes.
==================================================================
BUG: KASAN: stack-out-of-bounds in syscall_stub_data+0x70/0xe0
Read of size 128 at addr 000000006411f6f0 by task swapper/1
The previous fix here was only partially correct, it did
result in returning a proper error value in case of error,
but it also clobbered the pid that we need to return from
this function (not just zero for success).
As a result, it returned 0 here, but later this is treated
as a pid and used to kill the process, but since it's now
0 we kill(0, SIGKILL), which makes UML kill itself rather
than just the helper thread.
Fix that and make it more obvious by using a separate
variable for the pid.
Fixes: ccf1236ecac4 ("um: fix error return code in winch_tramp()") Reported-and-tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In client mode, we can't connect to hidden SSID APs or SSIDs not advertised
in beacons on DFS channels, since we're forced to passive scan. Fix this by
sending out a probe request immediately after the first beacon, if active
scan was requested by the user.
Cc: stable@vger.kernel.org Reported-by: Catrinel Catrinescu <cc@80211.de> Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20220420104907.36275-1-nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When irq-xtensa-mx chip is used in non-SMP configuration its
irq_set_affinity callback is not called leaving IRQ affinity set empty.
As a result IRQ delivery does not work in that configuration.
Initialize IRQ affinity of the xtensa MX interrupt distributor to CPU 0
for all external IRQ lines.
Cc: stable@vger.kernel.org Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Register ARMADA_370_XP_INT_FABRIC_MASK_OFFS is Armada 370 and XP specific
and on new Armada platforms it has different meaning. It does not configure
Performance Counter Overflow interrupt masking. So do not touch this
register on non-A370/XP platforms (A375, A38x and A39x).
Signed-off-by: Pali Rohár <pali@kernel.org> Cc: stable@vger.kernel.org Fixes: 28da06dfd9e4 ("irqchip: armada-370-xp: Enable the PMU interrupts") Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220425113706.29310-1-pali@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When multiplying of different types, an overflow is possible even when
storing the result in a larger type. This is because the conversion is
done after the multiplication. So arithmetic overflow and thus in
incorrect value is possible.
Correct an instance of this in the inter packet delay calculation. Fix by
ensuring one of the operands is u64 which will promote the other to u64 as
well ensuring no overflow.
Cc: stable@vger.kernel.org Fixes: 7724105686e7 ("IB/hfi1: add driver files") Link: https://lore.kernel.org/r/20220520183712.48973.29855.stgit@awfm-01.cornelisnetworks.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add H264 level 1.0, 4.1, 4.2 to the list of supported formats.
While the hardware does not fully support these levels, it does support
most of them. The constraints on frame size and pixel formats already
cover the limitation.
This fixes negotiation of level on GStreamer 1.17.1.
The CODA960 manual states that ASO/FMO features of baseline are not
supported, so for this reason this driver should only report
constrained baseline support.
This fixes negotiation issue with constrained baseline content
on GStreamer 1.17.1.
ASO/FMO features are unsupported for the encoder and untested for the
decoder because there is currently no userspace support. Neither GStreamer
parsers nor FFMPEG parsers support ASO/FMO.
The bug is here:
if (!rdev || rdev->desc_nr != nr) {
The list iterator value 'rdev' will *always* be set and non-NULL
by rdev_for_each_rcu(), so it is incorrect to assume that the
iterator value will be NULL if the list is empty or no element
found (In fact, it will be a bogus pointer to an invalid struct
object containing the HEAD). Otherwise it will bypass the check
and lead to invalid memory access passing the check.
To fix the bug, use a new variable 'iter' as the list iterator,
while using the original variable 'pdev' as a dedicated pointer to
point to the found element.
Cc: stable@vger.kernel.org Fixes: 70bcecdb1534 ("md-cluster: Improve md_reload_sb to be less error prone") Signed-off-by: Xiaomeng Tong <xiam0nd.tong@gmail.com> Signed-off-by: Song Liu <song@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The list iterator value 'rdev' will *always* be set and non-NULL
by rdev_for_each(), so it is incorrect to assume that the iterator
value will be NULL if the list is empty or no element found.
Otherwise it will bypass the NULL check and lead to invalid memory
access passing the check.
To fix the bug, use a new variable 'iter' as the list iterator,
while using the original variable 'rdev' as a dedicated pointer to
point to the found element.
Cc: stable@vger.kernel.org Fixes: 2aa82191ac36 ("md-cluster: Perform a lazy update") Acked-by: Guoqing Jiang <guoqing.jiang@linux.dev> Signed-off-by: Xiaomeng Tong <xiam0nd.tong@gmail.com> Acked-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Song Liu <song@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If the display is not enable()d, then we aren't holding a runtime PM
reference here. Thus, it's easy to accidentally cause a hang, if user
space is poking around at /dev/drm_dp_aux0 at the "wrong" time.
Let's get a runtime PM reference, and check that we "see" the panel.
Don't force any panel power-up, etc., because that can be intrusive, and
that's not what other drivers do (see
drivers/gpu/drm/bridge/ti-sn65dsi86.c and
drivers/gpu/drm/bridge/parade-ps8640.c.)
Fixes: 0d97ad03f422 ("drm/bridge: analogix_dp: Remove duplicated code") Cc: <stable@vger.kernel.org> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Signed-off-by: Brian Norris <briannorris@chromium.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/20220301181107.v4.1.I773a08785666ebb236917b0c8e6c05e3de471e75@changeid Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The bug is here:
if (nvkm_cstate_valid(clk, cstate, max_volt, clk->temp))
return cstate;
The list iterator value 'cstate' will *always* be set and non-NULL
by list_for_each_entry_from_reverse(), so it is incorrect to assume
that the iterator value will be unchanged if the list is empty or no
element is found (In fact, it will be a bogus pointer to an invalid
structure object containing the HEAD). Also it missed a NULL check
at callsite and may lead to invalid memory access after that.
To fix this bug, just return 'encoder' when found, otherwise return
NULL. And add the NULL check.
Cc: stable@vger.kernel.org Fixes: 1f7f3d91ad38a ("drm/nouveau/clk: Respect voltage limits in nvkm_cstate_prog") Signed-off-by: Xiaomeng Tong <xiam0nd.tong@gmail.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220327075824.11806-1-xiam0nd.tong@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In ufs_qcom_dev_ref_clk_ctrl(), it was noted that the ref_clk needs to be
stable for at least 1us. Even though there is wmb() to make sure the write
gets "completed", there is no guarantee that the write actually reached the
UFS device. There is a good chance that the write could be stored in a
Write Buffer (WB). In that case, even though the CPU waits for 1us, the
ref_clk might not be stable for that period.
So lets do a readl() to make sure that the previous write has reached the
UFS device before udelay().
Also, the wmb() after writel_relaxed() is not really needed. Both writel()
and readl() are ordered on all architectures and the CPU won't speculate
instructions after readl() due to the in-built control dependency with read
value on weakly ordered architectures. So it can be safely removed.
Link: https://lore.kernel.org/r/20220504084212.11605-4-manivannan.sadhasivam@linaro.org Fixes: f06fcc7155dc ("scsi: ufs-qcom: add QUniPro hardware support and power optimizations") Cc: stable@vger.kernel.org Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The list iterator 'p' will point to a bogus position containing HEAD if the
list is empty or no element is found. This case must be checked before any
use of the iterator, otherwise it will lead to an invalid memory access.
To fix this bug, add a check. Use a new variable 'iter' as the list
iterator, and use the original variable 'p' as a dedicated pointer to point
to the found element.
Link: https://lore.kernel.org/r/20220414040231.2662-1-xiam0nd.tong@gmail.com Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Xiaomeng Tong <xiam0nd.tong@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When user_dlm_destroy_lock failed, it didn't clean up the flags it set
before exit. For USER_LOCK_IN_TEARDOWN, if this function fails because of
lock is still in used, next time when unlink invokes this function, it
will return succeed, and then unlink will remove inode and dentry if lock
is not in used(file closed), but the dlm lock is still linked in dlm lock
resource, then when bast come in, it will trigger a panic due to
user-after-free. See the following panic call trace. To fix this,
USER_LOCK_IN_TEARDOWN should be reverted if fail. And also error should
be returned if USER_LOCK_IN_TEARDOWN is set to let user know that unlink
fail.
For the case of ocfs2_dlm_unlock failure, besides USER_LOCK_IN_TEARDOWN,
USER_LOCK_BUSY is also required to be cleared. Even though spin lock is
released in between, but USER_LOCK_IN_TEARDOWN is still set, for
USER_LOCK_BUSY, if before every place that waits on this flag,
USER_LOCK_IN_TEARDOWN is checked to bail out, that will make sure no flow
waits on the busy flag set by user_dlm_destroy_lock(), then we can
simplely revert USER_LOCK_BUSY when ocfs2_dlm_unlock fails. Fix
user_dlm_cluster_lock() which is the only function not following this.
Link: https://lkml.kernel.org/r/20220518235224.87100-2-junxiao.bi@oracle.com Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Joseph Qi <jiangqi903@gmail.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We always call hold_lkb(lkb) if we increment lkb->lkb_wait_count.
So, we always need to call unhold_lkb(lkb) if we decrement
lkb->lkb_wait_count. This patch will add missing unhold_lkb(lkb) if we
decrement lkb->lkb_wait_count. In case of setting lkb->lkb_wait_count to
zero we need to countdown until reaching zero and call unhold_lkb(lkb).
The waiters list unhold_lkb(lkb) can be removed because it's done for
the last lkb_wait_count decrement iteration as it's done in
_remove_from_waiters().
This issue was discovered by a dlm gfs2 test case which use excessively
dlm_unlock(LKF_CANCEL) feature. Probably the lkb->lkb_wait_count value
never reached above 1 if this feature isn't used and so it was not
discovered before.
The testcase ended in a rsb on the rsb keep data structure with a
refcount of 1 but no lkb was associated with it, which is itself
an invalid behaviour. A side effect of that was a condition in which
the dlm was sending remove messages in a looping behaviour. With this
patch that has not been reproduced.
Cc: stable@vger.kernel.org Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch fixes an invalid read showed by KASAN. A unlock will allocate a
"struct plock_op" and a followed send_op() will append it to a global
send_list data structure. In some cases a followed dev_read() moves it
to recv_list and dev_write() will cast it to "struct plock_xop" and access
fields which are only available in those structures. At this point an
invalid read happens by accessing those fields.
To fix this issue the "callback" field is moved to "struct plock_op" to
indicate that a cast to "plock_xop" is allowed and does the additional
"plock_xop" handling if set.
Example of the KASAN output which showed the invalid read:
[ 2064.296453] ==================================================================
[ 2064.304852] BUG: KASAN: slab-out-of-bounds in dev_write+0x52b/0x5a0 [dlm]
[ 2064.306491] Read of size 8 at addr ffff88800ef227d8 by task dlm_controld/7484
[ 2064.308168]
[ 2064.308575] CPU: 0 PID: 7484 Comm: dlm_controld Kdump: loaded Not tainted 5.14.0+ #9
[ 2064.310292] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[ 2064.311618] Call Trace:
[ 2064.312218] dump_stack_lvl+0x56/0x7b
[ 2064.313150] print_address_description.constprop.8+0x21/0x150
[ 2064.314578] ? dev_write+0x52b/0x5a0 [dlm]
[ 2064.315610] ? dev_write+0x52b/0x5a0 [dlm]
[ 2064.316595] kasan_report.cold.14+0x7f/0x11b
[ 2064.317674] ? dev_write+0x52b/0x5a0 [dlm]
[ 2064.318687] dev_write+0x52b/0x5a0 [dlm]
[ 2064.319629] ? dev_read+0x4a0/0x4a0 [dlm]
[ 2064.320713] ? bpf_lsm_kernfs_init_security+0x10/0x10
[ 2064.321926] vfs_write+0x17e/0x930
[ 2064.322769] ? __fget_light+0x1aa/0x220
[ 2064.323753] ksys_write+0xf1/0x1c0
[ 2064.324548] ? __ia32_sys_read+0xb0/0xb0
[ 2064.325464] do_syscall_64+0x3a/0x80
[ 2064.326387] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 2064.327606] RIP: 0033:0x7f807e4ba96f
[ 2064.328470] Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 39 87 f8 ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 44 24 08 e8 7c 87 f8 ff 48
[ 2064.332902] RSP: 002b:00007ffd50cfe6e0 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
[ 2064.334658] RAX: ffffffffffffffda RBX: 000055cc3886eb30 RCX: 00007f807e4ba96f
[ 2064.336275] RDX: 0000000000000040 RSI: 00007ffd50cfe7e0 RDI: 0000000000000010
[ 2064.337980] RBP: 00007ffd50cfe7e0 R08: 0000000000000000 R09: 0000000000000001
[ 2064.339560] R10: 000055cc3886eb30 R11: 0000000000000293 R12: 000055cc3886eb80
[ 2064.341237] R13: 000055cc3886eb00 R14: 000055cc3886f590 R15: 0000000000000001
[ 2064.342857]
[ 2064.343226] Allocated by task 12438:
[ 2064.344057] kasan_save_stack+0x1c/0x40
[ 2064.345079] __kasan_kmalloc+0x84/0xa0
[ 2064.345933] kmem_cache_alloc_trace+0x13b/0x220
[ 2064.346953] dlm_posix_unlock+0xec/0x720 [dlm]
[ 2064.348811] do_lock_file_wait.part.32+0xca/0x1d0
[ 2064.351070] fcntl_setlk+0x281/0xbc0
[ 2064.352879] do_fcntl+0x5e4/0xfe0
[ 2064.354657] __x64_sys_fcntl+0x11f/0x170
[ 2064.356550] do_syscall_64+0x3a/0x80
[ 2064.358259] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 2064.360745]
[ 2064.361511] Last potentially related work creation:
[ 2064.363957] kasan_save_stack+0x1c/0x40
[ 2064.365811] __kasan_record_aux_stack+0xaf/0xc0
[ 2064.368100] call_rcu+0x11b/0xf70
[ 2064.369785] dlm_process_incoming_buffer+0x47d/0xfd0 [dlm]
[ 2064.372404] receive_from_sock+0x290/0x770 [dlm]
[ 2064.374607] process_recv_sockets+0x32/0x40 [dlm]
[ 2064.377290] process_one_work+0x9a8/0x16e0
[ 2064.379357] worker_thread+0x87/0xbf0
[ 2064.381188] kthread+0x3ac/0x490
[ 2064.383460] ret_from_fork+0x22/0x30
[ 2064.385588]
[ 2064.386518] Second to last potentially related work creation:
[ 2064.389219] kasan_save_stack+0x1c/0x40
[ 2064.391043] __kasan_record_aux_stack+0xaf/0xc0
[ 2064.393303] call_rcu+0x11b/0xf70
[ 2064.394885] dlm_process_incoming_buffer+0x47d/0xfd0 [dlm]
[ 2064.397694] receive_from_sock+0x290/0x770 [dlm]
[ 2064.399932] process_recv_sockets+0x32/0x40 [dlm]
[ 2064.402180] process_one_work+0x9a8/0x16e0
[ 2064.404388] worker_thread+0x87/0xbf0
[ 2064.406124] kthread+0x3ac/0x490
[ 2064.408021] ret_from_fork+0x22/0x30
[ 2064.409834]
[ 2064.410599] The buggy address belongs to the object at ffff88800ef22780
[ 2064.410599] which belongs to the cache kmalloc-96 of size 96
[ 2064.416495] The buggy address is located 88 bytes inside of
[ 2064.416495] 96-byte region [ffff88800ef22780, ffff88800ef227e0)
[ 2064.422045] The buggy address belongs to the page:
[ 2064.424635] page:00000000b6bef8bc refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0xef22
[ 2064.428970] flags: 0xfffffc0000200(slab|node=0|zone=1|lastcpupid=0x1fffff)
[ 2064.432515] raw: 000fffffc0000200ffffea0000d68b800000001400000014ffff888001041780
[ 2064.436110] raw: 0000000000000000000000008020002000000001ffffffff0000000000000000
[ 2064.439813] page dumped because: kasan: bad access detected
[ 2064.442548]
[ 2064.443310] Memory state around the buggy address:
[ 2064.445988] ffff88800ef22680: 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc
[ 2064.449444] ffff88800ef22700: 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc
[ 2064.452941] >ffff88800ef22780: 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc fc
[ 2064.456383] ^
[ 2064.459386] ffff88800ef22800: 00 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc
[ 2064.462788] ffff88800ef22880: 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc
[ 2064.466239] ==================================================================
reproducer in python:
import argparse
import struct
import fcntl
import os
parser = argparse.ArgumentParser()
parser.add_argument('-f', '--file',
help='file to use fcntl, must be on dlm lock filesystem e.g. gfs2')
Drop the leftover pm_runtime_disable() calls from the late probe error
paths that would, for example, prevent runtime PM from being reenabled
after a probe deferral.
92597f97a40b ("PCI/PM: Avoid putting Elo i2 PCIe Ports in D3cold") omitted
braces around the new Elo i2 entry, so it overwrote the existing Gigabyte
X299 entry. Add the appropriate braces.
Found by:
$ make W=1 drivers/pci/pci.o
CC drivers/pci/pci.o
drivers/pci/pci.c:2974:12: error: initialized field overwritten [-Werror=override-init]
2974 | .ident = "Elo i2",
| ^~~~~~~~
In create_var_ref(), init_var_ref() is called to initialize the fields
of variable ref_field, which is allocated in the previous function call
to create_hist_field(). Function init_var_ref() allocates the
corresponding fields such as ref_field->system, but frees these fields
when the function encounters an error. The caller later calls
destroy_hist_field() to conduct error handling, which frees the fields
and the variable itself. This results in double free of the fields which
are already freed in the previous function.
Fix this by storing NULL to the corresponding fields when they are freed
in init_var_ref().
A maliciously corrupted filesystem can contain cycles in the h-tree
stored inside a directory. That can easily lead to the kernel corrupting
tree nodes that were already verified under its hands while doing a node
split and consequently accessing unallocated memory. Fix the problem by
verifying traversed block numbers are unique.
It may happen as follows:
1. write inline_data inode
vfs_write
new_sync_write
ext4_file_write_iter
ext4_buffered_write_iter
generic_perform_write
ext4_da_write_begin
ext4_da_write_inline_data_begin -> If inline data size too
small will allocate block to write, then mapping will has
dirty page
ext4_da_convert_inline_data_to_extent ->clear EXT4_STATE_MAY_INLINE_DATA
2. fallocate
do_vfs_ioctl
ioctl_preallocate
vfs_fallocate
ext4_fallocate
ext4_convert_inline_data
ext4_convert_inline_data_nolock
ext4_map_blocks -> fail will goto restore data
ext4_restore_inline_data
ext4_create_inline_data
ext4_write_inline_data
ext4_set_inode_state -> set inode EXT4_STATE_MAY_INLINE_DATA
3. writepages
__ext4_ioctl
ext4_alloc_da_blocks
filemap_flush
filemap_fdatawrite_wbc
do_writepages
ext4_writepages
if (ext4_has_inline_data(inode))
BUG_ON(ext4_test_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA))
The root cause of this issue is we destory inline data until call
ext4_writepages under delay allocation mode. But there maybe already
convert from inline to extent. To solve this issue, we call
filemap_flush first..
Cc: stable@kernel.org Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220516122634.1690462-1-yebin10@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reason is first directory entry which 'rec_len' is 34478, then will get illegal
parent entry. Now, we do not check directory entry after read directory block
in 'ext4_get_first_dir_block'.
To solve this issue, check directory entry in 'ext4_get_first_dir_block'.
[ Trigger an ext4_error() instead of just warning if the directory is
missing a '.' or '..' entry. Also make sure we return an error code
if the file system is corrupted. -TYT ]
Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220414025223.4113128-1-yebin10@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since 3e135cd499bf ("netfilter: nft_dynset: dynamic stateful expression
instantiation"), it is possible to attach stateful expressions to set
elements.
cd5125d8f518 ("netfilter: nf_tables: split set destruction in deactivate
and destroy phase") introduces conditional destruction on the object to
accomodate transaction semantics.
nft_expr_init() calls expr->ops->init() first, then check for
NFT_STATEFUL_EXPR, this stills allows to initialize a non-stateful
lookup expressions which points to a set, which might lead to UAF since
the set is not properly detached from the set->binding for this case.
Anyway, this combination is non-sense from nf_tables perspective.
This patch fixes this problem by checking for NFT_STATEFUL_EXPR before
expr->ops->init() is called.
The reporter provides a KASAN splat and a poc reproducer (similar to
those autogenerated by syzbot to report use-after-free errors). It is
unknown to me if they are using syzbot or if they use similar automated
tool to locate the bug that they are reporting.
For the record, this is the KASAN splat.
[ 85.431824] ==================================================================
[ 85.432901] BUG: KASAN: use-after-free in nf_tables_bind_set+0x81b/0xa20
[ 85.433825] Write of size 8 at addr ffff8880286f0e98 by task poc/776
[ 85.434756]
[ 85.434999] CPU: 1 PID: 776 Comm: poc Tainted: G W 5.18.0+ #2
[ 85.436023] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
Fixes: 0b2d8a7b638b ("netfilter: nf_tables: add helper functions for expression handling") Reported-and-tested-by: Aaron Adams <edg-e@nccgroup.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
[Ajay: Regenerated the patch for v4.19.y] Signed-off-by: Ajay Kaher <akaher@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 505a666ee3fc ("writeback: plug writeback in wb_writeback() and
writeback_inodes_wb()") has us holding a plug during wb_writeback, which
may cause a potential ABBA dead lock:
wb_writeback fat_file_fsync
blk_start_plug(&plug)
for (;;) {
iter i-1: some reqs have been added into plug->mq_list // LOCK A
iter i:
progress = __writeback_inodes_wb(wb, work)
. writeback_sb_inodes // fat's bdev
. __writeback_single_inode
. . generic_writepages
. . __block_write_full_page
. . . . __generic_file_fsync
. . . . sync_inode_metadata
. . . . writeback_single_inode
. . . . __writeback_single_inode
. . . . fat_write_inode
. . . . __fat_write_inode
. . . . sync_dirty_buffer // fat's bdev
. . . . lock_buffer(bh) // LOCK B
. . . . submit_bh
. . . . blk_mq_get_tag // LOCK A
. . . trylock_buffer(bh) // LOCK B
. . . redirty_page_for_writepage
. . . wbc->pages_skipped++
. . --wbc->nr_to_write
. wrote += write_chunk - wbc.nr_to_write // wrote > 0
. requeue_inode
. redirty_tail_locked
if (progress) // progress > 0
continue;
iter i+1:
queue_io
// similar process with iter i, infinite for-loop !
}
blk_finish_plug(&plug) // flush plug won't be called
In my test, 'need_resched()' (which is imported by 590dca3a71 "fs-writeback:
unplug before cond_resched in writeback_sb_inodes") in function
'writeback_sb_inodes()' seldom comes true, unless cond_resched() is deleted
from write_cache_pages().
Fix it by correcting wrote number according number of skipped pages
in writeback_sb_inodes().
In ieee80211_vif_use_reserved_context(), when we have an
old context and the new context's replace_state is set to
IEEE80211_CHANCTX_REPLACE_NONE, we free the old context
in ieee80211_vif_use_reserved_reassign(). Therefore, we
cannot check the old_ctx anymore, so we should set it to
NULL after this point.
However, since the new_ctx replace state is clearly not
IEEE80211_CHANCTX_REPLACES_OTHER, we're not going to do
anything else in this function and can just return to
avoid accessing the freed old_ctx.
The root cause is: in a very small sized image, it's very easy to
exceed threshold of foreground GC, if we calculate free space and
dirty data based on section granularity, in corner case,
has_not_enough_free_secs() will always return true, result in
deadloop in f2fs_gc().
So this patch refactors has_not_enough_free_secs() as below to fix
this issue:
1. calculate needed space based on block granularity, and separate
all blocks to two parts, section part, and block part, comparing
section part to free section, and comparing block part to free space
in openned log.
2. account F2FS_DIRTY_NODES, F2FS_DIRTY_IMETA and F2FS_DIRTY_DENTS
as node block consumer;
3. account F2FS_DIRTY_DENTS as data block consumer;
Cc: stable@vger.kernel.org Reported-by: Ming Yan <yanming@tju.edu.cn> Signed-off-by: Chao Yu <chao.yu@oppo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In the origin code, when "ExtSel" is 1, the eventcode will change to
"eventcode |= 1 << 21”. For event “UNC_Q_RxL_CREDITS_CONSUMED_VN0.DRS",
its "ExtSel" is "1", its eventcode will change from 0x1E to 0x20001E,
but in fact the eventcode should <=0x1FF, so this will cause the parse
fail:
# perf stat -e "UNC_Q_RxL_CREDITS_CONSUMED_VN0.DRS" -a sleep 0.1
event syntax error: '.._RxL_CREDITS_CONSUMED_VN0.DRS'
\___ value too big for format, maximum is 511
On the perf kernel side, the kernel assumes the valid bits are continuous.
It will adjust the 0x100 (bit 8 for perf tool) to bit 21 in HW.
If the slang lib is not installed on the system, perf c2c tool disables TUI
mode and roll back to use stdio mode; but the flag 'c2c.use_stdio' is
missed to set true and thus it wrongly applies UI quirks in the function
ui_quirks().
This commit forces to use stdio interface if slang is not supported, and
it can avoid to apply the UI quirks and show the correct metric header.
Fixes: 5a1a99cd2e4e1557 ("perf c2c report: Add main TUI browser") Reported-by: Joe Mario <jmario@redhat.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20220526145400.611249-1-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
On some systems it can take a long time for the hardware to enable the
GA log of the AMD IOMMU. The current wait time is only 0.1ms, but
testing showed that it can take up to 14ms for the GA log to enter
running state after it has been enabled.
Sometimes the long delay happens when booting the system, sometimes
only on resume. Adjust the timeout accordingly to not print a warning
when hardware takes a longer than usual.
There has already been an attempt to fix this with commit
9b45a7738eec ("iommu/amd: Fix loop timeout issue in iommu_ga_log_enable()")
But that commit was based on some wrong math and did not fix the issue
in all cases.
GISR1 was described in a not up-to-date documentation when the stm32-mdma
driver has been developed. This register has not been added in reference
manual of STM32 SoC with MDMA, which have only 32 MDMA channels.
So remove it from stm32-mdma driver.
of_parse_phandle() returns a node pointer with refcount incremented, we should
use of_node_put() on it when not need anymore. Add missing of_node_put() to
avoid refcount leak.
Fixes: d10715be03bd ("video: ARM CLCD: Add DT support") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit 587f03deb69b caused pnfs_update_layout() to stop returning ENOMEM
when the memory allocation fails, and hence causes it to fall back to
trying to do I/O through the MDS. There is no guarantee that this will
fare any better. If we're failing the pNFS layout allocation, then we
should just redirty the page and retry later.
drivers/i2c/busses/i2c-at91-master.c:707:6: warning: variable 'dma_buf' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
if (dev->use_dma) {
^~~~~~~~~~~~
drivers/i2c/busses/i2c-at91-master.c:717:27: note: uninitialized use occurs here
i2c_put_dma_safe_msg_buf(dma_buf, m_start, !ret);
^~~~~~~
Initialize dma_buf to NULL, as i2c_put_dma_safe_msg_buf() is a no-op
when the first argument is NULL, which will work for the !dev->use_dma
case.
Fixes: 03fbb903c8bf ("i2c: at91: use dma safe buffers") Link: https://github.com/ClangBuiltLinux/linux/issues/1629 Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Michael Walle <michael@walle.cc> Signed-off-by: Wolfram Sang <wsa@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The supplied buffer might be on the stack and we get the following error
message:
[ 3.312058] at91_i2c e0070600.i2c: rejecting DMA map of vmalloc memory
Use i2c_{get,put}_dma_safe_msg_buf() to get a DMA-able memory region if
necessary.
Lack the list_del in the mtk_iommu_remove, and remove
bus_set_iommu(*, NULL) since there may be several iommu HWs.
we can not bus_set_iommu null when one iommu driver unbind.
This could be a fix for mt2712 which support 2 M4U HW and list them.
Fixes: 7c3a2ec02806 ("iommu/mediatek: Merge 2 M4U HWs into one iommu domain") Signed-off-by: Yong Wu <yong.wu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/20220503071427.2285-6-yong.wu@mediatek.com Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
The list iterator variable will be a bogus pointer if no break was hit.
Dereferencing it (cur->page in this case) could load an out-of-bounds/undefined
value making it unsafe to use that in the comparision to determine if the
specific element was found.
Since 'cur->page' *can* be out-ouf-bounds it cannot be guaranteed that
by chance (or intention of an attacker) it matches the value of 'page'
even though the correct element was not found.
This is fixed by using a separate list iterator variable for the loop
and only setting the original variable if a suitable element was found.
Then determing if the element was found is simply checking if the
variable is set.
Fixes: 8c242db9b8c0 ("f2fs: fix stale ATOMIC_WRITTEN_PAGE private pointer") Signed-off-by: Jakob Koschel <jakobkoschel@gmail.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>