We're passing "&posn" instead of "posn" so it ends up corrupting
memory instead of doing something useful.
Fixes: 53e0c72d98ba ("ASoC: SOF: Add support for IPC IO between DSP and Host") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Link: https://lore.kernel.org/r/20200303101858.ytehbrivocyp3cnf@kili.mountain Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In case of ABI version mismatch, _manifest needs to be freed as
it is just a copy of the original topology manifest. However, if
a driver manifest handler is defined, that would get executed and
the cleanup is never reached. Fix that by getting the return status
of manifest() instead of returning directly.
Fixes: 583958fa2e52 ("ASoC: topology: Make manifest backward compatible from ABI v4") Signed-off-by: Dragos Tarcatu <dragos_tarcatu@mentor.com> Link: https://lore.kernel.org/r/20200207185325.22320-3-dragos_tarcatu@mentor.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If soc_tplg_link_config() fails, _link needs to be freed in case of
topology ABI version mismatch. However the current code is returning
directly and ends up leaking memory in this case.
This patch fixes that.
Fixes: 593d9e52f9bb ("ASoC: topology: Add support to configure existing physical DAI links") Signed-off-by: Dragos Tarcatu <dragos_tarcatu@mentor.com> Link: https://lore.kernel.org/r/20200207185325.22320-2-dragos_tarcatu@mentor.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The previous code was not thread safe and caused
undefined behavior from spurious duplicate resource IDs.
In this patch, an atomic_t is used instead. We no longer
see any duplicate IDs in tests with this change.
Fixes: 16065fcdd19d ("drm/virtio: do NOT reuse resource ids") Signed-off-by: John Bates <jbates@chromium.org> Reviewed-by: Chia-I Wu <olvaffe@gmail.com> Link: http://patchwork.freedesktop.org/patch/msgid/20200220225319.45621-1-jbates@chromium.org Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The purpose of commit 0fd85869c2a9 ("spi/bcm63xx-hsspi: keep pll clk enabled")
was to keep the pll clk enabled through the lifetime of the device.
In order to do that, some 'clk_prepare_enable()'/'clk_disable_unprepare()'
calls have been added in the error handling path of the probe function, in
the remove function and in the suspend and resume functions.
However, a 'clk_disable_unprepare()' call has been unfortunately left in
the probe function. So the commit seems to be more or less a no-op.
Axe it now, so that the pll clk is left enabled through the lifetime of
the device, as described in the commit.
Fixes: 0fd85869c2a9 ("spi/bcm63xx-hsspi: keep pll clk enabled") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Jonas Gorski <jonas.gorski@gmail.com> Link: https://lore.kernel.org/r/20200228213838.7124-1-christophe.jaillet@wanadoo.fr Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The difference between "fsl,etsec2-mdio" and "gianfar" has to do with
the .get_tbipa function, which calculates the address of the TBIPA
register automatically, if not explicitly specified. [ see
drivers/net/ethernet/freescale/fsl_pq_mdio.c ]. On LS1021A, the TBIPA
register is at offset 0x30 within the port register block, which is what
the "gianfar" method of calculating addresses actually does.
Luckily, the bad "compatible" is inconsequential for ls1021a.dtsi,
because the TBIPA register is explicitly specified via the second "reg"
(<0x0 0x2d10030 0x0 0x4>), so the "get_tbipa" function is dead code.
Nonetheless it's good to restore it to its correct value.
We neither assign congested_fn for requested-based blk-mq device nor
implement it correctly. So fix both.
Also, remove incorrect comment from dm_init_normal_md_queue and rename
it to dm_init_congested_fn.
Fixes: 4aa9c692e052 ("bdi: separate out congested state into a separate struct") Cc: stable@vger.kernel.org Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dm-zoned initializes reference counters of new chunk works with zero
value and refcount_inc() is called to increment the counter. However, the
refcount_inc() function handles the addition to zero value as an error
and triggers the warning as follows:
After this warning, following refcount API calls for the counter all fail
to change the counter value.
Fix this by setting the initial reference counter value not zero but one
for the new chunk works. Instead, do not call refcount_inc() via
dmz_get_chunk_work() for the new chunks works.
The failure was observed with linux version 5.4 with CONFIG_REFCOUNT_FULL
enabled. Refcount rework was merged to linux version 5.5 by the
commit 168829ad09ca ("Merge branch 'locking-core-for-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip"). After this
commit, CONFIG_REFCOUNT_FULL was removed and the failure was observed
regardless of kernel configuration.
Linux version 4.20 merged the commit 092b5648760a ("dm zoned: target: use
refcount_t for dm zoned reference counters"). Before this commit, dm
zoned used atomic_t APIs which does not check addition to zero, then this
fix is not necessary.
Fixes: 092b5648760a ("dm zoned: target: use refcount_t for dm zoned reference counters") Cc: stable@vger.kernel.org # 5.4+ Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The function dm_suspended returns true if the target is suspended.
However, when the target is being suspended during unload, it returns
false.
An example where this is a problem: the test "!dm_suspended(wc->ti)" in
writecache_writeback is not sufficient, because dm_suspended returns
zero while writecache_suspend is in progress. As is, without an
enhanced dm_suspended, simply switching from flush_workqueue to
drain_workqueue still emits warnings:
workqueue writecache-writeback: drain_workqueue() isn't complete after 10 tries
workqueue writecache-writeback: drain_workqueue() isn't complete after 100 tries
workqueue writecache-writeback: drain_workqueue() isn't complete after 200 tries
workqueue writecache-writeback: drain_workqueue() isn't complete after 300 tries
workqueue writecache-writeback: drain_workqueue() isn't complete after 400 tries
writecache_suspend calls flush_workqueue(wc->writeback_wq) - this function
flushes the current work. However, the workqueue may re-queue itself and
flush_workqueue doesn't wait for re-queued works to finish. Because of
this - the function writecache_writeback continues execution after the
device was suspended and then concurrently with writecache_dtr, causing
a crash in writecache_writeback.
We must use drain_workqueue - that waits until the work and all re-queued
works finish.
As a prereq for switching to drain_workqueue, this commit fixes
dm_suspended to return true after the presuspend hook and before the
postsuspend hook - just like during a normal suspend. It allows
simplifying the dm-integrity and dm-writecache targets so that they
don't have to maintain suspended flags on their own.
With this change use of drain_workqueue() can be used effectively. This
change was tested with the lvm2 testsuite and cryptsetup testsuite and
the are no regressions.
The crash can be reproduced by running the lvm2 testsuite test
lvconvert-thin-external-cache.sh for several minutes, e.g.:
while :; do make check T=shell/lvconvert-thin-external-cache.sh; done
The crash happens in this call chain:
do_waker -> policy_tick -> smq_tick -> end_hotspot_period -> clear_bitset
-> memset -> __memset -- which accesses an invalid pointer in the vmalloc
area.
The work entry on the workqueue is executed even after the bitmap was
freed. The problem is that cancel_delayed_work doesn't wait for the
running work item to finish, so the work item can continue running and
re-submitting itself even after cache_postsuspend. In order to make sure
that the work item won't be running, we must use cancel_delayed_work_sync.
Also, change flush_workqueue to drain_workqueue, so that if some work item
submits itself or another work item, we are properly waiting for both of
them.
If the flag SB_FLAG_RECALCULATE is present in the superblock, but it was
not specified on the command line (i.e. ic->recalculate_flag is false),
dm-integrity would return invalid table line - the reported number of
arguments would not match the real number.
Fixes: 468dfca38b1a ("dm integrity: add a bitmap mode") Cc: stable@vger.kernel.org # v5.2+ Reported-by: Ondrej Kozina <okozina@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If we need to perform synchronous I/O in dm_integrity_map_continue(),
we must make sure that we are not in the map function - in order to
avoid the deadlock due to bio queuing in generic_make_request. To
avoid the deadlock, we offload the request to metadata_wq.
However, metadata_wq also processes metadata updates for write requests.
If there are too many requests that get offloaded to metadata_wq at the
beginning of dm_integrity_map_continue, the workqueue metadata_wq
becomes clogged and the system is incapable of processing any metadata
updates.
This causes a deadlock because all the requests that need to do metadata
updates wait for metadata_wq to proceed and metadata_wq waits inside
wait_and_add_new_range until some existing request releases its range
lock (which doesn't happen because the range lock is released after
metadata update).
In order to fix the deadlock, we create a new workqueue offload_wq and
offload requests to it - so that processing of offload_wq is independent
from processing of metadata_wq.
If we resume a device in bitmap mode and the on-disk format is in journal
mode, we must recalculate anything above ic->sb->recalc_sector. Otherwise,
there would be non-recalculated blocks which would cause I/O errors.
The interrupt handler puts a half-completed DMA descriptor on a free list
and then schedules tasklet to process bottom half of the descriptor that
executes client's callback, this creates possibility to pick up the busy
descriptor from the free list. Thus, let's disallow descriptor's re-use
until it is fully processed.
I was doing some experiments with I2C and noticed that Tegra APB DMA
driver crashes sometime after I2C DMA transfer termination. The crash
happens because tegra_dma_terminate_all() bails out immediately if pending
list is empty, and thus, it doesn't release the half-completed descriptors
which are getting re-used before ISR tasklet kicks-in.
tegra-i2c 7000c400.i2c: DMA transfer timeout
elants_i2c 0-0010: elants_i2c_irq: failed to read data: -110
------------[ cut here ]------------
WARNING: CPU: 0 PID: 142 at lib/list_debug.c:45 __list_del_entry_valid+0x45/0xac
list_del corruption, ddbaac44->next is LIST_POISON1 (00000100)
Modules linked in:
CPU: 0 PID: 142 Comm: kworker/0:2 Not tainted 5.5.0-rc2-next-20191220-00175-gc3605715758d-dirty #538
Hardware name: NVIDIA Tegra SoC (Flattened Device Tree)
Workqueue: events_freezable_power_ thermal_zone_device_check
[<c010e5c5>] (unwind_backtrace) from [<c010a1c5>] (show_stack+0x11/0x14)
[<c010a1c5>] (show_stack) from [<c0973925>] (dump_stack+0x85/0x94)
[<c0973925>] (dump_stack) from [<c011f529>] (__warn+0xc1/0xc4)
[<c011f529>] (__warn) from [<c011f7e9>] (warn_slowpath_fmt+0x61/0x78)
[<c011f7e9>] (warn_slowpath_fmt) from [<c042497d>] (__list_del_entry_valid+0x45/0xac)
[<c042497d>] (__list_del_entry_valid) from [<c047a87f>] (tegra_dma_tasklet+0x5b/0x154)
[<c047a87f>] (tegra_dma_tasklet) from [<c0124799>] (tasklet_action_common.constprop.0+0x41/0x7c)
[<c0124799>] (tasklet_action_common.constprop.0) from [<c01022ab>] (__do_softirq+0xd3/0x2a8)
[<c01022ab>] (__do_softirq) from [<c0124683>] (irq_exit+0x7b/0x98)
[<c0124683>] (irq_exit) from [<c0168c19>] (__handle_domain_irq+0x45/0x80)
[<c0168c19>] (__handle_domain_irq) from [<c043e429>] (gic_handle_irq+0x45/0x7c)
[<c043e429>] (gic_handle_irq) from [<c0101aa5>] (__irq_svc+0x65/0x94)
Exception stack(0xde2ebb90 to 0xde2ebbd8)
On i.MX6UL/ULL and i.MX6SX the DMA event id for the RX channel of
UART6 is '0'. To fix the broken DMA support for UART6, we change
the check for event_id0 to include '0' as a valid id.
There is a DMA problem with the serial ports on i.MX6.
When the following sequence is performed:
1) Open a port
2) Write some data
3) Close the port
4) Open a *different* port
5) Write some data
6) Close the port
The second write sends nothing and the second close hangs.
If the first close() is omitted it works.
Adding logs to the the UART driver shows that the DMA is being setup but
the callback is never invoked for the second write.
This used to work in 4.19.
Git bisect leads to: ad0d92d: "dmaengine: imx-sdma: refine to load context only once"
This commit adds a "context_loaded" flag used to avoid unnecessary context
setups.
However the flag is only reset in sdma_channel_terminate_work(),
which is only invoked in a worker triggered by sdma_terminate_all() IF
there is an active descriptor.
So, if no active descriptor remains when the channel is terminated, the
flag is not reset and, when the channel is later reused the old context
is used.
Fix the problem by always resetting the flag in sdma_free_chan_resources().
Cc: stable@vger.kernel.org Signed-off-by: Martin Fuzzey <martin.fuzzey@flowbird.group> Fixes: ad0d92d7ba6a ("dmaengine: imx-sdma: refine to load context only once") Reviewed-by: Fabio Estevam <festevam@gmail.com> Link: https://lore.kernel.org/r/1580305274-27274-1-git-send-email-martin.fuzzey@flowbird.group Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This fix will handle some MP1 FW issue like as mclk dpm table in renoir has a reverse
dpm clock layout and a zero frequency dpm level as following case.
Add map_cached bool to drm_gem_shmem_object, to request cached mappings
on a per-object base. Check the flag before adding writecombine to
pgprot bits.
Stefan reported a strange kernel fault which turned out to be due to a
missing KUAP disable in flush_coherent_icache() called from
flush_icache_range().
This path through handle_rt_signal64() to setup_trampoline() and
flush_icache_range() is only triggered by 64-bit processes that have
unmapped their VDSO, which is rare.
flush_icache_range() takes a range of addresses to flush. In
flush_coherent_icache() we implement an optimisation for CPUs where we
know we don't actually have to flush the whole range, we just need to
do a single icbi.
However we still execute the icbi on the user address of the start of
the range we're flushing. On CPUs that also implement KUAP (Power9)
that leads to the spurious fault above.
We should be able to pass any address, including a kernel address, to
the icbi on these CPUs, which would avoid any interaction with KUAP.
But I don't want to make that change in a bug fix, just in case it
surfaces some strange behaviour on some CPU.
So for now just disable KUAP around the icbi. Note the icbi is treated
as a load, so we allow read access, not write as you'd expect.
Fixes: 890274c2dc4c ("powerpc/64s: Implement KUAP for Radix MMU") Cc: stable@vger.kernel.org # v5.2+ Reported-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200303235708.26004-1-mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On s390 there currently is no implementation of pud_write(). That was ok
as long as we had our own implementation of get_user_pages_fast() which
checked for pud protection by testing the bit directly w/o using
pud_write(). The other callers of pud_write() are not reachable on s390.
After commit 1a42010cdc26 ("s390/mm: convert to the generic
get_user_pages_fast code") we use the generic get_user_pages_fast(), which
does call pud_write() in pud_access_permitted() for FOLL_WRITE access on
a large pud. Without an s390 specific pud_write(), the generic version is
called, which contains a BUG() statement to remind us that we don't have a
proper implementation. This results in a kernel panic.
Fix this by providing an implementation of pud_write().
commit 71ba41c9b1d9 ("s390/pci: provide support for MIO instructions")
zpci_map_resource() and zpci_setup_resources() default to using the
mio_wb address as the resource's start address. This means users of the
mapping, which includes most drivers, will get write combining on PCI
Stores. This may lead to problems when drivers expect write through
behavior when not using an explicit ioremap_wc().
Cc: stable@vger.kernel.org Fixes: 71ba41c9b1d9 ("s390/pci: provide support for MIO instructions") Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Explicitly set X86_FEATURE_OSPKE via set_cpu_cap() instead of calling
get_cpu_cap() to pull the feature bit from CPUID after enabling CR4.PKE.
Invoking get_cpu_cap() effectively wipes out any {set,clear}_cpu_cap()
changes that were made between this_cpu->c_init() and setup_pku(), as
all non-synthetic feature words are reinitialized from the CPU's CPUID
values.
Blasting away capability updates manifests most visibility when running
on a VMX capable CPU, but with VMX disabled by BIOS. To indicate that
VMX is disabled, init_ia32_feat_ctl() clears X86_FEATURE_VMX, using
clear_cpu_cap() instead of setup_clear_cpu_cap() so that KVM can report
which CPU is misconfigured (KVM needs to probe every CPU anyways).
Restoring X86_FEATURE_VMX from CPUID causes KVM to think VMX is enabled,
ultimately leading to an unexpected #GP when KVM attempts to do VMXON.
Arguably, init_ia32_feat_ctl() should use setup_clear_cpu_cap() and let
KVM figure out a different way to report the misconfigured CPU, but VMX
is not the only feature bit that is affected, i.e. there is precedent
that tweaking feature bits via {set,clear}_cpu_cap() after ->c_init()
is expected to work. Most notably, x86_init_rdrand()'s clearing of
X86_FEATURE_RDRAND when RDRAND malfunctions is also overwritten.
Fixes: 0697694564c8 ("x86/mm/pkeys: Actually enable Memory Protection Keys in the CPU") Reported-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Tested-by: Jacob Keller <jacob.e.keller@intel.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20200226231615.13664-1-sean.j.christopherson@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit f3186dd87669 ("spi: Optionally use GPIO descriptors for CS GPIOs")
amended of_spi_parse_dt() to always set SPI_CS_HIGH for SPI slaves whose
Chip Select is defined by a "cs-gpios" devicetree property.
This change broke userspace applications which issue an SPI_IOC_WR_MODE
ioctl() to an spidev: Chip Select polarity will be incorrect unless the
application is changed to set SPI_CS_HIGH. And once changed, it will be
incompatible with kernels not containing the commit.
Fix by setting SPI_CS_HIGH in spidev_ioctl() (under the same conditions
as in of_spi_parse_dt()).
Fixes: f3186dd87669 ("spi: Optionally use GPIO descriptors for CS GPIOs") Reported-by: Simon Han <z.han@kunbus.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/fca3ba7cdc930cd36854666ceac4fbcf01b89028.1582027457.git.lukas@wunner.de Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org # v5.1+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In __cmd_record(), when receiving SIGINT(ctrl + c), a 'done' flag will
be set and the event list will be disabled by evlist__disable() once.
While in auxtrace_record.read_finish(), the related events will be
enabled again, if they are continuous, the recording seems to be
endless.
If the cs_etm event is disabled, we don't enable it again here.
Note: This patch is NOT tested since i don't have such a machine with
coresight feature, but the code seems buggy same as arm-spe and
intel-pt.
Tester notes:
Thanks for looping, Adrian. Applied this patch and tested with
CoreSight on juno board, it works well.
Signed-off-by: Wei Li <liwei391@huawei.com> Reviewed-by: Leo Yan <leo.yan@linaro.org> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Tested-by: Leo Yan <leo.yan@linaro.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Tan Xiaojun <tanxiaojun@huawei.com> Cc: stable@vger.kernel.org # 5.4+ Link: http://lore.kernel.org/lkml/20200214132654.20395-4-adrian.hunter@intel.com
[ahunter: removed redundant 'else' after 'return'] Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In __cmd_record(), when receiving SIGINT(ctrl + c), a 'done' flag will
be set and the event list will be disabled by evlist__disable() once.
While in auxtrace_record.read_finish(), the related events will be
enabled again, if they are continuous, the recording seems to be endless.
If the intel_pt event is disabled, we don't enable it again here.
Before the patch:
huawei@huawei-2288H-V5:~/linux-5.5-rc4/tools/perf$ ./perf record -e \
intel_pt//u -p 46803
^C^C^C^C^C^C
After the patch:
huawei@huawei-2288H-V5:~/linux-5.5-rc4/tools/perf$ ./perf record -e \
intel_pt//u -p 48591
^C[ perf record: Woken up 0 times to write data ]
Warning:
AUX data lost 504 times out of 4816!
[ perf record: Captured and wrote 2024.405 MB perf.data ]
Signed-off-by: Wei Li <liwei391@huawei.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Tan Xiaojun <tanxiaojun@huawei.com> Cc: stable@vger.kernel.org # 5.4+ Link: http://lore.kernel.org/lkml/20200214132654.20395-2-adrian.hunter@intel.com
[ ahunter: removed redundant 'else' after 'return' ] Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The topology that v4l2_m2m_register_media_controller() creates for a
processing block actually created a source-to-source link and a sink-to-sink
link instead of two source-to-sink links.
Unfortunately v4l2-compliance never checked for such bad links, so this
went unreported for quite some time.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Reported-by: Nicolas Dufresne <nicolas@ndufresne.ca> Cc: <stable@vger.kernel.org> # for v4.19 and up Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Only ARGB32-type pixelformat were assumed to have 4 components, which is
wrong since RGB32-type pixelformats may have an alpha channel, so they
should also assume 4 color components.
The XRGB32-type pixelformats really have only 3 color components, but this
complicated matters since that creates strides that are sometimes width * 3
and sometimes width * 4, and in fact this can result in buffer overflows.
Keep things simple by just always processing all 4 color components.
In the future we might want to optimize this again for the XRGB32-type
pixelformats, but for now keep it simple and robust.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Cc: <stable@vger.kernel.org> # for v5.4 and up Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
These are bits so to test if a pad is a sink you use & but not ==.
It looks like the only reason this hasn't caused problems before is that
media_get_pad_index() is currently only used with pads that do not set the
MEDIA_PAD_FL_MUST_CONNECT flag. So a pad really had only the SINK or SOURCE
flag set and nothing else.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Cc: <stable@vger.kernel.org> # for v5.3 and up Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We need to nest the console lock in sel_lock, so we have to push it down
a bit. Fortunately, the callers of set_selection_* just lock the console
lock around the function call. So moving it down is easy.
syzkaller reported this UAF:
BUG: KASAN: use-after-free in n_tty_receive_buf_common+0x2481/0x2940 drivers/tty/n_tty.c:1741
Read of size 1 at addr ffff8880089e40e9 by task syz-executor.1/13184
It is due to a race between parallel paste_selection (TIOCL_PASTESEL)
and set_selection_user (TIOCL_SETSEL) invocations. One uses sel_buffer,
while the other frees it and reallocates a new one for another
selection. Add a mutex to close this race.
The mutex takes care properly of sel_buffer and sel_buffer_lth only. The
other selection global variables (like sel_start, sel_end, and sel_cons)
are protected only in set_selection_user. The other functions need quite
some more work to close the races of the variables there. This is going
to happen later.
This likely fixes (I am unsure as there is no reproducer provided) bug
206361 too. It was marked as CVE-2020-8648.
On Apple devices the _CRS method returns an empty resource template, and
the resource settings are instead provided by the _DSM method. But
commit 33364d63c75d6182fa369cea80315cf1bb0ee38e (serdev: Add ACPI
devices by ResourceSource field) changed the search for serdev devices
to require valid, non-empty resource template, thereby breaking Apple
devices and causing bluetooth devices to not be found.
This expands the check so that if we don't find a valid template, and
we're on an Apple machine, then just check for the device being an
immediate child of the controller and having a "baud" property.
Since commit 3bc3206e1c0f ("serial: fsl_lpuart: Remove the alias node
dependence") the port line number can also be allocated by IDA, but in
case of an error the ID will no be removed again. More importantly, any
ID will be freed in remove(), even if it wasn't allocated but instead
fetched by of_alias_get_id(). If it was not allocated by IDA there will
be a warning:
WARN(1, "ida_free called for id=%d which is not allocated.\n", id);
Move the ID allocation more to the end of the probe() so that we still
can use plain return in the first error cases.
Fixes: 3bc3206e1c0f ("serial: fsl_lpuart: Remove the alias node dependence") Signed-off-by: Michael Walle <michael@walle.cc> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20200303174306.6015-3-michael@walle.cc Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
dra76x is not affected by i887 which requires mmc3 node to be limited to
a max frequency of 64 MHz. Fix this by overwriting the correct value in
the the dra76 specific dtsi.
Fixes: 895bd4b3e5ec ("ARM: dts: Add support for dra76-evm") Cc: stable@vger.kernel.org Signed-off-by: Faiz Abbas <faiz_abbas@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
btrfs_lookup_and_bind_dio_csum() does pointer arithmetic which assumes
32-bit checksums. If using a larger checksum, this leads to spurious
failures when a direct I/O read crosses a stripe. This is easy
to reproduce:
When get an error in the middle of reading an inode, some fields in the
inode might be still not initialized. And then the evict_inode path may
access those fields via iput().
To fix, this makes sure that inode fields are initialized.
Commit cd02cf1aceea ("mm/hotplug: fix an imbalance with DEBUG_PAGEALLOC")
fixed memory hotplug with debug_pagealloc enabled, where onlining a page
goes through page freeing, which removes the direct mapping. Some arches
don't like when the page is not mapped in the first place, so
generic_online_page() maps it first. This is somewhat wasteful, but
better than special casing page freeing fast paths.
The commit however missed that DEBUG_PAGEALLOC configured doesn't mean
it's actually enabled. One has to test debug_pagealloc_enabled() since 031bc5743f15 ("mm/debug-pagealloc: make debug-pagealloc boottime
configurable"), or alternatively debug_pagealloc_enabled_static() since 8e57f8acbbd1 ("mm, debug_pagealloc: don't rely on static keys too early"),
but this is not done.
As a result, a s390 kernel with DEBUG_PAGEALLOC configured but not enabled
will crash:
Fix this by checking debug_pagealloc_enabled_static() before calling
kernel_map_pages(). Backports for kernel before 5.5 should use
debug_pagealloc_enabled() instead. Also add comments.
Fixes: cd02cf1aceea ("mm/hotplug: fix an imbalance with DEBUG_PAGEALLOC") Reported-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: <stable@vger.kernel.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Qian Cai <cai@lca.pw> Link: http://lkml.kernel.org/r/20200224094651.18257-1-vbabka@suse.cz Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In set_pmd_migration_entry(), pmdp_invalidate() is used to change PMD
atomically. But the PMD is read before that with an ordinary memory
reading. If the THP (transparent huge page) is written between the PMD
reading and pmdp_invalidate(), the PMD dirty bit may be lost, and cause
data corruption. The race window is quite small, but still possible in
theory, so need to be fixed.
The race is fixed via using the return value of pmdp_invalidate() to get
the original content of PMD, which is a read/modify/write atomic
operation. So no THP writing can occur in between.
The race has been introduced when the THP migration support is added in
the commit 616b8371539a ("mm: thp: enable thp migration in generic path").
But this fix depends on the commit d52605d7cb30 ("mm: do not lose dirty
and accessed bits in pmdp_invalidate()"). So it's easy to be backported
after v4.16. But the race window is really small, so it may be fine not
to backport the fix at all.
Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: <stable@vger.kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Michal Hocko <mhocko@kernel.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Link: http://lkml.kernel.org/r/20200220075220.2327056-1-ying.huang@intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
: A user reported a bug against a distribution kernel while running a
: proprietary workload described as "memory intensive that is not swapping"
: that is expected to apply to mainline kernels. The workload is
: read/write/modifying ranges of memory and checking the contents. They
: reported that within a few hours that a bad PMD would be reported followed
: by a memory corruption where expected data was all zeros. A partial
: report of the bad PMD looked like
:
: [ 5195.338482] ../mm/pgtable-generic.c:33: bad pmd ffff8888157ba008(000002e0396009e2)
: [ 5195.341184] ------------[ cut here ]------------
: [ 5195.356880] kernel BUG at ../mm/pgtable-generic.c:35!
: ....
: [ 5195.410033] Call Trace:
: [ 5195.410471] [<ffffffff811bc75d>] change_protection_range+0x7dd/0x930
: [ 5195.410716] [<ffffffff811d4be8>] change_prot_numa+0x18/0x30
: [ 5195.410918] [<ffffffff810adefe>] task_numa_work+0x1fe/0x310
: [ 5195.411200] [<ffffffff81098322>] task_work_run+0x72/0x90
: [ 5195.411246] [<ffffffff81077139>] exit_to_usermode_loop+0x91/0xc2
: [ 5195.411494] [<ffffffff81003a51>] prepare_exit_to_usermode+0x31/0x40
: [ 5195.411739] [<ffffffff815e56af>] retint_user+0x8/0x10
:
: Decoding revealed that the PMD was a valid prot_numa PMD and the bad PMD
: was a false detection. The bug does not trigger if automatic NUMA
: balancing or transparent huge pages is disabled.
:
: The bug is due a race in change_pmd_range between a pmd_trans_huge and
: pmd_nond_or_clear_bad check without any locks held. During the
: pmd_trans_huge check, a parallel protection update under lock can have
: cleared the PMD and filled it with a prot_numa entry between the transhuge
: check and the pmd_none_or_clear_bad check.
:
: While this could be fixed with heavy locking, it's only necessary to make
: a copy of the PMD on the stack during change_pmd_range and avoid races. A
: new helper is created for this as the check if quite subtle and the
: existing similar helpful is not suitable. This passed 154 hours of
: testing (usually triggers between 20 minutes and 24 hours) without
: detecting bad PMDs or corruption. A basic test of an autonuma-intensive
: workload showed no significant change in behaviour.
Although Mel withdrew the patch on the face of LKML comment
https://lkml.org/lkml/2017/4/10/922 the race window aforementioned is
still open, and we have reports of Linpack test reporting bad residuals
after the bad PMD warning is observed. In addition to that, bad
rss-counter and non-zero pgtables assertions are triggered on mm teardown
for the task hitting the bad PMD.
host kernel: mm/pgtable-generic.c:40: bad pmd 00000000b3152f68(8000000d2d2008e7)
....
host kernel: BUG: Bad rss-counter state mm:00000000b583043d idx:1 val:512
host kernel: BUG: non-zero pgtables_bytes on freeing mm: 4096
The issue is observed on a v4.18-based distribution kernel, but the race
window is expected to be applicable to mainline kernels, as well.
[akpm@linux-foundation.org: fix comment typo, per Rafael] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Rafael Aquini <aquini@redhat.com> Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Cc: <stable@vger.kernel.org> Cc: Zi Yan <zi.yan@cs.rutgers.edu> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Michal Hocko <mhocko@suse.com> Link: http://lkml.kernel.org/r/20200216191800.22423-1-aquini@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When syzkaller tests, there is a UAF:
BUG: KASan: use after free in vgacon_invert_region+0x9d/0x110 at addr ffff880000100000
Read of size 2 by task syz-executor.1/16489
page:ffffea0000004000 count:0 mapcount:-127 mapping: (null)
index:0x0
page flags: 0xfffff00000000()
page dumped because: kasan: bad access detected
CPU: 1 PID: 16489 Comm: syz-executor.1 Not tainted
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
Call Trace:
[<ffffffffb119f309>] dump_stack+0x1e/0x20
[<ffffffffb04af957>] kasan_report+0x577/0x950
[<ffffffffb04ae652>] __asan_load2+0x62/0x80
[<ffffffffb090f26d>] vgacon_invert_region+0x9d/0x110
[<ffffffffb0a39d95>] invert_screen+0xe5/0x470
[<ffffffffb0a21dcb>] set_selection+0x44b/0x12f0
[<ffffffffb0a3bfae>] tioclinux+0xee/0x490
[<ffffffffb0a1d114>] vt_ioctl+0xff4/0x2670
[<ffffffffb0a0089a>] tty_ioctl+0x46a/0x1a10
[<ffffffffb052db3d>] do_vfs_ioctl+0x5bd/0xc40
[<ffffffffb052e2f2>] SyS_ioctl+0x132/0x170
[<ffffffffb11c9b1b>] system_call_fastpath+0x22/0x27
Memory state around the buggy address: ffff8800000fff00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 ffff8800000fff80: 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00
>ffff880000100000: ff ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff
It can be reproduce in the linux mainline by the program:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/ioctl.h>
#include <linux/vt.h>
struct tiocl_selection {
unsigned short xs; /* X start */
unsigned short ys; /* Y start */
unsigned short xe; /* X end */
unsigned short ye; /* Y end */
unsigned short sel_mode; /* selection mode */
};
When resize the screen, update the 'vc->vc_size_row' to the new_row_size,
but when 'set_origin' in 'vgacon_set_origin', vgacon use 'vga_vram_base'
for 'vc_origin' and 'vc_visible_origin', not 'vc_screenbuf'. It maybe
smaller than 'vc_screenbuf'. When TIOCLINUX, use the new_row_size to calc
the offset, it maybe larger than the vga_vram_size in vgacon driver, then
bad access.
Also, if set an larger screenbuf firstly, then set an more larger
screenbuf, when copy old_origin to new_origin, a bad access may happen.
So, If the screen size larger than vga_vram, resize screen should be
failed. This alse fix CVE-2020-8649 and CVE-2020-8647.
Linus pointed out that overflow checking seems absent. We're saved by
the existing bounds checks in vc_do_resize() with rather strict
limits:
if (cols > VC_RESIZE_MAXCOL || lines > VC_RESIZE_MAXROW)
return -EINVAL;
Fixes: 0aec4867dca14 ("[PATCH] SVGATextMode fix")
Reference: CVE-2020-8647 and CVE-2020-8649 Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
[danvet: augment commit message to point out overflow safety] Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20200304022429.37738-1-zhangxiaoxu5@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewing a fresh portion of coverity defects in USB core
(specifically CID 1458999), Alan Stern noted below in [1]:
On Tue, Feb 25, 2020 at 02:39:23PM -0500, Alan Stern wrote:
> A revised search finds line 997 in drivers/usb/core/hub.c and lines
> 216, 269 in drivers/usb/core/port.c. (I didn't try looking in any
> other directories.) AFAICT all three of these should check the
> return value, although a error message in the kernel log probably
> isn't needed.
Factor out the usb_port_runtime_{resume,suspend}() changes into a
standalone patch to allow conflict-free porting on top of stable v3.9+.
Reviewing a fresh portion of coverity defects in USB core
(specifically CID 1458999), Alan Stern noted below in [1]:
On Tue, Feb 25, 2020 at 02:39:23PM -0500, Alan Stern wrote:
> A revised search finds line 997 in drivers/usb/core/hub.c and lines
> 216, 269 in drivers/usb/core/port.c. (I didn't try looking in any
> other directories.) AFAICT all three of these should check the
> return value, although a error message in the kernel log probably
> isn't needed.
Factor out the usb_remove_device() change into a standalone patch to
allow conflict-free integration on top of the earliest stable branches.
If there are TRBs pending during reset endpoint operation, the
DMA will advance after reset operation, but it isn't expected,
since the data is not yet available (For OUT, the data is not
yet available). After the data is ready, there won't be any
interrupt since the EP_TRADDR already points to next TRB entry
and doorbell is not set.
To fix it, it toggles cycle bit before reset operation, and restores
it after reset, it could avoid unexpected DMA advance due to
cycle bit is for software during the endpoint reset operation.
It has marked the dequeue trb as link trb, but its next segment
pointer is still itself, it causes the transfer can't go on. Fix
it by set its pointer as the trb address for the next request.
If scatter-gather operation is allowed, a large USB request is split
into multiple TRBs. For preparing TRBs for sg list, driver iterates
over the list and creates TRB for each sg and mark the chain bit to
false for the last sg. The current IOMMU driver is clubbing the list
of sgs which shares a page boundary into one and giving it to USB driver.
With this the number of sgs mapped it not equal to the the number of sgs
passed. Because of this USB driver is not marking the chain bit to false
since it couldn't iterate to the last sg. This patch addresses this issue
by marking the chain bit to false if it is the last mapped sg.
At a practical level, this patch resolves USB transfer stalls
seen with adb on dwc3 based db845c, pixel3 and other qcom
hardware after functionfs gadget added scatter-gather support
around v4.20.
Credit also to Anurag Kumar Vulisha <anurag.kumar.vulisha@xilinx.com>
who implemented a very similar fix to this issue.
Cc: Felipe Balbi <balbi@kernel.org> Cc: Yang Fei <fei.yang@intel.com> Cc: Thinh Nguyen <thinhn@synopsys.com> Cc: Tejas Joglekar <tejas.joglekar@synopsys.com> Cc: Andrzej Pietrasiewicz <andrzej.p@collabora.com> Cc: Jack Pham <jackp@codeaurora.org> Cc: Todd Kjos <tkjos@google.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Linux USB List <linux-usb@vger.kernel.org> Cc: stable <stable@vger.kernel.org> #4.20+ Signed-off-by: Pratham Pratap <prathampratap@codeaurora.org>
[jstultz: Slight tweak to remove sg_is_last() usage, reworked
commit message, minor comment tweak] Signed-off-by: John Stultz <john.stultz@linaro.org> Link: https://lore.kernel.org/r/20200302214443.55783-1-john.stultz@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 4d7201cda226 ("usb: usb251xb: add vdd supply support") didn't
covered the non-DT use-case and so the regualtor_enable() call during
probe will fail on those platforms. Also the commit didn't handled the
error case correctly.
Move devm_regulator_get() out of usb251xb_get_ofdata() to address the
1st issue. This can be done without worries because devm_regulator_get()
handles the non-DT use-case too. Add devm_add_action_or_reset() to
address the 2nd bug.
Current driver has 240 (USB2.0) and 2048 (USB3.0) as max_sectors,
e.g., /sys/bus/scsi/devices/0:0:0:0/max_sectors
If data access times out, driver error handling will issue a port
reset.
Sometimes Samsung Fit (090C:1000) flash disk will not respond to
later Set Address or Get Descriptor command.
Adding this quirk to limit max_sectors to 64 sectors to avoid issue
occurring.
To rename a file in SMB2 we open it with the DELETE access and do a
special SetInfo on it. If the handle is missing the DELETE bit the
server will fail the SetInfo with STATUS_ACCESS_DENIED.
We currently try to reuse any existing opened handle we have with
cifs_get_writable_path(). That function looks for handles with WRITE
access but doesn't check for DELETE, making rename() fail if it finds
a handle to reuse. Simple reproducer below.
To select handles with the DELETE bit, this patch adds a flag argument
to cifs_get_writable_path() and find_writable_file() and the existing
'bool fsuid_only' argument is converted to a flag.
The cifsFileInfo struct only stores the UNIX open mode but not the
original SMB access flags. Since the DELETE bit is not mapped in that
mode, this patch stores the access mask in cifs_fid on file open,
which is accessible from cifsFileInfo.
int main(int argc, char *argv[])
{
int fd, ret;
if (argc != 3) {
fprintf(stderr, "Usage: %s A B\n"
"create&open A in write mode, "
"rename A to B, close A\n", argv[0]);
return 0;
}
Fixes: 8de9e86c67ba ("cifs: create a helper to find a writeable handle by path name") CC: Stable <stable@vger.kernel.org> Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If from cifs_revalidate_dentry_attr() the SMB2/QUERY_INFO call fails with an
error, such as STATUS_SESSION_EXPIRED, causing the session to be reconnected
it is possible we will leak -EAGAIN back to the application even for
system calls such as stat() where this is not a valid error.
Fix this by re-trying the operation from within cifs_revalidate_dentry_attr()
if cifs_get_inode_info*() returns -EAGAIN.
This fixes stat() and possibly also other system calls that uses
cifs_revalidate_dentry*().
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com> CC: Stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The initial patch that added sync_state() support didn't handle the case
where a supplier has no consumers. This was because when a device is
successfully bound with a driver, only its suppliers were checked to see
if they are eligible to get a sync_state(). This is not sufficient for
devices that have no consumers but still need to do device state clean
up. So fix this.
The Gigabyte X570 Aorus Master motherboard with ALC1220 codec
requires a similar workaround for Clevo laptops to enforce the
DAC/mixer connection path. Set up a quirk entry for that.
This is a necessary follow up to the first fix I proposed and we merged
in 2669b8b0c79 ("binder: prevent UAF for binderfs devices"). I have been
overly optimistic that the simple fix I proposed would work. But alas,
ihold() + iput() won't work since the inodes won't survive the
destruction of the superblock.
So all we get with my prior fix is a different race with a tinier
race-window but it doesn't solve the issue. Fwiw, the problem lies with
generic_shutdown_super(). It even has this cozy Al-style comment:
if (!list_empty(&sb->s_inodes)) {
printk("VFS: Busy inodes after unmount of %s. "
"Self-destruct in 5 seconds. Have a nice day...\n",
sb->s_id);
}
On binder_release(), binder_defer_work(proc, BINDER_DEFERRED_RELEASE) is
called which punts the actual cleanup operation to a workqueue. At some
point, binder_deferred_func() will be called which will end up calling
binder_deferred_release() which will retrieve and cleanup the
binder_context attach to this struct binder_proc.
If we trace back where this binder_context is attached to binder_proc we
see that it is set in binder_open() and is taken from the struct
binder_device it is associated with. This obviously assumes that the
struct binder_device that context is attached to is _never_ freed. While
that might be true for devtmpfs binder devices it is most certainly
wrong for binderfs binder devices.
So, assume binder_open() is called on a binderfs binder devices. We now
stash away the struct binder_context associated with that struct
binder_devices:
proc->context = &binder_dev->context;
/* binderfs stashes devices in i_private */
if (is_binderfs_device(nodp)) {
binder_dev = nodp->i_private;
info = nodp->i_sb->s_fs_info;
binder_binderfs_dir_entry_proc = info->proc_log_dir;
} else {
.
.
.
proc->context = &binder_dev->context;
Now let's assume that the binderfs instance for that binder devices is
shutdown via umount() and/or the mount namespace associated with it goes
away. As long as there is still an fd open for that binderfs binder
device things are fine. But let's assume we now close the last fd for
that binderfs binder device. Now binder_release() is called and punts to
the workqueue. Assume that the workqueue has quite a bit of stuff to do
and doesn't get to cleaning up the struct binder_proc and the associated
struct binder_context with it for that binderfs binder device right
away. In the meantime, the VFS is killing the super block and is
ultimately calling sb->evict_inode() which means it will call
binderfs_evict_inode() which does:
thereby freeing the struct binder_device including struct
binder_context.
Now the workqueue finally has time to get around to cleaning up struct
binder_proc and is now trying to access the associate struct
binder_context. Since it's already freed it will OOPs.
Fix this by introducing a refounct on binder devices.
This is an alternative fix to 51d8a7eca677 ("binder: prevent UAF read in
print_binder_transaction_log_entry()").
Fixes: 3ad20fe393b3 ("binder: implement binderfs") Fixes: 2669b8b0c798 ("binder: prevent UAF for binderfs devices") Fixes: 03e2e07e3814 ("binder: Make transaction_log available in binderfs")
Related : 51d8a7eca677 ("binder: prevent UAF read in print_binder_transaction_log_entry()") Cc: stable@vger.kernel.org Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Acked-by: Todd Kjos <tkjos@google.com> Link: https://lore.kernel.org/r/20200303164340.670054-1-christian.brauner@ubuntu.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On binder_release(), binder_defer_work(proc, BINDER_DEFERRED_RELEASE) is
called which punts the actual cleanup operation to a workqueue. At some
point, binder_deferred_func() will be called which will end up calling
binder_deferred_release() which will retrieve and cleanup the
binder_context attach to this struct binder_proc.
If we trace back where this binder_context is attached to binder_proc we
see that it is set in binder_open() and is taken from the struct
binder_device it is associated with. This obviously assumes that the
struct binder_device that context is attached to is _never_ freed. While
that might be true for devtmpfs binder devices it is most certainly
wrong for binderfs binder devices.
So, assume binder_open() is called on a binderfs binder devices. We now
stash away the struct binder_context associated with that struct
binder_devices:
proc->context = &binder_dev->context;
/* binderfs stashes devices in i_private */
if (is_binderfs_device(nodp)) {
binder_dev = nodp->i_private;
info = nodp->i_sb->s_fs_info;
binder_binderfs_dir_entry_proc = info->proc_log_dir;
} else {
.
.
.
proc->context = &binder_dev->context;
Now let's assume that the binderfs instance for that binder devices is
shutdown via umount() and/or the mount namespace associated with it goes
away. As long as there is still an fd open for that binderfs binder
device things are fine. But let's assume we now close the last fd for
that binderfs binder device. Now binder_release() is called and punts to
the workqueue. Assume that the workqueue has quite a bit of stuff to do
and doesn't get to cleaning up the struct binder_proc and the associated
struct binder_context with it for that binderfs binder device right
away. In the meantime, the VFS is killing the super block and is
ultimately calling sb->evict_inode() which means it will call
binderfs_evict_inode() which does:
thereby freeing the struct binder_device including struct
binder_context.
Now the workqueue finally has time to get around to cleaning up struct
binder_proc and is now trying to access the associate struct
binder_context. Since it's already freed it will OOPs.
Fix this by holding an additional reference to the inode that is only
released once the workqueue is done cleaning up struct binder_proc. This
is an easy alternative to introducing separate refcounting on struct
binder_device which we can always do later if it becomes necessary.
This is an alternative fix to 51d8a7eca677 ("binder: prevent UAF read in
print_binder_transaction_log_entry()").
Fixes: 3ad20fe393b3 ("binder: implement binderfs") Fixes: 03e2e07e3814 ("binder: Make transaction_log available in binderfs")
Related : 51d8a7eca677 ("binder: prevent UAF read in print_binder_transaction_log_entry()") Cc: stable@vger.kernel.org Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Acked-by: Todd Kjos <tkjos@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
SCU requires that all messages words are written sequentially but linux MU
driver implements multiple independent channels for each register so ordering
between different channels must be ensured by SCU API interface.
Wait for tx_done before every send to ensure that no queueing happens at the
mailbox channel level.
0x11 and 0x12 set the ECN bits based on RFC2474, it would be better to avoid
that. 0x14 and 0x18 would be better and works as well.
Reported-by: Petr Machata <petrm@mellanox.com> Fixes: 4e867c9a50ff ("selftests: forwarding: vxlan_bridge_1d: fix tos value") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
During ftrace init, linux will replace all function prologues
(call_mcout) with nops, but it need flush_dcache and
invalidate_icache to make it work. So flush_cache functions
couldn't be nested called by ftrace framework.
In the past, we didn't care about kernel sp when saving pt_reg. But in some
cases, we still need pt_reg->usp to represent the kernel stack before enter
exception.
For cmpxhg in atomic.S, we need save and restore usp for above.
There is no present bit in csky pmd hardware, so we need to prepare invalid_pte_table
for empty pmd entry and the functions (pmd_none & pmd_present) in pgtable.h need
invalid_pte_talbe to get result. If a module use these functions, we need export the
symbol for it.
While it is not yet understood why a TX underflow can easily occur
for SGMII interfaces resulting in a TX wedge. It has been found that
disabling/re-enabling the LMAC resolves the issue.
Signed-off-by: Tim Harvey <tharvey@gateworks.com> Reviewed-by: Robert Jones <rjones@gateworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Variables declared in a switch statement before any case statements
cannot be automatically initialized with compiler instrumentation (as
they are not part of any execution flow). With GCC's proposed automatic
stack variable initialization feature, this triggers a warning (and they
don't get initialized). Clang's automatic stack variable initialization
(via CONFIG_INIT_STACK_ALL=y) doesn't throw a warning, but it also
doesn't initialize such variables[1]. Note that these warnings (or silent
skipping) happen before the dead-store elimination optimization phase,
so even when the automatic initializations are later elided in favor of
direct initializations, the warnings remain.
To avoid these problems, move such variables into the "case" where
they're used or lift them up into the main function body.
arch/x86/xen/enlighten_pv.c: In function ‘xen_write_msr_safe’:
arch/x86/xen/enlighten_pv.c:904:12: warning: statement will never be executed [-Wswitch-unreachable]
904 | unsigned which;
| ^~~~~
[1] https://bugs.llvm.org/show_bug.cgi?id=44916
Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20200220062318.69299-1-keescook@chromium.org Reviewed-by: Juergen Gross <jgross@suse.com>
[boris: made @which an 'unsigned int'] Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Remove code that tell the OS that link is going down when user
change flow control via ethtool. When link is up it isn't certain
that link goes down after 0x0605 aq command. If link doesn't go
down, OS thinks that link is down, but physical link is up. To
reset this state user have to take interface down and up.
If link goes down after 0x0605 command, FW send information
about that and after that driver tells the OS that the link goes
down. So this code in ethtool is unnecessary.
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
gcc may detect a false positive on nvme using an unintialized variable
if setting features fails. Since this is not a fast path, explicitly
initialize this variable to suppress the warning.
Reported-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
GCC 10 changed the default to -fno-common, which leads to
LD arch/x86/boot/compressed/vmlinux
ld: arch/x86/boot/compressed/pgtable_64.o:(.bss+0x0): multiple definition of `__force_order'; \
arch/x86/boot/compressed/kaslr_64.o:(.bss+0x0): first defined here
make[2]: *** [arch/x86/boot/compressed/Makefile:119: arch/x86/boot/compressed/vmlinux] Error 1
Since __force_order is already provided in pgtable_64.c, there is no
need to declare __force_order in kaslr_64.c.
The Samsung SSD SM981/PM981 and Toshiba SSD KBG40ZNT256G on the Lenovo
C640 platform experience runtime resume issues when the SSDs are kept in
sleep/suspend mode for long time.
This patch applies the 'Simple Suspend' quirk to these configurations.
With this patch, the issue had not been observed in a 1+ day test.
Reviewed-by: Jon Derrick <jonathan.derrick@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Shyjumon N <shyjumon.n@intel.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
If CONFIG_LOAD_UEFI_KEYS is enabled, the kernel attempts to load the certs
from the db, dbx and MokListRT EFI variables into the appropriate keyrings.
But it just assumes that the variables will be present and prints an error
if the certs can't be loaded, even when is possible that the variables may
not exist. For example the MokListRT variable will only be present if shim
is used.
So only print an error message about failing to get the certs list from an
EFI variable if this is found. Otherwise these printed errors just pollute
the kernel log ring buffer with confusing messages like the following:
[ 5.427251] Couldn't get size: 0x800000000000000e
[ 5.427261] MODSIGN: Couldn't get UEFI db list
[ 5.428012] Couldn't get size: 0x800000000000000e
[ 5.428023] Couldn't get UEFI MokListRT
Reported-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Javier Martinez Canillas <javierm@redhat.com> Tested-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
For the same reason as commit 19514fc665ff ("arm, kbuild: make "make
install" not depend on vmlinux"), the install targets should never
trigger the rebuild of the kernel.
The variable, CONFIGURE, is not set by anyone. Remove it as well.
The da9062 hw has a minimum ping cool down phase of at least 200ms. The
driver takes that into account by setting the min_hw_heartbeat_ms to
300ms and the core guarantees that the hw limit is observed for the
ping() calls. But the core can't guarantee the required minimum ping
cool down phase if a stop() command is send immediately after the ping()
command. So it is not allowed to ping the watchdog within the stop()
command as the driver does. Remove the ping can be done without doubts
because the watchdog gets disabled anyway and a (re)start resets the
watchdog counter too.
The call to of_get_mac_address() can return -EPROBE_DEFER, for instance
when the MAC address is read from a NVMEM driver that did not probe yet.
Cc: H. Nikolaus Schaller <hns@goldelico.com> Cc: Mathieu Malaterre <malat@debian.org> Signed-off-by: Paul Cercueil <paul@crapouillou.net> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
The Micrel KSZ8851-16MLLI datasheet DS00002357B page 12 states that
BE[3:0] signals are active high. This contradicts the measurements
of the behavior of the actual chip, where these signals behave as
active low. For example, to read the CIDER register, the bus must
expose 0xc0c0 during the address phase, which means BE[3:0]=4'b1100.
Signed-off-by: Marek Vasut <marex@denx.de> Cc: David S. Miller <davem@davemloft.net> Cc: Lukas Wunner <lukas@wunner.de> Cc: Petr Stetiar <ynezz@true.cz> Cc: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
The packet data written to and read from Micrel KSZ8851-16MLLI must be
byte-swapped in 16-bit mode, add this byte-swapping.
Signed-off-by: Marek Vasut <marex@denx.de> Cc: David S. Miller <davem@davemloft.net> Cc: Lukas Wunner <lukas@wunner.de> Cc: Petr Stetiar <ynezz@true.cz> Cc: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
This driver is mixing 8-bit and 16-bit bus accessors for reasons unknown,
however the speculation is that this was some sort of attempt to support
the 8-bit bus mode.
As per the KS8851-16MLL documentation, all two registers accessed via the
8-bit accessors are internally 16-bit registers, so reading them using
16-bit accessors is fine. The KS_CCR read can be converted to 16-bit read
outright, as it is already a concatenation of two 8-bit reads of that
register. The KS_RXQCR accesses are 8-bit only, however writing the top
8 bits of the register is OK as well, since the driver caches the entire
16-bit register value anyway.
Finally, the driver is not used by any hardware in the kernel right now.
The only hardware available to me is one with 16-bit bus, so I have no
way to test the 8-bit bus mode, however it is unlikely this ever really
worked anyway. If the 8-bit bus mode is ever required, it can be easily
added by adjusting the 16-bit accessors to do 2 consecutive accesses,
which is how this should have been done from the beginning.
Signed-off-by: Marek Vasut <marex@denx.de> Cc: David S. Miller <davem@davemloft.net> Cc: Lukas Wunner <lukas@wunner.de> Cc: Petr Stetiar <ynezz@true.cz> Cc: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>