If two or more mappings go back to back to each other they can be passed
into io_uring to be registered as a single registered buffer. That would
even work if mappings came from different sources, e.g. it's possible to
mix in this way anon pages and pages from shmem or hugetlb. That is not
a problem but it'd rather be less prone if we forbid such mixing.
If the kernel is configured with CONFIG_PREEMPT_NONE, we could be
sitting in a tight loop reaping events but not giving them a chance to
finish. This results in a trace ala:
Just like for task_work, set the task mode to TASK_RUNNING before doing
potential resume work. We're not holding any locks at this point,
but we may have already set the task state to TASK_INTERRUPTIBLE in
preparation for going to sleep waiting for events. Ensure that we set it
back to TASK_RUNNING if we have work to process, to avoid warnings on
calling blocking operations with !TASK_RUNNING.
Fixes: b5d3ae202fbf ("io_uring: handle TIF_NOTIFY_RESUME when checking for task_work") Reported-by: kernel test robot <oliver.sang@intel.com> Link: https://lore.kernel.org/oe-lkp/202302062208.24d3e563-oliver.sang@intel.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If TIF_NOTIFY_RESUME is set, then we need to call resume_user_mode_work()
for PF_IO_WORKER threads. They never return to usermode, hence never get
a chance to process any items that are marked by this flag. Most notably
this includes the final put of files, but also any throttling markers set
by block cgroups.
When preparing an AER-CTR request, the driver copies the key provided by
the user into a data structure that is accessible by the firmware.
If the target device is QAT GEN4, the key size is rounded up by 16 since
a rounded up size is expected by the device.
If the key size is rounded up before the copy, the size used for copying
the key might be bigger than the size of the region containing the key,
causing an out-of-bounds read.
Fix by doing the copy first and then update the keylen.
This is to fix the following warning reported by KASAN:
[ 138.150574] BUG: KASAN: global-out-of-bounds in qat_alg_skcipher_init_com.isra.0+0x197/0x250 [intel_qat]
[ 138.150641] Read of size 32 at addr ffffffff88c402c0 by task cryptomgr_test/2340
Hierarchical domains created using irq_domain_create_hierarchy() are
currently added to the domain list before having been fully initialised.
This specifically means that a racing allocation request might fail to
allocate irq data for the inner domains of a hierarchy in case the
parent domain pointer has not yet been set up.
Note that this is not really any issue for irqchip drivers that are
registered early (e.g. via IRQCHIP_DECLARE() or IRQCHIP_ACPI_DECLARE())
but could potentially cause trouble with drivers that are registered
later (e.g. modular drivers using IRQCHIP_PLATFORM_DRIVER_BEGIN(),
gpiochip drivers, etc.).
Fixes: afb7da83b9f4 ("irqdomain: Introduce helper function irq_domain_add_hierarchy()") Cc: stable@vger.kernel.org # 3.19 Signed-off-by: Marc Zyngier <maz@kernel.org>
[ johan: add commit message ] Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230213104302.17307-8-johan+linaro@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Avoid looking for an existing mapping twice when creating a new mapping
using irq_create_fwspec_mapping() by factoring out the actual allocation
which is shared with irq_create_mapping_affinity().
The new helper function will also be used to fix a shared-interrupt
mapping race, hence the Fixes tag.
Fixes: b62b2cf5759b ("irqdomain: Fix handling of type settings for existing mappings") Cc: stable@vger.kernel.org # 4.8 Tested-by: Hsin-Yi Wang <hsinyi@chromium.org> Tested-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230213104302.17307-5-johan+linaro@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The sanity check for an already mapped virq is done outside of the
irq_domain_mutex-protected section which means that an (unlikely) racing
association may not be detected.
Fix this by factoring out the association implementation, which will
also be used in a follow-on change to fix a shared-interrupt mapping
race.
Commit 98de59bfe4b2f ("take calculation of final prot in
security_mmap_file() into a helper") moved the code to update prot, to be
the actual protections applied to the kernel, to a new helper called
mmap_prot().
However, while without the helper ima_file_mmap() was getting the updated
prot, with the helper ima_file_mmap() gets the original prot, which
contains the protections requested by the application.
A possible consequence of this change is that, if an application calls
mmap() with only PROT_READ, and the kernel applies PROT_EXEC in addition,
that application would have access to executable memory without having this
event recorded in the IMA measurement list. This situation would occur for
example if the application, before mmap(), calls the personality() system
call with READ_IMPLIES_EXEC as the first argument.
Align ima_file_mmap() parameters with those of the mmap_file LSM hook, so
that IMA can receive both the requested prot and the final prot. Since the
requested protections are stored in a new variable, and the final
protections are stored in the existing variable, this effectively restores
the original behavior of the MMAP_CHECK hook.
Cc: stable@vger.kernel.org Fixes: 98de59bfe4b2 ("take calculation of final prot in security_mmap_file() into a helper") Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When plain IBRS is enabled (not enhanced IBRS), the logic in
spectre_v2_user_select_mitigation() determines that STIBP is not needed.
The IBRS bit implicitly protects against cross-thread branch target
injection. However, with legacy IBRS, the IBRS bit is cleared on
returning to userspace for performance reasons which leaves userspace
threads vulnerable to cross-thread branch target injection against which
STIBP protects.
Exclude IBRS from the spectre_v2_in_ibrs_mode() check to allow for
enabling STIBP (through seccomp/prctl() by default or always-on, if
selected by spectre_v2_user kernel cmdline parameter).
The AMD side of the loader has always claimed to support mixed
steppings. But somewhere along the way, it broke that by assuming that
the cached patch blob is a single one instead of it being one per
*node*.
So turn it into a per-node one so that each node can stash the blob
relevant for it.
[ NB: Fixes tag is not really the exactly correct one but it is good
enough. ]
Fixes: fe055896c040 ("x86/microcode: Merge the early microcode loader") Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: <stable@kernel.org> # 2355370cd941 ("x86/microcode/amd: Remove load_microcode_amd()'s bsp parameter") Cc: <stable@kernel.org> # a5ad92134bd1 ("x86/microcode/AMD: Add a @cpu parameter to the reloading functions") Link: https://lore.kernel.org/r/20230130161709.11615-4-bp@alien8.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When arch_prepare_optimized_kprobe calculating jump destination address,
it copies original instructions from jmp-optimized kprobe (see
__recover_optprobed_insn), and calculated based on length of original
instruction.
arch_check_optimized_kprobe does not check KPROBE_FLAG_OPTIMATED when
checking whether jmp-optimized kprobe exists.
As a result, setup_detour_execution may jump to a range that has been
overwritten by jump destination address, resulting in an inval opcode error.
For example, assume that register two kprobes whose addresses are
<func+9> and <func+11> in "func" function.
The original code of "func" function is as follows:
2. Register the kprobe for <func+9>, assume that is kp2, corresponding optimized_kprobe is op2.
register_kprobe(kp2)
register_aggr_kprobe
alloc_aggr_kprobe
__prepare_optimized_kprobe
arch_prepare_optimized_kprobe
__recover_optprobed_insn // copy original bytes from kp1->optinsn.copied_insn,
// jump address = <func+14>
3. disable kp1:
disable_kprobe(kp1)
__disable_kprobe
...
if (p == orig_p || aggr_kprobe_disabled(orig_p)) {
ret = disarm_kprobe(orig_p, true) // add op1 in unoptimizing_list, not unoptimized
orig_p->flags |= KPROBE_FLAG_DISABLED; // op1->flags == KPROBE_FLAG_OPTIMATED | KPROBE_FLAG_DISABLED
...
4. unregister kp2
__unregister_kprobe_top
...
if (!kprobe_disabled(ap) && !kprobes_all_disarmed) {
optimize_kprobe(op)
...
if (arch_check_optimized_kprobe(op) < 0) // because op1 has KPROBE_FLAG_DISABLED, here not return
return;
p->kp.flags |= KPROBE_FLAG_OPTIMIZED; // now op2 has KPROBE_FLAG_OPTIMIZED
}
commit f66c0447cca1 ("kprobes: Set unoptimized flag after unoptimizing code")
modified the update timing of the KPROBE_FLAG_OPTIMIZED, a optimized_kprobe
may be in the optimizing or unoptimizing state when op.kp->flags
has KPROBE_FLAG_OPTIMIZED and op->list is not empty.
The __recover_optprobed_insn check logic is incorrect, a kprobe in the
unoptimizing state may be incorrectly determined as unoptimizing.
As a result, incorrect instructions are copied.
The optprobe_queued_unopt function needs to be exported for invoking in
arch directory.
Link: https://lore.kernel.org/all/20230216034247.32348-2-yangjihong1@huawei.com/ Fixes: f66c0447cca1 ("kprobes: Set unoptimized flag after unoptimizing code") Cc: stable@vger.kernel.org Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Disable SVM and more importantly force GIF=1 when halting a CPU or
rebooting the machine. Similar to VMX, SVM allows software to block
INITs via CLGI, and thus can be problematic for a crash/reboot. The
window for failure is smaller with SVM as INIT is only blocked while
GIF=0, i.e. between CLGI and STGI, but the window does exist.
Fixes: fba4f472b33a ("x86/reboot: Turn off KVM when halting a CPU") Cc: stable@vger.kernel.org Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20221130233650.1404148-5-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Disable SVM on all CPUs via NMI shootdown during an emergency reboot.
Like VMX, SVM can block INIT, e.g. if the emergency reboot is triggered
between CLGI and STGI, and thus can prevent bringing up other CPUs via
INIT-SIPI-SIPI.
Disable virtualization in crash_nmi_callback() and rework the
emergency_vmx_disable_all() path to do an NMI shootdown if and only if a
shootdown has not already occurred. NMI crash shootdown fundamentally
can't support multiple invocations as responding CPUs are deliberately
put into halt state without unblocking NMIs. But, the emergency reboot
path doesn't have any work of its own, it simply cares about disabling
virtualization, i.e. so long as a shootdown occurred, emergency reboot
doesn't care who initiated the shootdown, or when.
If "crash_kexec_post_notifiers" is specified on the kernel command line,
panic() will invoke crash_smp_send_stop() and result in a second call to
nmi_shootdown_cpus() during native_machine_emergency_restart().
Invoke the callback _before_ disabling virtualization, as the current
VMCS needs to be cleared before doing VMXOFF. Note, this results in a
subtle change in ordering between disabling virtualization and stopping
Intel PT on the responding CPUs. While VMX and Intel PT do interact,
VMXOFF and writes to MSR_IA32_RTIT_CTL do not induce faults between one
another, which is all that matters when panicking.
Harden nmi_shootdown_cpus() against multiple invocations to try and
capture any such kernel bugs via a WARN instead of hanging the system
during a crash/dump, e.g. prior to the recent hardening of
register_nmi_handler(), re-registering the NMI handler would trigger a
double list_add() and hang the system if CONFIG_BUG_ON_DATA_CORRUPTION=y.
Extract the disabling logic to a common helper to deduplicate code, and
to prepare for doing the shootdown in the emergency reboot path if SVM
is supported.
Note, prior to commit ed72736183c4 ("x86/reboot: Force all cpus to exit
VMX root if VMX is supported"), nmi_shootdown_cpus() was subtly protected
against a second invocation by a cpu_vmx_enabled() check as the kdump
handler would disable VMX if it ran first.
Fixes: ed72736183c4 ("x86/reboot: Force all cpus to exit VMX root if VMX is supported") Cc: stable@vger.kernel.org Reported-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/all/20220427224924.592546-2-gpiccoli@igalia.com Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20221130233650.1404148-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Set GIF=1 prior to disabling SVM to ensure that INIT is recognized if the
kernel is disabling SVM in an emergency, e.g. if the kernel is about to
jump into a crash kernel or may reboot without doing a full CPU RESET.
If GIF is left cleared, the new kernel (or firmware) will be unabled to
awaken APs. Eat faults on STGI (due to EFER.SVME=0) as it's possible
that SVM could be disabled via NMI shootdown between reading EFER.SVME
and executing STGI.
Migration mode is a VM attribute which enables tracking of changes in
storage attributes (PGSTE). It assumes dirty tracking is enabled on all
memslots to keep a dirty bitmap of pages with changed storage attributes.
When enabling migration mode, we currently check that dirty tracking is
enabled for all memslots. However, userspace can disable dirty tracking
without disabling migration mode.
Since migration mode is pointless with dirty tracking disabled, disable
migration mode whenever userspace disables dirty tracking on any slot.
Also update the documentation to clarify that dirty tracking must be
enabled when enabling migration mode, which is already enforced by the
code in kvm_s390_vm_start_migration().
Also highlight in the documentation for KVM_S390_GET_CMMA_BITS that it
can now fail with -EINVAL when dirty tracking is disabled while
migration mode is on. Move all the error codes to a table so this stays
readable.
To disable migration mode, slots_lock should be held, which is taken
in kvm_set_memory_region() and thus held in
kvm_arch_prepare_memory_region().
Restructure the prepare code a bit so all the sanity checking is done
before disabling migration mode. This ensures migration mode isn't
disabled when some sanity check fails.
This "(unknown) (section: .init.data)" all refer to svm_x86_ops.
Tag svm_hv_hardware_setup() with __init to fix a modpost warning as the
non-stub implementation accesses __initdata (svm_x86_ops), i.e. would
generate a use-after-free if svm_hv_hardware_setup() were actually invoked
post-init. The helper is only called from svm_hardware_setup(), which is
also __init, i.e. lack of __init is benign other than the modpost warning.
Fixes: 1e0c7d40758b ("KVM: SVM: hyper-v: Remote TLB flush for SVM") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Vineeth Pillai <viremana@linux.microsoft.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: kvm@vger.kernel.org Cc: stable@vger.kernel.org Reviewed-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20230222073315.9081-1-rdunlap@infradead.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
KVM_SEV_SEND_UPDATE_DATA and KVM_SEV_RECEIVE_UPDATE_DATA have an integer
overflow issue. Params.guest_len and offset are both 32 bits wide, with a
large params.guest_len the check to confirm a page boundary is not
crossed can falsely pass:
/* Check if we are crossing the page boundary *
offset = params.guest_uaddr & (PAGE_SIZE - 1);
if ((params.guest_len + offset > PAGE_SIZE))
Add an additional check to confirm that params.guest_len itself is not
greater than PAGE_SIZE.
Note, this isn't a security concern as overflow can happen if and only if
params.guest_len is greater than 0xfffff000, and the FW spec says these
commands fail with lengths greater than 16KB, i.e. the PSP will detect
KVM's goof.
Fixes: 15fb7de1a7f5 ("KVM: SVM: Add KVM_SEV_RECEIVE_UPDATE_DATA command") Fixes: d3d1af85e2c7 ("KVM: SVM: Add KVM_SEND_UPDATE_DATA command") Reported-by: Andy Nguyen <theflow@google.com> Suggested-by: Thomas Lendacky <thomas.lendacky@amd.com> Signed-off-by: Peter Gonda <pgonda@google.com> Cc: David Rientjes <rientjes@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sean Christopherson <seanjc@google.com> Cc: kvm@vger.kernel.org Cc: stable@vger.kernel.org Cc: linux-kernel@vger.kernel.org Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://lore.kernel.org/r/20230207171354.4012821-1-pgonda@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Destroy and free the target coalesced MMIO device if unregistering said
device fails. As clearly noted in the code, kvm_io_bus_unregister_dev()
does not destroy the target device.
When we append new block just after the end of preallocated extent, the
code in inode_getblk() wrongly determined we're going to use the
preallocated extent which resulted in adding block into a wrong logical
offset in the file. Sequence like this manifests it:
The code that determined the use of preallocated extent is actually
stale because udf_do_extend_file() does not create preallocation anymore
so after calling that function we are sure there's no usable
preallocation. Just remove the faulty condition.
CC: stable@vger.kernel.org Fixes: 16d055656814 ("udf: Discard preallocation before extending file with a hole") Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When UDF filesystem is corrupted, hidden system inodes can be linked
into directory hierarchy which is an avenue for further serious
corruption of the filesystem and kernel confusion as noticed by syzbot
fuzzed images. Refuse to access system inodes linked into directory
hierarchy and vice versa.
CC: stable@vger.kernel.org Reported-by: syzbot+38695a20b8addcbc1084@syzkaller.appspotmail.com Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
System files in UDF filesystem have link count 0. To not confuse VFS we
fudge the link count to be 1 when reading such inodes however we forget
to restore the link count of 0 when writing such inodes. Fix that.
CC: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When write to inline file fails (or happens only partly), we still
updated length of inline data as if the whole write succeeded. Fix the
update of length of inline data to happen only if the write succeeds.
Reported-by: syzbot+0937935b993956ba28ab@syzkaller.appspotmail.com CC: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When merging very long extents we try to push as much length as possible
to the first extent. However this is unnecessarily complicated and not
really worth the trouble. Furthermore there was a bug in the logic
resulting in corrupting extents in the file as syzbot reproducer shows.
So just don't bother with the merging of extents that are too long
together.
CC: stable@vger.kernel.org Reported-by: syzbot+60f291a24acecb3c2bd5@syzkaller.appspotmail.com Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When a file expansion failed because we didn't have enough space for
indirect extents make sure we truncate extents created so far so that we
don't leave extents beyond EOF.
CC: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Update ptrace tests according to all potential Yama security policies.
This is required to make such tests pass even if Yama is enabled.
Tests are not skipped but they now check both Landlock and Yama boundary
restrictions at run time to keep a maximum test coverage (i.e. positive
and negative testing).
Signed-off-by: Jeff Xu <jeffxu@google.com> Link: https://lore.kernel.org/r/20230114020306.1407195-2-jeffxu@google.com Cc: stable@vger.kernel.org
[mic: Add curly braces around EXPECT_EQ() to make it build, and improve
commit message] Co-developed-by: Mickaël Salaün <mic@digikod.net> Signed-off-by: Mickaël Salaün <mic@digikod.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
overlayfs may be disabled in the kernel configuration, causing related
tests to fail. Check that overlayfs is supported at runtime, so we can
skip layout2_overlay.* accordingly.
This fixes three issues on move extents ioctl without auto defrag:
a) In ocfs2_find_victim_alloc_group(), we have to convert bits to block
first in case of global bitmap.
b) In ocfs2_probe_alloc_group(), when finding enough bits in block
group bitmap, we have to back off move_len to start pos as well,
otherwise it may corrupt filesystem.
c) In ocfs2_ioctl_move_extents(), set me_threshold both for non-auto
and auto defrag paths. Otherwise it will set move_max_hop to 0 and
finally cause unexpectedly ENOSPC error.
Currently there are no tools triggering the above issues since
defragfs.ocfs2 enables auto defrag by default. Tested with manually
changing defragfs.ocfs2 to run non auto defrag path.
Link: https://lkml.kernel.org/r/20230220050526.22020-1-heming.zhao@suse.com Signed-off-by: Heming Zhao <heming.zhao@suse.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This bug has the same root cause of 'commit 7f27ec978b0e ("ocfs2: call
ocfs2_journal_access_di() before ocfs2_journal_dirty() in
ocfs2_write_end_nolock()")'. For this bug, jbd2_journal_restart() is
called by ocfs2_split_extent() during defragmenting.
How to fix
For ocfs2_split_extent() can handle journal operations totally by itself.
Caller doesn't need to call journal access/dirty pair, and caller only
needs to call journal start/stop pair. The fix method is to remove
journal access/dirty from __ocfs2_move_extent().
The discussion for this patch:
https://oss.oracle.com/pipermail/ocfs2-devel/2023-February/000647.html
Link: https://lkml.kernel.org/r/20230217003717.32469-1-heming.zhao@suse.com Signed-off-by: Heming Zhao <heming.zhao@suse.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When writing a page from an encrypted file that is using
filesystem-layer encryption (not inline encryption), f2fs encrypts the
pagecache page into a bounce page, then writes the bounce page.
It also passes the bounce page to wbc_account_cgroup_owner(). That's
incorrect, because the bounce page is a newly allocated temporary page
that doesn't have the memory cgroup of the original pagecache page.
This makes wbc_account_cgroup_owner() not account the I/O to the owner
of the pagecache page as it should.
Fix this by always passing the pagecache page to
wbc_account_cgroup_owner().
When converting an inline directory to a regular one, f2fs is leaking
uninitialized memory to disk because it doesn't initialize the entire
directory block. Fix this by zero-initializing the block.
This bug was introduced by commit 4ec17d688d74 ("f2fs: avoid unneeded
initializing when converting inline dentry"), which didn't consider the
security implications of leaking uninitialized memory to disk.
This was found by running xfstest generic/435 on a KMSAN-enabled kernel.
Fixes: 4ec17d688d74 ("f2fs: avoid unneeded initializing when converting inline dentry") Cc: <stable@vger.kernel.org> # v4.3+ Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch moves to send a ack back for receiving a FIN message only
when we are in valid states. In other cases and there might be a sender
waiting for a ack we just let it timeout at the senders time and
hopefully all other cleanups will remove the FIN message on their
sending queue. As an example we should never send out an ACK being in
LAST_ACK state or we cannot assume a working socket communication when
we are in CLOSED state.
Cc: stable@vger.kernel.org Fixes: 489d8e559c65 ("fs: dlm: add reliable connection if reconnect") Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch moves the send fin handling, which should appear in a specific
state change, into the state change handling while the per node
state_lock is held. I experienced issues with other messages because
we changed the state and a fin message was sent out in a different state.
Cc: stable@vger.kernel.org Fixes: 489d8e559c65 ("fs: dlm: add reliable connection if reconnect") Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Similar to the stop tx flag, the rx flag should warn about a dlm message
being received at DLM_FIN state change, when we are assuming no other
dlm application messages. If we receive a FIN message and we are in the
state DLM_FIN_WAIT2 we call midcomms_node_reset() which puts the
midcomms node into DLM_CLOSED state. Afterwards we should not set the
DLM_NODE_FLAG_STOP_RX flag any more. This patch changes the setting
DLM_NODE_FLAG_STOP_RX in those state changes when we receive a FIN
message and we assume there will be no other dlm application messages
received until we hit DLM_CLOSED state.
Cc: stable@vger.kernel.org Fixes: 489d8e559c65 ("fs: dlm: add reliable connection if reconnect") Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When a file or a directory is deleted, the hint for the cluster of
its parent directory in its in-memory inode is set as DIR_DELETED.
Therefore, DIR_DELETED must be one of invalid cluster numbers. According
to the exFAT specification, a volume can have at most 2^32-11 clusters.
However, DIR_DELETED is wrongly defined as 0xFFFF0321, which could be
a valid cluster number. To fix it, let's redefine DIR_DELETED as
0xFFFFFFF7, the bad cluster number.
Fixes: 1acf1a564b60 ("exfat: add in-memory and on-disk structures and headers") Cc: stable@vger.kernel.org # v5.7+ Reported-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If the position is not aligned with the dentry size, the return
value of readdir() will be NULL and errno is 0, which means the
end of the directory stream is reached.
If the position is aligned with dentry size, but there is no file
or directory at the position, exfat_readdir() will continue to
get dentry from the next dentry. So the dentry gotten by readdir()
may not be at the position.
After this commit, if the position is not aligned with the dentry
size, round the position up to the dentry size and continue to get
the dentry.
Fixes: ca06197382bd ("exfat: add directory operations") Cc: stable@vger.kernel.org # v5.7+ Reported-by: Wang Yugui <wangyugui@e16-tech.com> Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Reviewed-by: Andy Wu <Andy.Wu@sony.com> Reviewed-by: Aoyama Wataru <wataru.aoyama@sony.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since seekdir() does not check whether the position is valid, the
position may exceed the size of the directory. We found that for
a directory with discontinuous clusters, if the position exceeds
the size of the directory and the excess size is greater than or
equal to the cluster size, exfat_readdir() will return -EIO,
causing a file system error and making the file system unavailable.
The current hfsplus_put_super first calls hfs_btree_close on
sbi->ext_tree, then invokes iput on sbi->hidden_dir, resulting in an
use-after-free issue in hfsplus_release_folio.
As shown in hfsplus_fill_super, the error handling code also calls iput
before hfs_btree_close.
To fix this error, we move all iput calls before hfsplus_btree_close.
Note that this patch is tested on Syzbot.
Link: https://lkml.kernel.org/r/20230226124948.3175736-1-mudongliangabcd@gmail.com Reported-by: syzbot+57e3e98f7e3b80f64d56@syzkaller.appspotmail.com Tested-by: Dongliang Mu <mudongliangabcd@gmail.com> Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com> Cc: Bart Van Assche <bvanassche@acm.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
/* Dispose of resources used by a node */
void hfs_bnode_put(struct hfs_bnode *node)
{
if (node) {
<skipped>
BUG_ON(!atomic_read(&node->refcnt)); <- we have issue here!!!!
<skipped>
}
}
By tracing the refcnt, I found the node is created by hfs_bmap_alloc()
with refcnt 1. Then the node is used by hfs_btree_write(). There is a
missing of hfs_bnode_get() after find the node. The issue happened in
following path:
<alloc>
hfs_bmap_alloc
hfs_bnode_find
__hfs_bnode_create <- allocate a new node with refcnt 1.
hfs_bnode_put <- decrease the refcnt
<write>
hfs_btree_write
hfs_bnode_find
__hfs_bnode_create
hfs_bnode_findhash <- find the node without refcnt increased.
hfs_bnode_put <- trigger the BUG_ON() since refcnt is 0.
Link: https://lkml.kernel.org/r/20221212021627.3766829-1-liushixin2@huawei.com Reported-by: syzbot+5b04b49a7ec7226c7426@syzkaller.appspotmail.com Signed-off-by: Liu Shixin <liushixin2@huawei.com> Cc: Fabio M. De Francesco <fmdefrancesco@gmail.com> Cc: Viacheslav Dubeyko <slava@dubeyko.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ksmbd allowed the actual frame length to be smaller than the rfc1002
length. If allowed, it is possible to allocates a large amount of memory
that can be limited by credit management and can eventually cause memory
exhaustion problem. This patch do not allow it except SMB2 Negotiate
request which will be validated when message handling proceeds.
Also, Allow a message that padded to 8byte boundary.
Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3") Cc: stable@vger.kernel.org Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A non-first waiter can potentially spin in the for loop of
rwsem_down_write_slowpath() without sleeping but fail to acquire the
lock even if the rwsem is free if the following sequence happens:
Non-first RT waiter First waiter Lock holder
------------------- ------------ -----------
Acquire wait_lock
rwsem_try_write_lock():
Set handoff bit if RT or
wait too long
Set waiter->handoff_set
Release wait_lock
Acquire wait_lock
Inherit waiter->handoff_set
Release wait_lock
Clear owner
Release lock
if (waiter.handoff_set) {
rwsem_spin_on_owner(();
if (OWNER_NULL)
goto trylock_again;
}
trylock_again:
Acquire wait_lock
rwsem_try_write_lock():
if (first->handoff_set && (waiter != first))
return false;
Release wait_lock
A non-first waiter cannot really acquire the rwsem even if it mistakenly
believes that it can spin on OWNER_NULL value. If that waiter happens
to be an RT task running on the same CPU as the first waiter, it can
block the first waiter from acquiring the rwsem leading to live lock.
Fix this problem by making sure that a non-first waiter cannot spin in
the slowpath loop without sleeping.
Fixes: d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more consistent") Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Mukesh Ojha <quic_mojha@quicinc.com> Reviewed-by: Mukesh Ojha <quic_mojha@quicinc.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230126003628.365092-2-longman@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Async discard does not acquire the block group reference count while it
holds a reference on the discard list. This is generally OK, as the
paths which destroy block groups tend to try to synchronize on
cancelling async discard work. However, relying on cancelling work
requires careful analysis to be sure it is safe from races with
unpinning scheduling more work.
While I am unable to find a race with unpinning in the current code for
either the unused bgs or relocation paths, I believe we have one in an
older version of auto relocation in a Meta internal build. This suggests
that this is in fact an error prone model, and could be fragile to
future changes to these bg deletion paths.
To make this ownership more clear, add a refcount for async discard. If
work is queued for a block group, its refcount should be incremented,
and when work is completed or canceled, it should be decremented.
CC: stable@vger.kernel.org # 5.15+ Signed-off-by: Boris Burkov <boris@bur.io> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Recent test_kprobe_missed kprobes kunit test uncovers the following
problem. Once kprobe is triggered from another kprobe (kprobe reenter),
all future kprobes on this cpu are considered as kprobe reenter, thus
pre_handler and post_handler are not being called and kprobes are counted
as "missed".
Commit b9599798f953 ("[S390] kprobes: activation and deactivation")
introduced a simpler scheme for kprobes (de)activation and status
tracking by using push_kprobe/pop_kprobe, which supposed to work for
both initial kprobe entry as well as kprobe reentry and helps to avoid
handling those two cases differently. The problem is that a sequence of
calls in case of kprobes reenter:
push_kprobe() <- NULL (current_kprobe)
push_kprobe() <- kprobe1 (current_kprobe)
pop_kprobe() -> kprobe1 (current_kprobe)
pop_kprobe() -> kprobe1 (current_kprobe)
leaves "kprobe1" as "current_kprobe" on this cpu, instead of setting it
to NULL. In fact push_kprobe/pop_kprobe can only store a single state
(there is just one prev_kprobe in kprobe_ctlblk). Which is a hack but
sufficient, there is no need to have another prev_kprobe just to store
NULL. To make a simple and backportable fix simply reset "prev_kprobe"
when kprobe is poped from this "stack". No need to worry about
"kprobe_status" in this case, because its value is only checked when
current_kprobe != NULL.
Recent test_kprobe_missed kprobes kunit test uncovers the following error
(reported when CONFIG_DEBUG_ATOMIC_SLEEP is enabled):
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580
in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 662, name: kunit_try_catch
preempt_count: 0, expected: 0
RCU nest depth: 0, expected: 0
no locks held by kunit_try_catch/662.
irq event stamp: 280
hardirqs last enabled at (279): [<00000003e60a3d42>] __do_pgm_check+0x17a/0x1c0
hardirqs last disabled at (280): [<00000003e3bd774a>] kprobe_exceptions_notify+0x27a/0x318
softirqs last enabled at (0): [<00000003e3c5c890>] copy_process+0x14a8/0x4c80
softirqs last disabled at (0): [<0000000000000000>] 0x0
CPU: 46 PID: 662 Comm: kunit_try_catch Tainted: G N 6.2.0-173644-g44c18d77f0c0 #2
Hardware name: IBM 3931 A01 704 (LPAR)
Call Trace:
[<00000003e60a3a00>] dump_stack_lvl+0x120/0x198
[<00000003e3d02e82>] __might_resched+0x60a/0x668
[<00000003e60b9908>] __mutex_lock+0xc0/0x14e0
[<00000003e60bad5a>] mutex_lock_nested+0x32/0x40
[<00000003e3f7b460>] unregister_kprobe+0x30/0xd8
[<00000003e51b2602>] test_kprobe_missed+0xf2/0x268
[<00000003e51b5406>] kunit_try_run_case+0x10e/0x290
[<00000003e51b7dfa>] kunit_generic_run_threadfn_adapter+0x62/0xb8
[<00000003e3ce30f8>] kthread+0x2d0/0x398
[<00000003e3b96afa>] __ret_from_fork+0x8a/0xe8
[<00000003e60ccada>] ret_from_fork+0xa/0x40
The reason for this error report is that kprobes handling code failed
to restore irqs.
The problem is that when kprobe is triggered from another kprobe
post_handler current sequence of enable_singlestep / disable_singlestep
is the following:
enable_singlestep <- original kprobe (saves kprobe_saved_imask)
enable_singlestep <- kprobe triggered from post_handler (clobbers kprobe_saved_imask)
disable_singlestep <- kprobe triggered from post_handler (restores kprobe_saved_imask)
disable_singlestep <- original kprobe (restores wrong clobbered kprobe_saved_imask)
There is just one kprobe_ctlblk per cpu and both calls saves and
loads irq mask to kprobe_saved_imask. To fix the problem simply move
resume_execution (which calls disable_singlestep) before calling
post_handler. This also fixes the problem that post_handler is called
with pt_regs which were not yet adjusted after single-stepping.
When debugging vmlinux with QEMU + GDB, the following GDB error may
occur:
(gdb) c
Continuing.
Warning:
Cannot insert breakpoint -1.
Cannot access memory at address 0xffffffffffff95c0
Command aborted.
(gdb)
The reason is that, when .interp section is present, GDB tries to
locate the file specified in it in memory and put a number of
breakpoints there (see enable_break() function in gdb/solib-svr4.c).
Sometimes GDB finds a bogus location that matches its heuristics,
fails to set a breakpoint and stops. This makes further debugging
impossible.
The .interp section contains misleading information anyway (vmlinux
does not need ld.so), so fix by discarding it.
Commit f05f62d04271f ("s390/vmem: get rid of memory segment list")
reshuffled the call to vmem_add_mapping() in __segment_load(), which now
overwrites rc after it was set to contain the segment type code.
As result, __segment_load() will now always return 0 on success, which
corresponds to the segment type code SEG_TYPE_SW, i.e. a writeable
segment. This results in a kernel crash when loading a read-only segment
as dcssblk block device, and trying to write to it.
Instead of reshuffling code again, make sure to return the segment type
on success, and also describe this rather delicate and unexpected logic
in the function comment. Also initialize new segtype variable with
invalid value, to prevent possible future confusion.
The resend_msg() function cannot fail, but there was error handling
around using it. Rework the handling of the error, and fix the out of
retries debug reporting that was wrong around this, too.
Make sure to disable the alarm before updating the four alarm time
registers to avoid spurious alarms during the update.
Note that the disable needs to be done outside of the ctrl_reg_lock
section to prevent a racing alarm interrupt from disabling the newly set
alarm when the lock is released.
Fixes: 9a9a54ad7aa2 ("drivers/rtc: add support for Qualcomm PMIC8xxx RTC") Cc: stable@vger.kernel.org # 3.1 Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: David Collins <quic_collinsd@quicinc.com> Link: https://lore.kernel.org/r/20230202155448.6715-2-johan+linaro@kernel.org Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If we're doing a large IO request which needs to be split into multiple
bios for issue, then we can run into the same situation as the below
marked commit fixes - parts will complete just fine, one or more parts
will fail to allocate a request. This will result in a partially
completed read or write request, where the caller gets EAGAIN even though
parts of the IO completed just fine.
Do the same for large bios as we do for splits - fail a NOWAIT request
with EAGAIN. This isn't technically fixing an issue in the below marked
patch, but for stable purposes, we should have either none of them or
both.
This depends on: 613b14884b85 ("block: handle bio_split_to_limits() NULL return")
Cc: stable@vger.kernel.org # 5.15+ Fixes: 9cea62b2cbab ("block: don't allow splitting of a REQ_NOWAIT bio") Link: https://github.com/axboe/liburing/issues/766 Reported-and-tested-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The coreboot framebuffer doesn't support transparency, its 'reserved'
bit field is merely padding for byte/word alignment of pixel colors [1].
When trying to match the framebuffer to a simplefb format, the kernel
driver unnecessarily requires the format's transparency bit field to
exactly match this padding, even if the former is zero-width.
Due to a coreboot bug [2] (fixed upstream), some boards misreport the
reserved field's size as equal to its position (0x18 for both on a
'Lick' Chromebook), and the driver fails to probe where it would have
otherwise worked fine with e.g. the a8r8g8b8 or x8r8g8b8 formats.
Remove the transparency comparison with reserved bits. When the
bits-per-pixel and other color components match, transparency will
already be in a subset of the reserved field. Not forcing it to match
reserved bits allows the driver to work on the boards which misreport
the reserved field. It also enables using simplefb formats that don't
have transparency bits, although this doesn't currently happen due to
format support and ordering in linux/platform_data/simplefb.h.
The referenced commit added a wrapper for drm_gem_shmem_get_pages_sgt(),
but in the process it accidentally changed the export type from GPL to
non-GPL. Switch it back to GPL.
At first, I thought this might be a source of nfsd_file overputs, but
the current callers seem to avoid an extra put when nfsd4_verify_copy
returns an error.
Still, it's "bad form" to leave the pointers filled out when we don't
have a reference to them anymore, and that might lead to bugs later.
Zero them out as a defensive coding measure.
Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
When calling debugfs_lookup() the result must have dput() called on it,
otherwise the memory will leak over time. To make things simpler, just
call debugfs_lookup_and_remove() instead which handles all of the logic at
once.
Use devm_kasprintf() instead of kasprintf() to avoid any potential
leaks. At the moment drivers have no remove functionality thus
there is no need for fixes tag.
Coretemp's platform driver is unconventional. All the real work is done
globally by the initcall and CPU hotplug notifiers, while the "driver"
effectively just wraps an allocation and the registration of the hwmon
interface in a long-winded round-trip through the driver core. The whole
logic of dynamically creating and destroying platform devices to bring
the interfaces up and down is error prone, since it assumes
platform_device_add() will synchronously bind the driver and set drvdata
before it returns, thus results in a NULL dereference if drivers_autoprobe
is turned off for the platform bus. Furthermore, the unusual approach of
doing that from within a CPU hotplug notifier, already commented in the
code that it deadlocks suspend, also causes lockdep issues for other
drivers or subsystems which may want to legitimately register a CPU
hotplug notifier from a platform bus notifier.
All of these issues can be solved by ripping this unusual behaviour out
completely, simply tying the platform devices to the lifetime of the
module itself, and directly managing the hwmon interfaces from the
hotplug notifiers. There is a slight user-visible change in that
/sys/bus/platform/drivers/coretemp will no longer appear, and
/sys/devices/platform/coretemp.n will remain present if package n is
hotplugged off, but hwmon users should really only be looking for the
presence of the hwmon interfaces, whose behaviour remains unchanged.
In gfs2_make_fs_rw(), make sure to call gfs2_consist() to report an
inconsistency and mark the filesystem as withdrawn when
gfs2_find_jhead() fails.
At the end of gfs2_make_fs_rw(), when we discover that the filesystem
has been withdrawn, make sure we report an error. This also replaces
the gfs2_withdrawn() check after gfs2_find_jhead().
The compiler has no way to know if "id" is within the array bounds of
the regulators array. Add a check for this and a build-time check that
the regulators and reg_voltage_map arrays are sized the same. Seen with
GCC 13:
../drivers/regulator/s5m8767.c: In function 's5m8767_pmic_probe':
../drivers/regulator/s5m8767.c:936:35: warning: array subscript [0, 36] is outside array bounds of 'struct regulator_desc[37]' [-Warray-bounds=]
936 | regulators[id].vsel_reg =
| ~~~~~~~~~~^~~~
Cc: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Cc: Liam Girdwood <lgirdwood@gmail.com> Cc: Mark Brown <broonie@kernel.org> Cc: linux-samsung-soc@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20230128005358.never.313-kees@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Walking the dram->cs array was seen as accesses beyond the first array
item by the compiler. Instead, use the array index directly. This allows
for run-time bounds checking under CONFIG_UBSAN_BOUNDS as well. Seen
with GCC 13 with -fstrict-flex-arrays:
../sound/soc/kirkwood/kirkwood-dma.c: In function
'kirkwood_dma_conf_mbus_windows.constprop':
../sound/soc/kirkwood/kirkwood-dma.c:90:24: warning: array subscript 0 is outside array bounds of 'const struct mbus_dram_window[0]' [-Warray-bounds=]
90 | if ((cs->base & 0xffff0000) < (dma & 0xffff0000)) {
| ~~^~~~~~
Cc: Liam Girdwood <lgirdwood@gmail.com> Cc: Mark Brown <broonie@kernel.org> Cc: Jaroslav Kysela <perex@perex.cz> Cc: Takashi Iwai <tiwai@suse.com> Cc: alsa-devel@alsa-project.org Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230127224128.never.410-kees@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
If panic_on_warn is set and compress stream(DPCM) is started,
then kernel panic occurred because card->pcm_mutex isn't held appropriately.
In the following functions, warning were issued at this line
"snd_soc_dpcm_mutex_assert_held".
static int dpcm_be_connect(struct snd_soc_pcm_runtime *fe,
struct snd_soc_pcm_runtime *be, int stream)
{
...
snd_soc_dpcm_mutex_assert_held(fe);
...
}
Always free the console font when deinitializing the framebuffer
console. Subsequent framebuffer consoles will then use the default
font. Rely on userspace to load any user-configured font for these
consoles.
Commit ae1287865f53 ("fbcon: don't lose the console font across
generic->chip driver switch") was introduced to work around losing
the font during graphics-device handover. [1][2] It kept a dangling
pointer with the font data between loading the two consoles, which is
fairly adventurous hack. It also never covered cases when the other
consoles, such as VGA text mode, where involved.
The problem has meanwhile been solved in userspace. Systemd comes
with a udev rule that re-installs the configured font when a console
comes up. [3] So the kernel workaround can be removed.
This also removes one of the two special cases triggered by setting
FBINFO_MISC_FIRMWARE in an fbdev driver.
Tested during device handover from efifb and simpledrm to radeon. Udev
reloads the configured console font for the new driver's terminal.
During the sysfs firmware write process, a use-after-free read warning is
logged from the lpfc_wr_object() routine:
BUG: KFENCE: use-after-free read in lpfc_wr_object+0x235/0x310 [lpfc]
Use-after-free read at 0x0000000000cf164d (in kfence-#111):
lpfc_wr_object+0x235/0x310 [lpfc]
lpfc_write_firmware.cold+0x206/0x30d [lpfc]
lpfc_sli4_request_firmware_update+0xa6/0x100 [lpfc]
lpfc_request_firmware_upgrade_store+0x66/0xb0 [lpfc]
kernfs_fop_write_iter+0x121/0x1b0
new_sync_write+0x11c/0x1b0
vfs_write+0x1ef/0x280
ksys_write+0x5f/0xe0
do_syscall_64+0x59/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd
The driver accessed wr_object pointer data, which was initialized into
mailbox payload memory, after the mailbox object was released back to the
mailbox pool.
Fix by moving the mailbox free calls to the end of the routine ensuring
that we don't reference internal mailbox memory after release.
Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
iio was allocated in atom_index_iio() called by atom_parse(),
but it doesn't got released when the dirver is shutdown.
Fix this kmemleak by free it in radeon_atombios_fini().
Signed-off-by: Liwei Song <liwei.song@windriver.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The pixel data for the ILI9486 is always 16-bits wide and it must be
sent over the SPI bus. When the controller is only able to deal with
8-bit transfers, this 16-bits data needs to be swapped before the
sending to account for the big endian bus, this is on the contrary not
needed when the SPI controller already supports 16-bits transfers.
The decision about swapping the pixel data or not is taken in the MIPI
DBI code by probing the controller capabilities: if the controller only
suppors 8-bit transfers the data is swapped, otherwise it is not.
This swapping/non-swapping is relying on the assumption that when the
controller does support 16-bit transactions then the data is sent
unswapped in 16-bits-per-word over SPI.
The problem with the ILI9486 driver is that it is forcing 8-bit
transactions also for controllers supporting 16-bits, violating the
assumption and corrupting the pixel data.
Align the driver to what is done in the MIPI DBI code by adjusting the
transfer size to the maximum allowed by the SPI controller.
[Why]
Fixing smatch error:
dm_resume() error: we previously assumed 'aconnector->dc_link' could be null
[How]
Check if dc_link null at the beginning of the loop,
so further checks can be dropped.
Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Wayne Lin <Wayne.Lin@amd.com> Acked-by: Jasdeep Dhillon <jdhillon@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
[WHY]
It causes regression AMD source will not write DPCD 340.
Reviewed-by: Wayne Lin <Wayne.Lin@amd.com> Acked-by: Jasdeep Dhillon <jdhillon@amd.com> Signed-off-by: Ian Chen <ian.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
This is a followup of commit 2558b8039d05 ("net: use a bounce
buffer for copying skb->mark")
x86 and powerpc define user_access_begin, meaning
that they are not able to perform user copy checks
when using user_write_access_begin() / unsafe_copy_to_user()
and friends [1]
Instead of waiting bugs to trigger on other arches,
add a check_object_size() in put_cmsg() to make sure
that new code tested on x86 with CONFIG_HARDENED_USERCOPY=y
will perform more security checks.
[1] We can not generically call check_object_size() from
unsafe_copy_to_user() because UACCESS is enabled at this point.
Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Kees Cook <keescook@chromium.org> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Completion responses to SEND_RNDIS_PKT messages are currently processed
regardless of the status in the response, so that resources associated
with the request are freed. While this is appropriate, code bugs that
cause sending a malformed message, or errors on the Hyper-V host, go
undetected. Fix this by checking the status and outputting a rate-limited
message if there is an error.
When calling debugfs_lookup() the result must have dput() called on it,
otherwise the memory will leak over time. To make things simpler, just
call debugfs_lookup_and_remove() instead which handles all of the logic
at once.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
When calling debugfs_lookup() the result must have dput() called on it,
otherwise the memory will leak over time. To make things simpler, just
call debugfs_lookup_and_remove() instead which handles all of the logic
at once.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
When calling debugfs_lookup() the result must have dput() called on it,
otherwise the memory will leak over time. To make things simpler, just
call debugfs_lookup_and_remove() instead which handles all of the logic at
once.
While there is logic about the difference between ksize and usize,
copy_struct_from_user() didn't check the size of the destination buffer
(when it was known) against ksize. Add this check so there is an upper
bounds check on the possible memset() call, otherwise lower bounds
checks made by callers will trigger bounds warnings under -Warray-bounds.
Seen under GCC 13:
In function 'copy_struct_from_user',
inlined from 'iommufd_fops_ioctl' at
../drivers/iommu/iommufd/main.c:333:8:
../include/linux/fortify-string.h:59:33: warning: '__builtin_memset' offset [57, 4294967294] is out of the bounds [0, 56] of object 'buf' with type 'union ucmd_buffer' [-Warray-bounds=]
59 | #define __underlying_memset __builtin_memset
| ^
../include/linux/fortify-string.h:453:9: note: in expansion of macro '__underlying_memset'
453 | __underlying_memset(p, c, __fortify_size); \
| ^~~~~~~~~~~~~~~~~~~
../include/linux/fortify-string.h:461:25: note: in expansion of macro '__fortify_memset_chk'
461 | #define memset(p, c, s) __fortify_memset_chk(p, c, s, \
| ^~~~~~~~~~~~~~~~~~~~
../include/linux/uaccess.h:334:17: note: in expansion of macro 'memset'
334 | memset(dst + size, 0, rest);
| ^~~~~~
../drivers/iommu/iommufd/main.c: In function 'iommufd_fops_ioctl':
../drivers/iommu/iommufd/main.c:311:27: note: 'buf' declared here
311 | union ucmd_buffer buf;
| ^~~
GCC does not like having a partially allocated object, since it cannot
reason about it for bounds checking when it is passed to other code.
Instead, fully allocate sig_inputArgs. (Alternatively, sig_inputArgs
should be defined as a struct coda_in_hdr, if it is actually not using
any other part of the union.) Seen under GCC 13:
../fs/coda/upcall.c: In function 'coda_upcall':
../fs/coda/upcall.c:801:22: warning: array subscript 'union inputArgs[0]' is partly outside array bounds of 'unsigned char[20]' [-Warray-bounds=]
801 | sig_inputArgs->ih.opcode = CODA_SIGNAL;
| ^~