git.ipfire.org Git - thirdparty/kernel/stable.git/log

fbdev: zero-fill colormap in fbcmap.c

commit 19ab233989d0f7ab1de19a036e247afa4a0a1e9c upstream.

Use kzalloc() rather than kmalloc() for the dynamically allocated parts
of the colormap in fb_alloc_cmap_gfp, to prevent a leak of random kernel
data to userspace under certain circumstances.

Fixes a KMSAN-found infoleak bug reported by syzbot at:
https://syzkaller.appspot.com/bug?id=741578659feabd108ad9e06696f0c1f2e69c4b6e

Reported-by: syzbot+47fa9c9c648b765305b9@syzkaller.appspotmail.com
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Phillip Potter <phil@philpotter.co.uk>
Link: https://lore.kernel.org/r/20210331220719.1499743-1-phil@philpotter.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

posix-timers: Preserve return value in clock_adjtime32()

commit 2d036dfa5f10df9782f5278fc591d79d283c1fad upstream.

The return value on success (>= 0) is overwritten by the return value of
put_old_timex32(). That works correct in the fault case, but is wrong for
the success case where put_old_timex32() returns 0.

Just check the return value of put_old_timex32() and return -EFAULT in case
it is not zero.

[ tglx: Massage changelog ]

Fixes: 3a4d44b61625 ("ntp: Move adjtimex related compat syscalls to native counterparts")
Signed-off-by: Chen Jun <chenjun102@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Richard Cochran <richardcochran@gmail.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210414030449.90692-1-chenjun102@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

btrfs: zoned: fail mount if the device does not support zone append

commit 1d68128c107a0b8c0c9125cb05d4771ddc438369 upstream.

For zoned btrfs, zone append is mandatory to write to a sequential write
only zone, otherwise parallel writes to the same zone could result in
unaligned write errors.

If a zoned block device does not support zone append (e.g. a dm-crypt
zoned device using a non-NULL IV cypher), fail to mount.

CC: stable@vger.kernel.org # 5.12
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

btrfs: zoned: fix unpaired block group unfreeze during device replace

commit 0dc16ef4f6c2708407fab6d141908d46a3b737bc upstream.

When doing a device replace on a zoned filesystem, if we find a block
group with ->to_copy == 0, we jump to the label 'done', which will result
in later calling btrfs_unfreeze_block_group(), even though at this point
we never called btrfs_freeze_block_group().

Since at this point we have neither turned the block group to RO mode nor
made any progress, we don't need to jump to the label 'done'. So fix this
by jumping instead to the label 'skip' and dropping our reference on the
block group before the jump.

Fixes: 78ce9fc269af6e ("btrfs: zoned: mark block groups to copy for device-replace")
CC: stable@vger.kernel.org # 5.12
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

btrfs: fix race between transaction aborts and fsyncs leading to use-after-free

commit 061dde8245356d8864d29e25207aa4daa0be4d3c upstream.

There is a race between a task aborting a transaction during a commit,
a task doing an fsync and the transaction kthread, which leads to an
use-after-free of the log root tree. When this happens, it results in a
stack trace like the following:

  BTRFS info (device dm-0): forced readonly
  BTRFS warning (device dm-0): Skipping commit of aborted transaction.
  BTRFS: error (device dm-0) in cleanup_transaction:1958: errno=-5 IO failure
  BTRFS warning (device dm-0): lost page write due to IO error on /dev/mapper/error-test (-5)
  BTRFS warning (device dm-0): Skipping commit of aborted transaction.
  BTRFS warning (device dm-0): direct IO failed ino 261 rw 0,0 sector 0xa4e8 len 4096 err no 10
  BTRFS error (device dm-0): error writing primary super block to device 1
  BTRFS warning (device dm-0): direct IO failed ino 261 rw 0,0 sector 0x12e000 len 4096 err no 10
  BTRFS warning (device dm-0): direct IO failed ino 261 rw 0,0 sector 0x12e008 len 4096 err no 10
  BTRFS warning (device dm-0): direct IO failed ino 261 rw 0,0 sector 0x12e010 len 4096 err no 10
  BTRFS: error (device dm-0) in write_all_supers:4110: errno=-5 IO failure (1 errors while writing supers)
  BTRFS: error (device dm-0) in btrfs_sync_log:3308: errno=-5 IO failure
  general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6b68: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI
  CPU: 2 PID: 2458471 Comm: fsstress Not tainted 5.12.0-rc5-btrfs-next-84 #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
  RIP: 0010:__mutex_lock+0x139/0xa40
  Code: c0 74 19 (...)
  RSP: 0018:ffff9f18830d7b00 EFLAGS: 00010202
  RAX: 6b6b6b6b6b6b6b68 RBX: 0000000000000001 RCX: 0000000000000002
  RDX: ffffffffb9c54d13 RSI: 0000000000000000 RDI: 0000000000000000
  RBP: ffff9f18830d7bc0 R08: 0000000000000000 R09: 0000000000000000
  R10: ffff9f18830d7be0 R11: 0000000000000001 R12: ffff8c6cd199c040
  R13: ffff8c6c95821358 R14: 00000000fffffffb R15: ffff8c6cbcf01358
  FS:  00007fa9140c2b80(0000) GS:ffff8c6fac600000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007fa913d52000 CR3: 000000013d2b4003 CR4: 0000000000370ee0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   ? __btrfs_handle_fs_error+0xde/0x146 [btrfs]
   ? btrfs_sync_log+0x7c1/0xf20 [btrfs]
   ? btrfs_sync_log+0x7c1/0xf20 [btrfs]
   btrfs_sync_log+0x7c1/0xf20 [btrfs]
   btrfs_sync_file+0x40c/0x580 [btrfs]
   do_fsync+0x38/0x70
   __x64_sys_fsync+0x10/0x20
   do_syscall_64+0x33/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xae
  RIP: 0033:0x7fa9142a55c3
  Code: 8b 15 09 (...)
  RSP: 002b:00007fff26278d48 EFLAGS: 00000246 ORIG_RAX: 000000000000004a
  RAX: ffffffffffffffda RBX: 0000563c83cb4560 RCX: 00007fa9142a55c3
  RDX: 00007fff26278cb0 RSI: 00007fff26278cb0 RDI: 0000000000000005
  RBP: 0000000000000005 R08: 0000000000000001 R09: 00007fff26278d5c
  R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000340
  R13: 00007fff26278de0 R14: 00007fff26278d96 R15: 0000563c83ca57c0
  Modules linked in: btrfs dm_zero dm_snapshot dm_thin_pool (...)
  ---[ end trace ee2f1b19327d791d ]---

The steps that lead to this crash are the following:

1) We are at transaction N;

2) We have two tasks with a transaction handle attached to transaction N.
   Task A and Task B. Task B is doing an fsync;

3) Task B is at btrfs_sync_log(), and has saved fs_info->log_root_tree
   into a local variable named 'log_root_tree' at the top of
   btrfs_sync_log(). Task B is about to call write_all_supers(), but
   before that...

4) Task A calls btrfs_commit_transaction(), and after it sets the
   transaction state to TRANS_STATE_COMMIT_START, an error happens before
   it waits for the transaction's 'num_writers' counter to reach a value
   of 1 (no one else attached to the transaction), so it jumps to the
   label "cleanup_transaction";

5) Task A then calls cleanup_transaction(), where it aborts the
   transaction, setting BTRFS_FS_STATE_TRANS_ABORTED on fs_info->fs_state,
   setting the ->aborted field of the transaction and the handle to an
   errno value and also setting BTRFS_FS_STATE_ERROR on fs_info->fs_state.

   After that, at cleanup_transaction(), it deletes the transaction from
   the list of transactions (fs_info->trans_list), sets the transaction
   to the state TRANS_STATE_COMMIT_DOING and then waits for the number
   of writers to go down to 1, as it's currently 2 (1 for task A and 1
   for task B);

6) The transaction kthread is running and sees that BTRFS_FS_STATE_ERROR
   is set in fs_info->fs_state, so it calls btrfs_cleanup_transaction().

   There it sees the list fs_info->trans_list is empty, and then proceeds
   into calling btrfs_drop_all_logs(), which frees the log root tree with
   a call to btrfs_free_log_root_tree();

7) Task B calls write_all_supers() and, shortly after, under the label
   'out_wake_log_root', it deferences the pointer stored in
   'log_root_tree', which was already freed in the previous step by the
   transaction kthread. This results in a use-after-free leading to a
   crash.

Fix this by deleting the transaction from the list of transactions at
cleanup_transaction() only after setting the transaction state to
TRANS_STATE_COMMIT_DOING and waiting for all existing tasks that are
attached to the transaction to release their transaction handles.
This makes the transaction kthread wait for all the tasks attached to
the transaction to be done with the transaction before dropping the
log roots and doing other cleanups.

Fixes: ef67963dac255b ("btrfs: drop logs when we've aborted a transaction")
CC: stable@vger.kernel.org # 5.10+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

intel_th: pci: Add Rocket Lake CPU support

commit 9f7f2a5e01ab4ee56b6d9c0572536fe5fd56e376 upstream.

This adds support for the Trace Hub in Rocket Lake CPUs.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: stable <stable@vger.kernel.org> # v4.14+
Link: https://lore.kernel.org/r/20210414171251.14672-7-alexander.shishkin@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

btrfs: fix metadata extent leak after failure to create subvolume

commit 67addf29004c5be9fa0383c82a364bb59afc7f84 upstream.

When creating a subvolume we allocate an extent buffer for its root node
after starting a transaction. We setup a root item for the subvolume that
points to that extent buffer and then attempt to insert the root item into
the root tree - however if that fails, due to ENOMEM for example, we do
not free the extent buffer previously allocated and we do not abort the
transaction (as at that point we did nothing that can not be undone).

This means that we effectively do not return the metadata extent back to
the free space cache/tree and we leave a delayed reference for it which
causes a metadata extent item to be added to the extent tree, in the next
transaction commit, without having backreferences. When this happens
'btrfs check' reports the following:

  $ btrfs check /dev/sdi
  Opening filesystem to check...
  Checking filesystem on /dev/sdi
  UUID: dce2cb9d-025f-4b05-a4bf-cee0ad3785eb
  [1/7] checking root items
  [2/7] checking extents
  ref mismatch on [30425088 16384] extent item 1, found 0
  backref 30425088 root 256 not referenced back 0x564a91c23d70
  incorrect global backref count on 30425088 found 1 wanted 0
  backpointer mismatch on [30425088 16384]
  owner ref check failed [30425088 16384]
  ERROR: errors found in extent allocation tree or chunk allocation
  [3/7] checking free space cache
  [4/7] checking fs roots
  [5/7] checking only csums items (without verifying data)
  [6/7] checking root refs
  [7/7] checking quota groups skipped (not enabled on this FS)
  found 212992 bytes used, error(s) found
  total csum bytes: 0
  total tree bytes: 131072
  total fs tree bytes: 32768
  total extent tree bytes: 16384
  btree space waste bytes: 124669
  file data blocks allocated: 65536
   referenced 65536

So fix this by freeing the metadata extent if btrfs_insert_root() returns
an error.

CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/build: Disable HIGHMEM64G selection for M486SX

commit 0ef3439cd80ba7770723edb0470d15815914bb62 upstream.

Fix a regression caused by making the 486SX separately selectable in
Kconfig, for which the HIGHMEM64G setting has not been updated and
therefore has become exposed as a user-selectable option for the M486SX
configuration setting unlike with original M486 and all the other
settings that choose non-PAE-enabled processors:

  High Memory Support
  > 1. off (NOHIGHMEM)
    2. 4GB (HIGHMEM4G)
    3. 64GB (HIGHMEM64G)
  choice[1-3?]:

With the fix in place the setting is now correctly removed:

  High Memory Support
  > 1. off (NOHIGHMEM)
    2. 4GB (HIGHMEM4G)
  choice[1-2?]:

[ bp: Massage commit message. ]

Fixes: 87d6021b8143 ("x86/math-emu: Limit MATH_EMULATION to 486SX compatibles")
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: stable@vger.kernel.org # v5.5+
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.2104141221340.44318@angie.orcam.me.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

btrfs: handle remount to no compress during compression

commit 1d8ba9e7e785b6625f4d8e978e8a284b144a7077 upstream.

[BUG]
When running btrfs/071 with inode_need_compress() removed from
compress_file_range(), we got the following crash:

  BUG: kernel NULL pointer dereference, address: 0000000000000018
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  Workqueue: btrfs-delalloc btrfs_work_helper [btrfs]
  RIP: 0010:compress_file_range+0x476/0x7b0 [btrfs]
  Call Trace:
   ? submit_compressed_extents+0x450/0x450 [btrfs]
   async_cow_start+0x16/0x40 [btrfs]
   btrfs_work_helper+0xf2/0x3e0 [btrfs]
   process_one_work+0x278/0x5e0
   worker_thread+0x55/0x400
   ? process_one_work+0x5e0/0x5e0
   kthread+0x168/0x190
   ? kthread_create_worker_on_cpu+0x70/0x70
   ret_from_fork+0x22/0x30
  ---[ end trace 65faf4eae941fa7d ]---

This is already after the patch "btrfs: inode: fix NULL pointer
dereference if inode doesn't need compression."

[CAUSE]
@pages is firstly created by kcalloc() in compress_file_extent():
                pages = kcalloc(nr_pages, sizeof(struct page *), GFP_NOFS);

Then passed to btrfs_compress_pages() to be utilized there:

                ret = btrfs_compress_pages(...
                                           pages,
                                           &nr_pages,
                                           ...);

btrfs_compress_pages() will initialize each page as output, in
zlib_compress_pages() we have:

                        pages[nr_pages] = out_page;
                        nr_pages++;

Normally this is completely fine, but there is a special case which
is in btrfs_compress_pages() itself:

        switch (type) {
        default:
                return -E2BIG;
        }

In this case, we didn't modify @pages nor @out_pages, leaving them
untouched, then when we cleanup pages, the we can hit NULL pointer
dereference again:

        if (pages) {
                for (i = 0; i < nr_pages; i++) {
                        WARN_ON(pages[i]->mapping);
                        put_page(pages[i]);
                }
        ...
        }

Since pages[i] are all initialized to zero, and btrfs_compress_pages()
doesn't change them at all, accessing pages[i]->mapping would lead to
NULL pointer dereference.

This is not possible for current kernel, as we check
inode_need_compress() before doing pages allocation.
But if we're going to remove that inode_need_compress() in
compress_file_extent(), then it's going to be a problem.

[FIX]
When btrfs_compress_pages() hits its default case, modify @out_pages to
0 to prevent such problem from happening.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=212331
CC: stable@vger.kernel.org # 5.10+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

smb2: fix use-after-free in smb2_ioctl_query_info()

commit ccd48ec3d4a6cc595b2d9c5146e63b6c23546701 upstream.

* rqst[1,2,3] is allocated in vars
* each rqst->rq_iov is also allocated in vars or using pooled memory

SMB2_open_free, SMB2_ioctl_free, SMB2_query_info_free are iterating on
each rqst after vars has been freed (use-after-free), and they are
freeing the kvec a second time (double-free).

How to trigger:

* compile with KASAN
* mount a share

$ smbinfo quota /mnt/foo
Segmentation fault
$ dmesg

==================================================================
BUG: KASAN: use-after-free in SMB2_open_free+0x1c/0xa0
Read of size 8 at addr ffff888007b10c00 by task python3/1200

CPU: 2 PID: 1200 Comm: python3 Not tainted 5.12.0-rc6+ #107
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a-rebuilt.opensuse.org 04/01/2014
Call Trace:
  dump_stack+0x93/0xc2
  print_address_description.constprop.0+0x18/0x130
  ? SMB2_open_free+0x1c/0xa0
  ? SMB2_open_free+0x1c/0xa0
  kasan_report.cold+0x7f/0x111
  ? smb2_ioctl_query_info+0x240/0x990
  ? SMB2_open_free+0x1c/0xa0
  SMB2_open_free+0x1c/0xa0
  smb2_ioctl_query_info+0x2bf/0x990
  ? smb2_query_reparse_tag+0x600/0x600
  ? cifs_mapchar+0x250/0x250
  ? rcu_read_lock_sched_held+0x3f/0x70
  ? cifs_strndup_to_utf16+0x12c/0x1c0
  ? rwlock_bug.part.0+0x60/0x60
  ? rcu_read_lock_sched_held+0x3f/0x70
  ? cifs_convert_path_to_utf16+0xf8/0x140
  ? smb2_check_message+0x6f0/0x6f0
  cifs_ioctl+0xf18/0x16b0
  ? smb2_query_reparse_tag+0x600/0x600
  ? cifs_readdir+0x1800/0x1800
  ? selinux_bprm_creds_for_exec+0x4d0/0x4d0
  ? do_user_addr_fault+0x30b/0x950
  ? __x64_sys_openat+0xce/0x140
  __x64_sys_ioctl+0xb9/0xf0
  do_syscall_64+0x33/0x40
  entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fdcf1f4ba87
Code: b3 66 90 48 8b 05 11 14 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e1 13 2c 00 f7 d8 64 89 01 48
RSP: 002b:00007ffef1ce7748 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00000000c018cf07 RCX: 00007fdcf1f4ba87
RDX: 0000564c467c5590 RSI: 00000000c018cf07 RDI: 0000000000000003
RBP: 00007ffef1ce7770 R08: 00007ffef1ce7420 R09: 00007fdcf0e0562b
R10: 0000000000000100 R11: 0000000000000246 R12: 0000000000004018
R13: 0000000000000001 R14: 0000000000000003 R15: 0000564c467c5590

Allocated by task 1200:
  kasan_save_stack+0x1b/0x40
  __kasan_kmalloc+0x7a/0x90
  smb2_ioctl_query_info+0x10e/0x990
  cifs_ioctl+0xf18/0x16b0
  __x64_sys_ioctl+0xb9/0xf0
  do_syscall_64+0x33/0x40
  entry_SYSCALL_64_after_hwframe+0x44/0xae

Freed by task 1200:
  kasan_save_stack+0x1b/0x40
  kasan_set_track+0x1c/0x30
  kasan_set_free_info+0x20/0x30
  __kasan_slab_free+0xe5/0x110
  slab_free_freelist_hook+0x53/0x130
  kfree+0xcc/0x320
  smb2_ioctl_query_info+0x2ad/0x990
  cifs_ioctl+0xf18/0x16b0
  __x64_sys_ioctl+0xb9/0xf0
  do_syscall_64+0x33/0x40
  entry_SYSCALL_64_after_hwframe+0x44/0xae

The buggy address belongs to the object at ffff888007b10c00
  which belongs to the cache kmalloc-512 of size 512
The buggy address is located 0 bytes inside of
  512-byte region [ffff888007b10c00, ffff888007b10e00)
The buggy address belongs to the page:
page:0000000044e14b75 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x7b10
head:0000000044e14b75 order:2 compound_mapcount:0 compound_pincount:0
flags: 0x100000000010200(slab|head)
raw: 0100000000010200 ffffea000015f500 0000000400000004 ffff888001042c80
raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
  ffff888007b10b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
  ffff888007b10b80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff888007b10c00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                    ^
  ffff888007b10c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ffff888007b10d00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

Signed-off-by: Aurelien Aptel <aaptel@suse.com>
CC: <stable@vger.kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

cifs: fix regression when mounting shares with prefix paths

commit 5c1acf3fe05ce443edba5e2110c9e581765f66a8 upstream.

The commit 315db9a05b7a ("cifs: fix leak in cifs_smb3_do_mount() ctx")
revealed an existing bug when mounting shares that contain a prefix
path or DFS links.

cifs_setup_volume_info() requires the @devname to contain the full
path (UNC + prefix) to update the fs context with the new UNC and
prepath values, however we were passing only the UNC
path (old_ctx->UNC) in @device thus discarding any prefix paths.

Instead of concatenating both old_ctx->{UNC,prepath} and pass it in
@devname, just keep the dup'ed values of UNC and prepath in
cifs_sb->ctx after calling smb3_fs_context_dup(), and fix
smb3_parse_devname() to correctly parse and not leak the new UNC and
prefix paths.

Cc: <stable@vger.kernel.org> # v5.11+
Fixes: 315db9a05b7a ("cifs: fix leak in cifs_smb3_do_mount() ctx")
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Acked-by: David Disseldorp <ddiss@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

cifs: detect dead connections only when echoes are enabled.

commit f4916649f98e2c7bdba38c6597a98c456c17317d upstream.

We can detect server unresponsiveness only if echoes are enabled.
Echoes can be disabled under two scenarios:
1. The connection is low on credits, so we've disabled echoes/oplocks.
2. The connection has not seen any request till now (other than
negotiate/sess-setup), which is when we enable these two, based on
the credits available.

So this fix will check for dead connection, only when echo is enabled.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
CC: <stable@vger.kernel.org> # v5.8+
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

cifs: fix leak in cifs_smb3_do_mount() ctx

commit 315db9a05b7a56810728589baa930864107e4634 upstream.

cifs_smb3_do_mount() calls smb3_fs_context_dup() and then
cifs_setup_volume_info(). The latter's subsequent smb3_parse_devname()
call overwrites the cifs_sb->ctx->UNC string already dup'ed by
smb3_fs_context_dup(), resulting in a leak. E.g.

unreferenced object 0xffff888002980420 (size 32):
  comm "mount", pid 160, jiffies 4294892541 (age 30.416s)
  hex dump (first 32 bytes):
    5c 5c 31 39 32 2e 31 36 38 2e 31 37 34 2e 31 30  \\192.168.174.10
    34 5c 72 61 70 69 64 6f 2d 73 68 61 72 65 00 00  4\rapido-share..
  backtrace:
    [<00000000069e12f6>] kstrdup+0x28/0x50
    [<00000000b61f4032>] smb3_fs_context_dup+0x127/0x1d0 [cifs]
    [<00000000c6e3e3bf>] cifs_smb3_do_mount+0x77/0x660 [cifs]
    [<0000000063467a6b>] smb3_get_tree+0xdf/0x220 [cifs]
    [<00000000716f731e>] vfs_get_tree+0x1b/0x90
    [<00000000491d3892>] path_mount+0x62a/0x910
    [<0000000046b2e774>] do_mount+0x50/0x70
    [<00000000ca7b64dd>] __x64_sys_mount+0x81/0xd0
    [<00000000b5122496>] do_syscall_64+0x33/0x40
    [<000000002dd397af>] entry_SYSCALL_64_after_hwframe+0x44/0xae

This change is a bandaid until the cifs_setup_volume_info() TODO and
error handling issues are resolved.

Signed-off-by: David Disseldorp <ddiss@suse.de>
Acked-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
CC: <stable@vger.kernel.org> # v5.11+
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

cifs: fix out-of-bound memory access when calling smb3_notify() at mount point

commit a637f4ae037e1e0604ac008564934d63261a8fd1 upstream.

If smb3_notify() is called at mount point of CIFS, build_path_from_dentry()
returns the pointer to kmalloc-ed memory with terminating zero (this is
empty FileName to be passed to SMB2 CREATE request). This pointer is assigned
to the `path` variable.
Then `path + 1` (to skip first backslash symbol) is passed to
cifs_convert_path_to_utf16(). This is incorrect for empty path and causes
out-of-bound memory access.

Get rid of this "increase by one". cifs_convert_path_to_utf16() already
contains the check for leading backslash in the path.

BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=212693
CC: <stable@vger.kernel.org> # v5.6+
Signed-off-by: Eugene Korenevsky <ekorenevsky@astralinux.ru>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

cifs: Return correct error code from smb2_get_enc_key

commit 83728cbf366e334301091d5b808add468ab46b27 upstream.

Avoid a warning if the error percolates back up:

[440700.376476] CIFS VFS: \\otters.example.com crypt_message: Could not get encryption key
[440700.386947] ------------[ cut here ]------------
[440700.386948] err = 1
[440700.386977] WARNING: CPU: 11 PID: 2733 at /build/linux-hwe-5.4-p6lk6L/linux-hwe-5.4-5.4.0/lib/errseq.c:74 errseq_set+0x5c/0x70
...
[440700.397304] CPU: 11 PID: 2733 Comm: tar Tainted: G           OE     5.4.0-70-generic #78~18.04.1-Ubuntu
...
[440700.397334] Call Trace:
[440700.397346]  __filemap_set_wb_err+0x1a/0x70
[440700.397419]  cifs_writepages+0x9c7/0xb30 [cifs]
[440700.397426]  do_writepages+0x4b/0xe0
[440700.397444]  __filemap_fdatawrite_range+0xcb/0x100
[440700.397455]  filemap_write_and_wait+0x42/0xa0
[440700.397486]  cifs_setattr+0x68b/0xf30 [cifs]
[440700.397493]  notify_change+0x358/0x4a0
[440700.397500]  utimes_common+0xe9/0x1c0
[440700.397510]  do_utimes+0xc5/0x150
[440700.397520]  __x64_sys_utimensat+0x88/0xd0

Fixes: 61cfac6f267d ("CIFS: Fix possible use after free in demultiplex thread")
Signed-off-by: Paul Aurich <paul@darkrain42.org>
CC: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

irqchip/gic-v3: Do not enable irqs when handling spurious interrups

commit a97709f563a078e259bf0861cd259aa60332890a upstream.

We triggered the following error while running our 4.19 kernel
with the pseudo-NMI patches backported to it:

[   14.816231] ------------[ cut here ]------------
[   14.816231] kernel BUG at irq.c:99!
[   14.816232] Internal error: Oops - BUG: 0 [#1] SMP
[   14.816232] Process swapper/0 (pid: 0, stack limit = 0x(____ptrval____))
[   14.816233] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G           O      4.19.95.aarch64 #14
[   14.816233] Hardware name: evb (DT)
[   14.816234] pstate: 80400085 (Nzcv daIf +PAN -UAO)
[   14.816234] pc : asm_nmi_enter+0x94/0x98
[   14.816235] lr : asm_nmi_enter+0x18/0x98
[   14.816235] sp : ffff000008003c50
[   14.816235] pmr_save: 00000070
[   14.816237] x29: ffff000008003c50 x28: ffff0000095f56c0
[   14.816238] x27: 0000000000000000 x26: ffff000008004000
[   14.816239] x25: 00000000015e0000 x24: ffff8008fb916000
[   14.816240] x23: 0000000020400005 x22: ffff0000080817cc
[   14.816241] x21: ffff000008003da0 x20: 0000000000000060
[   14.816242] x19: 00000000000003ff x18: ffffffffffffffff
[   14.816243] x17: 0000000000000008 x16: 003d090000000000
[   14.816244] x15: ffff0000095ea6c8 x14: ffff8008fff5ab40
[   14.816244] x13: ffff8008fff58b9d x12: 0000000000000000
[   14.816245] x11: ffff000008c8a200 x10: 000000008e31fca5
[   14.816246] x9 : ffff000008c8a208 x8 : 000000000000000f
[   14.816247] x7 : 0000000000000004 x6 : ffff8008fff58b9e
[   14.816248] x5 : 0000000000000000 x4 : 0000000080000000
[   14.816249] x3 : 0000000000000000 x2 : 0000000080000000
[   14.816250] x1 : 0000000000120000 x0 : ffff0000095f56c0
[   14.816251] Call trace:
[   14.816251]  asm_nmi_enter+0x94/0x98
[   14.816251]  el1_irq+0x8c/0x180                    (IRQ C)
[   14.816252]  gic_handle_irq+0xbc/0x2e4
[   14.816252]  el1_irq+0xcc/0x180                    (IRQ B)
[   14.816253]  arch_timer_handler_virt+0x38/0x58
[   14.816253]  handle_percpu_devid_irq+0x90/0x240
[   14.816253]  generic_handle_irq+0x34/0x50
[   14.816254]  __handle_domain_irq+0x68/0xc0
[   14.816254]  gic_handle_irq+0xf8/0x2e4
[   14.816255]  el1_irq+0xcc/0x180                    (IRQ A)
[   14.816255]  arch_cpu_idle+0x34/0x1c8
[   14.816255]  default_idle_call+0x24/0x44
[   14.816256]  do_idle+0x1d0/0x2c8
[   14.816256]  cpu_startup_entry+0x28/0x30
[   14.816256]  rest_init+0xb8/0xc8
[   14.816257]  start_kernel+0x4c8/0x4f4
[   14.816257] Code: 940587f1 d5384100 b9401001 36a7fd01 (d4210000)
[   14.816258] Modules linked in: start_dp(O) smeth(O)
[   15.103092] ---[ end trace 701753956cb14aa8 ]---
[   15.103093] Kernel panic - not syncing: Fatal exception in interrupt
[   15.103099] SMP: stopping secondary CPUs
[   15.103100] Kernel Offset: disabled
[   15.103100] CPU features: 0x36,a2400218
[   15.103100] Memory Limit: none

which is cause by a 'BUG_ON(in_nmi())' in nmi_enter().

From the call trace, we can find three interrupts (noted A, B, C above):
interrupt (A) is preempted by (B), which is further interrupted by (C).

Subsequent investigations show that (B) results in nmi_enter() being
called, but that it actually is a spurious interrupt. Furthermore,
interrupts are reenabled in the context of (B), and (C) fires with
NMI priority. We end-up with a nested NMI situation, something
we definitely do not want to (and cannot) handle.

The bug here is that spurious interrupts should never result in any
state change, and we should just return to the interrupted context.
Moving the handling of spurious interrupts as early as possible in
the GICv3 handler fixes this issue.

Fixes: 3f1f3234bc2d ("irqchip/gic-v3: Switch to PMR masking before calling IRQ handler")
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: He Ying <heying24@huawei.com>
[maz: rewrote commit message, corrected Fixes: tag]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210423083516.170111-1-heying24@huawei.com
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mmc: core: Fix hanging on I/O during system suspend for removable cards

commit 17a17bf50612e6048a9975450cf1bd30f93815b5 upstream.

The mmc core uses a PM notifier to temporarily during system suspend, turn
off the card detection mechanism for removal/insertion of (e)MMC/SD/SDIO
cards. Additionally, the notifier may be used to remove an SDIO card
entirely, if a corresponding SDIO functional driver don't have the system
suspend/resume callbacks assigned. This behaviour has been around for a
very long time.

However, a recent bug report tells us there are problems with this
approach. More precisely, when receiving the PM_SUSPEND_PREPARE
notification, we may end up hanging on I/O to be completed, thus also
preventing the system from getting suspended.

In the end what happens, is that the cancel_delayed_work_sync() in
mmc_pm_notify() ends up waiting for mmc_rescan() to complete - and since
mmc_rescan() wants to claim the host, it needs to wait for the I/O to be
completed first.

Typically, this problem is triggered in Android, if there is ongoing I/O
while the user decides to suspend, resume and then suspend the system
again. This due to that after the resume, an mmc_rescan() work gets punted
to the workqueue, which job is to verify that the card remains inserted
after the system has resumed.

To fix this problem, userspace needs to become frozen to suspend the I/O,
prior to turning off the card detection mechanism. Therefore, let's drop
the PM notifiers for mmc subsystem altogether and rely on the card
detection to be turned off/on as a part of the system_freezable_wq, that we
are already using.

Moreover, to allow and SDIO card to be removed during system suspend, let's
manage this from a ->prepare() callback, assigned at the mmc_host_class
level. In this way, we can use the parent device (the mmc_host_class
device), to remove the card device that is the child, in the
device_prepare() phase.

Reported-by: Kiwoong Kim <kwmad.kim@samsung.com>
Cc: stable@vger.kernel.org # v4.5+
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20210310152900.149380-1-ulf.hansson@linaro.org
Reviewed-by: Kiwoong Kim <kwmad.kim@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mmc: core: Set read only for SD cards with permanent write protect bit

commit 917a5336f2c27928be270226ab374ed0cbf3805d upstream.

Some of SD cards sets permanent write protection bit in their CSD register,
due to lifespan or internal problem. To avoid unnecessary I/O write
operations, let's parse the bits in the CSD during initialization and mark
the card as read only for this case.

Signed-off-by: Seunghui Lee <sh043.lee@samsung.com>
Link: https://lore.kernel.org/r/20210222083156.19158-1-sh043.lee@samsung.com
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mmc: core: Do a power cycle when the CMD11 fails

commit 147186f531ae49c18b7a9091a2c40e83b3d95649 upstream.

A CMD11 is sent to the SD/SDIO card to start the voltage switch procedure
into 1.8V I/O. According to the SD spec a power cycle is needed of the
card, if it turns out that the CMD11 fails. Let's fix this, to allow a
retry of the initialization without the voltage switch, to succeed.

Note that, whether it makes sense to also retry with the voltage switch
after the power cycle is a bit more difficult to know. At this point, we
treat it like the CMD11 isn't supported and therefore we skip it when
retrying.

Signed-off-by: DooHyun Hwang <dh0421.hwang@samsung.com>
Link: https://lore.kernel.org/r/20210210045936.7809-1-dh0421.hwang@samsung.com
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mmc: block: Issue a cache flush only when it's enabled

commit 97fce126e279690105ee15be652b465fd96f9997 upstream.

In command queueing mode, the cache isn't flushed via the mmc_flush_cache()
function, but instead by issuing a CMDQ_TASK_MGMT (CMD48) with a
FLUSH_CACHE opcode. In this path, we need to check if cache has been
enabled, before deciding to flush the cache, along the lines of what's
being done in mmc_flush_cache().

To fix this problem, let's add a new bus ops callback ->cache_enabled() and
implement it for the mmc bus type. In this way, the mmc block device driver
can call it to know whether cache flushing should be done.

Fixes: 1e8e55b67030 (mmc: block: Add CQE support)
Cc: stable@vger.kernel.org
Reported-by: Brendan Peter <bpeter@lytx.com>
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Tested-by: Brendan Peter <bpeter@lytx.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/20210425060207.2591-2-avri.altman@wdc.com
Link: https://lore.kernel.org/r/20210425060207.2591-3-avri.altman@wdc.com
[Ulf: Squashed the two patches and made some minor updates]
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mmc: block: Update ext_csd.cache_ctrl if it was written

commit aea0440ad023ab0662299326f941214b0d7480bd upstream.

The cache function can be turned ON and OFF by writing to the CACHE_CTRL
byte (EXT_CSD byte [33]). However, card->ext_csd.cache_ctrl is only
set on init if cache size > 0.

Fix that by explicitly setting ext_csd.cache_ctrl on ext-csd write.

Signed-off-by: Avri Altman <avri.altman@wdc.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210420134641.57343-3-avri.altman@wdc.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mmc: sdhci-tegra: Add required callbacks to set/clear CQE_EN bit

commit 5ec6fa5a6dc5e42a4aa782f3a81d5f08b0fac1e6 upstream.

CMD8 is not supported with Command Queue Enabled. Add required callback
to clear CQE_EN and CQE_INTR fields in the host controller register
before sending CMD8. Add corresponding callback in the CQHCI resume path
to re-enable CQE_EN and CQE_INTR fields.

Reported-by: Kamal Mostafa <kamal@canonical.com>
Tested-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Aniruddha Tvs Rao <anrao@nvidia.com>
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20210407094617.770495-1-jonathanh@nvidia.com
Cc: stable@vger.kernel.org # v5.10+
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mmc: sdhci-pci: Fix initialization of some SD cards for Intel BYT-based controllers

commit 2970134b927834e9249659a70aac48e62dff804a upstream.

Bus power may control card power, but the full reset done by SDHCI at
initialization still may not reset the power, whereas a direct write to
SDHCI_POWER_CONTROL can. That might be needed to initialize correctly, if
the card was left powered on previously.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210331081752.23621-1-adrian.hunter@intel.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mmc: sdhci: Check for reset prior to DMA address unmap

commit 21e35e898aa9ef7781632959db8613a5380f2eae upstream.

For data read commands, SDHC may initiate data transfers even before it
completely process the command response. In case command itself fails,
driver un-maps the memory associated with data transfer but this memory
can still be accessed by SDHC for the already initiated data transfer.
This scenario can lead to un-mapped memory access error.

To avoid this scenario, reset SDHC (when command fails) prior to
un-mapping memory. Resetting SDHC ensures that all in-flight data
transfers are either aborted or completed. So we don't run into this
scenario.

Swap the reset, un-map steps sequence in sdhci_request_done().

Suggested-by: Veerabhadrarao Badiganti <vbadigan@codeaurora.org>
Signed-off-by: Pradeep P V K <pragalla@codeaurora.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/1614760331-43499-1-git-send-email-pragalla@qti.qualcomm.com
Cc: stable@vger.kernel.org # v4.9+
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mmc: uniphier-sd: Fix a resource leak in the remove function

commit e29c84857e2d51aa017ce04284b962742fb97d9e upstream.

A 'tmio_mmc_host_free()' call is missing in the remove function, in order
to balance a 'tmio_mmc_host_alloc()' call in the probe.
This is done in the error handling path of the probe, but not in the remove
function.

Add the missing call.

Fixes: 3fd784f745dd ("mmc: uniphier-sd: add UniPhier SD/eMMC controller driver")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://lore.kernel.org/r/20210220142953.918608-1-christophe.jaillet@wanadoo.fr
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mmc: uniphier-sd: Fix an error handling path in uniphier_sd_probe()

commit b03aec1c1f337dfdae44cdb0645ecac34208ae0a upstream.

A 'uniphier_sd_clk_enable()' call should be balanced by a corresponding
'uniphier_sd_clk_disable()' call.
This is done in the remove function, but not in the error handling path of
the probe.

Add the missing call.

Fixes: 3fd784f745dd ("mmc: uniphier-sd: add UniPhier SD/eMMC controller driver")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://lore.kernel.org/r/20210220142935.918554-1-christophe.jaillet@wanadoo.fr
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

scsi: mpt3sas: Block PCI config access from userspace during reset

commit 3c8604691d2acc7b7d4795d9695070de9eaa5828 upstream.

While diag reset is in progress there is short duration where all access to
controller's PCI config space from the host needs to be blocked. This is
due to a hardware limitation of the IOC controllers.

Block all access to controller's config space from userland applications by
calling pci_cfg_access_lock() while diag reset is in progress and unlocking
it again after the controller comes back to ready state.

Link: https://lore.kernel.org/r/20210330105137.20728-1-sreekanth.reddy@broadcom.com
Cc: stable@vger.kernel.org #v5.4.108+
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

scsi: mpt3sas: Only one vSES is present even when IOC has multi vSES

commit 4c51f956965120b3441cdd39c358b87daba13e19 upstream.

Whenever the driver is adding a vSES to virtual-phys list it is
reinitializing the list head. Hence those vSES devices which were added
previously are lost.

Stop reinitializing the list every time a new vSES device is added.

Link: https://lore.kernel.org/r/20210330105004.20413-1-sreekanth.reddy@broadcom.com
Cc: stable@vger.kernel.org #v5.11.10+
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

scsi: qla2xxx: Fix crash in qla2xxx_mqueuecommand()

commit 6641df81ab799f28a5d564f860233dd26cca0d93 upstream.

    RIP: 0010:kmem_cache_free+0xfa/0x1b0
    Call Trace:
       qla2xxx_mqueuecommand+0x2b5/0x2c0 [qla2xxx]
       scsi_queue_rq+0x5e2/0xa40
       __blk_mq_try_issue_directly+0x128/0x1d0
       blk_mq_request_issue_directly+0x4e/0xb0

Fix incorrect call to free srb in qla2xxx_mqueuecommand(), as srb is now
allocated by upper layers. This fixes smatch warning of srb unintended
free.

Link: https://lore.kernel.org/r/20210329085229.4367-7-njavali@marvell.com
Fixes: af2a0c51b120 ("scsi: qla2xxx: Fix SRB leak on switch command timeout")
Cc: stable@vger.kernel.org # 5.5
Reported-by: Laurence Oberman <loberman@redhat.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

scsi: lpfc: Fix rmmod crash due to bad ring pointers to abort_iotag

commit 078c68b87a717b9fcd8e0f2109f73456fbc55490 upstream.

Rmmod on SLI-4 adapters is sometimes hitting a bad ptr dereference in
lpfc_els_free_iocb().

A prior patch refactored the lpfc_sli_abort_iocb() routine. One of the
changes was to convert from building/sending an abort within the routine to
using a common routine. The reworked routine passes, without modification,
the pring ptr to the new common routine. The older routine had logic to
check SLI-3 vs SLI-4 and adapt the pring ptr if necessary as callers were
passing SLI-3 pointers even when not on an SLI-4 adapter. The new routine
is missing this check and adapt, so the SLI-3 ring pointers are being used
in SLI-4 paths.

Fix by cleaning up the calling routines. In review, there is no need to
pass the ring ptr argument to abort_iocb at all. The routine can look at
the adapter type itself and reference the proper ring.

Link: https://lore.kernel.org/r/20210412013127.2387-2-jsmart2021@gmail.com
Fixes: db7531d2b377 ("scsi: lpfc: Convert abort handling to SLI-3 and SLI-4 handlers")
Cc: <stable@vger.kernel.org> # v5.11+
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

scsi: qla2xxx: Reserve extra IRQ vectors

commit f02d4086a8f36a0e1aaebf559b54cf24a177a486 upstream.

Commit a6dcfe08487e ("scsi: qla2xxx: Limit interrupt vectors to number of
CPUs") lowers the number of allocated MSI-X vectors to the number of CPUs.

That breaks vector allocation assumptions in qla83xx_iospace_config(),
qla24xx_enable_msix() and qla2x00_iospace_config(). Either of the functions
computes maximum number of qpairs as:

  ha->max_qpairs = ha->msix_count - 1 (MB interrupt) - 1 (default
                   response queue) - 1 (ATIO, in dual or pure target mode)

max_qpairs is set to zero in case of two CPUs and initiator mode. The
number is then used to allocate ha->queue_pair_map inside
qla2x00_alloc_queues(). No allocation happens and ha->queue_pair_map is
left NULL but the driver thinks there are queue pairs available.

qla2xxx_queuecommand() tries to find a qpair in the map and crashes:

  if (ha->mqenable) {
          uint32_t tag;
          uint16_t hwq;
          struct qla_qpair *qpair = NULL;

          tag = blk_mq_unique_tag(cmd->request);
          hwq = blk_mq_unique_tag_to_hwq(tag);
          qpair = ha->queue_pair_map[hwq]; # <- HERE

          if (qpair)
                  return qla2xxx_mqueuecommand(host, cmd, qpair);
  }

  BUG: kernel NULL pointer dereference, address: 0000000000000000
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 0 P4D 0
  Oops: 0000 [#1] SMP PTI
  CPU: 0 PID: 72 Comm: kworker/u4:3 Tainted: G        W         5.10.0-rc1+ #25
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
  Workqueue: scsi_wq_7 fc_scsi_scan_rport [scsi_transport_fc]
  RIP: 0010:qla2xxx_queuecommand+0x16b/0x3f0 [qla2xxx]
  Call Trace:
   scsi_queue_rq+0x58c/0xa60
   blk_mq_dispatch_rq_list+0x2b7/0x6f0
   ? __sbitmap_get_word+0x2a/0x80
   __blk_mq_sched_dispatch_requests+0xb8/0x170
   blk_mq_sched_dispatch_requests+0x2b/0x50
   __blk_mq_run_hw_queue+0x49/0xb0
   __blk_mq_delay_run_hw_queue+0xfb/0x150
   blk_mq_sched_insert_request+0xbe/0x110
   blk_execute_rq+0x45/0x70
   __scsi_execute+0x10e/0x250
   scsi_probe_and_add_lun+0x228/0xda0
   __scsi_scan_target+0xf4/0x620
   ? __pm_runtime_resume+0x4f/0x70
   scsi_scan_target+0x100/0x110
   fc_scsi_scan_rport+0xa1/0xb0 [scsi_transport_fc]
   process_one_work+0x1ea/0x3b0
   worker_thread+0x28/0x3b0
   ? process_one_work+0x3b0/0x3b0
   kthread+0x112/0x130
   ? kthread_park+0x80/0x80
   ret_from_fork+0x22/0x30

The driver should allocate enough vectors to provide every CPU it's own HW
queue and still handle reserved (MB, RSP, ATIO) interrupts.

The change fixes the crash on dual core VM and prevents unbalanced QP
allocation where nr_hw_queues is two less than the number of CPUs.

Link: https://lore.kernel.org/r/20210412165740.39318-1-r.bolshakov@yadro.com
Fixes: a6dcfe08487e ("scsi: qla2xxx: Limit interrupt vectors to number of CPUs")
Cc: Daniel Wagner <daniel.wagner@suse.com>
Cc: Himanshu Madhani <himanshu.madhani@oracle.com>
Cc: Quinn Tran <qutran@marvell.com>
Cc: Nilesh Javali <njavali@marvell.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: stable@vger.kernel.org # 5.11+
Reported-by: Aleksandr Volkov <a.y.volkov@yadro.com>
Reported-by: Aleksandr Miloserdov <a.miloserdov@yadro.com>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

libceph: allow addrvecs with a single NONE/blank address

commit 3f1c6f2122fc780560f09735b6d1dbf39b44eb0f upstream.

Normally, an unused OSD id/slot is represented by an empty addrvec.
However, it also appears to be possible to generate an osdmap where
an unused OSD id/slot has an addrvec with a single blank address of
type NONE. Allow such addrvecs and make the end result be exactly
the same as for the empty addrvec case -- leave addr intact.

Cc: stable@vger.kernel.org # 5.11+
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

libceph: bump CephXAuthenticate encoding version

commit 7807dafda21a549403d922da98dde0ddfeb70d08 upstream.

A dummy v3 encoding (exactly the same as v2) was introduced so that
the monitors can distinguish broken clients that may not include their
auth ticket in CEPHX_GET_AUTH_SESSION_KEY request on reconnects, thus
failing to prove previous possession of their global_id (one part of
CVE-2021-20288).

The kernel client has always included its auth ticket, so it is
compatible with enforcing mode as is. However we want to bump the
encoding version to avoid having to authenticate twice on the initial
connect -- all legacy (CephXAuthenticate < v3) are now forced do so in
order to expose insecure global_id reclaim.

Marking for stable since at least for 5.11 and 5.12 it is trivial
(v2 -> v3).

Cc: stable@vger.kernel.org # 5.11+
URL: https://tracker.ceph.com/issues/50452
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

spi: spi-ti-qspi: Free DMA resources

commit 1d309cd688a76fb733f0089d36dc630327b32d59 upstream.

Release the RX channel and free the dma coherent memory when
devm_spi_register_master() fails.

Fixes: 5720ec0a6d26 ("spi: spi-ti-qspi: Add DMA support for QSPI mmap read")
Cc: stable@vger.kernel.org
Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Link: https://lore.kernel.org/r/20210218130950.90155-1-tudor.ambarus@microchip.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

spi: stm32-qspi: fix pm_runtime usage_count counter

commit 102e9d1936569d43f55dd1ea89be355ad207143c upstream.

pm_runtime usage_count counter is not well managed.
pm_runtime_put_autosuspend callback drops the usage_counter but this
one has never been increased. Add pm_runtime_get_sync callback to bump up
the usage counter. It is also needed to use pm_runtime_force_suspend and
pm_runtime_force_resume APIs to handle properly the clock.

Fixes: 9d282c17b023 ("spi: stm32-qspi: Add pm_runtime support")
Signed-off-by: Christophe Kerello <christophe.kerello@foss.st.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210419121541.11617-2-patrice.chotard@foss.st.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

erofs: add unsupported inode i_format check

commit 24a806d849c0b0c1d0cd6a6b93ba4ae4c0ec9f08 upstream.

If any unknown i_format fields are set (may be of some new incompat
inode features), mark such inode as unsupported.

Just in case of any new incompat i_format fields added in the future.

Link: https://lore.kernel.org/r/20210329003614.6583-1-hsiangkao@aol.com
Fixes: 431339ba9042 ("staging: erofs: add inode operations")
Cc: <stable@vger.kernel.org> # 4.19+
Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mtd: physmap: physmap-bt1-rom: Fix unintentional stack access

commit 683313993dbe1651c7aa00bb42a041d70e914925 upstream.

Cast &data to (char *) in order to avoid unintentionally accessing
the stack.

Notice that data is of type u32, so any increment to &data
will be in the order of 4-byte chunks, and this piece of code
is actually intended to be a byte offset.

Fixes: b3e79e7682e0 ("mtd: physmap: Add Baikal-T1 physically mapped ROM support")
Addresses-Coverity-ID: 1497765 ("Out-of-bounds access")
Cc: stable@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Acked-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20210212104022.GA242669@embeddedor
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mtd: rawnand: atmel: Update ecc_stats.corrected counter

commit 33cebf701e98dd12b01d39d1c644387b27c1a627 upstream.

Update MTD ECC statistics with the number of corrected bits.

Fixes: f88fc122cc34 ("mtd: nand: Cleanup/rework the atmel_nand driver")
Cc: stable@vger.kernel.org
Signed-off-by: Kai Stuhlemmer (ebee Engineering) <kai.stuhlemmer@ebee.de>
Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20210322150714.101585-1-tudor.ambarus@microchip.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mtd: spinand: core: add missing MODULE_DEVICE_TABLE()

commit 25fefc88c71f47db0466570335e3f75f10952e7a upstream.

The module misses MODULE_DEVICE_TABLE() for both SPI and OF ID tables
and thus never autoloads on ID matches.
Add the missing declarations.
Present since day-0 of spinand framework introduction.

Fixes: 7529df465248 ("mtd: nand: Add core infrastructure to support SPI NANDs")
Cc: stable@vger.kernel.org # 4.19+
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20210323173714.317884-1-alobakin@pm.me
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Revert "mtd: spi-nor: macronix: Add support for mx25l51245g"

commit 46094049a49be777f12a9589798f7c70b90cd03f upstream.

This reverts commit 04b8edad262eec0d153005973dfbdd83423c0dcb.

mx25l51245g and mx66l51235l have the same flash ID. The flash
detection returns the first entry in the flash_info array that
matches the flash ID that was read, thus for the 0xc2201a ID,
mx25l51245g was always hit, introducing a regression for
mx66l51235l.

If one wants to differentiate the flash names, a better fix would be
to differentiate between the two at run-time, depending on SFDP,
and choose the correct name from a list of flash names, depending on
the SFDP differentiator.

Fixes: 04b8edad262e ("mtd: spi-nor: macronix: Add support for mx25l51245g")
Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Acked-by: Pratyush Yadav <p.yadav@ti.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210402082031.19055-2-tudor.ambarus@microchip.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mtd: spi-nor: core: Fix an issue of releasing resources during read/write

commit be94215be1ab19e5d38f50962f611c88d4bfc83a upstream.

If rmmod the driver during read or write, the driver will release the
resources which are used during read or write, so it is possible to
refer to NULL pointer.

Use the testcase "mtd_debug read /dev/mtd0 0xc00000 0x400000 dest_file &
sleep 0.5;rmmod spi_hisi_sfc_v3xx.ko", the issue can be reproduced in
hisi_sfc_v3xx driver.

To avoid the issue, fill the interface _get_device and _put_device of
mtd_info to grab the reference to the spi controller driver module, so
the request of rmmod the driver is rejected before read/write is finished.

Fixes: b199489d37b2 ("mtd: spi-nor: add the framework for SPI NOR")
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Tested-by: Michael Walle <michael@walle.cc>
Tested-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Reviewed-by: Michael Walle <michael@walle.cc>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/1617262486-4223-1-git-send-email-yangyicong@hisilicon.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

reset: add missing empty function reset_control_rearm()

commit 48582b2e3b87b794a9845d488af2c76ce055502b upstream.

All other functions are defined for when CONFIG_RESET_CONTROLLER
is not set.

Fixes: 557acb3d2cd9 ("reset: make shared pulsed reset controls re-triggerable")
Link: https://lore.kernel.org/r/20210430152156.21162-2-jim2101024@gmail.com
Signed-off-by: Jim Quinlan <jim2101024@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org # v5.11+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

fs/epoll: restore waking from ep_done_scan()

commit 7fab29e356309ff93a4b30ecc466129682ec190b upstream.

Commit 339ddb53d373 ("fs/epoll: remove unnecessary wakeups of nested
epoll") changed the userspace visible behavior of exclusive waiters
blocked on a common epoll descriptor upon a single event becoming ready.

Previously, all tasks doing epoll_wait would awake, and now only one is
awoken, potentially causing missed wakeups on applications that rely on
this behavior, such as Apache Qpid.

While the aforementioned commit aims at having only a wakeup single path
in ep_poll_callback (with the exceptions of epoll_ctl cases), we need to
restore the wakeup in what was the old ep_scan_ready_list() such that
the next thread can be awoken, in a cascading style, after the waker's
corresponding ep_send_events().

Link: https://lkml.kernel.org/r/20210405231025.33829-3-dave@stgolabs.net
Fixes: 339ddb53d373 ("fs/epoll: remove unnecessary wakeups of nested epoll")
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Roman Penyaev <rpenyaev@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ecryptfs: fix kernel panic with null dev_name

commit 9046625511ad8dfbc8c6c2de16b3532c43d68d48 upstream.

When mounting eCryptfs, a null "dev_name" argument to ecryptfs_mount()
causes a kernel panic if the parsed options are valid. The easiest way to
reproduce this is to call mount() from userspace with an existing
eCryptfs mount's options and a "source" argument of 0.

Error out if "dev_name" is null in ecryptfs_mount()

Fixes: 237fead61998 ("[PATCH] ecryptfs: fs/Makefile and fs/Kconfig")
Cc: stable@vger.kernel.org
Signed-off-by: Jeffrey Mitchell <jeffrey.mitchell@starlab.io>
Signed-off-by: Tyler Hicks <code@tyhicks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: dts: mt8173: fix property typo of 'phys' in dsi node

commit e4e5d030bd779fb8321d3b8bd65406fbe0827037 upstream.

Use 'phys' instead of 'phy'.

Fixes: 81ad4dbaf7af ("arm64: dts: mt8173: Add display subsystem related nodes")
Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com>
Reviewed-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210316092232.9806-5-chunfeng.yun@mediatek.com
Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: dts: marvell: armada-37xx: add syscon compatible to NB clk node

commit 1d88358a89dbac9c7d4559548b9a44840456e6fb upstream.

Add "syscon" compatible to the North Bridge clocks node to allow the
cpufreq driver to access these registers via syscon API.

This is needed for a fix of cpufreq driver.

Signed-off-by: Marek Behún <kabel@kernel.org>
Fixes: e8d66e7927b2 ("arm64: dts: marvell: armada-37xx: add nodes...")
Cc: stable@vger.kernel.org
Cc: Gregory CLEMENT <gregory.clement@free-electrons.com>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ARM: 9056/1: decompressor: fix BSS size calculation for LLVM ld.lld

commit c4e792d1acce31c2eb7b9193ab06ab94de05bf42 upstream.

The LLVM ld.lld linker uses a different symbol type for __bss_start,
resulting in the calculation of KBSS_SZ to be thrown off. Up until now,
this has gone unnoticed as it only affects the appended DTB case, but
pending changes for ARM in the way the decompressed kernel is cleaned
from the caches has uncovered this problem.

On a ld.lld build:

  $ nm vmlinux |grep bss_
  c1c22034 D __bss_start
  c1c86e98 B __bss_stop

resulting in

  $ readelf -s arch/arm/boot/compressed/vmlinux | grep bss_size
  433: c1c86e98     0 NOTYPE  GLOBAL DEFAULT  ABS _kernel_bss_size

which is obviously incorrect, and may cause the cache clean to access
unmapped memory, or cause the size calculation to wrap, resulting in no
cache clean to be performed at all.

Fix this by updating the sed regex to take D type symbols into account.

Link: https://lore.kernel.org/linux-arm-kernel/6c65bcef-d4e7-25fa-43cf-2c435bb61bb9@collabora.com/
Link: https://lore.kernel.org/linux-arm-kernel/20210205085220.31232-1-ardb@kernel.org/
Cc: <stable@vger.kernel.org> # v4.19+
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Reported-by: Guillaume Tucker <guillaume.tucker@collabora.com>
Reported-by: "kernelci.org bot" <bot@kernelci.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ftrace: Handle commands when closing set_ftrace_filter file

commit 8c9af478c06bb1ab1422f90d8ecbc53defd44bc3 upstream.

# echo switch_mm:traceoff > /sys/kernel/tracing/set_ftrace_filter

will cause switch_mm to stop tracing by the traceoff command.

# echo -n switch_mm:traceoff > /sys/kernel/tracing/set_ftrace_filter

does nothing.

The reason is that the parsing in the write function only processes
commands if it finished parsing (there is white space written after the
command). That's to handle:

write(fd, "switch_mm:", 10);
write(fd, "traceoff", 8);

cases, where the command is broken over multiple writes.

The problem is if the file descriptor is closed, then the write call is
not processed, and the command needs to be processed in the release code.
The release code can handle matching of functions, but does not handle
commands.

Cc: stable@vger.kernel.org
Fixes: eda1e32855656 ("tracing: handle broken names in ftrace filter")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ACPI: custom_method: fix a possible memory leak

commit 1cfd8956437f842836e8a066b40d1ec2fc01f13e upstream.

In cm_write(), if the 'buf' is allocated memory but not fully consumed,
it is possible to reallocate the buffer without freeing it by passing
'*ppos' as 0 on a subsequent call.

Add an explicit kfree() before kzalloc() to prevent the possible memory
leak.

Fixes: 526b4af47f44 ("ACPI: Split out custom_method functionality into an own driver")
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
Cc: 5.4+ <stable@vger.kernel.org> # 5.4+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ACPI: custom_method: fix potential use-after-free issue

commit e483bb9a991bdae29a0caa4b3a6d002c968f94aa upstream.

In cm_write(), buf is always freed when reaching the end of the
function. If the requested count is less than table.length, the
allocated buffer will be freed but subsequent calls to cm_write() will
still try to access it.

Remove the unconditional kfree(buf) at the end of the function and
set the buf to NULL in the -EINVAL error path to match the rest of
function.

Fixes: 03d1571d9513 ("ACPI: custom_method: fix memory leaks")
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
Cc: 5.4+ <stable@vger.kernel.org> # 5.4+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

tpm: acpi: Check eventlog signature before using it

commit 3dcd15665aca80197333500a4be3900948afccc1 upstream.

Check the eventlog signature before using it. This avoids using an
empty log, as may be the case when QEMU created the ACPI tables,
rather than probing the EFI log next. This resolves an issue where
the EFI log was empty since an empty ACPI log was used.

Cc: stable@vger.kernel.org
Fixes: 85467f63a05c ("tpm: Add support for event log pointer found in TPM2 ACPI table")
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

vhost-vdpa: fix vm_flags for virtqueue doorbell mapping

commit 3a3e0fad16d40a2aa68ddf7eea4acdf48b22dd44 upstream.

The virtqueue doorbell is usually implemented via registeres but we
don't provide the necessary vma->flags like VM_PFNMAP. This may cause
several issues e.g when userspace tries to map the doorbell via vhost
IOTLB, kernel may panic due to the page is not backed by page
structure. This patch fixes this by setting the necessary
vm_flags. With this patch, try to map doorbell via IOTLB will fail
with bad address.

Cc: stable@vger.kernel.org
Fixes: ddd89d0a059d ("vhost_vdpa: support doorbell mapping via mmap")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20210413091557.29008-1-jasowang@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

s390/cio: remove invalid condition on IO_SCH_UNREG

commit 2f7484fd73729f89085fe08d683f5a8d9e17fe99 upstream.

The condition to check the cdev pointer validity on
css_sch_device_unregister() is a leftover from the 'commit 8cc0dcfdc1c0
("s390/cio: remove pm support from ccw bus driver")'. This could lead to a
situation, where detaching the device is not happening completely. Remove
this invalid condition in the IO_SCH_UNREG case.

Link: https://lore.kernel.org/r/20210423100843.2230969-1-vneethv@linux.ibm.com
Fixes: 8cc0dcfdc1c0 ("s390/cio: remove pm support from ccw bus driver")
Reported-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Suggested-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Vineeth Vijayan <vneethv@linux.ibm.com>
Tested-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Tested-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks

commit 0cc00c8d40500c4c8fe058dc014bdaf44a82f4f7 upstream.

This patch fixes a lockdep splat introduced by commit f21916ec4826
("s390/vfio-ap: clean up vfio_ap resources when KVM pointer invalidated").
The lockdep splat only occurs when starting a Secure Execution guest.
Crypto virtualization (vfio_ap) is not yet supported for SE guests;
however, in order to avoid this problem when support becomes available,
this fix is being provided.

The circular locking dependency was introduced when the setting of the
masks in the guest's APCB was executed while holding the matrix_dev->lock.
While the lock is definitely needed to protect the setting/unsetting of the
matrix_mdev->kvm pointer, it is not necessarily critical for setting the
masks; so, the matrix_dev->lock will be released while the masks are being
set or cleared.

Keep in mind, however, that another process that takes the matrix_dev->lock
can get control while the masks in the guest's APCB are being set or
cleared as a result of the driver being notified that the KVM pointer
has been set or unset. This could result in invalid access to the
matrix_mdev->kvm pointer by the intervening process. To avoid this
scenario, two new fields are being added to the ap_matrix_mdev struct:

struct ap_matrix_mdev {
...
bool kvm_busy;
wait_queue_head_t wait_for_kvm;
...
};

The functions that handle notification that the KVM pointer value has
been set or cleared will set the kvm_busy flag to true until they are done
processing at which time they will set it to false and wake up the tasks on
the matrix_mdev->wait_for_kvm wait queue. Functions that require
access to matrix_mdev->kvm will sleep on the wait queue until they are
awakened at which time they can safely access the matrix_mdev->kvm
field.

Fixes: f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM pointer invalidated")
Cc: stable@vger.kernel.org
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

s390/zcrypt: fix zcard and zqueue hot-unplug memleak

commit 70fac8088cfad9f3b379c9082832b4d7532c16c2 upstream.

Tests with kvm and a kmemdebug kernel showed, that on hot unplug the
zcard and zqueue structs for the unplugged card or queue are not
properly freed because of a mismatch with get/put for the embedded
kref counter.

This fix now adjusts the handling of the kref counters. With init the
kref counter starts with 1. This initial value needs to drop to zero
with the unregister of the card or queue to trigger the release and
free the object.

Fixes: 29c2680fd2bf ("s390/ap: fix ap devices reference counting")
Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Cc: stable@vger.kernel.org
Reviewed-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

s390/disassembler: increase ebpf disasm buffer size

commit 6f3353c2d2b3eb4de52e9704cb962712033db181 upstream.

Current ebpf disassembly buffer size of 64 is too small. E.g. this line
takes 65 bytes:
01fffff8005822e: ec8100ed8065\tclgrj\t%r8,%r1,8,001fffff80058408\n\0

Double the buffer size like it is done for the kernel disassembly buffer.

Fixes the following KASAN finding:

UG: KASAN: stack-out-of-bounds in print_fn_code+0x34c/0x380
Write of size 1 at addr 001fff800ad5f970 by task test_progs/853

CPU: 53 PID: 853 Comm: test_progs Not tainted
5.12.0-rc7-23786-g23457d86b1f0-dirty #19
Hardware name: IBM 3906 M04 704 (LPAR)
Call Trace:
[<0000000cd8e0538a>] show_stack+0x17a/0x1668
[<0000000cd8e2a5d8>] dump_stack+0x140/0x1b8
[<0000000cd8e16e74>] print_address_description.constprop.0+0x54/0x260
[<0000000cd75a8698>] kasan_report+0xc8/0x130
[<0000000cd6e26da4>] print_fn_code+0x34c/0x380
[<0000000cd6ea0f4e>] bpf_int_jit_compile+0xe3e/0xe58
[<0000000cd72c4c88>] bpf_prog_select_runtime+0x5b8/0x9c0
[<0000000cd72d1bf8>] bpf_prog_load+0xa78/0x19c0
[<0000000cd72d7ad6>] __do_sys_bpf.part.0+0x18e/0x768
[<0000000cd6e0f392>] do_syscall+0x12a/0x220
[<0000000cd8e333f8>] __do_syscall+0x98/0xc8
[<0000000cd8e54834>] system_call+0x6c/0x94
1 lock held by test_progs/853:
#0: 0000000cd9bf7460 (report_lock){....}-{2:2}, at:
kasan_report+0x96/0x130

addr 001fff800ad5f970 is located in stack of task test_progs/853 at
offset 96 in frame:
print_fn_code+0x0/0x380
this frame has 1 object:
[32, 96) 'buffer'

Memory state around the buggy address:
001fff800ad5f800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
001fff800ad5f880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>001fff800ad5f900: 00 00 f1 f1 f1 f1 00 00 00 00 00 00 00 00 f3 f3
^
001fff800ad5f980: f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 00 00
001fff800ad5fa00: 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00 00

Cc: <stable@vger.kernel.org>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dyndbg: fix parsing file query without a line-range suffix

commit 7b1ae248279bea33af9e797a93c35f49601cb8a0 upstream.

Query like 'file tcp_input.c line 1234 +p' was broken by
commit aaebe329bff0 ("dyndbg: accept 'file foo.c:func1' and 'file
foo.c:10-100'") because a file name without a ':' now makes the loop in
ddebug_parse_query() exits early before parsing the 'line 1234' part.
As a result, all pr_debug() in tcp_input.c will be enabled, instead of only
the one on line 1234. Changing 'break' to 'continue' fixes this.

Fixes: aaebe329bff0 ("dyndbg: accept 'file foo.c:func1' and 'file foo.c:10-100'")
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Shuo Chen <shuochen@google.com>
Acked-by: Jason Baron <jbaron@akamai.com>
Link: https://lore.kernel.org/r/20210414212400.2927281-1-giantchen@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

nitro_enclaves: Fix stale file descriptors on failed usercopy

commit f1ce3986baa62cffc3c5be156994de87524bab99 upstream.

A failing usercopy of the slot uid will lead to a stale entry in the
file descriptor table as put_unused_fd() won't release it. This enables
userland to refer to a dangling 'file' object through that still valid
file descriptor, leading to all kinds of use-after-free exploitation
scenarios.

Exchanging put_unused_fd() for close_fd(), ksys_close() or alike won't
solve the underlying issue, as the file descriptor might have been
replaced in the meantime, e.g. via userland calling close() on it
(leading to a NULL pointer dereference in the error handling code as
'fget(enclave_fd)' will return a NULL pointer) or by dup2()'ing a
completely different file object to that very file descriptor, leading
to the same situation: a dangling file descriptor pointing to a freed
object -- just in this case to a file object of user's choosing.

Generally speaking, after the call to fd_install() the file descriptor
is live and userland is free to do whatever with it. We cannot rely on
it to still refer to our enclave object afterwards. In fact, by abusing
userfaultfd() userland can hit the condition without any racing and
abuse the error handling in the nitro code as it pleases.

To fix the above issues, defer the call to fd_install() until all
possible errors are handled. In this case it's just the usercopy, so do
it directly in ne_create_vm_ioctl() itself.

Signed-off-by: Mathias Krause <minipli@grsecurity.net>
Signed-off-by: Andra Paraschiv <andraprs@amazon.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210429165941.27020-2-andraprs@amazon.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bus: mhi: core: Fix invalid error returning in mhi_queue

commit 0ecc1c70dcd32c0f081b173a1a5d89952686f271 upstream.

mhi_queue returns an error when the doorbell is not accessible in
the current state. This can happen when the device is in non M0
state, like M3, and needs to be waken-up prior ringing the DB. This
case is managed earlier by triggering an asynchronous M3 exit via
controller resume/suspend callbacks, that in turn will cause M0
transition and DB update.

So, since it's not an error but just delaying of doorbell update, there
is no reason to return an error.

This also fixes a use after free error for skb case, indeed a caller
queuing skb will try to free the skb if the queueing fails, but in
that case queueing has been done.

Fixes: a8f75cb348fd ("mhi: core: Factorize mhi queuing")
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Reviewed-by: Jeffrey Hugo <jhugo@codeaurora.org>
Reviewed-by: Bhaumik Bhatt <bbhatt@codeaurora.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/1614336782-5809-1-git-send-email-loic.poulain@linaro.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bus: mhi: core: Fix MHI runtime_pm behavior

commit 4547a749be997eb12ea7edcf361ec2a5329f7aec upstream.

This change ensures that PM reference is always get during packet
queueing and released either after queuing completion (RX) or once
the buffer has been consumed (TX). This guarantees proper update for
underlying MHI controller runtime status (e.g. last_busy timestamp)
and prevents suspend to be triggered while TX packets are flying,
or before we completed update of the RX ring.

Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/1617700315-12492-1-git-send-email-loic.poulain@linaro.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bus: mhi: pci_generic: Remove WQ_MEM_RECLAIM flag from state workqueue

commit 0fccbf0a3b690b162f53b13ed8bc442ea33437dc upstream.

A recent change created a dedicated workqueue for the state-change work
with WQ_HIGHPRI (no strong reason for that) and WQ_MEM_RECLAIM flags,
but the state-change work (mhi_pm_st_worker) does not guarantee forward
progress under memory pressure, and will even wait on various memory
allocations when e.g. creating devices, loading firmware, etc... The
work is then not part of a memory reclaim path...

Moreover, this causes a warning in check_flush_dependency() since we end
up in code that flushes a non-reclaim workqueue:

[   40.969601] workqueue: WQ_MEM_RECLAIM mhi_hiprio_wq:mhi_pm_st_worker [mhi] is flushing !WQ_MEM_RECLAIM events_highpri:flush_backlog
[   40.969612] WARNING: CPU: 4 PID: 158 at kernel/workqueue.c:2607 check_flush_dependency+0x11c/0x140
[   40.969733] Call Trace:
[   40.969740]  __flush_work+0x97/0x1d0
[   40.969745]  ? wake_up_process+0x15/0x20
[   40.969749]  ? insert_work+0x70/0x80
[   40.969750]  ? __queue_work+0x14a/0x3e0
[   40.969753]  flush_work+0x10/0x20
[   40.969756]  rollback_registered_many+0x1c9/0x510
[   40.969759]  unregister_netdevice_queue+0x94/0x120
[   40.969761]  unregister_netdev+0x1d/0x30
[   40.969765]  mhi_net_remove+0x1a/0x40 [mhi_net]
[   40.969770]  mhi_driver_remove+0x124/0x250 [mhi]
[   40.969776]  device_release_driver_internal+0xf0/0x1d0
[   40.969778]  device_release_driver+0x12/0x20
[   40.969782]  bus_remove_device+0xe1/0x150
[   40.969786]  device_del+0x17b/0x3e0
[   40.969791]  mhi_destroy_device+0x9a/0x100 [mhi]
[   40.969796]  ? mhi_unmap_single_use_bb+0x50/0x50 [mhi]
[   40.969799]  device_for_each_child+0x5e/0xa0
[   40.969804]  mhi_pm_st_worker+0x921/0xf50 [mhi]

Fixes: 8f7039787687 ("bus: mhi: core: Move to using high priority workqueue")
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Reviewed-by: Bhaumik Bhatt <bbhatt@codeaurora.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/1614161930-8513-1-git-send-email-loic.poulain@linaro.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bus: mhi: core: Add missing checks for MMIO register entries

commit 8de5ad99414347ad08e6ebc2260be1d2e009cb9a upstream.

As per documentation, fields marked as (required) in an MHI
controller structure need to be populated by the controller driver
before calling mhi_register_controller(). Ensure all required
pointers and non-zero fields are present in the controller before
proceeding with the registration.

Signed-off-by: Bhaumik Bhatt <bbhatt@codeaurora.org>
Reviewed-by: Jeffrey Hugo <jhugo@codeaurora.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/1615315490-36017-1-git-send-email-bbhatt@codeaurora.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bus: mhi: core: Sanity check values from remote device before use

commit ec32332df7645e0ba463a08d483fe97665167071 upstream.

When parsing the structures in the shared memory, there are values which
come from the remote device.  For example, a transfer completion event
will have a pointer to the tre in the relevant channel's transfer ring.
As another example, event ring elements may specify a channel in which
the event occurred, however the specified channel value may not be valid
as no channel is defined at that index even though the index may be less
than the maximum allowed index.  Such values should be considered to be
untrusted, and validated before use.  If we blindly use such values, we
may access invalid data or crash if the values are corrupted.

If validation fails, drop the relevant event.

Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Hemant Kumar <hemantk@codeaurora.org>
Link: https://lore.kernel.org/r/1615411855-15053-1-git-send-email-jhugo@codeaurora.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bus: mhi: core: Clear configuration from channel context during reset

commit 47705c08465931923e2f2b506986ca0bdf80380d upstream.

When clearing up the channel context after client drivers are
done using channels, the configuration is currently not being
reset entirely. Ensure this is done to appropriately handle
issues where clients unaware of the context state end up calling
functions which expect a context.

Signed-off-by: Bhaumik Bhatt <bbhatt@codeaurora.org>
Reviewed-by: Hemant Kumar <hemantk@codeaurora.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/1617311778-1254-7-git-send-email-bbhatt@codeaurora.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bus: mhi: core: Fix check for syserr at power_up

commit 6403298c58d4858d93648f553abf0bcbd2dfaca2 upstream.

The check to see if we have reset the device after detecting syserr at
power_up is inverted.  wait_for_event_timeout() returns 0 on failure,
and a positive value on success.  The check is looking for non-zero
as a failure, which is likely to incorrectly cause a device init failure
if syserr was detected at power_up.  Fix this.

Fixes: e18d4e9fa79b ("bus: mhi: core: Handle syserr during power_up")
Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org>
Reviewed-by: Loic Poulain <loic.poulain@linaro.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/1613165243-23359-1-git-send-email-jhugo@codeaurora.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Linux 5.12.2

Tested-by: Fox Chen <foxhlchen@gmail.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Jason Self <jason@bluehome.net>
Link: https://lore.kernel.org/r/20210505112324.956720416@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

perf/core: Fix unconditional security_locked_down() call

commit 08ef1af4de5fe7de9c6d69f1e22e51b66e385d9b upstream.

Currently, the lockdown state is queried unconditionally, even though
its result is used only if the PERF_SAMPLE_REGS_INTR bit is set in
attr.sample_type. While that doesn't matter in case of the Lockdown LSM,
it causes trouble with the SELinux's lockdown hook implementation.

SELinux implements the locked_down hook with a check whether the current
task's type has the corresponding "lockdown" class permission
("integrity" or "confidentiality") allowed in the policy. This means
that calling the hook when the access control decision would be ignored
generates a bogus permission check and audit record.

Fix this by checking sample_type first and only calling the hook when
its result would be honored.

Fixes: b0c8fdc7fdb7 ("lockdown: Lock down perf when in confidentiality mode")
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Paul Moore <paul@paul-moore.com>
Link: https://lkml.kernel.org/r/20210224215628.192519-1-omosnace@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

platform/x86: thinkpad_acpi: Correct thermal sensor allocation

commit 6759e18e5cd8745a5dfc5726e4a3db5281ec1639 upstream.

On recent Thinkpad platforms it was reported that temp sensor 11 was
always incorrectly displaying 66C. It turns out the reason for this is
that this location in EC RAM is not a temperature sensor but is the
power supply ID (offset 0xC2).

Based on feedback from the Lenovo firmware team the EC RAM version can
be determined and for the current version (3) only the 0x78 to 0x7F
range is used for temp sensors. I don't have any details for earlier
versions so I have left the implementation unaltered there.

Note - in this block only 0x78 and 0x79 are officially designated (CPU &
GPU sensors). The use of the other locations in the block will vary from
platform to platform; but the existing logic to detect a sensor presence
holds.

Signed-off-by: Mark Pearson <markpearson@lenovo.com>
Link: https://lore.kernel.org/r/20210407212015.298222-1-markpearson@lenovo.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

USB: Add reset-resume quirk for WD19's Realtek Hub

commit ca91fd8c7643d93bfc18a6fec1a0d3972a46a18a upstream.

Realtek Hub (0bda:5487) in Dell Dock WD19 sometimes fails to work
after the system resumes from suspend with remote wakeup enabled
device connected:
[ 1947.640907] hub 5-2.3:1.0: hub_ext_port_status failed (err = -71)
[ 1947.641208] usb 5-2.3-port5: cannot disable (err = -71)
[ 1947.641401] hub 5-2.3:1.0: hub_ext_port_status failed (err = -71)
[ 1947.641450] usb 5-2.3-port4: cannot reset (err = -71)

Information of this hub:
T:  Bus=01 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 10 Spd=480  MxCh= 5
D:  Ver= 2.10 Cls=09(hub  ) Sub=00 Prot=02 MxPS=64 #Cfgs=  1
P:  Vendor=0bda ProdID=5487 Rev= 1.47
S:  Manufacturer=Dell Inc.
S:  Product=Dell dock
C:* #Ifs= 1 Cfg#= 1 Atr=e0 MxPwr=  0mA
I:  If#= 0 Alt= 0 #EPs= 1 Cls=09(hub  ) Sub=00 Prot=01 Driver=hub
E:  Ad=81(I) Atr=03(Int.) MxPS=   1 Ivl=256ms
I:* If#= 0 Alt= 1 #EPs= 1 Cls=09(hub  ) Sub=00 Prot=02 Driver=hub
E:  Ad=81(I) Atr=03(Int.) MxPS=   1 Ivl=256ms

The failure results from the ETIMEDOUT by chance when turning on
the suspend feature for the specified port of the hub. The port
seems to be in an unknown state so the hub_activate during resume
fails the hub_port_status, then the hub will fail to work.

The quirky hub needs the reset-resume quirk to function correctly.

Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Chris Chiu <chris.chiu@canonical.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210420174651.6202-1-chris.chiu@canonical.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

USB: Add LPM quirk for Lenovo ThinkPad USB-C Dock Gen2 Ethernet

commit 8f23fe35ff1e5491b4d279323a8209a31f03ae65 upstream.

This is another branded 8153 device that doesn't work well with LPM
enabled:
[ 400.597506] r8152 5-1.1:1.0 enx482ae3a2a6f0: Tx status -71

So disable LPM to resolve the issue.

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1922651
Link: https://lore.kernel.org/r/20210412135455.791971-1-kai.heng.feng@canonical.com
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: usb-audio: Fix implicit sync clearance at stopping stream

commit 988cc17552606be67a956cf8cd6ff504cfc5d643 upstream.

The recent endpoint management change for implicit feedback mode added
a clearance of ep->sync_sink (formerly ep->sync_slave) pointer at
snd_usb_endpoint_stop() to assure no leftover for the feedback from
the already stopped capture stream. This turned out to cause a
regression, however, when full-duplex streams were running and only a
capture was stopped. Because of the above clearance of ep->sync_sink
pointer, no more feedback is done, hence the playback will stall.

This patch fixes the ep->sync_sink clearance to be done only after all
endpoints are released, for addressing the regression.

Reported-and-tested-by: Lucas Endres <jaffa225man@gmail.com>
Fixes: bf6313a0ff76 ("ALSA: usb-audio: Refactor endpoint management")
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210426063349.18601-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: usb-audio: Add MIDI quirk for Vox ToneLab EX

commit 64f40f9be14106e7df0098c427cb60be645bddb7 upstream.

ToneLab EX guitar pedal device requires the same quirk like ToneLab ST
for supporting the MIDI.

BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=212593
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210407144549.1530-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ovl: allow upperdir inside lowerdir

commit 708fa01597fa002599756bf56a96d0de1677375c upstream.

Commit 146d62e5a586 ("ovl: detect overlapping layers") made sure we don't
have overlapping layers, but it also broke the arguably valid use case of

mount -olowerdir=/,upperdir=/subdir,..

where upperdir overlaps lowerdir on the same filesystem. This has been
causing regressions.

Revert the check, but only for the specific case where upperdir and/or
workdir are subdirectories of lowerdir. Any other overlap (e.g. lowerdir
is subdirectory of upperdir, etc) case is crazy, so leave the check in
place for those.

Overlaps are detected at lookup time too, so reverting the mount time check
should be safe.

Fixes: 146d62e5a586 ("ovl: detect overlapping layers")
Cc: <stable@vger.kernel.org> # v5.2
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ovl: fix leaked dentry

commit eaab1d45cdb4bb0c846bd23c3d666d5b90af7b41 upstream.

Since commit 6815f479ca90 ("ovl: use only uppermetacopy state in
ovl_lookup()"), overlayfs doesn't put temporary dentry when there is a
metacopy error, which leads to dentry leaks when shutting down the related
superblock:

  overlayfs: refusing to follow metacopy origin for (/file0)
  ...
  BUG: Dentry (____ptrval____){i=3f33,n=file3}  still in use (1) [unmount of overlay overlay]
  ...
  WARNING: CPU: 1 PID: 432 at umount_check.cold+0x107/0x14d
  CPU: 1 PID: 432 Comm: unmount-overlay Not tainted 5.12.0-rc5 #1
  ...
  RIP: 0010:umount_check.cold+0x107/0x14d
  ...
  Call Trace:
   d_walk+0x28c/0x950
   ? dentry_lru_isolate+0x2b0/0x2b0
   ? __kasan_slab_free+0x12/0x20
   do_one_tree+0x33/0x60
   shrink_dcache_for_umount+0x78/0x1d0
   generic_shutdown_super+0x70/0x440
   kill_anon_super+0x3e/0x70
   deactivate_locked_super+0xc4/0x160
   deactivate_super+0xfa/0x140
   cleanup_mnt+0x22e/0x370
   __cleanup_mnt+0x1a/0x30
   task_work_run+0x139/0x210
   do_exit+0xb0c/0x2820
   ? __kasan_check_read+0x1d/0x30
   ? find_held_lock+0x35/0x160
   ? lock_release+0x1b6/0x660
   ? mm_update_next_owner+0xa20/0xa20
   ? reacquire_held_locks+0x3f0/0x3f0
   ? __sanitizer_cov_trace_const_cmp4+0x22/0x30
   do_group_exit+0x135/0x380
   __do_sys_exit_group.isra.0+0x20/0x20
   __x64_sys_exit_group+0x3c/0x50
   do_syscall_64+0x45/0x70
   entry_SYSCALL_64_after_hwframe+0x44/0xae
  ...
  VFS: Busy inodes after unmount of overlay. Self-destruct in 5 seconds.  Have a nice day...

This fix has been tested with a syzkaller reproducer.

Cc: Amir Goldstein <amir73il@gmail.com>
Cc: <stable@vger.kernel.org> # v5.8+
Reported-by: syzbot <syzkaller@googlegroups.com>
Fixes: 6815f479ca90 ("ovl: use only uppermetacopy state in ovl_lookup()")
Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com>
Link: https://lore.kernel.org/r/20210329164907.2133175-1-mic@digikod.net
Reviewed-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: qrtr: Avoid potential use after free in MHI send

commit 47a017f33943278570c072bc71681809b2567b3a upstream.

It is possible that the MHI ul_callback will be invoked immediately
following the queueing of the skb for transmission, leading to the
callback decrementing the refcount of the associated sk and freeing the
skb.

As such the dereference of skb and the increment of the sk refcount must
happen before the skb is queued, to avoid the skb to be used after free
and potentially the sk to drop its last refcount..

Fixes: 6e728f321393 ("net: qrtr: Add MHI transport layer")
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bpf: Fix leakage of uninitialized bpf stack under speculation

commit 801c6058d14a82179a7ee17a4b532cac6fad067f upstream.

The current implemented mechanisms to mitigate data disclosure under
speculation mainly address stack and map value oob access from the
speculative domain. However, Piotr discovered that uninitialized BPF
stack is not protected yet, and thus old data from the kernel stack,
potentially including addresses of kernel structures, could still be
extracted from that 512 bytes large window. The BPF stack is special
compared to map values since it's not zero initialized for every
program invocation, whereas map values /are/ zero initialized upon
their initial allocation and thus cannot leak any prior data in either
domain. In the non-speculative domain, the verifier ensures that every
stack slot read must have a prior stack slot write by the BPF program
to avoid such data leaking issue.

However, this is not enough: for example, when the pointer arithmetic
operation moves the stack pointer from the last valid stack offset to
the first valid offset, the sanitation logic allows for any intermediate
offsets during speculative execution, which could then be used to
extract any restricted stack content via side-channel.

Given for unprivileged stack pointer arithmetic the use of unknown
but bounded scalars is generally forbidden, we can simply turn the
register-based arithmetic operation into an immediate-based arithmetic
operation without the need for masking. This also gives the benefit
of reducing the needed instructions for the operation. Given after
the work in 7fedb63a8307 ("bpf: Tighten speculative pointer arithmetic
mask"), the aux->alu_limit already holds the final immediate value for
the offset register with the known scalar. Thus, a simple mov of the
immediate to AX register with using AX as the source for the original
instruction is sufficient and possible now in this case.

Reported-by: Piotr Krysiuk <piotras@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Piotr Krysiuk <piotras@gmail.com>
Reviewed-by: Piotr Krysiuk <piotras@gmail.com>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bpf: Fix masking negation logic upon negative dst register

commit b9b34ddbe2076ade359cd5ce7537d5ed019e9807 upstream.

The negation logic for the case where the off_reg is sitting in the
dst register is not correct given then we cannot just invert the add
to a sub or vice versa. As a fix, perform the final bitwise and-op
unconditionally into AX from the off_reg, then move the pointer from
the src to dst and finally use AX as the source for the original
pointer arithmetic operation such that the inversion yields a correct
result. The single non-AX mov in between is possible given constant
blinding is retaining it as it's not an immediate based operation.

Fixes: 979d63d50c0c ("bpf: prevent out of bounds speculation on pointer arithmetic")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Piotr Krysiuk <piotras@gmail.com>
Reviewed-by: Piotr Krysiuk <piotras@gmail.com>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/i915: Disable runtime power management during shutdown

commit 7962893ecb853aa7c8925ce237ab6c4274cfc1c7 upstream.

At least on some TGL platforms PUNIT wants to access some display HW
registers, but it doesn't handle display power management (disabling DC
states as required) and so this register access will lead to a hang. To
prevent this disable runtime power management for poweroff and reboot.

v2:
- Add code comment clarifying the requirement of display power states.
(Ville)

Reported-and-tested-by: Khaled Almahallawy <khaled.almahallawy@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Khaled Almahallawy <khaled.almahallawy@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210127181909.128094-1-imre.deak@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: usb: ax88179_178a: initialize local variables before use

commit bd78980be1a68d14524c51c4b4170782fada622b upstream.

Use memset to initialize local array in drivers/net/usb/ax88179_178a.c, and
also set a local u16 and u32 variable to 0. Fixes a KMSAN found uninit-value bug
reported by syzbot at:
https://syzkaller.appspot.com/bug?id=00371c73c72f72487c1d0bfe0cc9d00de339d5aa

Reported-by: syzbot+4993e4a0e237f1b53747@syzkaller.appspotmail.com
Signed-off-by: Phillip Potter <phil@philpotter.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

netfilter: conntrack: Make global sysctls readonly in non-init netns

commit 2671fa4dc0109d3fb581bc3078fdf17b5d9080f6 upstream.

These sysctls point to global variables:
- NF_SYSCTL_CT_MAX (&nf_conntrack_max)
- NF_SYSCTL_CT_EXPECT_MAX (&nf_ct_expect_max)
- NF_SYSCTL_CT_BUCKETS (&nf_conntrack_htable_size_user)

Because their data pointers are not updated to point to per-netns
structures, they must be marked read-only in a non-init_net ns.
Otherwise, changes in any net namespace are reflected in (leaked into)
all other net namespaces. This problem has existed since the
introduction of net namespaces.

The current logic marks them read-only only if the net namespace is
owned by an unprivileged user (other than init_user_ns).

Commit d0febd81ae77 ("netfilter: conntrack: re-visit sysctls in
unprivileged namespaces") "exposes all sysctls even if the namespace is
unpriviliged." Since we need to mark them readonly in any case, we can
forego the unprivileged user check altogether.

Fixes: d0febd81ae77 ("netfilter: conntrack: re-visit sysctls in unprivileged namespaces")
Signed-off-by: Jonathon Reinhart <Jonathon.Reinhart@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mips: Do not include hi and lo in clobber list for R6

commit 1d7ba0165d8206ac073f7ac3b14fc0836b66eae7 upstream.

From [1]
"GCC 10 (PR 91233) won't silently allow registers that are not
architecturally available to be present in the clobber list anymore,
resulting in build failure for mips*r6 targets in form of:
...
.../sysdep.h:146:2: error: the register ‘lo’ cannot be clobbered in ‘asm’ for the current target
146 | __asm__ volatile ( \
| ^~~~~~~

This is because base R6 ISA doesn't define hi and lo registers w/o DSP
extension. This patch provides the alternative clobber list for r6 targets
that won't include those registers."

Since kernel 5.4 and mips support for generic vDSO [2], the kernel fail to
build for mips r6 cpus with gcc 10 for the same reason as glibc.

[1] https://sourceware.org/git/?p=glibc.git;a=commit;h=020b2a97bb15f807c0482f0faee2184ed05bcad8
[2] '24640f233b46 ("mips: Add support for generic vDSO")'

Signed-off-by: Romain Naour <romain.naour@gmail.com>
Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Linux 5.12.1

Tested-by: Fox Chen <foxhlchen@gmail.com>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20210430141910.899518186@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mei: me: add Alder Lake P device id.

commit 0df74278faedf20f9696bf2755cf0ce34afa4c3a upstream.

Add Alder Lake P device ID.

Cc: <stable@vger.kernel.org>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Link: https://lore.kernel.org/r/20210414045200.3498241-1-tomas.winkler@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

cfg80211: fix locking in netlink owner interface destruction

commit ea6b2098dd02789f68770fd3d5a373732207be2f upstream.

Harald Arnesen reported [1] a deadlock at reboot time, and after
he captured a stack trace a picture developed of what's going on:

The distribution he's using is using iwd (not wpa_supplicant) to
manage wireless. iwd will usually use the "socket owner" option
when it creates new interfaces, so that they're automatically
destroyed when it quits (unexpectedly or otherwise). This is also
done by wpa_supplicant, but it doesn't do it for the normal one,
only for additional ones, which is different with iwd.

Anyway, during shutdown, iwd quits while the netdev is still UP,
i.e. IFF_UP is set. This causes the stack trace that Linus so
nicely transcribed from the pictures:

cfg80211_destroy_iface_wk() takes wiphy_lock
-> cfg80211_destroy_ifaces()
  ->ieee80211_del_iface
    ->ieeee80211_if_remove
      ->cfg80211_unregister_wdev
        ->unregister_netdevice_queue
          ->dev_close_many
            ->__dev_close_many
              ->raw_notifier_call_chain
                ->cfg80211_netdev_notifier_call
and that last call tries to take wiphy_lock again.

In commit a05829a7222e ("cfg80211: avoid holding the RTNL when
calling the driver") I had taken into account the possibility of
recursing from cfg80211 into cfg80211_netdev_notifier_call() via
the network stack, but only for NETDEV_UNREGISTER, not for what
happens here, NETDEV_GOING_DOWN and NETDEV_DOWN notifications.

Additionally, while this worked still back in commit 78f22b6a3a92
("cfg80211: allow userspace to take ownership of interfaces"), it
missed another corner case: unregistering a netdev will cause
dev_close() to be called, and thus stop wireless operations (e.g.
disconnecting), but there are some types of virtual interfaces in
wifi that don't have a netdev - for that we need an additional
call to cfg80211_leave().

So, to fix this mess, change cfg80211_destroy_ifaces() to not
require the wiphy_lock(), but instead make it acquire it, but
only after it has actually closed all the netdevs on the list,
and then call cfg80211_leave() as well before removing them
from the driver, to fix the second issue. The locking change in
this requires modifying the nl80211 call to not get the wiphy
lock passed in, but acquire it by itself after flushing any
potentially pending destruction requests.

[1] https://lore.kernel.org/r/09464e67-f3de-ac09-28a3-e27b7914ee7d@skogtun.org

Cc: stable@vger.kernel.org # 5.12
Reported-by: Harald Arnesen <harald@skogtun.org>
Fixes: 776a39b8196d ("cfg80211: call cfg80211_destroy_ifaces() with wiphy lock held")
Fixes: 78f22b6a3a92 ("cfg80211: allow userspace to take ownership of interfaces")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Tested-by: Harald Arnesen <harald@skogtun.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

iwlwifi: Fix softirq/hardirq disabling in iwl_pcie_gen2_enqueue_hcmd()

commit e7020bb068d8be50a92f48e36b236a1a1ef9282e upstream.

Analogically to what we did in 2800aadc18a6 ("iwlwifi: Fix softirq/hardirq
disabling in iwl_pcie_enqueue_hcmd()"), we must apply the same fix to
iwl_pcie_gen2_enqueue_hcmd(), as it's being called from exactly the same
contexts.

Reported-by: Heiner Kallweit <hkallweit1@gmail.com
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/nycvar.YFH.7.76.2104171112390.18270@cbobk.fhfr.pm
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

USB: CDC-ACM: fix poison/unpoison imbalance

commit a8b3b519618f30a87a304c4e120267ce6f8dc68a upstream.

suspend() does its poisoning conditionally, resume() does it
unconditionally. On a device with combined interfaces this
will balance, on a device with two interfaces the counter will
go negative and resubmission will fail.

Both actions need to be done conditionally.

Fixes: 6069e3e927c8f ("USB: cdc-acm: untangle a circular dependency between callback and softint")
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210421074513.4327-1-oneukum@suse.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: hso: fix NULL-deref on disconnect regression

commit 2ad5692db72874f02b9ad551d26345437ea4f7f3 upstream.

Commit 8a12f8836145 ("net: hso: fix null-ptr-deref during tty device
unregistration") fixed the racy minor allocation reported by syzbot, but
introduced an unconditional NULL-pointer dereference on every disconnect
instead.

Specifically, the serial device table must no longer be accessed after
the minor has been released by hso_serial_tty_unregister().

Fixes: 8a12f8836145 ("net: hso: fix null-ptr-deref during tty device unregistration")
Cc: stable@vger.kernel.org
Cc: Anirudh Rayabharam <mail@anirudhrb.com>
Reported-by: Leonardo Antoniazzi <leoanto@aruba.it>
Signed-off-by: Johan Hovold <johan@kernel.org>
Reviewed-by: Anirudh Rayabharam <mail@anirudhrb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Linux 5.12

Merge tag 'perf-tools-fixes-for-v5.12-2021-04-25' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull perf tools fixes from Arnaldo Carvalho de Melo:

- Fix potential NULL pointer dereference in the auxtrace option parser

- Fix access to PID in an array when setting a PID filter in 'perf ftrace'

- Fix error return code in the 'perf data' tool and in maps__clone(),
   found using a static analysis tool from Huawei

* tag 'perf-tools-fixes-for-v5.12-2021-04-25' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  perf map: Fix error return code in maps__clone()
  perf ftrace: Fix access to pid in array when setting a pid filter
  perf auxtrace: Fix potential NULL pointer dereference
  perf data: Fix error return code in perf_data__create_dir()

Merge tag 'perf_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 perf fixes from Borislav Petkov:

- Fix Broadwell Xeon's stepping in the PEBS isolation table of CPUs

- Fix a panic when initializing perf uncore machinery on Haswell and
   Broadwell servers

* tag 'perf_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/kvm: Fix Broadwell Xeon stepping in isolation_ucodes[]
  perf/x86/intel/uncore: Remove uncore extra PCI dev HSWEP_PCI_PCU_3

Merge tag 'locking_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking fix from Borislav Petkov:
"Fix ordering in the queued writer lock's slowpath"

* tag 'locking_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/qrwlock: Fix ordering in queued_write_lock_slowpath()

Merge tag 'sched_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fix from Borislav Petkov:
"Fix a typo in a macro ifdeffery"

* tag 'sched_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
preempt/dynamic: Fix typo in macro conditional statement

Merge tag 'x86_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fix from Borislav Petkov:
"Fix an out-of-bounds memory access when setting up a crash kernel with
kexec"

* tag 'x86_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/crash: Fix crash_setup_memmap_entries() out-of-bounds access

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fix from Paolo Bonzini:
"Fix SRCU bug introduced in the merge window"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86/xen: Take srcu lock when accessing kvm_memslots()

Revert "net/rds: Avoid potential use after free in rds_send_remove_from_sock"

This reverts commit 0c85a7e87465f2d4cbc768e245f4f45b2f299b05.

The games with 'rm' are on (two separate instances) of a local variable,
and make no difference.

Quoting Aditya Pakki:
"I was the author of the patch and it was the cause of the giant UMN
  revert.

  The patch is garbage and I was unaware of the steps involved in
  retracting it. I *believed* the maintainers would pull it, given it
  was already under Greg's list. The patch does not introduce any bugs
  but is pointless and is stupid. I accept my incompetence and for not
  requesting a revert earlier."

Link: https://lwn.net/Articles/854319/
Requested-by: Aditya Pakki <pakki001@umn.edu>
Cc: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'pinctrl-v5.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull pin control fixes from Linus Walleij:
"Late pin control fixes, would have been in the main pull request
  normally but hey I got lucky and we got another week to polish up
  v5.12 so here we go.

  One driver fix and one making the core debugfs work:

   - Fix the number of pins in the community of the Intel Lewisburg SoC

   - Show pin numbers for controllers with base = 0 in the new debugfs
     feature"

* tag 'pinctrl-v5.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: core: Show pin numbers for the controllers with base = 0
  pinctrl: lewisburg: Update number of pins in community

Merge branch 'akpm' (patches from Andrew)

Merge misc fixes from Andrew Morton:
"5 patches.

  Subsystems affected by this patch series: coda, overlayfs, and
  mm (pagecache and memcg)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  tools/cgroup/slabinfo.py: updated to work on current kernel
  mm/filemap: fix mapping_seek_hole_data on THP & 32-bit
  mm/filemap: fix find_lock_entries hang on 32-bit THP
  ovl: fix reference counting in ovl_mmap error path
  coda: fix reference counting in coda_file_mmap error path

Merge tag 'block-5.12-2021-04-23' of git://git.kernel.dk/linux-block

Pull block fix from Jens Axboe:
"A single fix for a behavioral regression in this series, when
re-reading the partition table with partitions open"

* tag 'block-5.12-2021-04-23' of git://git.kernel.dk/linux-block:
block: return -EBUSY when there are open partitions in blkdev_reread_part

tools/cgroup/slabinfo.py: updated to work on current kernel

slabinfo.py script does not work with actual kernel version.

First, it was unable to recognise SLUB susbsytem, and when I specified
it manually it failed again with

  AttributeError: 'struct page' has no member 'obj_cgroups'

.. and then again with

  File "tools/cgroup/memcg_slabinfo.py", line 221, in main
    memcg.kmem_caches.address_of_(),
  AttributeError: 'struct mem_cgroup' has no member 'kmem_caches'

Link: https://lkml.kernel.org/r/cec1a75e-43b4-3d64-2084-d9f98fda037f@virtuozzo.com
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Tested-by: Roman Gushchin <guro@fb.com>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/filemap: fix mapping_seek_hole_data on THP & 32-bit

No problem on 64-bit, or without huge pages, but xfstests generic/285
and other SEEK_HOLE/SEEK_DATA tests have regressed on huge tmpfs, and on
32-bit architectures, with the new mapping_seek_hole_data(). Several
different bugs turned out to need fixing.

u64 cast to stop losing bits when converting unsigned long to loff_t
(and let's use shifts throughout, rather than mixed with * and /).

Use round_up() when advancing pos, to stop assuming that pos was already
THP-aligned when advancing it by THP-size. (This use of round_up()
assumes that any THP has THP-aligned index: true at present and true
going forward, but could be recoded to avoid the assumption.)

Use xas_set() when iterating away from a THP, so that xa_index stays in
synch with start, instead of drifting away to return bogus offset.

Check start against end to avoid wrapping 32-bit xa_index to 0 (and to
handle these additional cases, seek_data or not, it's easier to break
the loop than goto: so rearrange exit from the function).

[hughd@google.com: remove unneeded u64 casts, per Matthew]
Link: https://lkml.kernel.org/r/alpine.LSU.2.11.2104221347240.1170@eggly.anvils
Link: https://lkml.kernel.org/r/alpine.LSU.2.11.2104211737410.3299@eggly.anvils
Fixes: 41139aa4c3a3 ("mm/filemap: add mapping_seek_hole_data")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>