From: Greg Kroah-Hartman Date: Wed, 15 Sep 2021 11:38:20 +0000 (+0200) Subject: 5.4-stable patches X-Git-Tag: v5.14.5~49 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=5c84d6b45626040664783920c2993b75d14402cc;p=thirdparty%2Fkernel%2Fstable-queue.git 5.4-stable patches added patches: 9p-xen-fix-end-of-loop-tests-for-list_for_each_entry.patch blk-zoned-allow-blkreportzone-without-cap_sys_admin.patch blk-zoned-allow-zone-management-send-operations-without-cap_sys_admin.patch btrfs-reset-replace-target-device-to-allocation-state-on-close.patch include-linux-list.h-add-a-macro-to-test-if-entry-is-pointing-to-the-head.patch pci-msi-skip-masking-msi-x-on-xen-pv.patch powerpc-perf-hv-gpci-fix-counter-value-parsing.patch xen-fix-setting-of-max_pfn-in-shared_info.patch --- diff --git a/queue-5.4/9p-xen-fix-end-of-loop-tests-for-list_for_each_entry.patch b/queue-5.4/9p-xen-fix-end-of-loop-tests-for-list_for_each_entry.patch new file mode 100644 index 00000000000..bd280e9137f --- /dev/null +++ b/queue-5.4/9p-xen-fix-end-of-loop-tests-for-list_for_each_entry.patch @@ -0,0 +1,46 @@ +From 732b33d0dbf17e9483f0b50385bf606f724f50a2 Mon Sep 17 00:00:00 2001 +From: Harshvardhan Jha +Date: Tue, 27 Jul 2021 05:37:10 +0530 +Subject: 9p/xen: Fix end of loop tests for list_for_each_entry + +From: Harshvardhan Jha + +commit 732b33d0dbf17e9483f0b50385bf606f724f50a2 upstream. + +This patch addresses the following problems: + - priv can never be NULL, so this part of the check is useless + - if the loop ran through the whole list, priv->client is invalid and +it is more appropriate and sufficient to check for the end of +list_for_each_entry loop condition. + +Link: http://lkml.kernel.org/r/20210727000709.225032-1-harshvardhan.jha@oracle.com +Signed-off-by: Harshvardhan Jha +Reviewed-by: Stefano Stabellini +Tested-by: Stefano Stabellini +Cc: +Signed-off-by: Dominique Martinet +Signed-off-by: Greg Kroah-Hartman +--- + net/9p/trans_xen.c | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +--- a/net/9p/trans_xen.c ++++ b/net/9p/trans_xen.c +@@ -138,7 +138,7 @@ static bool p9_xen_write_todo(struct xen + + static int p9_xen_request(struct p9_client *client, struct p9_req_t *p9_req) + { +- struct xen_9pfs_front_priv *priv = NULL; ++ struct xen_9pfs_front_priv *priv; + RING_IDX cons, prod, masked_cons, masked_prod; + unsigned long flags; + u32 size = p9_req->tc.size; +@@ -151,7 +151,7 @@ static int p9_xen_request(struct p9_clie + break; + } + read_unlock(&xen_9pfs_lock); +- if (!priv || priv->client != client) ++ if (list_entry_is_head(priv, &xen_9pfs_devs, list)) + return -EINVAL; + + num = p9_req->tc.tag % priv->num_rings; diff --git a/queue-5.4/blk-zoned-allow-blkreportzone-without-cap_sys_admin.patch b/queue-5.4/blk-zoned-allow-blkreportzone-without-cap_sys_admin.patch new file mode 100644 index 00000000000..1dc4b099e61 --- /dev/null +++ b/queue-5.4/blk-zoned-allow-blkreportzone-without-cap_sys_admin.patch @@ -0,0 +1,45 @@ +From 4d643b66089591b4769bcdb6fd1bfeff2fe301b8 Mon Sep 17 00:00:00 2001 +From: Niklas Cassel +Date: Wed, 11 Aug 2021 11:05:19 +0000 +Subject: blk-zoned: allow BLKREPORTZONE without CAP_SYS_ADMIN + +From: Niklas Cassel + +commit 4d643b66089591b4769bcdb6fd1bfeff2fe301b8 upstream. + +A user space process should not need the CAP_SYS_ADMIN capability set +in order to perform a BLKREPORTZONE ioctl. + +Getting the zone report is required in order to get the write pointer. +Neither read() nor write() requires CAP_SYS_ADMIN, so it is reasonable +that a user space process that can read/write from/to the device, also +can get the write pointer. (Since e.g. writes have to be at the write +pointer.) + +Fixes: 3ed05a987e0f ("blk-zoned: implement ioctls") +Signed-off-by: Niklas Cassel +Reviewed-by: Damien Le Moal +Reviewed-by: Aravind Ramesh +Reviewed-by: Adam Manzanares +Reviewed-by: Himanshu Madhani +Reviewed-by: Johannes Thumshirn +Cc: stable@vger.kernel.org # v4.10+ +Link: https://lore.kernel.org/r/20210811110505.29649-3-Niklas.Cassel@wdc.com +Signed-off-by: Jens Axboe +Signed-off-by: Greg Kroah-Hartman +--- + block/blk-zoned.c | 3 --- + 1 file changed, 3 deletions(-) + +--- a/block/blk-zoned.c ++++ b/block/blk-zoned.c +@@ -316,9 +316,6 @@ int blkdev_report_zones_ioctl(struct blo + if (!blk_queue_is_zoned(q)) + return -ENOTTY; + +- if (!capable(CAP_SYS_ADMIN)) +- return -EACCES; +- + if (copy_from_user(&rep, argp, sizeof(struct blk_zone_report))) + return -EFAULT; + diff --git a/queue-5.4/blk-zoned-allow-zone-management-send-operations-without-cap_sys_admin.patch b/queue-5.4/blk-zoned-allow-zone-management-send-operations-without-cap_sys_admin.patch new file mode 100644 index 00000000000..c3dc257dfeb --- /dev/null +++ b/queue-5.4/blk-zoned-allow-zone-management-send-operations-without-cap_sys_admin.patch @@ -0,0 +1,51 @@ +From ead3b768bb51259e3a5f2287ff5fc9041eb6f450 Mon Sep 17 00:00:00 2001 +From: Niklas Cassel +Date: Wed, 11 Aug 2021 11:05:18 +0000 +Subject: blk-zoned: allow zone management send operations without CAP_SYS_ADMIN + +From: Niklas Cassel + +commit ead3b768bb51259e3a5f2287ff5fc9041eb6f450 upstream. + +Zone management send operations (BLKRESETZONE, BLKOPENZONE, BLKCLOSEZONE +and BLKFINISHZONE) should be allowed under the same permissions as write(). +(write() does not require CAP_SYS_ADMIN). + +Additionally, other ioctls like BLKSECDISCARD and BLKZEROOUT only check if +the fd was successfully opened with FMODE_WRITE. +(They do not require CAP_SYS_ADMIN). + +Currently, zone management send operations require both CAP_SYS_ADMIN +and that the fd was successfully opened with FMODE_WRITE. + +Remove the CAP_SYS_ADMIN requirement, so that zone management send +operations match the access control requirement of write(), BLKSECDISCARD +and BLKZEROOUT. + +Fixes: 3ed05a987e0f ("blk-zoned: implement ioctls") +Signed-off-by: Niklas Cassel +Reviewed-by: Damien Le Moal +Reviewed-by: Aravind Ramesh +Reviewed-by: Adam Manzanares +Reviewed-by: Himanshu Madhani +Reviewed-by: Johannes Thumshirn +Cc: stable@vger.kernel.org # v4.10+ +Link: https://lore.kernel.org/r/20210811110505.29649-2-Niklas.Cassel@wdc.com +Signed-off-by: Jens Axboe +Signed-off-by: Greg Kroah-Hartman +--- + block/blk-zoned.c | 3 --- + 1 file changed, 3 deletions(-) + +--- a/block/blk-zoned.c ++++ b/block/blk-zoned.c +@@ -374,9 +374,6 @@ int blkdev_reset_zones_ioctl(struct bloc + if (!blk_queue_is_zoned(q)) + return -ENOTTY; + +- if (!capable(CAP_SYS_ADMIN)) +- return -EACCES; +- + if (!(mode & FMODE_WRITE)) + return -EBADF; + diff --git a/queue-5.4/btrfs-reset-replace-target-device-to-allocation-state-on-close.patch b/queue-5.4/btrfs-reset-replace-target-device-to-allocation-state-on-close.patch new file mode 100644 index 00000000000..d5a7bc82166 --- /dev/null +++ b/queue-5.4/btrfs-reset-replace-target-device-to-allocation-state-on-close.patch @@ -0,0 +1,125 @@ +From 0d977e0eba234e01a60bdde27314dc21374201b3 Mon Sep 17 00:00:00 2001 +From: Desmond Cheong Zhi Xi +Date: Sat, 21 Aug 2021 01:50:40 +0800 +Subject: btrfs: reset replace target device to allocation state on close + +From: Desmond Cheong Zhi Xi + +commit 0d977e0eba234e01a60bdde27314dc21374201b3 upstream. + +This crash was observed with a failed assertion on device close: + + BTRFS: Transaction aborted (error -28) + WARNING: CPU: 1 PID: 3902 at fs/btrfs/extent-tree.c:2150 btrfs_run_delayed_refs+0x1d2/0x1e0 [btrfs] + Modules linked in: btrfs blake2b_generic libcrc32c crc32c_intel xor zstd_decompress zstd_compress xxhash lzo_compress lzo_decompress raid6_pq loop + CPU: 1 PID: 3902 Comm: kworker/u8:4 Not tainted 5.14.0-rc5-default+ #1532 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014 + Workqueue: events_unbound btrfs_async_reclaim_metadata_space [btrfs] + RIP: 0010:btrfs_run_delayed_refs+0x1d2/0x1e0 [btrfs] + RSP: 0018:ffffb7a5452d7d80 EFLAGS: 00010282 + RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000000 + RDX: 0000000000000001 RSI: ffffffffabee13c4 RDI: 00000000ffffffff + RBP: ffff97834176a378 R08: 0000000000000001 R09: 0000000000000001 + R10: 0000000000000000 R11: 0000000000000001 R12: ffff97835195d388 + R13: 0000000005b08000 R14: ffff978385484000 R15: 000000000000016c + FS: 0000000000000000(0000) GS:ffff9783bd800000(0000) knlGS:0000000000000000 + CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 + CR2: 000056190d003fe8 CR3: 000000002a81e005 CR4: 0000000000170ea0 + Call Trace: + flush_space+0x197/0x2f0 [btrfs] + btrfs_async_reclaim_metadata_space+0x139/0x300 [btrfs] + process_one_work+0x262/0x5e0 + worker_thread+0x4c/0x320 + ? process_one_work+0x5e0/0x5e0 + kthread+0x144/0x170 + ? set_kthread_struct+0x40/0x40 + ret_from_fork+0x1f/0x30 + irq event stamp: 19334989 + hardirqs last enabled at (19334997): [] console_unlock+0x2b7/0x400 + hardirqs last disabled at (19335006): [] console_unlock+0x33d/0x400 + softirqs last enabled at (19334900): [] __do_softirq+0x30d/0x574 + softirqs last disabled at (19334893): [] irq_exit_rcu+0x12c/0x140 + ---[ end trace 45939e308e0dd3c7 ]--- + BTRFS: error (device vdd) in btrfs_run_delayed_refs:2150: errno=-28 No space left + BTRFS info (device vdd): forced readonly + BTRFS warning (device vdd): failed setting block group ro: -30 + BTRFS info (device vdd): suspending dev_replace for unmount + assertion failed: !test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state), in fs/btrfs/volumes.c:1150 + ------------[ cut here ]------------ + kernel BUG at fs/btrfs/ctree.h:3431! + invalid opcode: 0000 [#1] PREEMPT SMP + CPU: 1 PID: 3982 Comm: umount Tainted: G W 5.14.0-rc5-default+ #1532 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014 + RIP: 0010:assertfail.constprop.0+0x18/0x1a [btrfs] + RSP: 0018:ffffb7a5454c7db8 EFLAGS: 00010246 + RAX: 0000000000000068 RBX: ffff978364b91c00 RCX: 0000000000000000 + RDX: 0000000000000000 RSI: ffffffffabee13c4 RDI: 00000000ffffffff + RBP: ffff9783523a4c00 R08: 0000000000000001 R09: 0000000000000001 + R10: 0000000000000000 R11: 0000000000000001 R12: ffff9783523a4d18 + R13: 0000000000000000 R14: 0000000000000004 R15: 0000000000000003 + FS: 00007f61c8f42800(0000) GS:ffff9783bd800000(0000) knlGS:0000000000000000 + CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 + CR2: 000056190cffa810 CR3: 0000000030b96002 CR4: 0000000000170ea0 + Call Trace: + btrfs_close_one_device.cold+0x11/0x55 [btrfs] + close_fs_devices+0x44/0xb0 [btrfs] + btrfs_close_devices+0x48/0x160 [btrfs] + generic_shutdown_super+0x69/0x100 + kill_anon_super+0x14/0x30 + btrfs_kill_super+0x12/0x20 [btrfs] + deactivate_locked_super+0x2c/0xa0 + cleanup_mnt+0x144/0x1b0 + task_work_run+0x59/0xa0 + exit_to_user_mode_loop+0xe7/0xf0 + exit_to_user_mode_prepare+0xaf/0xf0 + syscall_exit_to_user_mode+0x19/0x50 + do_syscall_64+0x4a/0x90 + entry_SYSCALL_64_after_hwframe+0x44/0xae + +This happens when close_ctree is called while a dev_replace hasn't +completed. In close_ctree, we suspend the dev_replace, but keep the +replace target around so that we can resume the dev_replace procedure +when we mount the root again. This is the call trace: + + close_ctree(): + btrfs_dev_replace_suspend_for_unmount(); + btrfs_close_devices(): + btrfs_close_fs_devices(): + btrfs_close_one_device(): + ASSERT(!test_bit(BTRFS_DEV_STATE_REPLACE_TGT, + &device->dev_state)); + +However, since the replace target sticks around, there is a device +with BTRFS_DEV_STATE_REPLACE_TGT set on close, and we fail the +assertion in btrfs_close_one_device. + +To fix this, if we come across the replace target device when +closing, we should properly reset it back to allocation state. This +fix also ensures that if a non-target device has a corrupted state and +has the BTRFS_DEV_STATE_REPLACE_TGT bit set, the assertion will still +catch the error. + +Reported-by: David Sterba +Fixes: b2a616676839 ("btrfs: fix rw device counting in __btrfs_free_extra_devids") +CC: stable@vger.kernel.org # 4.19+ +Reviewed-by: Anand Jain +Signed-off-by: Desmond Cheong Zhi Xi +Reviewed-by: David Sterba +Signed-off-by: David Sterba +Signed-off-by: Greg Kroah-Hartman +--- + fs/btrfs/volumes.c | 3 +++ + 1 file changed, 3 insertions(+) + +--- a/fs/btrfs/volumes.c ++++ b/fs/btrfs/volumes.c +@@ -1311,6 +1311,9 @@ static void btrfs_close_one_device(struc + fs_devices->rw_devices--; + } + ++ if (device->devid == BTRFS_DEV_REPLACE_DEVID) ++ clear_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state); ++ + if (test_bit(BTRFS_DEV_STATE_MISSING, &device->dev_state)) + fs_devices->missing_devices--; + diff --git a/queue-5.4/include-linux-list.h-add-a-macro-to-test-if-entry-is-pointing-to-the-head.patch b/queue-5.4/include-linux-list.h-add-a-macro-to-test-if-entry-is-pointing-to-the-head.patch new file mode 100644 index 00000000000..a951ea43543 --- /dev/null +++ b/queue-5.4/include-linux-list.h-add-a-macro-to-test-if-entry-is-pointing-to-the-head.patch @@ -0,0 +1,142 @@ +From e130816164e244b692921de49771eeb28205152d Mon Sep 17 00:00:00 2001 +From: Andy Shevchenko +Date: Thu, 15 Oct 2020 20:11:31 -0700 +Subject: include/linux/list.h: add a macro to test if entry is pointing to the head + +From: Andy Shevchenko + +commit e130816164e244b692921de49771eeb28205152d upstream. + +Add a macro to test if entry is pointing to the head of the list which is +useful in cases like: + + list_for_each_entry(pos, &head, member) { + if (cond) + break; + } + if (list_entry_is_head(pos, &head, member)) + return -ERRNO; + +that allows to avoid additional variable to be added to track if loop has +not been stopped in the middle. + +While here, convert list_for_each_entry*() family of macros to use a new one. + +Signed-off-by: Andy Shevchenko +Signed-off-by: Andrew Morton +Reviewed-by: Cezary Rojewski +Link: https://lkml.kernel.org/r/20200929134342.51489-1-andriy.shevchenko@linux.intel.com +Signed-off-by: Linus Torvalds +Signed-off-by: Greg Kroah-Hartman +--- + include/linux/list.h | 29 +++++++++++++++++++---------- + 1 file changed, 19 insertions(+), 10 deletions(-) + +--- a/include/linux/list.h ++++ b/include/linux/list.h +@@ -568,6 +568,15 @@ static inline void list_splice_tail_init + pos = n, n = pos->prev) + + /** ++ * list_entry_is_head - test if the entry points to the head of the list ++ * @pos: the type * to cursor ++ * @head: the head for your list. ++ * @member: the name of the list_head within the struct. ++ */ ++#define list_entry_is_head(pos, head, member) \ ++ (&pos->member == (head)) ++ ++/** + * list_for_each_entry - iterate over list of given type + * @pos: the type * to use as a loop cursor. + * @head: the head for your list. +@@ -575,7 +584,7 @@ static inline void list_splice_tail_init + */ + #define list_for_each_entry(pos, head, member) \ + for (pos = list_first_entry(head, typeof(*pos), member); \ +- &pos->member != (head); \ ++ !list_entry_is_head(pos, head, member); \ + pos = list_next_entry(pos, member)) + + /** +@@ -586,7 +595,7 @@ static inline void list_splice_tail_init + */ + #define list_for_each_entry_reverse(pos, head, member) \ + for (pos = list_last_entry(head, typeof(*pos), member); \ +- &pos->member != (head); \ ++ !list_entry_is_head(pos, head, member); \ + pos = list_prev_entry(pos, member)) + + /** +@@ -611,7 +620,7 @@ static inline void list_splice_tail_init + */ + #define list_for_each_entry_continue(pos, head, member) \ + for (pos = list_next_entry(pos, member); \ +- &pos->member != (head); \ ++ !list_entry_is_head(pos, head, member); \ + pos = list_next_entry(pos, member)) + + /** +@@ -625,7 +634,7 @@ static inline void list_splice_tail_init + */ + #define list_for_each_entry_continue_reverse(pos, head, member) \ + for (pos = list_prev_entry(pos, member); \ +- &pos->member != (head); \ ++ !list_entry_is_head(pos, head, member); \ + pos = list_prev_entry(pos, member)) + + /** +@@ -637,7 +646,7 @@ static inline void list_splice_tail_init + * Iterate over list of given type, continuing from current position. + */ + #define list_for_each_entry_from(pos, head, member) \ +- for (; &pos->member != (head); \ ++ for (; !list_entry_is_head(pos, head, member); \ + pos = list_next_entry(pos, member)) + + /** +@@ -650,7 +659,7 @@ static inline void list_splice_tail_init + * Iterate backwards over list of given type, continuing from current position. + */ + #define list_for_each_entry_from_reverse(pos, head, member) \ +- for (; &pos->member != (head); \ ++ for (; !list_entry_is_head(pos, head, member); \ + pos = list_prev_entry(pos, member)) + + /** +@@ -663,7 +672,7 @@ static inline void list_splice_tail_init + #define list_for_each_entry_safe(pos, n, head, member) \ + for (pos = list_first_entry(head, typeof(*pos), member), \ + n = list_next_entry(pos, member); \ +- &pos->member != (head); \ ++ !list_entry_is_head(pos, head, member); \ + pos = n, n = list_next_entry(n, member)) + + /** +@@ -679,7 +688,7 @@ static inline void list_splice_tail_init + #define list_for_each_entry_safe_continue(pos, n, head, member) \ + for (pos = list_next_entry(pos, member), \ + n = list_next_entry(pos, member); \ +- &pos->member != (head); \ ++ !list_entry_is_head(pos, head, member); \ + pos = n, n = list_next_entry(n, member)) + + /** +@@ -694,7 +703,7 @@ static inline void list_splice_tail_init + */ + #define list_for_each_entry_safe_from(pos, n, head, member) \ + for (n = list_next_entry(pos, member); \ +- &pos->member != (head); \ ++ !list_entry_is_head(pos, head, member); \ + pos = n, n = list_next_entry(n, member)) + + /** +@@ -710,7 +719,7 @@ static inline void list_splice_tail_init + #define list_for_each_entry_safe_reverse(pos, n, head, member) \ + for (pos = list_last_entry(head, typeof(*pos), member), \ + n = list_prev_entry(pos, member); \ +- &pos->member != (head); \ ++ !list_entry_is_head(pos, head, member); \ + pos = n, n = list_prev_entry(n, member)) + + /** diff --git a/queue-5.4/pci-msi-skip-masking-msi-x-on-xen-pv.patch b/queue-5.4/pci-msi-skip-masking-msi-x-on-xen-pv.patch new file mode 100644 index 00000000000..df03e255970 --- /dev/null +++ b/queue-5.4/pci-msi-skip-masking-msi-x-on-xen-pv.patch @@ -0,0 +1,55 @@ +From 1a519dc7a73c977547d8b5108d98c6e769c89f4b Mon Sep 17 00:00:00 2001 +From: =?UTF-8?q?Marek=20Marczykowski-G=C3=B3recki?= + +Date: Thu, 26 Aug 2021 19:03:42 +0200 +Subject: PCI/MSI: Skip masking MSI-X on Xen PV +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +From: Marek Marczykowski-Górecki + +commit 1a519dc7a73c977547d8b5108d98c6e769c89f4b upstream. + +When running as Xen PV guest, masking MSI-X is a responsibility of the +hypervisor. The guest has no write access to the relevant BAR at all - when +it tries to, it results in a crash like this: + + BUG: unable to handle page fault for address: ffffc9004069100c + #PF: supervisor write access in kernel mode + #PF: error_code(0x0003) - permissions violation + RIP: e030:__pci_enable_msix_range.part.0+0x26b/0x5f0 + e1000e_set_interrupt_capability+0xbf/0xd0 [e1000e] + e1000_probe+0x41f/0xdb0 [e1000e] + local_pci_probe+0x42/0x80 + (...) + +The recently introduced function msix_mask_all() does not check the global +variable pci_msi_ignore_mask which is set by XEN PV to bypass the masking +of MSI[-X] interrupts. + +Add the check to make this function XEN PV compatible. + +Fixes: 7d5ec3d36123 ("PCI/MSI: Mask all unused MSI-X entries") +Signed-off-by: Marek Marczykowski-Górecki +Signed-off-by: Thomas Gleixner +Acked-by: Bjorn Helgaas +Cc: stable@vger.kernel.org +Link: https://lore.kernel.org/r/20210826170342.135172-1-marmarek@invisiblethingslab.com +Signed-off-by: Greg Kroah-Hartman +--- + drivers/pci/msi.c | 3 +++ + 1 file changed, 3 insertions(+) + +--- a/drivers/pci/msi.c ++++ b/drivers/pci/msi.c +@@ -782,6 +782,9 @@ static void msix_mask_all(void __iomem * + u32 ctrl = PCI_MSIX_ENTRY_CTRL_MASKBIT; + int i; + ++ if (pci_msi_ignore_mask) ++ return; ++ + for (i = 0; i < tsize; i++, base += PCI_MSIX_ENTRY_SIZE) + writel(ctrl, base + PCI_MSIX_ENTRY_VECTOR_CTRL); + } diff --git a/queue-5.4/powerpc-perf-hv-gpci-fix-counter-value-parsing.patch b/queue-5.4/powerpc-perf-hv-gpci-fix-counter-value-parsing.patch new file mode 100644 index 00000000000..db61e20c502 --- /dev/null +++ b/queue-5.4/powerpc-perf-hv-gpci-fix-counter-value-parsing.patch @@ -0,0 +1,67 @@ +From f9addd85fbfacf0d155e83dbee8696d6df5ed0c7 Mon Sep 17 00:00:00 2001 +From: Kajol Jain +Date: Fri, 13 Aug 2021 13:51:58 +0530 +Subject: powerpc/perf/hv-gpci: Fix counter value parsing + +From: Kajol Jain + +commit f9addd85fbfacf0d155e83dbee8696d6df5ed0c7 upstream. + +H_GetPerformanceCounterInfo (0xF080) hcall returns the counter data in +the result buffer. Result buffer has specific format defined in the PAPR +specification. One of the fields is counter offset and width of the +counter data returned. + +Counter data are returned in a unsigned char array in big endian byte +order. To get the final counter data, the values must be left shifted +byte at a time. But commit 220a0c609ad17 ("powerpc/perf: Add support for +the hv gpci (get performance counter info) interface") made the shifting +bitwise and also assumed little endian order. Because of that, hcall +counters values are reported incorrectly. + +In particular this can lead to counters go backwards which messes up the +counter prev vs now calculation and leads to huge counter value +reporting: + + #: perf stat -e hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ + -C 0 -I 1000 + time counts unit events + 1.000078854 18,446,744,073,709,535,232 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ + 2.000213293 0 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ + 3.000320107 0 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ + 4.000428392 0 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ + 5.000537864 0 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ + 6.000649087 0 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ + 7.000760312 0 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ + 8.000865218 16,448 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ + 9.000978985 18,446,744,073,709,535,232 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ + 10.001088891 16,384 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ + 11.001201435 0 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ + 12.001307937 18,446,744,073,709,535,232 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ + +Fix the shifting logic to correct match the format, ie. read bytes in +big endian order. + +Fixes: e4f226b1580b ("powerpc/perf/hv-gpci: Increase request buffer size") +Cc: stable@vger.kernel.org # v4.6+ +Reported-by: Nageswara R Sastry +Signed-off-by: Kajol Jain +Tested-by: Nageswara R Sastry +Signed-off-by: Michael Ellerman +Link: https://lore.kernel.org/r/20210813082158.429023-1-kjain@linux.ibm.com +Signed-off-by: Greg Kroah-Hartman +--- + arch/powerpc/perf/hv-gpci.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +--- a/arch/powerpc/perf/hv-gpci.c ++++ b/arch/powerpc/perf/hv-gpci.c +@@ -164,7 +164,7 @@ static unsigned long single_gpci_request + */ + count = 0; + for (i = offset; i < offset + length; i++) +- count |= arg->bytes[i] << (i - offset); ++ count |= (u64)(arg->bytes[i]) << ((length - 1 - (i - offset)) * 8); + + *value = count; + out: diff --git a/queue-5.4/series b/queue-5.4/series index 5a4a8e6cc9e..cf8b548e5a1 100644 --- a/queue-5.4/series +++ b/queue-5.4/series @@ -1,2 +1,10 @@ rtc-tps65910-correct-driver-module-alias.patch btrfs-wake-up-async_delalloc_pages-waiters-after-submit.patch +btrfs-reset-replace-target-device-to-allocation-state-on-close.patch +blk-zoned-allow-zone-management-send-operations-without-cap_sys_admin.patch +blk-zoned-allow-blkreportzone-without-cap_sys_admin.patch +pci-msi-skip-masking-msi-x-on-xen-pv.patch +powerpc-perf-hv-gpci-fix-counter-value-parsing.patch +xen-fix-setting-of-max_pfn-in-shared_info.patch +include-linux-list.h-add-a-macro-to-test-if-entry-is-pointing-to-the-head.patch +9p-xen-fix-end-of-loop-tests-for-list_for_each_entry.patch diff --git a/queue-5.4/xen-fix-setting-of-max_pfn-in-shared_info.patch b/queue-5.4/xen-fix-setting-of-max_pfn-in-shared_info.patch new file mode 100644 index 00000000000..88f1ddbea02 --- /dev/null +++ b/queue-5.4/xen-fix-setting-of-max_pfn-in-shared_info.patch @@ -0,0 +1,51 @@ +From 4b511d5bfa74b1926daefd1694205c7f1bcf677f Mon Sep 17 00:00:00 2001 +From: Juergen Gross +Date: Fri, 30 Jul 2021 11:26:21 +0200 +Subject: xen: fix setting of max_pfn in shared_info + +From: Juergen Gross + +commit 4b511d5bfa74b1926daefd1694205c7f1bcf677f upstream. + +Xen PV guests are specifying the highest used PFN via the max_pfn +field in shared_info. This value is used by the Xen tools when saving +or migrating the guest. + +Unfortunately this field is misnamed, as in reality it is specifying +the number of pages (including any memory holes) of the guest, so it +is the highest used PFN + 1. Renaming isn't possible, as this is a +public Xen hypervisor interface which needs to be kept stable. + +The kernel will set the value correctly initially at boot time, but +when adding more pages (e.g. due to memory hotplug or ballooning) a +real PFN number is stored in max_pfn. This is done when expanding the +p2m array, and the PFN stored there is even possibly wrong, as it +should be the last possible PFN of the just added P2M frame, and not +one which led to the P2M expansion. + +Fix that by setting shared_info->max_pfn to the last possible PFN + 1. + +Fixes: 98dd166ea3a3c3 ("x86/xen/p2m: hint at the last populated P2M entry") +Cc: stable@vger.kernel.org +Signed-off-by: Juergen Gross +Reviewed-by: Jan Beulich +Link: https://lore.kernel.org/r/20210730092622.9973-2-jgross@suse.com +Signed-off-by: Juergen Gross +Signed-off-by: Greg Kroah-Hartman +--- + arch/x86/xen/p2m.c | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +--- a/arch/x86/xen/p2m.c ++++ b/arch/x86/xen/p2m.c +@@ -622,8 +622,8 @@ int xen_alloc_p2m_entry(unsigned long pf + } + + /* Expanded the p2m? */ +- if (pfn > xen_p2m_last_pfn) { +- xen_p2m_last_pfn = pfn; ++ if (pfn >= xen_p2m_last_pfn) { ++ xen_p2m_last_pfn = ALIGN(pfn + 1, P2M_PER_PAGE); + HYPERVISOR_shared_info->arch.max_pfn = xen_p2m_last_pfn; + } +