From: Greg Kroah-Hartman Date: Mon, 17 Feb 2020 19:20:33 +0000 (+0100) Subject: 4.19-stable patches X-Git-Tag: v4.19.105~23 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=4164e3da3ba715239929c37c8f07344a1ac37b6a;p=thirdparty%2Fkernel%2Fstable-queue.git 4.19-stable patches added patches: arm-npcm-bring-back-gpiolib-support.patch arm64-ssbs-fix-context-switch-when-ssbs-is-present-on-all-cpus.patch btrfs-fix-race-between-using-extent-maps-and-merging-them.patch btrfs-log-message-when-rw-remount-is-attempted-with-unclean-tree-log.patch btrfs-print-message-when-tree-log-replay-starts.patch btrfs-ref-verify-fix-memory-leaks.patch ext4-add-cond_resched-to-ext4_protect_reserved_inode.patch ext4-don-t-assume-that-mmp_nodename-bdevname-have-nul.patch ext4-fix-checksum-errors-with-indexed-dirs.patch ext4-fix-support-for-inode-sizes-1024-bytes.patch ext4-improve-explanation-of-a-mount-failure-caused-by-a-misconfigured-kernel.patch kvm-nvmx-use-correct-root-level-for-nested-ept-shadow-page-tables.patch perf-x86-amd-add-missing-l2-misses-event-spec-to-amd-family-17h-s-event-map.patch --- diff --git a/queue-4.19/arm-npcm-bring-back-gpiolib-support.patch b/queue-4.19/arm-npcm-bring-back-gpiolib-support.patch new file mode 100644 index 00000000000..a72e045cfd7 --- /dev/null +++ b/queue-4.19/arm-npcm-bring-back-gpiolib-support.patch @@ -0,0 +1,35 @@ +From e383e871ab54f073c2a798a9e0bde7f1d0528de8 Mon Sep 17 00:00:00 2001 +From: Krzysztof Kozlowski +Date: Thu, 30 Jan 2020 20:55:24 +0100 +Subject: ARM: npcm: Bring back GPIOLIB support + +From: Krzysztof Kozlowski + +commit e383e871ab54f073c2a798a9e0bde7f1d0528de8 upstream. + +The CONFIG_ARCH_REQUIRE_GPIOLIB is gone since commit 65053e1a7743 +("gpio: delete ARCH_[WANTS_OPTIONAL|REQUIRE]_GPIOLIB") and all platforms +should explicitly select GPIOLIB to have it. + +Link: https://lore.kernel.org/r/20200130195525.4525-1-krzk@kernel.org +Cc: +Fixes: 65053e1a7743 ("gpio: delete ARCH_[WANTS_OPTIONAL|REQUIRE]_GPIOLIB") +Signed-off-by: Krzysztof Kozlowski +Signed-off-by: Olof Johansson +Signed-off-by: Greg Kroah-Hartman + +--- + arch/arm/mach-npcm/Kconfig | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +--- a/arch/arm/mach-npcm/Kconfig ++++ b/arch/arm/mach-npcm/Kconfig +@@ -10,7 +10,7 @@ config ARCH_NPCM7XX + depends on ARCH_MULTI_V7 + select PINCTRL_NPCM7XX + select NPCM7XX_TIMER +- select ARCH_REQUIRE_GPIOLIB ++ select GPIOLIB + select CACHE_L2X0 + select ARM_GIC + select HAVE_ARM_TWD if SMP diff --git a/queue-4.19/arm64-ssbs-fix-context-switch-when-ssbs-is-present-on-all-cpus.patch b/queue-4.19/arm64-ssbs-fix-context-switch-when-ssbs-is-present-on-all-cpus.patch new file mode 100644 index 00000000000..3835b3b456e --- /dev/null +++ b/queue-4.19/arm64-ssbs-fix-context-switch-when-ssbs-is-present-on-all-cpus.patch @@ -0,0 +1,46 @@ +From fca3d33d8ad61eb53eca3ee4cac476d1e31b9008 Mon Sep 17 00:00:00 2001 +From: Will Deacon +Date: Thu, 6 Feb 2020 10:42:58 +0000 +Subject: arm64: ssbs: Fix context-switch when SSBS is present on all CPUs + +From: Will Deacon + +commit fca3d33d8ad61eb53eca3ee4cac476d1e31b9008 upstream. + +When all CPUs in the system implement the SSBS extension, the SSBS field +in PSTATE is the definitive indication of the mitigation state. Further, +when the CPUs implement the SSBS manipulation instructions (advertised +to userspace via an HWCAP), EL0 can toggle the SSBS field directly and +so we cannot rely on any shadow state such as TIF_SSBD at all. + +Avoid forcing the SSBS field in context-switch on such a system, and +simply rely on the PSTATE register instead. + +Cc: +Cc: Catalin Marinas +Cc: Srinivas Ramana +Fixes: cbdf8a189a66 ("arm64: Force SSBS on context switch") +Reviewed-by: Marc Zyngier +Signed-off-by: Will Deacon +Signed-off-by: Greg Kroah-Hartman + +--- + arch/arm64/kernel/process.c | 7 +++++++ + 1 file changed, 7 insertions(+) + +--- a/arch/arm64/kernel/process.c ++++ b/arch/arm64/kernel/process.c +@@ -414,6 +414,13 @@ static void ssbs_thread_switch(struct ta + if (unlikely(next->flags & PF_KTHREAD)) + return; + ++ /* ++ * If all CPUs implement the SSBS extension, then we just need to ++ * context-switch the PSTATE field. ++ */ ++ if (cpu_have_feature(cpu_feature(SSBS))) ++ return; ++ + /* If the mitigation is enabled, then we leave SSBS clear. */ + if ((arm64_get_ssbd_state() == ARM64_SSBD_FORCE_ENABLE) || + test_tsk_thread_flag(next, TIF_SSBD)) diff --git a/queue-4.19/btrfs-fix-race-between-using-extent-maps-and-merging-them.patch b/queue-4.19/btrfs-fix-race-between-using-extent-maps-and-merging-them.patch new file mode 100644 index 00000000000..2e06fbe987c --- /dev/null +++ b/queue-4.19/btrfs-fix-race-between-using-extent-maps-and-merging-them.patch @@ -0,0 +1,128 @@ +From ac05ca913e9f3871126d61da275bfe8516ff01ca Mon Sep 17 00:00:00 2001 +From: Filipe Manana +Date: Fri, 31 Jan 2020 14:06:07 +0000 +Subject: Btrfs: fix race between using extent maps and merging them + +From: Filipe Manana + +commit ac05ca913e9f3871126d61da275bfe8516ff01ca upstream. + +We have a few cases where we allow an extent map that is in an extent map +tree to be merged with other extents in the tree. Such cases include the +unpinning of an extent after the respective ordered extent completed or +after logging an extent during a fast fsync. This can lead to subtle and +dangerous problems because when doing the merge some other task might be +using the same extent map and as consequence see an inconsistent state of +the extent map - for example sees the new length but has seen the old start +offset. + +With luck this triggers a BUG_ON(), and not some silent bug, such as the +following one in __do_readpage(): + + $ cat -n fs/btrfs/extent_io.c + 3061 static int __do_readpage(struct extent_io_tree *tree, + 3062 struct page *page, + (...) + 3127 em = __get_extent_map(inode, page, pg_offset, cur, + 3128 end - cur + 1, get_extent, em_cached); + 3129 if (IS_ERR_OR_NULL(em)) { + 3130 SetPageError(page); + 3131 unlock_extent(tree, cur, end); + 3132 break; + 3133 } + 3134 extent_offset = cur - em->start; + 3135 BUG_ON(extent_map_end(em) <= cur); + (...) + +Consider the following example scenario, where we end up hitting the +BUG_ON() in __do_readpage(). + +We have an inode with a size of 8KiB and 2 extent maps: + + extent A: file offset 0, length 4KiB, disk_bytenr = X, persisted on disk by + a previous transaction + + extent B: file offset 4KiB, length 4KiB, disk_bytenr = X + 4KiB, not yet + persisted but writeback started for it already. The extent map + is pinned since there's writeback and an ordered extent in + progress, so it can not be merged with extent map A yet + +The following sequence of steps leads to the BUG_ON(): + +1) The ordered extent for extent B completes, the respective page gets its + writeback bit cleared and the extent map is unpinned, at that point it + is not yet merged with extent map A because it's in the list of modified + extents; + +2) Due to memory pressure, or some other reason, the MM subsystem releases + the page corresponding to extent B - btrfs_releasepage() is called and + returns 1, meaning the page can be released as it's not dirty, not under + writeback anymore and the extent range is not locked in the inode's + iotree. However the extent map is not released, either because we are + not in a context that allows memory allocations to block or because the + inode's size is smaller than 16MiB - in this case our inode has a size + of 8KiB; + +3) Task B needs to read extent B and ends up __do_readpage() through the + btrfs_readpage() callback. At __do_readpage() it gets a reference to + extent map B; + +4) Task A, doing a fast fsync, calls clear_em_loggin() against extent map B + while holding the write lock on the inode's extent map tree - this + results in try_merge_map() being called and since it's possible to merge + extent map B with extent map A now (the extent map B was removed from + the list of modified extents), the merging begins - it sets extent map + B's start offset to 0 (was 4KiB), but before it increments the map's + length to 8KiB (4kb + 4KiB), task A is at: + + BUG_ON(extent_map_end(em) <= cur); + + The call to extent_map_end() sees the extent map has a start of 0 + and a length still at 4KiB, so it returns 4KiB and 'cur' is 4KiB, so + the BUG_ON() is triggered. + +So it's dangerous to modify an extent map that is in the tree, because some +other task might have got a reference to it before and still using it, and +needs to see a consistent map while using it. Generally this is very rare +since most paths that lookup and use extent maps also have the file range +locked in the inode's iotree. The fsync path is pretty much the only +exception where we don't do it to avoid serialization with concurrent +reads. + +Fix this by not allowing an extent map do be merged if if it's being used +by tasks other then the one attempting to merge the extent map (when the +reference count of the extent map is greater than 2). + +Reported-by: ryusuke1925 +Reported-by: Koki Mitani +Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=206211 +CC: stable@vger.kernel.org # 4.4+ +Reviewed-by: Josef Bacik +Signed-off-by: Filipe Manana +Signed-off-by: David Sterba +Signed-off-by: Greg Kroah-Hartman + +--- + fs/btrfs/extent_map.c | 11 +++++++++++ + 1 file changed, 11 insertions(+) + +--- a/fs/btrfs/extent_map.c ++++ b/fs/btrfs/extent_map.c +@@ -228,6 +228,17 @@ static void try_merge_map(struct extent_ + struct extent_map *merge = NULL; + struct rb_node *rb; + ++ /* ++ * We can't modify an extent map that is in the tree and that is being ++ * used by another task, as it can cause that other task to see it in ++ * inconsistent state during the merging. We always have 1 reference for ++ * the tree and 1 for this task (which is unpinning the extent map or ++ * clearing the logging flag), so anything > 2 means it's being used by ++ * other tasks too. ++ */ ++ if (refcount_read(&em->refs) > 2) ++ return; ++ + if (em->start != 0) { + rb = rb_prev(&em->rb_node); + if (rb) diff --git a/queue-4.19/btrfs-log-message-when-rw-remount-is-attempted-with-unclean-tree-log.patch b/queue-4.19/btrfs-log-message-when-rw-remount-is-attempted-with-unclean-tree-log.patch new file mode 100644 index 00000000000..f86b15a0853 --- /dev/null +++ b/queue-4.19/btrfs-log-message-when-rw-remount-is-attempted-with-unclean-tree-log.patch @@ -0,0 +1,37 @@ +From 10a3a3edc5b89a8cd095bc63495fb1e0f42047d9 Mon Sep 17 00:00:00 2001 +From: David Sterba +Date: Wed, 5 Feb 2020 17:12:28 +0100 +Subject: btrfs: log message when rw remount is attempted with unclean tree-log + +From: David Sterba + +commit 10a3a3edc5b89a8cd095bc63495fb1e0f42047d9 upstream. + +A remount to a read-write filesystem is not safe when there's tree-log +to be replayed. Files that could be opened until now might be affected +by the changes in the tree-log. + +A regular mount is needed to replay the log so the filesystem presents +the consistent view with the pending changes included. + +CC: stable@vger.kernel.org # 4.4+ +Reviewed-by: Anand Jain +Reviewed-by: Johannes Thumshirn +Signed-off-by: David Sterba +Signed-off-by: Greg Kroah-Hartman + +--- + fs/btrfs/super.c | 2 ++ + 1 file changed, 2 insertions(+) + +--- a/fs/btrfs/super.c ++++ b/fs/btrfs/super.c +@@ -1857,6 +1857,8 @@ static int btrfs_remount(struct super_bl + } + + if (btrfs_super_log_root(fs_info->super_copy) != 0) { ++ btrfs_warn(fs_info, ++ "mount required to replay tree-log, cannot remount read-write"); + ret = -EINVAL; + goto restore; + } diff --git a/queue-4.19/btrfs-print-message-when-tree-log-replay-starts.patch b/queue-4.19/btrfs-print-message-when-tree-log-replay-starts.patch new file mode 100644 index 00000000000..f0737f0f9af --- /dev/null +++ b/queue-4.19/btrfs-print-message-when-tree-log-replay-starts.patch @@ -0,0 +1,34 @@ +From e8294f2f6aa6208ed0923aa6d70cea3be178309a Mon Sep 17 00:00:00 2001 +From: David Sterba +Date: Wed, 5 Feb 2020 17:12:16 +0100 +Subject: btrfs: print message when tree-log replay starts + +From: David Sterba + +commit e8294f2f6aa6208ed0923aa6d70cea3be178309a upstream. + +There's no logged information about tree-log replay although this is +something that points to previous unclean unmount. Other filesystems +report that as well. + +Suggested-by: Chris Murphy +CC: stable@vger.kernel.org # 4.4+ +Reviewed-by: Anand Jain +Reviewed-by: Johannes Thumshirn +Signed-off-by: David Sterba +Signed-off-by: Greg Kroah-Hartman + +--- + fs/btrfs/disk-io.c | 1 + + 1 file changed, 1 insertion(+) + +--- a/fs/btrfs/disk-io.c ++++ b/fs/btrfs/disk-io.c +@@ -3117,6 +3117,7 @@ retry_root_backup: + /* do not make disk changes in broken FS or nologreplay is given */ + if (btrfs_super_log_root(disk_super) != 0 && + !btrfs_test_opt(fs_info, NOLOGREPLAY)) { ++ btrfs_info(fs_info, "start tree-log replay"); + ret = btrfs_replay_log(fs_info, fs_devices); + if (ret) { + err = ret; diff --git a/queue-4.19/btrfs-ref-verify-fix-memory-leaks.patch b/queue-4.19/btrfs-ref-verify-fix-memory-leaks.patch new file mode 100644 index 00000000000..5c47fdbb21c --- /dev/null +++ b/queue-4.19/btrfs-ref-verify-fix-memory-leaks.patch @@ -0,0 +1,66 @@ +From f311ade3a7adf31658ed882aaab9f9879fdccef7 Mon Sep 17 00:00:00 2001 +From: Wenwen Wang +Date: Sat, 1 Feb 2020 20:38:38 +0000 +Subject: btrfs: ref-verify: fix memory leaks + +From: Wenwen Wang + +commit f311ade3a7adf31658ed882aaab9f9879fdccef7 upstream. + +In btrfs_ref_tree_mod(), 'ref' and 'ra' are allocated through kzalloc() and +kmalloc(), respectively. In the following code, if an error occurs, the +execution will be redirected to 'out' or 'out_unlock' and the function will +be exited. However, on some of the paths, 'ref' and 'ra' are not +deallocated, leading to memory leaks. For example, if 'action' is +BTRFS_ADD_DELAYED_EXTENT, add_block_entry() will be invoked. If the return +value indicates an error, the execution will be redirected to 'out'. But, +'ref' is not deallocated on this path, causing a memory leak. + +To fix the above issues, deallocate both 'ref' and 'ra' before exiting from +the function when an error is encountered. + +CC: stable@vger.kernel.org # 4.15+ +Signed-off-by: Wenwen Wang +Reviewed-by: David Sterba +Signed-off-by: David Sterba +Signed-off-by: Greg Kroah-Hartman + +--- + fs/btrfs/ref-verify.c | 5 +++++ + 1 file changed, 5 insertions(+) + +--- a/fs/btrfs/ref-verify.c ++++ b/fs/btrfs/ref-verify.c +@@ -747,6 +747,7 @@ int btrfs_ref_tree_mod(struct btrfs_root + */ + be = add_block_entry(root->fs_info, bytenr, num_bytes, ref_root); + if (IS_ERR(be)) { ++ kfree(ref); + kfree(ra); + ret = PTR_ERR(be); + goto out; +@@ -760,6 +761,8 @@ int btrfs_ref_tree_mod(struct btrfs_root + "re-allocated a block that still has references to it!"); + dump_block_entry(fs_info, be); + dump_ref_action(fs_info, ra); ++ kfree(ref); ++ kfree(ra); + goto out_unlock; + } + +@@ -822,6 +825,7 @@ int btrfs_ref_tree_mod(struct btrfs_root + "dropping a ref for a existing root that doesn't have a ref on the block"); + dump_block_entry(fs_info, be); + dump_ref_action(fs_info, ra); ++ kfree(ref); + kfree(ra); + goto out_unlock; + } +@@ -837,6 +841,7 @@ int btrfs_ref_tree_mod(struct btrfs_root + "attempting to add another ref for an existing ref on a tree block"); + dump_block_entry(fs_info, be); + dump_ref_action(fs_info, ra); ++ kfree(ref); + kfree(ra); + goto out_unlock; + } diff --git a/queue-4.19/ext4-add-cond_resched-to-ext4_protect_reserved_inode.patch b/queue-4.19/ext4-add-cond_resched-to-ext4_protect_reserved_inode.patch new file mode 100644 index 00000000000..9473487daa1 --- /dev/null +++ b/queue-4.19/ext4-add-cond_resched-to-ext4_protect_reserved_inode.patch @@ -0,0 +1,64 @@ +From af133ade9a40794a37104ecbcc2827c0ea373a3c Mon Sep 17 00:00:00 2001 +From: Shijie Luo +Date: Mon, 10 Feb 2020 20:17:52 -0500 +Subject: ext4: add cond_resched() to ext4_protect_reserved_inode + +From: Shijie Luo + +commit af133ade9a40794a37104ecbcc2827c0ea373a3c upstream. + +When journal size is set too big by "mkfs.ext4 -J size=", or when +we mount a crafted image to make journal inode->i_size too big, +the loop, "while (i < num)", holds cpu too long. This could cause +soft lockup. + +[ 529.357541] Call trace: +[ 529.357551] dump_backtrace+0x0/0x198 +[ 529.357555] show_stack+0x24/0x30 +[ 529.357562] dump_stack+0xa4/0xcc +[ 529.357568] watchdog_timer_fn+0x300/0x3e8 +[ 529.357574] __hrtimer_run_queues+0x114/0x358 +[ 529.357576] hrtimer_interrupt+0x104/0x2d8 +[ 529.357580] arch_timer_handler_virt+0x38/0x58 +[ 529.357584] handle_percpu_devid_irq+0x90/0x248 +[ 529.357588] generic_handle_irq+0x34/0x50 +[ 529.357590] __handle_domain_irq+0x68/0xc0 +[ 529.357593] gic_handle_irq+0x6c/0x150 +[ 529.357595] el1_irq+0xb8/0x140 +[ 529.357599] __ll_sc_atomic_add_return_acquire+0x14/0x20 +[ 529.357668] ext4_map_blocks+0x64/0x5c0 [ext4] +[ 529.357693] ext4_setup_system_zone+0x330/0x458 [ext4] +[ 529.357717] ext4_fill_super+0x2170/0x2ba8 [ext4] +[ 529.357722] mount_bdev+0x1a8/0x1e8 +[ 529.357746] ext4_mount+0x44/0x58 [ext4] +[ 529.357748] mount_fs+0x50/0x170 +[ 529.357752] vfs_kern_mount.part.9+0x54/0x188 +[ 529.357755] do_mount+0x5ac/0xd78 +[ 529.357758] ksys_mount+0x9c/0x118 +[ 529.357760] __arm64_sys_mount+0x28/0x38 +[ 529.357764] el0_svc_common+0x78/0x130 +[ 529.357766] el0_svc_handler+0x38/0x78 +[ 529.357769] el0_svc+0x8/0xc +[ 541.356516] watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [mount:18674] + +Link: https://lore.kernel.org/r/20200211011752.29242-1-luoshijie1@huawei.com +Reviewed-by: Jan Kara +Signed-off-by: Shijie Luo +Signed-off-by: Theodore Ts'o +Cc: stable@kernel.org +Signed-off-by: Greg Kroah-Hartman + +--- + fs/ext4/block_validity.c | 1 + + 1 file changed, 1 insertion(+) + +--- a/fs/ext4/block_validity.c ++++ b/fs/ext4/block_validity.c +@@ -203,6 +203,7 @@ static int ext4_protect_reserved_inode(s + return PTR_ERR(inode); + num = (inode->i_size + sb->s_blocksize - 1) >> sb->s_blocksize_bits; + while (i < num) { ++ cond_resched(); + map.m_lblk = i; + map.m_len = num - i; + n = ext4_map_blocks(NULL, inode, &map, 0); diff --git a/queue-4.19/ext4-don-t-assume-that-mmp_nodename-bdevname-have-nul.patch b/queue-4.19/ext4-don-t-assume-that-mmp_nodename-bdevname-have-nul.patch new file mode 100644 index 00000000000..931d990df6f --- /dev/null +++ b/queue-4.19/ext4-don-t-assume-that-mmp_nodename-bdevname-have-nul.patch @@ -0,0 +1,58 @@ +From 14c9ca0583eee8df285d68a0e6ec71053efd2228 Mon Sep 17 00:00:00 2001 +From: Andreas Dilger +Date: Sun, 26 Jan 2020 15:03:34 -0700 +Subject: ext4: don't assume that mmp_nodename/bdevname have NUL + +From: Andreas Dilger + +commit 14c9ca0583eee8df285d68a0e6ec71053efd2228 upstream. + +Don't assume that the mmp_nodename and mmp_bdevname strings are NUL +terminated, since they are filled in by snprintf(), which is not +guaranteed to do so. + +Link: https://lore.kernel.org/r/1580076215-1048-1-git-send-email-adilger@dilger.ca +Signed-off-by: Andreas Dilger +Signed-off-by: Theodore Ts'o +Cc: stable@kernel.org +Signed-off-by: Greg Kroah-Hartman + +--- + fs/ext4/mmp.c | 12 +++++++----- + 1 file changed, 7 insertions(+), 5 deletions(-) + +--- a/fs/ext4/mmp.c ++++ b/fs/ext4/mmp.c +@@ -120,10 +120,10 @@ void __dump_mmp_msg(struct super_block * + { + __ext4_warning(sb, function, line, "%s", msg); + __ext4_warning(sb, function, line, +- "MMP failure info: last update time: %llu, last update " +- "node: %s, last update device: %s", +- (long long unsigned int) le64_to_cpu(mmp->mmp_time), +- mmp->mmp_nodename, mmp->mmp_bdevname); ++ "MMP failure info: last update time: %llu, last update node: %.*s, last update device: %.*s", ++ (unsigned long long)le64_to_cpu(mmp->mmp_time), ++ (int)sizeof(mmp->mmp_nodename), mmp->mmp_nodename, ++ (int)sizeof(mmp->mmp_bdevname), mmp->mmp_bdevname); + } + + /* +@@ -154,6 +154,7 @@ static int kmmpd(void *data) + mmp_check_interval = max(EXT4_MMP_CHECK_MULT * mmp_update_interval, + EXT4_MMP_MIN_CHECK_INTERVAL); + mmp->mmp_check_interval = cpu_to_le16(mmp_check_interval); ++ BUILD_BUG_ON(sizeof(mmp->mmp_bdevname) < BDEVNAME_SIZE); + bdevname(bh->b_bdev, mmp->mmp_bdevname); + + memcpy(mmp->mmp_nodename, init_utsname()->nodename, +@@ -375,7 +376,8 @@ skip: + /* + * Start a kernel thread to update the MMP block periodically. + */ +- EXT4_SB(sb)->s_mmp_tsk = kthread_run(kmmpd, mmpd_data, "kmmpd-%s", ++ EXT4_SB(sb)->s_mmp_tsk = kthread_run(kmmpd, mmpd_data, "kmmpd-%.*s", ++ (int)sizeof(mmp->mmp_bdevname), + bdevname(bh->b_bdev, + mmp->mmp_bdevname)); + if (IS_ERR(EXT4_SB(sb)->s_mmp_tsk)) { diff --git a/queue-4.19/ext4-fix-checksum-errors-with-indexed-dirs.patch b/queue-4.19/ext4-fix-checksum-errors-with-indexed-dirs.patch new file mode 100644 index 00000000000..dc9a7476968 --- /dev/null +++ b/queue-4.19/ext4-fix-checksum-errors-with-indexed-dirs.patch @@ -0,0 +1,125 @@ +From 48a34311953d921235f4d7bbd2111690d2e469cf Mon Sep 17 00:00:00 2001 +From: Jan Kara +Date: Mon, 10 Feb 2020 15:43:16 +0100 +Subject: ext4: fix checksum errors with indexed dirs + +From: Jan Kara + +commit 48a34311953d921235f4d7bbd2111690d2e469cf upstream. + +DIR_INDEX has been introduced as a compat ext4 feature. That means that +even kernels / tools that don't understand the feature may modify the +filesystem. This works because for kernels not understanding indexed dir +format, internal htree nodes appear just as empty directory entries. +Index dir aware kernels then check the htree structure is still +consistent before using the data. This all worked reasonably well until +metadata checksums were introduced. The problem is that these +effectively made DIR_INDEX only ro-compatible because internal htree +nodes store checksums in a different place than normal directory blocks. +Thus any modification ignorant to DIR_INDEX (or just clearing +EXT4_INDEX_FL from the inode) will effectively cause checksum mismatch +and trigger kernel errors. So we have to be more careful when dealing +with indexed directories on filesystems with checksumming enabled. + +1) We just disallow loading any directory inodes with EXT4_INDEX_FL when +DIR_INDEX is not enabled. This is harsh but it should be very rare (it +means someone disabled DIR_INDEX on existing filesystem and didn't run +e2fsck), e2fsck can fix the problem, and we don't want to answer the +difficult question: "Should we rather corrupt the directory more or +should we ignore that DIR_INDEX feature is not set?" + +2) When we find out htree structure is corrupted (but the filesystem and +the directory should in support htrees), we continue just ignoring htree +information for reading but we refuse to add new entries to the +directory to avoid corrupting it more. + +Link: https://lore.kernel.org/r/20200210144316.22081-1-jack@suse.cz +Fixes: dbe89444042a ("ext4: Calculate and verify checksums for htree nodes") +Reviewed-by: Andreas Dilger +Signed-off-by: Jan Kara +Signed-off-by: Theodore Ts'o +Cc: stable@kernel.org +Signed-off-by: Greg Kroah-Hartman + +--- + fs/ext4/dir.c | 14 ++++++++------ + fs/ext4/ext4.h | 5 ++++- + fs/ext4/inode.c | 12 ++++++++++++ + fs/ext4/namei.c | 7 +++++++ + 4 files changed, 31 insertions(+), 7 deletions(-) + +--- a/fs/ext4/dir.c ++++ b/fs/ext4/dir.c +@@ -126,12 +126,14 @@ static int ext4_readdir(struct file *fil + if (err != ERR_BAD_DX_DIR) { + return err; + } +- /* +- * We don't set the inode dirty flag since it's not +- * critical that it get flushed back to the disk. +- */ +- ext4_clear_inode_flag(file_inode(file), +- EXT4_INODE_INDEX); ++ /* Can we just clear INDEX flag to ignore htree information? */ ++ if (!ext4_has_metadata_csum(sb)) { ++ /* ++ * We don't set the inode dirty flag since it's not ++ * critical that it gets flushed back to the disk. ++ */ ++ ext4_clear_inode_flag(inode, EXT4_INODE_INDEX); ++ } + } + + if (ext4_has_inline_data(inode)) { +--- a/fs/ext4/ext4.h ++++ b/fs/ext4/ext4.h +@@ -2375,8 +2375,11 @@ void ext4_insert_dentry(struct inode *in + struct ext4_filename *fname); + static inline void ext4_update_dx_flag(struct inode *inode) + { +- if (!ext4_has_feature_dir_index(inode->i_sb)) ++ if (!ext4_has_feature_dir_index(inode->i_sb)) { ++ /* ext4_iget() should have caught this... */ ++ WARN_ON_ONCE(ext4_has_feature_metadata_csum(inode->i_sb)); + ext4_clear_inode_flag(inode, EXT4_INODE_INDEX); ++ } + } + static const unsigned char ext4_filetype_table[] = { + DT_UNKNOWN, DT_REG, DT_DIR, DT_CHR, DT_BLK, DT_FIFO, DT_SOCK, DT_LNK +--- a/fs/ext4/inode.c ++++ b/fs/ext4/inode.c +@@ -4975,6 +4975,18 @@ struct inode *__ext4_iget(struct super_b + ret = -EFSCORRUPTED; + goto bad_inode; + } ++ /* ++ * If dir_index is not enabled but there's dir with INDEX flag set, ++ * we'd normally treat htree data as empty space. But with metadata ++ * checksumming that corrupts checksums so forbid that. ++ */ ++ if (!ext4_has_feature_dir_index(sb) && ext4_has_metadata_csum(sb) && ++ ext4_test_inode_flag(inode, EXT4_INODE_INDEX)) { ++ ext4_error_inode(inode, function, line, 0, ++ "iget: Dir with htree data on filesystem without dir_index feature."); ++ ret = -EFSCORRUPTED; ++ goto bad_inode; ++ } + ei->i_disksize = inode->i_size; + #ifdef CONFIG_QUOTA + ei->i_reserved_quota = 0; +--- a/fs/ext4/namei.c ++++ b/fs/ext4/namei.c +@@ -2085,6 +2085,13 @@ static int ext4_add_entry(handle_t *hand + retval = ext4_dx_add_entry(handle, &fname, dir, inode); + if (!retval || (retval != ERR_BAD_DX_DIR)) + goto out; ++ /* Can we just ignore htree data? */ ++ if (ext4_has_metadata_csum(sb)) { ++ EXT4_ERROR_INODE(dir, ++ "Directory has corrupted htree index."); ++ retval = -EFSCORRUPTED; ++ goto out; ++ } + ext4_clear_inode_flag(dir, EXT4_INODE_INDEX); + dx_fallback++; + ext4_mark_inode_dirty(handle, dir); diff --git a/queue-4.19/ext4-fix-support-for-inode-sizes-1024-bytes.patch b/queue-4.19/ext4-fix-support-for-inode-sizes-1024-bytes.patch new file mode 100644 index 00000000000..c5ca42c2402 --- /dev/null +++ b/queue-4.19/ext4-fix-support-for-inode-sizes-1024-bytes.patch @@ -0,0 +1,72 @@ +From 4f97a68192bd33b9963b400759cef0ca5963af00 Mon Sep 17 00:00:00 2001 +From: Theodore Ts'o +Date: Thu, 6 Feb 2020 17:35:01 -0500 +Subject: ext4: fix support for inode sizes > 1024 bytes + +From: Theodore Ts'o + +commit 4f97a68192bd33b9963b400759cef0ca5963af00 upstream. + +A recent commit, 9803387c55f7 ("ext4: validate the +debug_want_extra_isize mount option at parse time"), moved mount-time +checks around. One of those changes moved the inode size check before +the blocksize variable was set to the blocksize of the file system. +After 9803387c55f7 was set to the minimum allowable blocksize, which +in practice on most systems would be 1024 bytes. This cuased file +systems with inode sizes larger than 1024 bytes to be rejected with a +message: + +EXT4-fs (sdXX): unsupported inode size: 4096 + +Fixes: 9803387c55f7 ("ext4: validate the debug_want_extra_isize mount option at parse time") +Link: https://lore.kernel.org/r/20200206225252.GA3673@mit.edu +Reported-by: Herbert Poetzl +Signed-off-by: Theodore Ts'o +Cc: stable@kernel.org +Signed-off-by: Greg Kroah-Hartman + +--- + fs/ext4/super.c | 18 ++++++++++-------- + 1 file changed, 10 insertions(+), 8 deletions(-) + +--- a/fs/ext4/super.c ++++ b/fs/ext4/super.c +@@ -3727,6 +3727,15 @@ static int ext4_fill_super(struct super_ + */ + sbi->s_li_wait_mult = EXT4_DEF_LI_WAIT_MULT; + ++ blocksize = BLOCK_SIZE << le32_to_cpu(es->s_log_block_size); ++ if (blocksize < EXT4_MIN_BLOCK_SIZE || ++ blocksize > EXT4_MAX_BLOCK_SIZE) { ++ ext4_msg(sb, KERN_ERR, ++ "Unsupported filesystem blocksize %d (%d log_block_size)", ++ blocksize, le32_to_cpu(es->s_log_block_size)); ++ goto failed_mount; ++ } ++ + if (le32_to_cpu(es->s_rev_level) == EXT4_GOOD_OLD_REV) { + sbi->s_inode_size = EXT4_GOOD_OLD_INODE_SIZE; + sbi->s_first_ino = EXT4_GOOD_OLD_FIRST_INO; +@@ -3744,6 +3753,7 @@ static int ext4_fill_super(struct super_ + ext4_msg(sb, KERN_ERR, + "unsupported inode size: %d", + sbi->s_inode_size); ++ ext4_msg(sb, KERN_ERR, "blocksize: %d", blocksize); + goto failed_mount; + } + /* +@@ -3907,14 +3917,6 @@ static int ext4_fill_super(struct super_ + if (!ext4_feature_set_ok(sb, (sb_rdonly(sb)))) + goto failed_mount; + +- blocksize = BLOCK_SIZE << le32_to_cpu(es->s_log_block_size); +- if (blocksize < EXT4_MIN_BLOCK_SIZE || +- blocksize > EXT4_MAX_BLOCK_SIZE) { +- ext4_msg(sb, KERN_ERR, +- "Unsupported filesystem blocksize %d (%d log_block_size)", +- blocksize, le32_to_cpu(es->s_log_block_size)); +- goto failed_mount; +- } + if (le32_to_cpu(es->s_log_block_size) > + (EXT4_MAX_BLOCK_LOG_SIZE - EXT4_MIN_BLOCK_LOG_SIZE)) { + ext4_msg(sb, KERN_ERR, diff --git a/queue-4.19/ext4-improve-explanation-of-a-mount-failure-caused-by-a-misconfigured-kernel.patch b/queue-4.19/ext4-improve-explanation-of-a-mount-failure-caused-by-a-misconfigured-kernel.patch new file mode 100644 index 00000000000..fec3c43d0e3 --- /dev/null +++ b/queue-4.19/ext4-improve-explanation-of-a-mount-failure-caused-by-a-misconfigured-kernel.patch @@ -0,0 +1,55 @@ +From d65d87a07476aa17df2dcb3ad18c22c154315bec Mon Sep 17 00:00:00 2001 +From: Theodore Ts'o +Date: Fri, 14 Feb 2020 18:11:19 -0500 +Subject: ext4: improve explanation of a mount failure caused by a misconfigured kernel + +From: Theodore Ts'o + +commit d65d87a07476aa17df2dcb3ad18c22c154315bec upstream. + +If CONFIG_QFMT_V2 is not enabled, but CONFIG_QUOTA is enabled, when a +user tries to mount a file system with the quota or project quota +enabled, the kernel will emit a very confusing messsage: + + EXT4-fs warning (device vdc): ext4_enable_quotas:5914: Failed to enable quota tracking (type=0, err=-3). Please run e2fsck to fix. + EXT4-fs (vdc): mount failed + +We will now report an explanatory message indicating which kernel +configuration options have to be enabled, to avoid customer/sysadmin +confusion. + +Link: https://lore.kernel.org/r/20200215012738.565735-1-tytso@mit.edu +Google-Bug-Id: 149093531 +Fixes: 7c319d328505b778 ("ext4: make quota as first class supported feature") +Signed-off-by: Theodore Ts'o +Cc: stable@kernel.org +Signed-off-by: Greg Kroah-Hartman + +--- + fs/ext4/super.c | 14 ++++---------- + 1 file changed, 4 insertions(+), 10 deletions(-) + +--- a/fs/ext4/super.c ++++ b/fs/ext4/super.c +@@ -2923,17 +2923,11 @@ static int ext4_feature_set_ok(struct su + return 0; + } + +-#ifndef CONFIG_QUOTA +- if (ext4_has_feature_quota(sb) && !readonly) { ++#if !defined(CONFIG_QUOTA) || !defined(CONFIG_QFMT_V2) ++ if (!readonly && (ext4_has_feature_quota(sb) || ++ ext4_has_feature_project(sb))) { + ext4_msg(sb, KERN_ERR, +- "Filesystem with quota feature cannot be mounted RDWR " +- "without CONFIG_QUOTA"); +- return 0; +- } +- if (ext4_has_feature_project(sb) && !readonly) { +- ext4_msg(sb, KERN_ERR, +- "Filesystem with project quota feature cannot be mounted RDWR " +- "without CONFIG_QUOTA"); ++ "The kernel was not built with CONFIG_QUOTA and CONFIG_QFMT_V2"); + return 0; + } + #endif /* CONFIG_QUOTA */ diff --git a/queue-4.19/kvm-nvmx-use-correct-root-level-for-nested-ept-shadow-page-tables.patch b/queue-4.19/kvm-nvmx-use-correct-root-level-for-nested-ept-shadow-page-tables.patch new file mode 100644 index 00000000000..78e8b3f43f9 --- /dev/null +++ b/queue-4.19/kvm-nvmx-use-correct-root-level-for-nested-ept-shadow-page-tables.patch @@ -0,0 +1,37 @@ +From 148d735eb55d32848c3379e460ce365f2c1cbe4b Mon Sep 17 00:00:00 2001 +From: Sean Christopherson +Date: Fri, 7 Feb 2020 09:37:41 -0800 +Subject: KVM: nVMX: Use correct root level for nested EPT shadow page tables + +From: Sean Christopherson + +commit 148d735eb55d32848c3379e460ce365f2c1cbe4b upstream. + +Hardcode the EPT page-walk level for L2 to be 4 levels, as KVM's MMU +currently also hardcodes the page walk level for nested EPT to be 4 +levels. The L2 guest is all but guaranteed to soft hang on its first +instruction when L1 is using EPT, as KVM will construct 4-level page +tables and then tell hardware to use 5-level page tables. + +Fixes: 855feb673640 ("KVM: MMU: Add 5 level EPT & Shadow page table support.") +Cc: stable@vger.kernel.org +Signed-off-by: Sean Christopherson +Signed-off-by: Paolo Bonzini +Signed-off-by: Greg Kroah-Hartman + +--- + arch/x86/kvm/vmx/vmx.c | 3 +++ + 1 file changed, 3 insertions(+) + +--- a/arch/x86/kvm/vmx/vmx.c ++++ b/arch/x86/kvm/vmx/vmx.c +@@ -2968,6 +2968,9 @@ void vmx_set_cr0(struct kvm_vcpu *vcpu, + + static int get_ept_level(struct kvm_vcpu *vcpu) + { ++ /* Nested EPT currently only supports 4-level walks. */ ++ if (is_guest_mode(vcpu) && nested_cpu_has_ept(get_vmcs12(vcpu))) ++ return 4; + if (cpu_has_vmx_ept_5levels() && (cpuid_maxphyaddr(vcpu) > 48)) + return 5; + return 4; diff --git a/queue-4.19/perf-x86-amd-add-missing-l2-misses-event-spec-to-amd-family-17h-s-event-map.patch b/queue-4.19/perf-x86-amd-add-missing-l2-misses-event-spec-to-amd-family-17h-s-event-map.patch new file mode 100644 index 00000000000..cfe91bfc78d --- /dev/null +++ b/queue-4.19/perf-x86-amd-add-missing-l2-misses-event-spec-to-amd-family-17h-s-event-map.patch @@ -0,0 +1,74 @@ +From 25d387287cf0330abf2aad761ce6eee67326a355 Mon Sep 17 00:00:00 2001 +From: Kim Phillips +Date: Tue, 21 Jan 2020 11:12:31 -0600 +Subject: perf/x86/amd: Add missing L2 misses event spec to AMD Family 17h's event map + +From: Kim Phillips + +commit 25d387287cf0330abf2aad761ce6eee67326a355 upstream. + +Commit 3fe3331bb285 ("perf/x86/amd: Add event map for AMD Family 17h"), +claimed L2 misses were unsupported, due to them not being found in its +referenced documentation, whose link has now moved [1]. + +That old documentation listed PMCx064 unit mask bit 3 as: + + "LsRdBlkC: LS Read Block C S L X Change to X Miss." + +and bit 0 as: + + "IcFillMiss: IC Fill Miss" + +We now have new public documentation [2] with improved descriptions, that +clearly indicate what events those unit mask bits represent: + +Bit 3 now clearly states: + + "LsRdBlkC: Data Cache Req Miss in L2 (all types)" + +and bit 0 is: + + "IcFillMiss: Instruction Cache Req Miss in L2." + +So we can now add support for L2 misses in perf's genericised events as +PMCx064 with both the above unit masks. + +[1] The commit's original documentation reference, "Processor Programming + Reference (PPR) for AMD Family 17h Model 01h, Revision B1 Processors", + originally available here: + + https://www.amd.com/system/files/TechDocs/54945_PPR_Family_17h_Models_00h-0Fh.pdf + + is now available here: + + https://developer.amd.com/wordpress/media/2017/11/54945_PPR_Family_17h_Models_00h-0Fh.pdf + +[2] "Processor Programming Reference (PPR) for Family 17h Model 31h, + Revision B0 Processors", available here: + + https://developer.amd.com/wp-content/resources/55803_0.54-PUB.pdf + +Fixes: 3fe3331bb285 ("perf/x86/amd: Add event map for AMD Family 17h") +Reported-by: Babu Moger +Signed-off-by: Kim Phillips +Signed-off-by: Peter Zijlstra (Intel) +Signed-off-by: Ingo Molnar +Tested-by: Babu Moger +Cc: stable@vger.kernel.org +Link: https://lkml.kernel.org/r/20200121171232.28839-1-kim.phillips@amd.com +Signed-off-by: Greg Kroah-Hartman + +--- + arch/x86/events/amd/core.c | 1 + + 1 file changed, 1 insertion(+) + +--- a/arch/x86/events/amd/core.c ++++ b/arch/x86/events/amd/core.c +@@ -245,6 +245,7 @@ static const u64 amd_f17h_perfmon_event_ + [PERF_COUNT_HW_CPU_CYCLES] = 0x0076, + [PERF_COUNT_HW_INSTRUCTIONS] = 0x00c0, + [PERF_COUNT_HW_CACHE_REFERENCES] = 0xff60, ++ [PERF_COUNT_HW_CACHE_MISSES] = 0x0964, + [PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = 0x00c2, + [PERF_COUNT_HW_BRANCH_MISSES] = 0x00c3, + [PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] = 0x0287, diff --git a/queue-4.19/series b/queue-4.19/series index 56446228342..45e16d512a2 100644 --- a/queue-4.19/series +++ b/queue-4.19/series @@ -8,3 +8,16 @@ arm64-cpufeature-set-the-fp-simd-compat-hwcap-bits-p.patch arm64-nofpsmid-handle-tif_foreign_fpstate-flag-clean.patch alsa-usb-audio-sound-usb-usb-true-false-for-bool-return-type.patch alsa-usb-audio-add-clock-validity-quirk-for-denon-mc7000-mcx8000.patch +ext4-don-t-assume-that-mmp_nodename-bdevname-have-nul.patch +ext4-fix-support-for-inode-sizes-1024-bytes.patch +ext4-fix-checksum-errors-with-indexed-dirs.patch +ext4-add-cond_resched-to-ext4_protect_reserved_inode.patch +ext4-improve-explanation-of-a-mount-failure-caused-by-a-misconfigured-kernel.patch +btrfs-fix-race-between-using-extent-maps-and-merging-them.patch +btrfs-ref-verify-fix-memory-leaks.patch +btrfs-print-message-when-tree-log-replay-starts.patch +btrfs-log-message-when-rw-remount-is-attempted-with-unclean-tree-log.patch +arm-npcm-bring-back-gpiolib-support.patch +arm64-ssbs-fix-context-switch-when-ssbs-is-present-on-all-cpus.patch +kvm-nvmx-use-correct-root-level-for-nested-ept-shadow-page-tables.patch +perf-x86-amd-add-missing-l2-misses-event-spec-to-amd-family-17h-s-event-map.patch