From: Sasha Levin Date: Mon, 15 Mar 2021 03:01:10 +0000 (-0400) Subject: Fixes for 5.4 X-Git-Tag: v4.4.262~40 X-Git-Url: http://git.ipfire.org/gitweb.cgi?a=commitdiff_plain;h=8a169f32272ff90b502b790f5def6dbc32ed6b8c;p=thirdparty%2Fkernel%2Fstable-queue.git Fixes for 5.4 Signed-off-by: Sasha Levin --- diff --git a/queue-5.4/arm64-mm-fix-pfn_valid-for-zone_device-based-memory.patch b/queue-5.4/arm64-mm-fix-pfn_valid-for-zone_device-based-memory.patch new file mode 100644 index 00000000000..ee37ced9f23 --- /dev/null +++ b/queue-5.4/arm64-mm-fix-pfn_valid-for-zone_device-based-memory.patch @@ -0,0 +1,77 @@ +From e8e8bcdba53a0ddb816ef4933994c7b1437c11b8 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 5 Mar 2021 10:54:57 +0530 +Subject: arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory + +From: Anshuman Khandual + +[ Upstream commit eeb0753ba27b26f609e61f9950b14f1b934fe429 ] + +pfn_valid() validates a pfn but basically it checks for a valid struct page +backing for that pfn. It should always return positive for memory ranges +backed with struct page mapping. But currently pfn_valid() fails for all +ZONE_DEVICE based memory types even though they have struct page mapping. + +pfn_valid() asserts that there is a memblock entry for a given pfn without +MEMBLOCK_NOMAP flag being set. The problem with ZONE_DEVICE based memory is +that they do not have memblock entries. Hence memblock_is_map_memory() will +invariably fail via memblock_search() for a ZONE_DEVICE based address. This +eventually fails pfn_valid() which is wrong. memblock_is_map_memory() needs +to be skipped for such memory ranges. As ZONE_DEVICE memory gets hotplugged +into the system via memremap_pages() called from a driver, their respective +memory sections will not have SECTION_IS_EARLY set. + +Normal hotplug memory will never have MEMBLOCK_NOMAP set in their memblock +regions. Because the flag MEMBLOCK_NOMAP was specifically designed and set +for firmware reserved memory regions. memblock_is_map_memory() can just be +skipped as its always going to be positive and that will be an optimization +for the normal hotplug memory. Like ZONE_DEVICE based memory, all normal +hotplugged memory too will not have SECTION_IS_EARLY set for their sections + +Skipping memblock_is_map_memory() for all non early memory sections would +fix pfn_valid() problem for ZONE_DEVICE based memory and also improve its +performance for normal hotplug memory as well. + +Cc: Catalin Marinas +Cc: Will Deacon +Cc: Ard Biesheuvel +Cc: Robin Murphy +Cc: linux-arm-kernel@lists.infradead.org +Cc: linux-kernel@vger.kernel.org +Acked-by: David Hildenbrand +Fixes: 73b20c84d42d ("arm64: mm: implement pte_devmap support") +Signed-off-by: Anshuman Khandual +Acked-by: Catalin Marinas +Link: https://lore.kernel.org/r/1614921898-4099-2-git-send-email-anshuman.khandual@arm.com +Signed-off-by: Will Deacon +Signed-off-by: Sasha Levin +--- + arch/arm64/mm/init.c | 12 ++++++++++++ + 1 file changed, 12 insertions(+) + +diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c +index 602bd19630ff..cbcac03c0e0d 100644 +--- a/arch/arm64/mm/init.c ++++ b/arch/arm64/mm/init.c +@@ -245,6 +245,18 @@ int pfn_valid(unsigned long pfn) + + if (!valid_section(__nr_to_section(pfn_to_section_nr(pfn)))) + return 0; ++ ++ /* ++ * ZONE_DEVICE memory does not have the memblock entries. ++ * memblock_is_map_memory() check for ZONE_DEVICE based ++ * addresses will always fail. Even the normal hotplugged ++ * memory will never have MEMBLOCK_NOMAP flag set in their ++ * memblock entries. Skip memblock search for all non early ++ * memory sections covering all of hotplug memory including ++ * both normal and ZONE_DEVICE based. ++ */ ++ if (!early_section(__pfn_to_section(pfn))) ++ return pfn_section_valid(__pfn_to_section(pfn), pfn); + #endif + return memblock_is_map_memory(addr); + } +-- +2.30.1 + diff --git a/queue-5.4/arm64-mm-use-a-48-bit-id-map-when-possible-on-52-bit.patch b/queue-5.4/arm64-mm-use-a-48-bit-id-map-when-possible-on-52-bit.patch new file mode 100644 index 00000000000..9ffe3b4c7ba --- /dev/null +++ b/queue-5.4/arm64-mm-use-a-48-bit-id-map-when-possible-on-52-bit.patch @@ -0,0 +1,87 @@ +From 65fe43f84bceaf6e1d3b796231f81d2470a9e9e1 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 10 Mar 2021 18:15:11 +0100 +Subject: arm64: mm: use a 48-bit ID map when possible on 52-bit VA builds + +From: Ard Biesheuvel + +[ Upstream commit 7ba8f2b2d652cd8d8a2ab61f4be66973e70f9f88 ] + +52-bit VA kernels can run on hardware that is only 48-bit capable, but +configure the ID map as 52-bit by default. This was not a problem until +recently, because the special T0SZ value for a 52-bit VA space was never +programmed into the TCR register anwyay, and because a 52-bit ID map +happens to use the same number of translation levels as a 48-bit one. + +This behavior was changed by commit 1401bef703a4 ("arm64: mm: Always update +TCR_EL1 from __cpu_set_tcr_t0sz()"), which causes the unsupported T0SZ +value for a 52-bit VA to be programmed into TCR_EL1. While some hardware +simply ignores this, Mark reports that Amberwing systems choke on this, +resulting in a broken boot. But even before that commit, the unsupported +idmap_t0sz value was exposed to KVM and used to program TCR_EL2 incorrectly +as well. + +Given that we already have to deal with address spaces being either 48-bit +or 52-bit in size, the cleanest approach seems to be to simply default to +a 48-bit VA ID map, and only switch to a 52-bit one if the placement of the +kernel in DRAM requires it. This is guaranteed not to happen unless the +system is actually 52-bit VA capable. + +Fixes: 90ec95cda91a ("arm64: mm: Introduce VA_BITS_MIN") +Reported-by: Mark Salter +Link: http://lore.kernel.org/r/20210310003216.410037-1-msalter@redhat.com +Signed-off-by: Ard Biesheuvel +Link: https://lore.kernel.org/r/20210310171515.416643-2-ardb@kernel.org +Signed-off-by: Will Deacon +Signed-off-by: Sasha Levin +--- + arch/arm64/include/asm/mmu_context.h | 5 +---- + arch/arm64/kernel/head.S | 2 +- + arch/arm64/mm/mmu.c | 2 +- + 3 files changed, 3 insertions(+), 6 deletions(-) + +diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h +index 3827ff4040a3..3a5d9f1c91b6 100644 +--- a/arch/arm64/include/asm/mmu_context.h ++++ b/arch/arm64/include/asm/mmu_context.h +@@ -63,10 +63,7 @@ extern u64 idmap_ptrs_per_pgd; + + static inline bool __cpu_uses_extended_idmap(void) + { +- if (IS_ENABLED(CONFIG_ARM64_VA_BITS_52)) +- return false; +- +- return unlikely(idmap_t0sz != TCR_T0SZ(VA_BITS)); ++ return unlikely(idmap_t0sz != TCR_T0SZ(vabits_actual)); + } + + /* +diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S +index 438de2301cfe..a2e0b3754943 100644 +--- a/arch/arm64/kernel/head.S ++++ b/arch/arm64/kernel/head.S +@@ -337,7 +337,7 @@ __create_page_tables: + */ + adrp x5, __idmap_text_end + clz x5, x5 +- cmp x5, TCR_T0SZ(VA_BITS) // default T0SZ small enough? ++ cmp x5, TCR_T0SZ(VA_BITS_MIN) // default T0SZ small enough? + b.ge 1f // .. then skip VA range extension + + adr_l x6, idmap_t0sz +diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c +index d10247fab0fd..99bc0289ab2b 100644 +--- a/arch/arm64/mm/mmu.c ++++ b/arch/arm64/mm/mmu.c +@@ -38,7 +38,7 @@ + #define NO_BLOCK_MAPPINGS BIT(0) + #define NO_CONT_MAPPINGS BIT(1) + +-u64 idmap_t0sz = TCR_T0SZ(VA_BITS); ++u64 idmap_t0sz = TCR_T0SZ(VA_BITS_MIN); + u64 idmap_ptrs_per_pgd = PTRS_PER_PGD; + + u64 __section(".mmuoff.data.write") vabits_actual; +-- +2.30.1 + diff --git a/queue-5.4/block-rsxx-fix-error-return-code-of-rsxx_pci_probe.patch b/queue-5.4/block-rsxx-fix-error-return-code-of-rsxx_pci_probe.patch new file mode 100644 index 00000000000..bf6f2cc1f24 --- /dev/null +++ b/queue-5.4/block-rsxx-fix-error-return-code-of-rsxx_pci_probe.patch @@ -0,0 +1,39 @@ +From ff4dc10118a8fe2f0cb34b5b1763088327316993 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 9 Mar 2021 19:30:17 -0800 +Subject: block: rsxx: fix error return code of rsxx_pci_probe() + +From: Jia-Ju Bai + +[ Upstream commit df66617bfe87487190a60783d26175b65d2502ce ] + +When create_singlethread_workqueue returns NULL to card->event_wq, no +error return code of rsxx_pci_probe() is assigned. + +To fix this bug, st is assigned with -ENOMEM in this case. + +Fixes: 8722ff8cdbfa ("block: IBM RamSan 70/80 device driver") +Reported-by: TOTE Robot +Signed-off-by: Jia-Ju Bai +Link: https://lore.kernel.org/r/20210310033017.4023-1-baijiaju1990@gmail.com +Signed-off-by: Jens Axboe +Signed-off-by: Sasha Levin +--- + drivers/block/rsxx/core.c | 1 + + 1 file changed, 1 insertion(+) + +diff --git a/drivers/block/rsxx/core.c b/drivers/block/rsxx/core.c +index 804d28faa97b..a1824bb08044 100644 +--- a/drivers/block/rsxx/core.c ++++ b/drivers/block/rsxx/core.c +@@ -869,6 +869,7 @@ static int rsxx_pci_probe(struct pci_dev *dev, + card->event_wq = create_singlethread_workqueue(DRIVER_NAME"_event"); + if (!card->event_wq) { + dev_err(CARD_TO_DEV(card), "Failed card event setup.\n"); ++ st = -ENOMEM; + goto failed_event_handler; + } + +-- +2.30.1 + diff --git a/queue-5.4/configfs-fix-a-use-after-free-in-__configfs_open_fil.patch b/queue-5.4/configfs-fix-a-use-after-free-in-__configfs_open_fil.patch new file mode 100644 index 00000000000..2ca02c41c22 --- /dev/null +++ b/queue-5.4/configfs-fix-a-use-after-free-in-__configfs_open_fil.patch @@ -0,0 +1,132 @@ +From 95ffd4306b969f6a7fa126370950b9a88d65e21d Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 1 Mar 2021 14:10:53 +0800 +Subject: configfs: fix a use-after-free in __configfs_open_file + +From: Daiyue Zhang + +[ Upstream commit 14fbbc8297728e880070f7b077b3301a8c698ef9 ] + +Commit b0841eefd969 ("configfs: provide exclusion between IO and removals") +uses ->frag_dead to mark the fragment state, thus no bothering with extra +refcount on config_item when opening a file. The configfs_get_config_item +was removed in __configfs_open_file, but not with config_item_put. So the +refcount on config_item will lost its balance, causing use-after-free +issues in some occasions like this: + +Test: +1. Mount configfs on /config with read-only items: +drwxrwx--- 289 root root 0 2021-04-01 11:55 /config +drwxr-xr-x 2 root root 0 2021-04-01 11:54 /config/a +--w--w--w- 1 root root 4096 2021-04-01 11:53 /config/a/1.txt +...... + +2. Then run: +for file in /config +do +echo $file +grep -R 'key' $file +done + +3. __configfs_open_file will be called in parallel, the first one +got called will do: +if (file->f_mode & FMODE_READ) { + if (!(inode->i_mode & S_IRUGO)) + goto out_put_module; + config_item_put(buffer->item); + kref_put() + package_details_release() + kfree() + +the other one will run into use-after-free issues like this: +BUG: KASAN: use-after-free in __configfs_open_file+0x1bc/0x3b0 +Read of size 8 at addr fffffff155f02480 by task grep/13096 +CPU: 0 PID: 13096 Comm: grep VIP: 00 Tainted: G W 4.14.116-kasan #1 +TGID: 13096 Comm: grep +Call trace: +dump_stack+0x118/0x160 +kasan_report+0x22c/0x294 +__asan_load8+0x80/0x88 +__configfs_open_file+0x1bc/0x3b0 +configfs_open_file+0x28/0x34 +do_dentry_open+0x2cc/0x5c0 +vfs_open+0x80/0xe0 +path_openat+0xd8c/0x2988 +do_filp_open+0x1c4/0x2fc +do_sys_open+0x23c/0x404 +SyS_openat+0x38/0x48 + +Allocated by task 2138: +kasan_kmalloc+0xe0/0x1ac +kmem_cache_alloc_trace+0x334/0x394 +packages_make_item+0x4c/0x180 +configfs_mkdir+0x358/0x740 +vfs_mkdir2+0x1bc/0x2e8 +SyS_mkdirat+0x154/0x23c +el0_svc_naked+0x34/0x38 + +Freed by task 13096: +kasan_slab_free+0xb8/0x194 +kfree+0x13c/0x910 +package_details_release+0x524/0x56c +kref_put+0xc4/0x104 +config_item_put+0x24/0x34 +__configfs_open_file+0x35c/0x3b0 +configfs_open_file+0x28/0x34 +do_dentry_open+0x2cc/0x5c0 +vfs_open+0x80/0xe0 +path_openat+0xd8c/0x2988 +do_filp_open+0x1c4/0x2fc +do_sys_open+0x23c/0x404 +SyS_openat+0x38/0x48 +el0_svc_naked+0x34/0x38 + +To fix this issue, remove the config_item_put in +__configfs_open_file to balance the refcount of config_item. + +Fixes: b0841eefd969 ("configfs: provide exclusion between IO and removals") +Signed-off-by: Daiyue Zhang +Signed-off-by: Yi Chen +Signed-off-by: Ge Qiu +Reviewed-by: Chao Yu +Acked-by: Al Viro +Signed-off-by: Christoph Hellwig +Signed-off-by: Sasha Levin +--- + fs/configfs/file.c | 6 ++---- + 1 file changed, 2 insertions(+), 4 deletions(-) + +diff --git a/fs/configfs/file.c b/fs/configfs/file.c +index fb65b706cc0d..84b4d58fc65f 100644 +--- a/fs/configfs/file.c ++++ b/fs/configfs/file.c +@@ -378,7 +378,7 @@ static int __configfs_open_file(struct inode *inode, struct file *file, int type + + attr = to_attr(dentry); + if (!attr) +- goto out_put_item; ++ goto out_free_buffer; + + if (type & CONFIGFS_ITEM_BIN_ATTR) { + buffer->bin_attr = to_bin_attr(dentry); +@@ -391,7 +391,7 @@ static int __configfs_open_file(struct inode *inode, struct file *file, int type + /* Grab the module reference for this attribute if we have one */ + error = -ENODEV; + if (!try_module_get(buffer->owner)) +- goto out_put_item; ++ goto out_free_buffer; + + error = -EACCES; + if (!buffer->item->ci_type) +@@ -435,8 +435,6 @@ static int __configfs_open_file(struct inode *inode, struct file *file, int type + + out_put_module: + module_put(buffer->owner); +-out_put_item: +- config_item_put(buffer->item); + out_free_buffer: + up_read(&frag->frag_sem); + kfree(buffer); +-- +2.30.1 + diff --git a/queue-5.4/hrtimer-update-softirq_expires_next-correctly-after-.patch b/queue-5.4/hrtimer-update-softirq_expires_next-correctly-after-.patch new file mode 100644 index 00000000000..89ff17e2ae2 --- /dev/null +++ b/queue-5.4/hrtimer-update-softirq_expires_next-correctly-after-.patch @@ -0,0 +1,155 @@ +From b984670daf7191167f26f59c448508bcca946bbc Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 23 Feb 2021 17:02:40 +0100 +Subject: hrtimer: Update softirq_expires_next correctly after + __hrtimer_get_next_event() + +From: Anna-Maria Behnsen + +[ Upstream commit 46eb1701c046cc18c032fa68f3c8ccbf24483ee4 ] + +hrtimer_force_reprogram() and hrtimer_interrupt() invokes +__hrtimer_get_next_event() to find the earliest expiry time of hrtimer +bases. __hrtimer_get_next_event() does not update +cpu_base::[softirq_]_expires_next to preserve reprogramming logic. That +needs to be done at the callsites. + +hrtimer_force_reprogram() updates cpu_base::softirq_expires_next only when +the first expiring timer is a softirq timer and the soft interrupt is not +activated. That's wrong because cpu_base::softirq_expires_next is left +stale when the first expiring timer of all bases is a timer which expires +in hard interrupt context. hrtimer_interrupt() does never update +cpu_base::softirq_expires_next which is wrong too. + +That becomes a problem when clock_settime() sets CLOCK_REALTIME forward and +the first soft expiring timer is in the CLOCK_REALTIME_SOFT base. Setting +CLOCK_REALTIME forward moves the clock MONOTONIC based expiry time of that +timer before the stale cpu_base::softirq_expires_next. + +cpu_base::softirq_expires_next is cached to make the check for raising the +soft interrupt fast. In the above case the soft interrupt won't be raised +until clock monotonic reaches the stale cpu_base::softirq_expires_next +value. That's incorrect, but what's worse it that if the softirq timer +becomes the first expiring timer of all clock bases after the hard expiry +timer has been handled the reprogramming of the clockevent from +hrtimer_interrupt() will result in an interrupt storm. That happens because +the reprogramming does not use cpu_base::softirq_expires_next, it uses +__hrtimer_get_next_event() which returns the actual expiry time. Once clock +MONOTONIC reaches cpu_base::softirq_expires_next the soft interrupt is +raised and the storm subsides. + +Change the logic in hrtimer_force_reprogram() to evaluate the soft and hard +bases seperately, update softirq_expires_next and handle the case when a +soft expiring timer is the first of all bases by comparing the expiry times +and updating the required cpu base fields. Split this functionality into a +separate function to be able to use it in hrtimer_interrupt() as well +without copy paste. + +Fixes: 5da70160462e ("hrtimer: Implement support for softirq based hrtimers") +Reported-by: Mikael Beckius +Suggested-by: Thomas Gleixner +Tested-by: Mikael Beckius +Signed-off-by: Anna-Maria Behnsen +Signed-off-by: Thomas Gleixner +Signed-off-by: Ingo Molnar +Link: https://lore.kernel.org/r/20210223160240.27518-1-anna-maria@linutronix.de +Signed-off-by: Sasha Levin +--- + kernel/time/hrtimer.c | 60 ++++++++++++++++++++++++++++--------------- + 1 file changed, 39 insertions(+), 21 deletions(-) + +diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c +index 7f31932216a1..299a4c5b6cf8 100644 +--- a/kernel/time/hrtimer.c ++++ b/kernel/time/hrtimer.c +@@ -547,8 +547,11 @@ static ktime_t __hrtimer_next_event_base(struct hrtimer_cpu_base *cpu_base, + } + + /* +- * Recomputes cpu_base::*next_timer and returns the earliest expires_next but +- * does not set cpu_base::*expires_next, that is done by hrtimer_reprogram. ++ * Recomputes cpu_base::*next_timer and returns the earliest expires_next ++ * but does not set cpu_base::*expires_next, that is done by ++ * hrtimer[_force]_reprogram and hrtimer_interrupt only. When updating ++ * cpu_base::*expires_next right away, reprogramming logic would no longer ++ * work. + * + * When a softirq is pending, we can ignore the HRTIMER_ACTIVE_SOFT bases, + * those timers will get run whenever the softirq gets handled, at the end of +@@ -589,6 +592,37 @@ __hrtimer_get_next_event(struct hrtimer_cpu_base *cpu_base, unsigned int active_ + return expires_next; + } + ++static ktime_t hrtimer_update_next_event(struct hrtimer_cpu_base *cpu_base) ++{ ++ ktime_t expires_next, soft = KTIME_MAX; ++ ++ /* ++ * If the soft interrupt has already been activated, ignore the ++ * soft bases. They will be handled in the already raised soft ++ * interrupt. ++ */ ++ if (!cpu_base->softirq_activated) { ++ soft = __hrtimer_get_next_event(cpu_base, HRTIMER_ACTIVE_SOFT); ++ /* ++ * Update the soft expiry time. clock_settime() might have ++ * affected it. ++ */ ++ cpu_base->softirq_expires_next = soft; ++ } ++ ++ expires_next = __hrtimer_get_next_event(cpu_base, HRTIMER_ACTIVE_HARD); ++ /* ++ * If a softirq timer is expiring first, update cpu_base->next_timer ++ * and program the hardware with the soft expiry time. ++ */ ++ if (expires_next > soft) { ++ cpu_base->next_timer = cpu_base->softirq_next_timer; ++ expires_next = soft; ++ } ++ ++ return expires_next; ++} ++ + static inline ktime_t hrtimer_update_base(struct hrtimer_cpu_base *base) + { + ktime_t *offs_real = &base->clock_base[HRTIMER_BASE_REALTIME].offset; +@@ -629,23 +663,7 @@ hrtimer_force_reprogram(struct hrtimer_cpu_base *cpu_base, int skip_equal) + { + ktime_t expires_next; + +- /* +- * Find the current next expiration time. +- */ +- expires_next = __hrtimer_get_next_event(cpu_base, HRTIMER_ACTIVE_ALL); +- +- if (cpu_base->next_timer && cpu_base->next_timer->is_soft) { +- /* +- * When the softirq is activated, hrtimer has to be +- * programmed with the first hard hrtimer because soft +- * timer interrupt could occur too late. +- */ +- if (cpu_base->softirq_activated) +- expires_next = __hrtimer_get_next_event(cpu_base, +- HRTIMER_ACTIVE_HARD); +- else +- cpu_base->softirq_expires_next = expires_next; +- } ++ expires_next = hrtimer_update_next_event(cpu_base); + + if (skip_equal && expires_next == cpu_base->expires_next) + return; +@@ -1640,8 +1658,8 @@ void hrtimer_interrupt(struct clock_event_device *dev) + + __hrtimer_run_queues(cpu_base, now, flags, HRTIMER_ACTIVE_HARD); + +- /* Reevaluate the clock bases for the next expiry */ +- expires_next = __hrtimer_get_next_event(cpu_base, HRTIMER_ACTIVE_ALL); ++ /* Reevaluate the clock bases for the [soft] next expiry */ ++ expires_next = hrtimer_update_next_event(cpu_base); + /* + * Store the new expiry value so the migration code can verify + * against it. +-- +2.30.1 + diff --git a/queue-5.4/include-linux-sched-mm.h-use-rcu_dereference-in-in_v.patch b/queue-5.4/include-linux-sched-mm.h-use-rcu_dereference-in-in_v.patch new file mode 100644 index 00000000000..1b978237dbc --- /dev/null +++ b/queue-5.4/include-linux-sched-mm.h-use-rcu_dereference-in-in_v.patch @@ -0,0 +1,43 @@ +From 94babab6a31ffdcf9a9650d74646159ba9b1fdec Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 12 Mar 2021 21:08:03 -0800 +Subject: include/linux/sched/mm.h: use rcu_dereference in in_vfork() + +From: Matthew Wilcox (Oracle) + +[ Upstream commit 149fc787353f65b7e72e05e7b75d34863266c3e2 ] + +Fix a sparse warning by using rcu_dereference(). Technically this is a +bug and a sufficiently aggressive compiler could reload the `real_parent' +pointer outside the protection of the rcu lock (and access freed memory), +but I think it's pretty unlikely to happen. + +Link: https://lkml.kernel.org/r/20210221194207.1351703-1-willy@infradead.org +Fixes: b18dc5f291c0 ("mm, oom: skip vforked tasks from being selected") +Signed-off-by: Matthew Wilcox (Oracle) +Reviewed-by: Miaohe Lin +Acked-by: Michal Hocko +Signed-off-by: Andrew Morton +Signed-off-by: Linus Torvalds +Signed-off-by: Sasha Levin +--- + include/linux/sched/mm.h | 3 ++- + 1 file changed, 2 insertions(+), 1 deletion(-) + +diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h +index a132d875d351..3a1d899019af 100644 +--- a/include/linux/sched/mm.h ++++ b/include/linux/sched/mm.h +@@ -167,7 +167,8 @@ static inline bool in_vfork(struct task_struct *tsk) + * another oom-unkillable task does this it should blame itself. + */ + rcu_read_lock(); +- ret = tsk->vfork_done && tsk->real_parent->mm == tsk->mm; ++ ret = tsk->vfork_done && ++ rcu_dereference(tsk->real_parent)->mm == tsk->mm; + rcu_read_unlock(); + + return ret; +-- +2.30.1 + diff --git a/queue-5.4/net-bonding-fix-error-return-code-of-bond_neigh_init.patch b/queue-5.4/net-bonding-fix-error-return-code-of-bond_neigh_init.patch new file mode 100644 index 00000000000..60465507b45 --- /dev/null +++ b/queue-5.4/net-bonding-fix-error-return-code-of-bond_neigh_init.patch @@ -0,0 +1,47 @@ +From 496c2c03992a018b54092bee78b0ea9f5ba27fa9 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Sun, 7 Mar 2021 19:11:02 -0800 +Subject: net: bonding: fix error return code of bond_neigh_init() + +From: Jia-Ju Bai + +[ Upstream commit 2055a99da8a253a357bdfd359b3338ef3375a26c ] + +When slave is NULL or slave_ops->ndo_neigh_setup is NULL, no error +return code of bond_neigh_init() is assigned. +To fix this bug, ret is assigned with -EINVAL in these cases. + +Fixes: 9e99bfefdbce ("bonding: fix bond_neigh_init()") +Reported-by: TOTE Robot +Signed-off-by: Jia-Ju Bai +Signed-off-by: David S. Miller +Signed-off-by: Sasha Levin +--- + drivers/net/bonding/bond_main.c | 8 ++++++-- + 1 file changed, 6 insertions(+), 2 deletions(-) + +diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c +index 2bc4cb9e3095..da6600255577 100644 +--- a/drivers/net/bonding/bond_main.c ++++ b/drivers/net/bonding/bond_main.c +@@ -3703,11 +3703,15 @@ static int bond_neigh_init(struct neighbour *n) + + rcu_read_lock(); + slave = bond_first_slave_rcu(bond); +- if (!slave) ++ if (!slave) { ++ ret = -EINVAL; + goto out; ++ } + slave_ops = slave->dev->netdev_ops; +- if (!slave_ops->ndo_neigh_setup) ++ if (!slave_ops->ndo_neigh_setup) { ++ ret = -EINVAL; + goto out; ++ } + + /* TODO: find another way [1] to implement this. + * Passing a zeroed structure is fragile, +-- +2.30.1 + diff --git a/queue-5.4/nfs-don-t-gratuitously-clear-the-inode-cache-when-lo.patch b/queue-5.4/nfs-don-t-gratuitously-clear-the-inode-cache-when-lo.patch new file mode 100644 index 00000000000..5c985fff5ee --- /dev/null +++ b/queue-5.4/nfs-don-t-gratuitously-clear-the-inode-cache-when-lo.patch @@ -0,0 +1,54 @@ +From cf8f02f2bf35b4d0dbec8341c4027741d001cd4e Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 8 Mar 2021 14:42:52 -0500 +Subject: NFS: Don't gratuitously clear the inode cache when lookup failed + +From: Trond Myklebust + +[ Upstream commit 47397915ede0192235474b145ebcd81b37b03624 ] + +The fact that the lookup revalidation failed, does not mean that the +inode contents have changed. + +Fixes: 5ceb9d7fdaaf ("NFS: Refactor nfs_lookup_revalidate()") +Signed-off-by: Trond Myklebust +Signed-off-by: Anna Schumaker +Signed-off-by: Sasha Levin +--- + fs/nfs/dir.c | 20 ++++++++------------ + 1 file changed, 8 insertions(+), 12 deletions(-) + +diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c +index 59092d2780a3..e7c0790308fe 100644 +--- a/fs/nfs/dir.c ++++ b/fs/nfs/dir.c +@@ -1116,18 +1116,14 @@ nfs_lookup_revalidate_done(struct inode *dir, struct dentry *dentry, + __func__, dentry); + return 1; + case 0: +- if (inode && S_ISDIR(inode->i_mode)) { +- /* Purge readdir caches. */ +- nfs_zap_caches(inode); +- /* +- * We can't d_drop the root of a disconnected tree: +- * its d_hash is on the s_anon list and d_drop() would hide +- * it from shrink_dcache_for_unmount(), leading to busy +- * inodes on unmount and further oopses. +- */ +- if (IS_ROOT(dentry)) +- return 1; +- } ++ /* ++ * We can't d_drop the root of a disconnected tree: ++ * its d_hash is on the s_anon list and d_drop() would hide ++ * it from shrink_dcache_for_unmount(), leading to busy ++ * inodes on unmount and further oopses. ++ */ ++ if (inode && IS_ROOT(dentry)) ++ return 1; + dfprintk(LOOKUPCACHE, "NFS: %s(%pd2) is invalid\n", + __func__, dentry); + return 0; +-- +2.30.1 + diff --git a/queue-5.4/nfs-don-t-revalidate-the-directory-permissions-on-a-.patch b/queue-5.4/nfs-don-t-revalidate-the-directory-permissions-on-a-.patch new file mode 100644 index 00000000000..edda4b7f6da --- /dev/null +++ b/queue-5.4/nfs-don-t-revalidate-the-directory-permissions-on-a-.patch @@ -0,0 +1,90 @@ +From 65086feb2b6efea33ed90ff688b1082e4255df00 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 8 Mar 2021 14:42:51 -0500 +Subject: NFS: Don't revalidate the directory permissions on a lookup failure + +From: Trond Myklebust + +[ Upstream commit 82e7ca1334ab16e2e04fafded1cab9dfcdc11b40 ] + +There should be no reason to expect the directory permissions to change +just because the directory contents changed or a negative lookup timed +out. So let's avoid doing a full call to nfs_mark_for_revalidate() in +that case. +Furthermore, if this is a negative dentry, and we haven't actually done +a new lookup, then we have no reason yet to believe the directory has +changed at all. So let's remove the gratuitous directory inode +invalidation altogether when called from +nfs_lookup_revalidate_negative(). + +Reported-by: Geert Jansen +Fixes: 5ceb9d7fdaaf ("NFS: Refactor nfs_lookup_revalidate()") +Signed-off-by: Trond Myklebust +Signed-off-by: Anna Schumaker +Signed-off-by: Sasha Levin +--- + fs/nfs/dir.c | 20 +++++++++++++++++--- + 1 file changed, 17 insertions(+), 3 deletions(-) + +diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c +index 188b17a3b19e..59092d2780a3 100644 +--- a/fs/nfs/dir.c ++++ b/fs/nfs/dir.c +@@ -1073,6 +1073,15 @@ int nfs_lookup_verify_inode(struct inode *inode, unsigned int flags) + goto out; + } + ++static void nfs_mark_dir_for_revalidate(struct inode *inode) ++{ ++ struct nfs_inode *nfsi = NFS_I(inode); ++ ++ spin_lock(&inode->i_lock); ++ nfsi->cache_validity |= NFS_INO_REVAL_PAGECACHE; ++ spin_unlock(&inode->i_lock); ++} ++ + /* + * We judge how long we want to trust negative + * dentries by looking at the parent inode mtime. +@@ -1107,7 +1116,6 @@ nfs_lookup_revalidate_done(struct inode *dir, struct dentry *dentry, + __func__, dentry); + return 1; + case 0: +- nfs_mark_for_revalidate(dir); + if (inode && S_ISDIR(inode->i_mode)) { + /* Purge readdir caches. */ + nfs_zap_caches(inode); +@@ -1188,6 +1196,13 @@ nfs_lookup_revalidate_dentry(struct inode *dir, struct dentry *dentry, + nfs_free_fattr(fattr); + nfs_free_fhandle(fhandle); + nfs4_label_free(label); ++ ++ /* ++ * If the lookup failed despite the dentry change attribute being ++ * a match, then we should revalidate the directory cache. ++ */ ++ if (!ret && nfs_verify_change_attribute(dir, dentry->d_time)) ++ nfs_mark_dir_for_revalidate(dir); + return nfs_lookup_revalidate_done(dir, dentry, inode, ret); + } + +@@ -1230,7 +1245,7 @@ nfs_do_lookup_revalidate(struct inode *dir, struct dentry *dentry, + error = nfs_lookup_verify_inode(inode, flags); + if (error) { + if (error == -ESTALE) +- nfs_zap_caches(dir); ++ nfs_mark_dir_for_revalidate(dir); + goto out_bad; + } + nfs_advise_use_readdirplus(dir); +@@ -1725,7 +1740,6 @@ nfs_add_or_obtain(struct dentry *dentry, struct nfs_fh *fhandle, + dput(parent); + return d; + out_error: +- nfs_mark_for_revalidate(dir); + d = ERR_PTR(error); + goto out; + } +-- +2.30.1 + diff --git a/queue-5.4/nfsv4.2-fix-return-value-of-_nfs4_get_security_label.patch b/queue-5.4/nfsv4.2-fix-return-value-of-_nfs4_get_security_label.patch new file mode 100644 index 00000000000..7273aa60766 --- /dev/null +++ b/queue-5.4/nfsv4.2-fix-return-value-of-_nfs4_get_security_label.patch @@ -0,0 +1,43 @@ +From b045348c73db45087912352c883b27eec127453f Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 15 Jan 2021 18:43:56 +0100 +Subject: NFSv4.2: fix return value of _nfs4_get_security_label() + +From: Ondrej Mosnacek + +[ Upstream commit 53cb245454df5b13d7063162afd7a785aed6ebf2 ] + +An xattr 'get' handler is expected to return the length of the value on +success, yet _nfs4_get_security_label() (and consequently also +nfs4_xattr_get_nfs4_label(), which is used as an xattr handler) returns +just 0 on success. + +Fix this by returning label.len instead, which contains the length of +the result. + +Fixes: aa9c2669626c ("NFS: Client implementation of Labeled-NFS") +Signed-off-by: Ondrej Mosnacek +Reviewed-by: James Morris +Reviewed-by: Paul Moore +Signed-off-by: Anna Schumaker +Signed-off-by: Sasha Levin +--- + fs/nfs/nfs4proc.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c +index 30e44b33040a..b2119159dead 100644 +--- a/fs/nfs/nfs4proc.c ++++ b/fs/nfs/nfs4proc.c +@@ -5830,7 +5830,7 @@ static int _nfs4_get_security_label(struct inode *inode, void *buf, + return ret; + if (!(fattr.valid & NFS_ATTR_FATTR_V4_SECURITY_LABEL)) + return -ENOENT; +- return 0; ++ return label.len; + } + + static int nfs4_get_security_label(struct inode *inode, void *buf, +-- +2.30.1 + diff --git a/queue-5.4/prctl-fix-pr_set_mm_auxv-kernel-stack-leak.patch b/queue-5.4/prctl-fix-pr_set_mm_auxv-kernel-stack-leak.patch new file mode 100644 index 00000000000..d59bef17c54 --- /dev/null +++ b/queue-5.4/prctl-fix-pr_set_mm_auxv-kernel-stack-leak.patch @@ -0,0 +1,45 @@ +From 00ed686852c3c89957a82274bd1cae70422dc256 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Sun, 14 Mar 2021 23:51:14 +0300 +Subject: prctl: fix PR_SET_MM_AUXV kernel stack leak + +From: Alexey Dobriyan + +[ Upstream commit c995f12ad8842dbf5cfed113fb52cdd083f5afd1 ] + +Doing a + + prctl(PR_SET_MM, PR_SET_MM_AUXV, addr, 1); + +will copy 1 byte from userspace to (quite big) on-stack array +and then stash everything to mm->saved_auxv. +AT_NULL terminator will be inserted at the very end. + +/proc/*/auxv handler will find that AT_NULL terminator +and copy original stack contents to userspace. + +This devious scheme requires CAP_SYS_RESOURCE. + +Signed-off-by: Alexey Dobriyan +Signed-off-by: Linus Torvalds +Signed-off-by: Sasha Levin +--- + kernel/sys.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/kernel/sys.c b/kernel/sys.c +index 3459a5ce0da0..867ec3e003fd 100644 +--- a/kernel/sys.c ++++ b/kernel/sys.c +@@ -2062,7 +2062,7 @@ static int prctl_set_auxv(struct mm_struct *mm, unsigned long addr, + * up to the caller to provide sane values here, otherwise userspace + * tools which use this vector might be unhappy. + */ +- unsigned long user_auxv[AT_VECTOR_SIZE]; ++ unsigned long user_auxv[AT_VECTOR_SIZE] = {}; + + if (len > sizeof(user_auxv)) + return -EINVAL; +-- +2.30.1 + diff --git a/queue-5.4/series b/queue-5.4/series index 50d2a39a965..2d736660dc4 100644 --- a/queue-5.4/series +++ b/queue-5.4/series @@ -156,3 +156,17 @@ staging-comedi-dmm32at-fix-endian-problem-for-ai-command-data.patch staging-comedi-me4000-fix-endian-problem-for-ai-command-data.patch staging-comedi-pcl711-fix-endian-problem-for-ai-command-data.patch staging-comedi-pcl818-fix-endian-problem-for-ai-command-data.patch +sh_eth-fix-trscer-mask-for-r7s72100.patch +arm64-mm-fix-pfn_valid-for-zone_device-based-memory.patch +net-bonding-fix-error-return-code-of-bond_neigh_init.patch +sunrpc-set-memalloc_nofs_save-for-sync-tasks.patch +nfs-don-t-revalidate-the-directory-permissions-on-a-.patch +nfs-don-t-gratuitously-clear-the-inode-cache-when-lo.patch +nfsv4.2-fix-return-value-of-_nfs4_get_security_label.patch +block-rsxx-fix-error-return-code-of-rsxx_pci_probe.patch +configfs-fix-a-use-after-free-in-__configfs_open_fil.patch +arm64-mm-use-a-48-bit-id-map-when-possible-on-52-bit.patch +hrtimer-update-softirq_expires_next-correctly-after-.patch +stop_machine-mark-helpers-__always_inline.patch +include-linux-sched-mm.h-use-rcu_dereference-in-in_v.patch +prctl-fix-pr_set_mm_auxv-kernel-stack-leak.patch diff --git a/queue-5.4/sh_eth-fix-trscer-mask-for-r7s72100.patch b/queue-5.4/sh_eth-fix-trscer-mask-for-r7s72100.patch new file mode 100644 index 00000000000..af803c21bf1 --- /dev/null +++ b/queue-5.4/sh_eth-fix-trscer-mask-for-r7s72100.patch @@ -0,0 +1,38 @@ +From 8c24f0f304d3b54a099e0896686dbf3983fd8ba9 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Sun, 28 Feb 2021 23:26:34 +0300 +Subject: sh_eth: fix TRSCER mask for R7S72100 + +From: Sergey Shtylyov + +[ Upstream commit 75be7fb7f978202c4c3a1a713af4485afb2ff5f6 ] + +According to the RZ/A1H Group, RZ/A1M Group User's Manual: Hardware, +Rev. 4.00, the TRSCER register has bit 9 reserved, hence we can't use +the driver's default TRSCER mask. Add the explicit initializer for +sh_eth_cpu_data::trscer_err_mask for R7S72100. + +Fixes: db893473d313 ("sh_eth: Add support for r7s72100") +Signed-off-by: Sergey Shtylyov +Signed-off-by: David S. Miller +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/renesas/sh_eth.c | 2 ++ + 1 file changed, 2 insertions(+) + +diff --git a/drivers/net/ethernet/renesas/sh_eth.c b/drivers/net/ethernet/renesas/sh_eth.c +index 91d234b18195..a042f4607b0d 100644 +--- a/drivers/net/ethernet/renesas/sh_eth.c ++++ b/drivers/net/ethernet/renesas/sh_eth.c +@@ -610,6 +610,8 @@ static struct sh_eth_cpu_data r7s72100_data = { + EESR_TDE, + .fdr_value = 0x0000070f, + ++ .trscer_err_mask = DESC_I_RINT8 | DESC_I_RINT5, ++ + .no_psr = 1, + .apr = 1, + .mpr = 1, +-- +2.30.1 + diff --git a/queue-5.4/stop_machine-mark-helpers-__always_inline.patch b/queue-5.4/stop_machine-mark-helpers-__always_inline.patch new file mode 100644 index 00000000000..52f7a981239 --- /dev/null +++ b/queue-5.4/stop_machine-mark-helpers-__always_inline.patch @@ -0,0 +1,83 @@ +From e406443cca920748a474bfb85d0611e270e17e71 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 12 Mar 2021 21:07:04 -0800 +Subject: stop_machine: mark helpers __always_inline + +From: Arnd Bergmann + +[ Upstream commit cbf78d85079cee662c45749ef4f744d41be85d48 ] + +With clang-13, some functions only get partially inlined, with a +specialized version referring to a global variable. This triggers a +harmless build-time check for the intel-rng driver: + +WARNING: modpost: drivers/char/hw_random/intel-rng.o(.text+0xe): Section mismatch in reference from the function stop_machine() to the function .init.text:intel_rng_hw_init() +The function stop_machine() references +the function __init intel_rng_hw_init(). +This is often because stop_machine lacks a __init +annotation or the annotation of intel_rng_hw_init is wrong. + +In this instance, an easy workaround is to force the stop_machine() +function to be inline, along with related interfaces that did not show the +same behavior at the moment, but theoretically could. + +The combination of the two patches listed below triggers the behavior in +clang-13, but individually these commits are correct. + +Link: https://lkml.kernel.org/r/20210225130153.1956990-1-arnd@kernel.org +Fixes: fe5595c07400 ("stop_machine: Provide stop_machine_cpuslocked()") +Fixes: ee527cd3a20c ("Use stop_machine_run in the Intel RNG driver") +Signed-off-by: Arnd Bergmann +Cc: Nathan Chancellor +Cc: Nick Desaulniers +Cc: Thomas Gleixner +Cc: Sebastian Andrzej Siewior +Cc: "Paul E. McKenney" +Cc: Ingo Molnar +Cc: Prarit Bhargava +Cc: Daniel Bristot de Oliveira +Cc: Peter Zijlstra +Cc: Valentin Schneider +Signed-off-by: Andrew Morton +Signed-off-by: Linus Torvalds +Signed-off-by: Sasha Levin +--- + include/linux/stop_machine.h | 11 ++++++----- + 1 file changed, 6 insertions(+), 5 deletions(-) + +diff --git a/include/linux/stop_machine.h b/include/linux/stop_machine.h +index f9a0c6189852..69998fc5ffe9 100644 +--- a/include/linux/stop_machine.h ++++ b/include/linux/stop_machine.h +@@ -139,7 +139,7 @@ int stop_machine_from_inactive_cpu(cpu_stop_fn_t fn, void *data, + const struct cpumask *cpus); + #else /* CONFIG_SMP || CONFIG_HOTPLUG_CPU */ + +-static inline int stop_machine_cpuslocked(cpu_stop_fn_t fn, void *data, ++static __always_inline int stop_machine_cpuslocked(cpu_stop_fn_t fn, void *data, + const struct cpumask *cpus) + { + unsigned long flags; +@@ -150,14 +150,15 @@ static inline int stop_machine_cpuslocked(cpu_stop_fn_t fn, void *data, + return ret; + } + +-static inline int stop_machine(cpu_stop_fn_t fn, void *data, +- const struct cpumask *cpus) ++static __always_inline int ++stop_machine(cpu_stop_fn_t fn, void *data, const struct cpumask *cpus) + { + return stop_machine_cpuslocked(fn, data, cpus); + } + +-static inline int stop_machine_from_inactive_cpu(cpu_stop_fn_t fn, void *data, +- const struct cpumask *cpus) ++static __always_inline int ++stop_machine_from_inactive_cpu(cpu_stop_fn_t fn, void *data, ++ const struct cpumask *cpus) + { + return stop_machine(fn, data, cpus); + } +-- +2.30.1 + diff --git a/queue-5.4/sunrpc-set-memalloc_nofs_save-for-sync-tasks.patch b/queue-5.4/sunrpc-set-memalloc_nofs_save-for-sync-tasks.patch new file mode 100644 index 00000000000..9e9117f11ae --- /dev/null +++ b/queue-5.4/sunrpc-set-memalloc_nofs_save-for-sync-tasks.patch @@ -0,0 +1,41 @@ +From 96898493deb839afe5fb0f7482ef374d12896e93 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 3 Mar 2021 08:47:16 -0500 +Subject: SUNRPC: Set memalloc_nofs_save() for sync tasks + +From: Benjamin Coddington + +[ Upstream commit f0940f4b3284a00f38a5d42e6067c2aaa20e1f2e ] + +We could recurse into NFS doing memory reclaim while sending a sync task, +which might result in a deadlock. Set memalloc_nofs_save for sync task +execution. + +Fixes: a1231fda7e94 ("SUNRPC: Set memalloc_nofs_save() on all rpciod/xprtiod jobs") +Signed-off-by: Benjamin Coddington +Signed-off-by: Anna Schumaker +Signed-off-by: Sasha Levin +--- + net/sunrpc/sched.c | 5 ++++- + 1 file changed, 4 insertions(+), 1 deletion(-) + +diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c +index 7afbf15bcbd9..4beb6d2957c3 100644 +--- a/net/sunrpc/sched.c ++++ b/net/sunrpc/sched.c +@@ -990,8 +990,11 @@ void rpc_execute(struct rpc_task *task) + + rpc_set_active(task); + rpc_make_runnable(rpciod_workqueue, task); +- if (!is_async) ++ if (!is_async) { ++ unsigned int pflags = memalloc_nofs_save(); + __rpc_execute(task); ++ memalloc_nofs_restore(pflags); ++ } + } + + static void rpc_async_schedule(struct work_struct *work) +-- +2.30.1 +