From: Sasha Levin Date: Fri, 22 May 2020 00:42:55 +0000 (-0400) Subject: Fixes for 4.9 X-Git-Tag: v4.4.225~62 X-Git-Url: http://git.ipfire.org/gitweb.cgi?a=commitdiff_plain;h=14689dcbea535942393d99c6bb65b32e6582ff9f;p=thirdparty%2Fkernel%2Fstable-queue.git Fixes for 4.9 Signed-off-by: Sasha Levin --- diff --git a/queue-4.9/arm64-fix-the-flush_icache_range-arguments-in-machin.patch b/queue-4.9/arm64-fix-the-flush_icache_range-arguments-in-machin.patch new file mode 100644 index 00000000000..0703851b4e6 --- /dev/null +++ b/queue-4.9/arm64-fix-the-flush_icache_range-arguments-in-machin.patch @@ -0,0 +1,37 @@ +From cbdc39e8626fc33bbc879a6238474b430718ed6f Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 21 May 2020 15:44:34 +0100 +Subject: arm64: fix the flush_icache_range arguments in machine_kexec + +From: Christoph Hellwig + +Commit d51c214541c5154dda3037289ee895ea3ded5ebd upstream. + +The second argument is the end "pointer", not the length. + +Fixes: d28f6df1305a ("arm64/kexec: Add core kexec support") +Cc: # 4.8.x- +Signed-off-by: Christoph Hellwig +Signed-off-by: Catalin Marinas +Signed-off-by: Sasha Levin +--- + arch/arm64/kernel/machine_kexec.c | 3 ++- + 1 file changed, 2 insertions(+), 1 deletion(-) + +diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c +index bc96c8a7fc79..3e4b778f16a5 100644 +--- a/arch/arm64/kernel/machine_kexec.c ++++ b/arch/arm64/kernel/machine_kexec.c +@@ -177,7 +177,8 @@ void machine_kexec(struct kimage *kimage) + /* Flush the reboot_code_buffer in preparation for its execution. */ + __flush_dcache_area(reboot_code_buffer, arm64_relocate_new_kernel_size); + flush_icache_range((uintptr_t)reboot_code_buffer, +- arm64_relocate_new_kernel_size); ++ (uintptr_t)reboot_code_buffer + ++ arm64_relocate_new_kernel_size); + + /* Flush the kimage list and its buffers. */ + kexec_list_flush(kimage); +-- +2.25.1 + diff --git a/queue-4.9/i2c-dev-fix-the-race-between-the-release-of-i2c_dev-.patch b/queue-4.9/i2c-dev-fix-the-race-between-the-release-of-i2c_dev-.patch new file mode 100644 index 00000000000..21c587ac73a --- /dev/null +++ b/queue-4.9/i2c-dev-fix-the-race-between-the-release-of-i2c_dev-.patch @@ -0,0 +1,187 @@ +From bb9dfcadb5f51ea3c7913f9229dbd82fca0a63e4 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 11 Oct 2019 23:00:14 +0800 +Subject: i2c: dev: Fix the race between the release of i2c_dev and cdev + +From: Kevin Hao + +[ Upstream commit 1413ef638abae4ab5621901cf4d8ef08a4a48ba6 ] + +The struct cdev is embedded in the struct i2c_dev. In the current code, +we would free the i2c_dev struct directly in put_i2c_dev(), but the +cdev is manged by a kobject, and the release of it is not predictable. +So it is very possible that the i2c_dev is freed before the cdev is +entirely released. We can easily get the following call trace with +CONFIG_DEBUG_KOBJECT_RELEASE and CONFIG_DEBUG_OBJECTS_TIMERS enabled. + ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x38 + WARNING: CPU: 19 PID: 1 at lib/debugobjects.c:325 debug_print_object+0xb0/0xf0 + Modules linked in: + CPU: 19 PID: 1 Comm: swapper/0 Tainted: G W 5.2.20-yocto-standard+ #120 + Hardware name: Marvell OcteonTX CN96XX board (DT) + pstate: 80c00089 (Nzcv daIf +PAN +UAO) + pc : debug_print_object+0xb0/0xf0 + lr : debug_print_object+0xb0/0xf0 + sp : ffff00001292f7d0 + x29: ffff00001292f7d0 x28: ffff800b82151788 + x27: 0000000000000001 x26: ffff800b892c0000 + x25: ffff0000124a2558 x24: 0000000000000000 + x23: ffff00001107a1d8 x22: ffff0000116b5088 + x21: ffff800bdc6afca8 x20: ffff000012471ae8 + x19: ffff00001168f2c8 x18: 0000000000000010 + x17: 00000000fd6f304b x16: 00000000ee79de43 + x15: ffff800bc0e80568 x14: 79616c6564203a74 + x13: 6e6968207473696c x12: 5f72656d6974203a + x11: ffff0000113f0018 x10: 0000000000000000 + x9 : 000000000000001f x8 : 0000000000000000 + x7 : ffff0000101294cc x6 : 0000000000000000 + x5 : 0000000000000000 x4 : 0000000000000001 + x3 : 00000000ffffffff x2 : 0000000000000000 + x1 : 387fc15c8ec0f200 x0 : 0000000000000000 + Call trace: + debug_print_object+0xb0/0xf0 + __debug_check_no_obj_freed+0x19c/0x228 + debug_check_no_obj_freed+0x1c/0x28 + kfree+0x250/0x440 + put_i2c_dev+0x68/0x78 + i2cdev_detach_adapter+0x60/0xc8 + i2cdev_notifier_call+0x3c/0x70 + notifier_call_chain+0x8c/0xe8 + blocking_notifier_call_chain+0x64/0x88 + device_del+0x74/0x380 + device_unregister+0x54/0x78 + i2c_del_adapter+0x278/0x2d0 + unittest_i2c_bus_remove+0x3c/0x80 + platform_drv_remove+0x30/0x50 + device_release_driver_internal+0xf4/0x1c0 + driver_detach+0x58/0xa0 + bus_remove_driver+0x84/0xd8 + driver_unregister+0x34/0x60 + platform_driver_unregister+0x20/0x30 + of_unittest_overlay+0x8d4/0xbe0 + of_unittest+0xae8/0xb3c + do_one_initcall+0xac/0x450 + do_initcall_level+0x208/0x224 + kernel_init_freeable+0x2d8/0x36c + kernel_init+0x18/0x108 + ret_from_fork+0x10/0x1c + irq event stamp: 3934661 + hardirqs last enabled at (3934661): [] debug_exception_exit+0x4c/0x58 + hardirqs last disabled at (3934660): [] debug_exception_enter+0xa4/0xe0 + softirqs last enabled at (3934654): [] __do_softirq+0x46c/0x628 + softirqs last disabled at (3934649): [] irq_exit+0x104/0x118 + +This is a common issue when using cdev embedded in a struct. +Fortunately, we already have a mechanism to solve this kind of issue. +Please see commit 233ed09d7fda ("chardev: add helper function to +register char devs with a struct device") for more detail. + +In this patch, we choose to embed the struct device into the i2c_dev, +and use the API provided by the commit 233ed09d7fda to make sure that +the release of i2c_dev and cdev are in sequence. + +Signed-off-by: Kevin Hao +Signed-off-by: Wolfram Sang +Signed-off-by: Sasha Levin +--- + drivers/i2c/i2c-dev.c | 48 +++++++++++++++++++++++-------------------- + 1 file changed, 26 insertions(+), 22 deletions(-) + +diff --git a/drivers/i2c/i2c-dev.c b/drivers/i2c/i2c-dev.c +index eaa312bc3a3c..c4066276eb7b 100644 +--- a/drivers/i2c/i2c-dev.c ++++ b/drivers/i2c/i2c-dev.c +@@ -47,7 +47,7 @@ + struct i2c_dev { + struct list_head list; + struct i2c_adapter *adap; +- struct device *dev; ++ struct device dev; + struct cdev cdev; + }; + +@@ -91,12 +91,14 @@ static struct i2c_dev *get_free_i2c_dev(struct i2c_adapter *adap) + return i2c_dev; + } + +-static void put_i2c_dev(struct i2c_dev *i2c_dev) ++static void put_i2c_dev(struct i2c_dev *i2c_dev, bool del_cdev) + { + spin_lock(&i2c_dev_list_lock); + list_del(&i2c_dev->list); + spin_unlock(&i2c_dev_list_lock); +- kfree(i2c_dev); ++ if (del_cdev) ++ cdev_device_del(&i2c_dev->cdev, &i2c_dev->dev); ++ put_device(&i2c_dev->dev); + } + + static ssize_t name_show(struct device *dev, +@@ -542,6 +544,14 @@ static const struct file_operations i2cdev_fops = { + + static struct class *i2c_dev_class; + ++static void i2cdev_dev_release(struct device *dev) ++{ ++ struct i2c_dev *i2c_dev; ++ ++ i2c_dev = container_of(dev, struct i2c_dev, dev); ++ kfree(i2c_dev); ++} ++ + static int i2cdev_attach_adapter(struct device *dev, void *dummy) + { + struct i2c_adapter *adap; +@@ -558,27 +568,23 @@ static int i2cdev_attach_adapter(struct device *dev, void *dummy) + + cdev_init(&i2c_dev->cdev, &i2cdev_fops); + i2c_dev->cdev.owner = THIS_MODULE; +- res = cdev_add(&i2c_dev->cdev, MKDEV(I2C_MAJOR, adap->nr), 1); +- if (res) +- goto error_cdev; +- +- /* register this i2c device with the driver core */ +- i2c_dev->dev = device_create(i2c_dev_class, &adap->dev, +- MKDEV(I2C_MAJOR, adap->nr), NULL, +- "i2c-%d", adap->nr); +- if (IS_ERR(i2c_dev->dev)) { +- res = PTR_ERR(i2c_dev->dev); +- goto error; ++ ++ device_initialize(&i2c_dev->dev); ++ i2c_dev->dev.devt = MKDEV(I2C_MAJOR, adap->nr); ++ i2c_dev->dev.class = i2c_dev_class; ++ i2c_dev->dev.parent = &adap->dev; ++ i2c_dev->dev.release = i2cdev_dev_release; ++ dev_set_name(&i2c_dev->dev, "i2c-%d", adap->nr); ++ ++ res = cdev_device_add(&i2c_dev->cdev, &i2c_dev->dev); ++ if (res) { ++ put_i2c_dev(i2c_dev, false); ++ return res; + } + + pr_debug("i2c-dev: adapter [%s] registered as minor %d\n", + adap->name, adap->nr); + return 0; +-error: +- cdev_del(&i2c_dev->cdev); +-error_cdev: +- put_i2c_dev(i2c_dev); +- return res; + } + + static int i2cdev_detach_adapter(struct device *dev, void *dummy) +@@ -594,9 +600,7 @@ static int i2cdev_detach_adapter(struct device *dev, void *dummy) + if (!i2c_dev) /* attach_adapter must have failed */ + return 0; + +- cdev_del(&i2c_dev->cdev); +- put_i2c_dev(i2c_dev); +- device_destroy(i2c_dev_class, MKDEV(I2C_MAJOR, adap->nr)); ++ put_i2c_dev(i2c_dev, true); + + pr_debug("i2c-dev: adapter [%s] unregistered\n", adap->name); + return 0; +-- +2.25.1 + diff --git a/queue-4.9/padata-initialize-pd-cpu-with-effective-cpumask.patch b/queue-4.9/padata-initialize-pd-cpu-with-effective-cpumask.patch new file mode 100644 index 00000000000..54fb1f532d4 --- /dev/null +++ b/queue-4.9/padata-initialize-pd-cpu-with-effective-cpumask.patch @@ -0,0 +1,76 @@ +From 6635574a35f2e7a40d42354d444db8b09715f6b1 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 21 May 2020 16:48:46 -0400 +Subject: padata: initialize pd->cpu with effective cpumask + +From: Daniel Jordan + +[ Upstream commit ec9c7d19336ee98ecba8de80128aa405c45feebb ] + +Exercising CPU hotplug on a 5.2 kernel with recent padata fixes from +cryptodev-2.6.git in an 8-CPU kvm guest... + + # modprobe tcrypt alg="pcrypt(rfc4106(gcm(aes)))" type=3 + # echo 0 > /sys/devices/system/cpu/cpu1/online + # echo c > /sys/kernel/pcrypt/pencrypt/parallel_cpumask + # modprobe tcrypt mode=215 + +...caused the following crash: + + BUG: kernel NULL pointer dereference, address: 0000000000000000 + #PF: supervisor read access in kernel mode + #PF: error_code(0x0000) - not-present page + PGD 0 P4D 0 + Oops: 0000 [#1] SMP PTI + CPU: 2 PID: 134 Comm: kworker/2:2 Not tainted 5.2.0-padata-base+ #7 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0- + Workqueue: pencrypt padata_parallel_worker + RIP: 0010:padata_reorder+0xcb/0x180 + ... + Call Trace: + padata_do_serial+0x57/0x60 + pcrypt_aead_enc+0x3a/0x50 [pcrypt] + padata_parallel_worker+0x9b/0xe0 + process_one_work+0x1b5/0x3f0 + worker_thread+0x4a/0x3c0 + ... + +In padata_alloc_pd, pd->cpu is set using the user-supplied cpumask +instead of the effective cpumask, and in this case cpumask_first picked +an offline CPU. + +The offline CPU's reorder->list.next is NULL in padata_reorder because +the list wasn't initialized in padata_init_pqueues, which only operates +on CPUs in the effective mask. + +Fix by using the effective mask in padata_alloc_pd. + +Fixes: 6fc4dbcf0276 ("padata: Replace delayed timer with immediate workqueue in padata_reorder") +Signed-off-by: Daniel Jordan +Cc: Herbert Xu +Cc: Steffen Klassert +Cc: linux-crypto@vger.kernel.org +Cc: linux-kernel@vger.kernel.org +Signed-off-by: Herbert Xu +Signed-off-by: Daniel Jordan +Signed-off-by: Sasha Levin +--- + kernel/padata.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/kernel/padata.c b/kernel/padata.c +index 0b9c39730d6d..1030e6cfc08c 100644 +--- a/kernel/padata.c ++++ b/kernel/padata.c +@@ -450,7 +450,7 @@ static struct parallel_data *padata_alloc_pd(struct padata_instance *pinst, + atomic_set(&pd->refcnt, 1); + pd->pinst = pinst; + spin_lock_init(&pd->lock); +- pd->cpu = cpumask_first(pcpumask); ++ pd->cpu = cpumask_first(pd->cpumask.pcpu); + INIT_WORK(&pd->reorder_work, invoke_padata_reorder); + + return pd; +-- +2.25.1 + diff --git a/queue-4.9/padata-purge-get_cpu-and-reorder_via_wq-from-padata_.patch b/queue-4.9/padata-purge-get_cpu-and-reorder_via_wq-from-padata_.patch new file mode 100644 index 00000000000..306e74593ba --- /dev/null +++ b/queue-4.9/padata-purge-get_cpu-and-reorder_via_wq-from-padata_.patch @@ -0,0 +1,68 @@ +From 5bf7aabd7348b97b354c7fdd15bd928f52fff90d Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 21 May 2020 16:48:47 -0400 +Subject: padata: purge get_cpu and reorder_via_wq from padata_do_serial + +From: Daniel Jordan + +[ Upstream commit 065cf577135a4977931c7a1e1edf442bfd9773dd ] + +With the removal of the padata timer, padata_do_serial no longer +needs special CPU handling, so remove it. + +Signed-off-by: Daniel Jordan +Cc: Herbert Xu +Cc: Steffen Klassert +Cc: linux-crypto@vger.kernel.org +Cc: linux-kernel@vger.kernel.org +Signed-off-by: Herbert Xu +Signed-off-by: Daniel Jordan +Signed-off-by: Sasha Levin +--- + kernel/padata.c | 23 +++-------------------- + 1 file changed, 3 insertions(+), 20 deletions(-) + +diff --git a/kernel/padata.c b/kernel/padata.c +index 1030e6cfc08c..e82f066d63ac 100644 +--- a/kernel/padata.c ++++ b/kernel/padata.c +@@ -323,24 +323,9 @@ static void padata_serial_worker(struct work_struct *serial_work) + */ + void padata_do_serial(struct padata_priv *padata) + { +- int cpu; +- struct padata_parallel_queue *pqueue; +- struct parallel_data *pd; +- int reorder_via_wq = 0; +- +- pd = padata->pd; +- +- cpu = get_cpu(); +- +- /* We need to enqueue the padata object into the correct +- * per-cpu queue. +- */ +- if (cpu != padata->cpu) { +- reorder_via_wq = 1; +- cpu = padata->cpu; +- } +- +- pqueue = per_cpu_ptr(pd->pqueue, cpu); ++ struct parallel_data *pd = padata->pd; ++ struct padata_parallel_queue *pqueue = per_cpu_ptr(pd->pqueue, ++ padata->cpu); + + spin_lock(&pqueue->reorder.lock); + list_add_tail(&padata->list, &pqueue->reorder.list); +@@ -354,8 +339,6 @@ void padata_do_serial(struct padata_priv *padata) + */ + smp_mb__after_atomic(); + +- put_cpu(); +- + padata_reorder(pd); + } + EXPORT_SYMBOL(padata_do_serial); +-- +2.25.1 + diff --git a/queue-4.9/padata-replace-delayed-timer-with-immediate-workqueu.patch b/queue-4.9/padata-replace-delayed-timer-with-immediate-workqueu.patch new file mode 100644 index 00000000000..6c8cc3f9d6b --- /dev/null +++ b/queue-4.9/padata-replace-delayed-timer-with-immediate-workqueu.patch @@ -0,0 +1,308 @@ +From 7387f263a6958000ed47d39bb27551c12f3439a9 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 21 May 2020 16:48:45 -0400 +Subject: padata: Replace delayed timer with immediate workqueue in + padata_reorder + +From: Herbert Xu + +[ Upstream commit 6fc4dbcf0276279d488c5fbbfabe94734134f4fa ] + +The function padata_reorder will use a timer when it cannot progress +while completed jobs are outstanding (pd->reorder_objects > 0). This +is suboptimal as if we do end up using the timer then it would have +introduced a gratuitous delay of one second. + +In fact we can easily distinguish between whether completed jobs +are outstanding and whether we can make progress. All we have to +do is look at the next pqueue list. + +This patch does that by replacing pd->processed with pd->cpu so +that the next pqueue is more accessible. + +A work queue is used instead of the original try_again to avoid +hogging the CPU. + +Note that we don't bother removing the work queue in +padata_flush_queues because the whole premise is broken. You +cannot flush async crypto requests so it makes no sense to even +try. A subsequent patch will fix it by replacing it with a ref +counting scheme. + +Signed-off-by: Herbert Xu +[dj: - adjust context + - corrected setup_timer -> timer_setup to delete hunk + - skip padata_flush_queues() hunk, function already removed + in 4.9] +Signed-off-by: Daniel Jordan +Signed-off-by: Sasha Levin +--- + include/linux/padata.h | 13 ++---- + kernel/padata.c | 95 ++++++++---------------------------------- + 2 files changed, 22 insertions(+), 86 deletions(-) + +diff --git a/include/linux/padata.h b/include/linux/padata.h +index 86c885f90878..3afa17ed59da 100644 +--- a/include/linux/padata.h ++++ b/include/linux/padata.h +@@ -24,7 +24,6 @@ + #include + #include + #include +-#include + #include + #include + +@@ -85,18 +84,14 @@ struct padata_serial_queue { + * @serial: List to wait for serialization after reordering. + * @pwork: work struct for parallelization. + * @swork: work struct for serialization. +- * @pd: Backpointer to the internal control structure. + * @work: work struct for parallelization. +- * @reorder_work: work struct for reordering. + * @num_obj: Number of objects that are processed by this cpu. + * @cpu_index: Index of the cpu. + */ + struct padata_parallel_queue { + struct padata_list parallel; + struct padata_list reorder; +- struct parallel_data *pd; + struct work_struct work; +- struct work_struct reorder_work; + atomic_t num_obj; + int cpu_index; + }; +@@ -122,10 +117,10 @@ struct padata_cpumask { + * @reorder_objects: Number of objects waiting in the reorder queues. + * @refcnt: Number of objects holding a reference on this parallel_data. + * @max_seq_nr: Maximal used sequence number. ++ * @cpu: Next CPU to be processed. + * @cpumask: The cpumasks in use for parallel and serial workers. ++ * @reorder_work: work struct for reordering. + * @lock: Reorder lock. +- * @processed: Number of already processed objects. +- * @timer: Reorder timer. + */ + struct parallel_data { + struct padata_instance *pinst; +@@ -134,10 +129,10 @@ struct parallel_data { + atomic_t reorder_objects; + atomic_t refcnt; + atomic_t seq_nr; ++ int cpu; + struct padata_cpumask cpumask; ++ struct work_struct reorder_work; + spinlock_t lock ____cacheline_aligned; +- unsigned int processed; +- struct timer_list timer; + }; + + /** +diff --git a/kernel/padata.c b/kernel/padata.c +index 52a1d3fd13b5..0b9c39730d6d 100644 +--- a/kernel/padata.c ++++ b/kernel/padata.c +@@ -166,23 +166,12 @@ EXPORT_SYMBOL(padata_do_parallel); + */ + static struct padata_priv *padata_get_next(struct parallel_data *pd) + { +- int cpu, num_cpus; +- unsigned int next_nr, next_index; + struct padata_parallel_queue *next_queue; + struct padata_priv *padata; + struct padata_list *reorder; ++ int cpu = pd->cpu; + +- num_cpus = cpumask_weight(pd->cpumask.pcpu); +- +- /* +- * Calculate the percpu reorder queue and the sequence +- * number of the next object. +- */ +- next_nr = pd->processed; +- next_index = next_nr % num_cpus; +- cpu = padata_index_to_cpu(pd, next_index); + next_queue = per_cpu_ptr(pd->pqueue, cpu); +- + reorder = &next_queue->reorder; + + spin_lock(&reorder->lock); +@@ -193,7 +182,8 @@ static struct padata_priv *padata_get_next(struct parallel_data *pd) + list_del_init(&padata->list); + atomic_dec(&pd->reorder_objects); + +- pd->processed++; ++ pd->cpu = cpumask_next_wrap(cpu, pd->cpumask.pcpu, -1, ++ false); + + spin_unlock(&reorder->lock); + goto out; +@@ -216,6 +206,7 @@ static void padata_reorder(struct parallel_data *pd) + struct padata_priv *padata; + struct padata_serial_queue *squeue; + struct padata_instance *pinst = pd->pinst; ++ struct padata_parallel_queue *next_queue; + + /* + * We need to ensure that only one cpu can work on dequeueing of +@@ -247,7 +238,6 @@ static void padata_reorder(struct parallel_data *pd) + * so exit immediately. + */ + if (PTR_ERR(padata) == -ENODATA) { +- del_timer(&pd->timer); + spin_unlock_bh(&pd->lock); + return; + } +@@ -266,70 +256,29 @@ static void padata_reorder(struct parallel_data *pd) + + /* + * The next object that needs serialization might have arrived to +- * the reorder queues in the meantime, we will be called again +- * from the timer function if no one else cares for it. ++ * the reorder queues in the meantime. + * +- * Ensure reorder_objects is read after pd->lock is dropped so we see +- * an increment from another task in padata_do_serial. Pairs with ++ * Ensure reorder queue is read after pd->lock is dropped so we see ++ * new objects from another task in padata_do_serial. Pairs with + * smp_mb__after_atomic in padata_do_serial. + */ + smp_mb(); +- if (atomic_read(&pd->reorder_objects) +- && !(pinst->flags & PADATA_RESET)) +- mod_timer(&pd->timer, jiffies + HZ); +- else +- del_timer(&pd->timer); + +- return; ++ next_queue = per_cpu_ptr(pd->pqueue, pd->cpu); ++ if (!list_empty(&next_queue->reorder.list)) ++ queue_work(pinst->wq, &pd->reorder_work); + } + + static void invoke_padata_reorder(struct work_struct *work) + { +- struct padata_parallel_queue *pqueue; + struct parallel_data *pd; + + local_bh_disable(); +- pqueue = container_of(work, struct padata_parallel_queue, reorder_work); +- pd = pqueue->pd; ++ pd = container_of(work, struct parallel_data, reorder_work); + padata_reorder(pd); + local_bh_enable(); + } + +-static void padata_reorder_timer(unsigned long arg) +-{ +- struct parallel_data *pd = (struct parallel_data *)arg; +- unsigned int weight; +- int target_cpu, cpu; +- +- cpu = get_cpu(); +- +- /* We don't lock pd here to not interfere with parallel processing +- * padata_reorder() calls on other CPUs. We just need any CPU out of +- * the cpumask.pcpu set. It would be nice if it's the right one but +- * it doesn't matter if we're off to the next one by using an outdated +- * pd->processed value. +- */ +- weight = cpumask_weight(pd->cpumask.pcpu); +- target_cpu = padata_index_to_cpu(pd, pd->processed % weight); +- +- /* ensure to call the reorder callback on the correct CPU */ +- if (cpu != target_cpu) { +- struct padata_parallel_queue *pqueue; +- struct padata_instance *pinst; +- +- /* The timer function is serialized wrt itself -- no locking +- * needed. +- */ +- pinst = pd->pinst; +- pqueue = per_cpu_ptr(pd->pqueue, target_cpu); +- queue_work_on(target_cpu, pinst->wq, &pqueue->reorder_work); +- } else { +- padata_reorder(pd); +- } +- +- put_cpu(); +-} +- + static void padata_serial_worker(struct work_struct *serial_work) + { + struct padata_serial_queue *squeue; +@@ -383,9 +332,8 @@ void padata_do_serial(struct padata_priv *padata) + + cpu = get_cpu(); + +- /* We need to run on the same CPU padata_do_parallel(.., padata, ..) +- * was called on -- or, at least, enqueue the padata object into the +- * correct per-cpu queue. ++ /* We need to enqueue the padata object into the correct ++ * per-cpu queue. + */ + if (cpu != padata->cpu) { + reorder_via_wq = 1; +@@ -395,12 +343,12 @@ void padata_do_serial(struct padata_priv *padata) + pqueue = per_cpu_ptr(pd->pqueue, cpu); + + spin_lock(&pqueue->reorder.lock); +- atomic_inc(&pd->reorder_objects); + list_add_tail(&padata->list, &pqueue->reorder.list); ++ atomic_inc(&pd->reorder_objects); + spin_unlock(&pqueue->reorder.lock); + + /* +- * Ensure the atomic_inc of reorder_objects above is ordered correctly ++ * Ensure the addition to the reorder list is ordered correctly + * with the trylock of pd->lock in padata_reorder. Pairs with smp_mb + * in padata_reorder. + */ +@@ -408,13 +356,7 @@ void padata_do_serial(struct padata_priv *padata) + + put_cpu(); + +- /* If we're running on the wrong CPU, call padata_reorder() via a +- * kernel worker. +- */ +- if (reorder_via_wq) +- queue_work_on(cpu, pd->pinst->wq, &pqueue->reorder_work); +- else +- padata_reorder(pd); ++ padata_reorder(pd); + } + EXPORT_SYMBOL(padata_do_serial); + +@@ -470,14 +412,12 @@ static void padata_init_pqueues(struct parallel_data *pd) + continue; + } + +- pqueue->pd = pd; + pqueue->cpu_index = cpu_index; + cpu_index++; + + __padata_list_init(&pqueue->reorder); + __padata_list_init(&pqueue->parallel); + INIT_WORK(&pqueue->work, padata_parallel_worker); +- INIT_WORK(&pqueue->reorder_work, invoke_padata_reorder); + atomic_set(&pqueue->num_obj, 0); + } + } +@@ -505,12 +445,13 @@ static struct parallel_data *padata_alloc_pd(struct padata_instance *pinst, + + padata_init_pqueues(pd); + padata_init_squeues(pd); +- setup_timer(&pd->timer, padata_reorder_timer, (unsigned long)pd); + atomic_set(&pd->seq_nr, -1); + atomic_set(&pd->reorder_objects, 0); + atomic_set(&pd->refcnt, 1); + pd->pinst = pinst; + spin_lock_init(&pd->lock); ++ pd->cpu = cpumask_first(pcpumask); ++ INIT_WORK(&pd->reorder_work, invoke_padata_reorder); + + return pd; + +-- +2.25.1 + diff --git a/queue-4.9/padata-set-cpu_index-of-unused-cpus-to-1.patch b/queue-4.9/padata-set-cpu_index-of-unused-cpus-to-1.patch new file mode 100644 index 00000000000..43cc38fd048 --- /dev/null +++ b/queue-4.9/padata-set-cpu_index-of-unused-cpus-to-1.patch @@ -0,0 +1,50 @@ +From 22e591286da7a218256fee150d72945e124a9353 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 21 May 2020 16:48:44 -0400 +Subject: padata: set cpu_index of unused CPUs to -1 + +From: Mathias Krause + +[ Upstream commit 1bd845bcb41d5b7f83745e0cb99273eb376f2ec5 ] + +The parallel queue per-cpu data structure gets initialized only for CPUs +in the 'pcpu' CPU mask set. This is not sufficient as the reorder timer +may run on a different CPU and might wrongly decide it's the target CPU +for the next reorder item as per-cpu memory gets memset(0) and we might +be waiting for the first CPU in cpumask.pcpu, i.e. cpu_index 0. + +Make the '__this_cpu_read(pd->pqueue->cpu_index) == next_queue->cpu_index' +compare in padata_get_next() fail in this case by initializing the +cpu_index member of all per-cpu parallel queues. Use -1 for unused ones. + +Signed-off-by: Mathias Krause +Signed-off-by: Herbert Xu +Signed-off-by: Daniel Jordan +Signed-off-by: Sasha Levin +--- + kernel/padata.c | 8 +++++++- + 1 file changed, 7 insertions(+), 1 deletion(-) + +diff --git a/kernel/padata.c b/kernel/padata.c +index 693536efccf9..52a1d3fd13b5 100644 +--- a/kernel/padata.c ++++ b/kernel/padata.c +@@ -462,8 +462,14 @@ static void padata_init_pqueues(struct parallel_data *pd) + struct padata_parallel_queue *pqueue; + + cpu_index = 0; +- for_each_cpu(cpu, pd->cpumask.pcpu) { ++ for_each_possible_cpu(cpu) { + pqueue = per_cpu_ptr(pd->pqueue, cpu); ++ ++ if (!cpumask_test_cpu(cpu, pd->cpumask.pcpu)) { ++ pqueue->cpu_index = -1; ++ continue; ++ } ++ + pqueue->pd = pd; + pqueue->cpu_index = cpu_index; + cpu_index++; +-- +2.25.1 + diff --git a/queue-4.9/series b/queue-4.9/series index df065bb2b30..22b54555dfe 100644 --- a/queue-4.9/series +++ b/queue-4.9/series @@ -17,3 +17,9 @@ ceph-fix-double-unlock-in-handle_cap_export.patch usb-core-fix-misleading-driver-bug-report.patch platform-x86-asus-nb-wmi-do-not-load-on-asus-t100ta-.patch arm-futex-address-build-warning.patch +i2c-dev-fix-the-race-between-the-release-of-i2c_dev-.patch +padata-set-cpu_index-of-unused-cpus-to-1.patch +padata-replace-delayed-timer-with-immediate-workqueu.patch +padata-initialize-pd-cpu-with-effective-cpumask.patch +padata-purge-get_cpu-and-reorder_via_wq-from-padata_.patch +arm64-fix-the-flush_icache_range-arguments-in-machin.patch