From: Paul E. McKenney Date: Fri, 22 Jan 2021 23:26:44 +0000 (-0800) Subject: Merge branches 'doc.2021.01.06a', 'fixes.2021.01.04b', 'kfree_rcu.2021.01.04a', ... X-Git-Tag: v5.12-rc1~150^2~1^2 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=0d2460ba61841e5c2e64e77f7a84d3fc69cfe899;p=thirdparty%2Flinux.git Merge branches 'doc.2021.01.06a', 'fixes.2021.01.04b', 'kfree_rcu.2021.01.04a', 'mmdumpobj.2021.01.22a', 'nocb.2021.01.06a', 'rt.2021.01.04a', 'stall.2021.01.06a', 'torture.2021.01.12a' and 'tortureall.2021.01.06a' into HEAD doc.2021.01.06a: Documentation updates. fixes.2021.01.04b: Miscellaneous fixes. kfree_rcu.2021.01.04a: kfree_rcu() updates. mmdumpobj.2021.01.22a: Dump allocation point for memory blocks. nocb.2021.01.06a: RCU callback offload updates and cblist segment lengths. rt.2021.01.04a: Real-time updates. stall.2021.01.06a: RCU CPU stall warning updates. torture.2021.01.12a: Torture-test updates and polling SRCU grace-period API. tortureall.2021.01.06a: Torture-test script updates. --- 0d2460ba61841e5c2e64e77f7a84d3fc69cfe899 diff --cc Documentation/RCU/Design/Requirements/Requirements.rst index 42a81e30619ef,1ae79a10a8de6,e8c84fcc05071,e8c84fcc05071,e8c84fcc05071,e8c84fcc05071,e8c84fcc05071,93a189ae85924,e8c84fcc05071..0da9133fa13ab --- a/Documentation/RCU/Design/Requirements/Requirements.rst +++ b/Documentation/RCU/Design/Requirements/Requirements.rst @@@@@@@@@@ -1927,46 -1929,16 -1929,46 -1929,46 -1929,46 -1929,46 -1929,46 -1929,46 -1929,46 +1927,46 @@@@@@@@@@ The Linux-kernel CPU-hotplug implementa to allow the various kernel subsystems (including RCU) to respond appropriately to a given CPU-hotplug operation. Most RCU operations may be invoked from CPU-hotplug notifiers, including even synchronous - grace-period operations such as ``synchronize_rcu()`` and - ``synchronize_rcu_expedited()``. - - However, all-callback-wait operations such as ``rcu_barrier()`` are also - not supported, due to the fact that there are phases of CPU-hotplug - operations where the outgoing CPU's callbacks will not be invoked until - after the CPU-hotplug operation ends, which could also result in - deadlock. Furthermore, ``rcu_barrier()`` blocks CPU-hotplug operations - during its execution, which results in another type of deadlock when - invoked from a CPU-hotplug notifier. -------grace-period operations such as (``synchronize_rcu()`` and -------``synchronize_rcu_expedited()``). However, these synchronous operations ++++++++grace-period operations such as (synchronize_rcu() and ++++++++synchronize_rcu_expedited()). However, these synchronous operations + do block and therefore cannot be invoked from notifiers that execute via -------``stop_machine()``, specifically those between the ``CPUHP_AP_OFFLINE`` ++++++++stop_machine(), specifically those between the ``CPUHP_AP_OFFLINE`` + and ``CPUHP_AP_ONLINE`` states. + -------In addition, all-callback-wait operations such as ``rcu_barrier()`` may ++++++++In addition, all-callback-wait operations such as rcu_barrier() may + not be invoked from any CPU-hotplug notifier. This restriction is due + to the fact that there are phases of CPU-hotplug operations where the + outgoing CPU's callbacks will not be invoked until after the CPU-hotplug + operation ends, which could also result in deadlock. Furthermore, -------``rcu_barrier()`` blocks CPU-hotplug operations during its execution, ++++++++rcu_barrier() blocks CPU-hotplug operations during its execution, + which results in another type of deadlock when invoked from a CPU-hotplug + notifier. + + Finally, RCU must avoid deadlocks due to interaction between hotplug, + timers and grace period processing. It does so by maintaining its own set + of books that duplicate the centrally maintained ``cpu_online_mask``, + and also by reporting quiescent states explicitly when a CPU goes + offline. This explicit reporting of quiescent states avoids any need + for the force-quiescent-state loop (FQS) to report quiescent states for + offline CPUs. However, as a debugging measure, the FQS loop does splat + if offline CPUs block an RCU grace period for too long. + + An offline CPU's quiescent state will be reported either: + -------1. As the CPU goes offline using RCU's hotplug notifier (``rcu_report_dead()``). -------2. When grace period initialization (``rcu_gp_init()``) detects a ++++++++1. As the CPU goes offline using RCU's hotplug notifier (rcu_report_dead()). ++++++++2. When grace period initialization (rcu_gp_init()) detects a + race either with CPU offlining or with a task unblocking on a leaf + ``rcu_node`` structure whose CPUs are all offline. + -------The CPU-online path (``rcu_cpu_starting()``) should never need to report ++++++++The CPU-online path (rcu_cpu_starting()) should never need to report + a quiescent state for an offline CPU. However, as a debugging measure, + it does emit a warning if a quiescent state was not already reported + for that CPU. + + During the checking/modification of RCU's hotplug bookkeeping, the + corresponding CPU's leaf node lock is held. This avoids race conditions + between RCU's hotplug notifier hooks, the grace period initialization + code, and the FQS loop, all of which refer to or modify this bookkeeping. Scheduler and RCU ~~~~~~~~~~~~~~~~~ @@@@@@@@@@ -2590,14 -2562,14 -2592,14 -2592,14 -2592,14 -2592,14 -2592,14 -2592,32 -2592,14 +2590,32 @@@@@@@@@@ of your CPUs and the size of your memor The `SRCU API `__ --------includes ``srcu_read_lock()``, ``srcu_read_unlock()``, --------``srcu_dereference()``, ``srcu_dereference_check()``, --------``synchronize_srcu()``, ``synchronize_srcu_expedited()``, --------``call_srcu()``, ``srcu_barrier()``, and ``srcu_read_lock_held()``. It --------also includes ``DEFINE_SRCU()``, ``DEFINE_STATIC_SRCU()``, and --------``init_srcu_struct()`` APIs for defining and initializing ++++++++includes srcu_read_lock(), srcu_read_unlock(), ++++++++srcu_dereference(), srcu_dereference_check(), ++++++++synchronize_srcu(), synchronize_srcu_expedited(), ++++++++call_srcu(), srcu_barrier(), and srcu_read_lock_held(). It ++++++++also includes DEFINE_SRCU(), DEFINE_STATIC_SRCU(), and ++++++++init_srcu_struct() APIs for defining and initializing ``srcu_struct`` structures. +++++++ +More recently, the SRCU API has added polling interfaces: +++++++ + +++++++ +#. start_poll_synchronize_srcu() returns a cookie identifying +++++++ + the completion of a future SRCU grace period and ensures +++++++ + that this grace period will be started. +++++++ +#. poll_state_synchronize_srcu() returns ``true`` iff the +++++++ + specified cookie corresponds to an already-completed +++++++ + SRCU grace period. +++++++ +#. get_state_synchronize_srcu() returns a cookie just like +++++++ + start_poll_synchronize_srcu() does, but differs in that +++++++ + it does nothing to ensure that any future SRCU grace period +++++++ + will be started. +++++++ + +++++++ +These functions are used to avoid unnecessary SRCU grace periods in +++++++ +certain types of buffer-cache algorithms having multi-stage age-out +++++++ +mechanisms. The idea is that by the time the block has aged completely +++++++ +from the cache, an SRCU grace period will be very likely to have elapsed. +++++++ + Tasks RCU ~~~~~~~~~ diff --cc kernel/locking/locktorture.c index fd838cea39349,af99e9ca285a6,fd838cea39349,fd838cea39349,fd838cea39349,fd838cea39349,fd838cea39349,fd838cea39349,fd838cea39349..0ab94e1f1276a --- a/kernel/locking/locktorture.c +++ b/kernel/locking/locktorture.c @@@@@@@@@@ -27,9 -27,7 -27,9 -27,9 -27,9 -27,9 -27,9 -27,9 -27,9 +27,8 @@@@@@@@@@ #include #include #include - -------#include #include + #include MODULE_LICENSE("GPL"); MODULE_AUTHOR("Paul E. McKenney "); diff --cc kernel/rcu/rcutorture.c index 528ed10b78fdc,916ea4f66e4b2,528ed10b78fdc,528ed10b78fdc,b9dd63c166b9b,528ed10b78fdc,528ed10b78fdc,a816df4e86e00,528ed10b78fdc..99657ffa66887 --- a/kernel/rcu/rcutorture.c +++ b/kernel/rcu/rcutorture.c @@@@@@@@@@ -1018,42 -1008,40 -1018,42 -1018,42 -1024,42 -1018,42 -1018,42 -1062,26 -1018,42 +1068,26 @@@@@@@@@@ rcu_torture_fqs(void *arg return 0; } +++++++ +// Used by writers to randomly choose from the available grace-period +++++++ +// primitives. The only purpose of the initialization is to size the array. +++++++ +static int synctype[] = { RTWS_DEF_FREE, RTWS_EXP_SYNC, RTWS_COND_GET, RTWS_POLL_GET, RTWS_SYNC }; +++++++ +static int nsynctypes; +++++++ + /* ------- - * RCU torture writer kthread. Repeatedly substitutes a new structure ------- - * for that pointed to by rcu_torture_current, freeing the old structure ------- - * after a series of grace periods (the "pipeline"). +++++++ + * Determine which grace-period primitives are available. */ ------- -static int ------- -rcu_torture_writer(void *arg) +++++++ +static void rcu_torture_write_types(void) { ------- - bool can_expedite = !rcu_gp_is_expedited() && !rcu_gp_is_normal(); ------- - int expediting = 0; ------- - unsigned long gp_snap; bool gp_cond1 = gp_cond, gp_exp1 = gp_exp, gp_normal1 = gp_normal; ------- - bool gp_sync1 = gp_sync; ------- - int i; - ----- - int oldnice = task_nice(current); ------- - struct rcu_torture *rp; ------- - struct rcu_torture *old_rp; ------- - static DEFINE_TORTURE_RANDOM(rand); - ----- - bool stutter_waited; ------- - int synctype[] = { RTWS_DEF_FREE, RTWS_EXP_SYNC, ------- - RTWS_COND_GET, RTWS_SYNC }; ------- - int nsynctypes = 0; ------- - ------- - VERBOSE_TOROUT_STRING("rcu_torture_writer task started"); ------- - if (!can_expedite) ------- - pr_alert("%s" TORTURE_FLAG ------- - " GP expediting controlled from boot/sysfs for %s.\n", ------- - torture_type, cur_ops->name); +++++++ + bool gp_poll1 = gp_poll, gp_sync1 = gp_sync; /* Initialize synctype[] array. If none set, take default. */ ------- - if (!gp_cond1 && !gp_exp1 && !gp_normal1 && !gp_sync1) ------- - gp_cond1 = gp_exp1 = gp_normal1 = gp_sync1 = true; ------- - if (gp_cond1 && cur_ops->get_state && cur_ops->cond_sync) { +++++++ + if (!gp_cond1 && !gp_exp1 && !gp_normal1 && !gp_poll1 && !gp_sync1) +++++++ + gp_cond1 = gp_exp1 = gp_normal1 = gp_poll1 = gp_sync1 = true; +++++++ + if (gp_cond1 && cur_ops->get_gp_state && cur_ops->cond_sync) { synctype[nsynctypes++] = RTWS_COND_GET; pr_info("%s: Testing conditional GPs.\n", __func__); ------- - } else if (gp_cond && (!cur_ops->get_state || !cur_ops->cond_sync)) { +++++++ + } else if (gp_cond && (!cur_ops->get_gp_state || !cur_ops->cond_sync)) { pr_alert("%s: gp_cond without primitives.\n", __func__); } if (gp_exp1 && cur_ops->exp_sync) { @@@@@@@@@@ -1155,8 -1143,7 -1155,8 -1155,8 -1161,8 -1155,8 -1155,8 -1243,9 -1155,8 +1249,9 @@@@@@@@@@ rcu_torture_writer(void *arg !rcu_gp_is_normal(); } rcu_torture_writer_state = RTWS_STUTTER; - if (stutter_wait("rcu_torture_writer") && +++++++ + boot_ended = rcu_inkernel_boot_has_ended(); + stutter_waited = stutter_wait("rcu_torture_writer"); + if (stutter_waited && !READ_ONCE(rcu_fwd_cb_nodelay) && !cur_ops->slow_gps && !torture_must_stop() && @@@@@@@@@@ -2505,13 -2484,13 -2505,13 -2505,13 -2569,13 -2505,13 -2505,13 -2681,15 -2505,13 +2745,15 @@@@@@@@@@ rcu_torture_cleanup(void torture_stop_kthread(rcu_torture_reader, reader_tasks[i]); kfree(reader_tasks); + reader_tasks = NULL; } +++++++ + kfree(rcu_torture_reader_mbchk); +++++++ + rcu_torture_reader_mbchk = NULL; if (fakewriter_tasks) { - for (i = 0; i < nfakewriters; i++) { + for (i = 0; i < nfakewriters; i++) torture_stop_kthread(rcu_torture_fakewriter, fakewriter_tasks[i]); - } kfree(fakewriter_tasks); fakewriter_tasks = NULL; } diff --cc kernel/rcu/tasks.h index 35bdcfd84d428,74767d365752c,35bdcfd84d428,35bdcfd84d428,35bdcfd84d428,35bdcfd84d428,35bdcfd84d428,35bdcfd84d428,35bdcfd84d428..af7c19439f4ec --- a/kernel/rcu/tasks.h +++ b/kernel/rcu/tasks.h @@@@@@@@@@ -564,10 -569,9 -564,10 -564,10 -564,10 -564,10 -564,10 -564,10 -564,10 +564,9 @@@@@@@@@@ static int __init rcu_spawn_tasks_kthre rcu_spawn_tasks_kthread_generic(&rcu_tasks); return 0; } - -------core_initcall(rcu_spawn_tasks_kthread); - #ifndef CONFIG_TINY_RCU - static void show_rcu_tasks_classic_gp_kthread(void) + #if !defined(CONFIG_TINY_RCU) + void show_rcu_tasks_classic_gp_kthread(void) { show_rcu_tasks_generic_gp_kthread(&rcu_tasks, ""); } @@@@@@@@@@ -692,10 -696,9 -692,10 -692,10 -692,10 -692,10 -692,10 -692,10 -692,10 +691,9 @@@@@@@@@@ static int __init rcu_spawn_tasks_rude_ rcu_spawn_tasks_kthread_generic(&rcu_tasks_rude); return 0; } - -------core_initcall(rcu_spawn_tasks_rude_kthread); - #ifndef CONFIG_TINY_RCU - static void show_rcu_tasks_rude_gp_kthread(void) + #if !defined(CONFIG_TINY_RCU) + void show_rcu_tasks_rude_gp_kthread(void) { show_rcu_tasks_generic_gp_kthread(&rcu_tasks_rude, ""); } @@@@@@@@@@ -1193,10 -1203,9 -1193,10 -1193,10 -1193,10 -1193,10 -1193,10 -1193,10 -1193,10 +1196,9 @@@@@@@@@@ static int __init rcu_spawn_tasks_trace rcu_spawn_tasks_kthread_generic(&rcu_tasks_trace); return 0; } - -------core_initcall(rcu_spawn_tasks_trace_kthread); - #ifndef CONFIG_TINY_RCU - static void show_rcu_tasks_trace_gp_kthread(void) + #if !defined(CONFIG_TINY_RCU) + void show_rcu_tasks_trace_gp_kthread(void) { char buf[64]; diff --cc kernel/rcu/tree.c index 40e5e3dd253e0,f70634f7c3aa4,2db736cbe3422,84513c52d9b07,e6dee714efe0a,d60903581300d,e918f100cc347,40e5e3dd253e0,40e5e3dd253e0..0f4a6a3c057b0 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@@@@@@@@@ -1765,14 -1756,8 -1765,14 -1765,14 -1772,14 -1767,14 -1765,14 -1765,14 -1765,14 +1774,14 @@@@@@@@@@ static bool rcu_gp_init(void * go offline later. Please also refer to "Hotplug CPU" section * of RCU's Requirements documentation. */ ------ -- rcu_state.gp_state = RCU_GP_ONOFF; ++++++ ++ WRITE_ONCE(rcu_state.gp_state, RCU_GP_ONOFF); rcu_for_each_leaf_node(rnp) { + smp_mb(); // Pair with barriers used when updating ->ofl_seq to odd values. + firstseq = READ_ONCE(rnp->ofl_seq); + if (firstseq & 0x1) + while (firstseq == READ_ONCE(rnp->ofl_seq)) + schedule_timeout_idle(1); // Can't wake unless RCU is watching. + smp_mb(); // Pair with barriers used when updating ->ofl_seq to even values. raw_spin_lock(&rcu_state.ofl_lock); raw_spin_lock_irq_rcu_node(rnp); if (rnp->qsmaskinit == rnp->qsmaskinitnext && @@@@@@@@@@ -2430,11 -2416,12 -2430,11 -2430,11 -2437,12 -2432,11 -2435,11 -2430,11 -2430,11 +2444,12 @@@@@@@@@@ int rcutree_dead_cpu(unsigned int cpu static void rcu_do_batch(struct rcu_data *rdp) { int div; ++++ ++++ bool __maybe_unused empty; unsigned long flags; - const bool offloaded = IS_ENABLED(CONFIG_RCU_NOCB_CPU) && - rcu_segcblist_is_offloaded(&rdp->cblist); + const bool offloaded = rcu_segcblist_is_offloaded(&rdp->cblist); struct rcu_head *rhp; struct rcu_cblist rcl = RCU_CBLIST_INITIALIZER(rcl); ---- ---- long bl, count; ++++ ++++ long bl, count = 0; long pending, tlimit = 0; /* If no callbacks are ready, just return. */ @@@@@@@@@@ -2688,7 -2677,8 -2688,7 -2688,7 -2699,7 -2690,7 -2693,7 -2688,7 -2688,7 +2708,7 @@@@@@@@@@ static __latent_entropy void rcu_core(v unsigned long flags; struct rcu_data *rdp = raw_cpu_ptr(&rcu_data); struct rcu_node *rnp = rdp->mynode; - -- ---- const bool offloaded = rcu_segcblist_is_offloaded(&rdp->cblist); - const bool offloaded = IS_ENABLED(CONFIG_RCU_NOCB_CPU) && - rcu_segcblist_is_offloaded(&rdp->cblist); ++++ ++++ const bool do_batch = !rcu_segcblist_completely_offloaded(&rdp->cblist); if (cpu_is_offline(smp_processor_id())) return; @@@@@@@@@@ -2989,8 -2979,9 -2989,8 -2992,8 -3000,10 -2991,8 -2994,8 -2989,8 -2989,8 +3012,10 @@@@@@@@@@ __call_rcu(struct rcu_head *head, rcu_c trace_rcu_callback(rcu_state.name, head, rcu_segcblist_n_cbs(&rdp->cblist)); ++++ ++++ trace_rcu_segcb_stats(&rdp->cblist, TPS("SegCBQueued")); ++++ ++++ /* Go handle any RCU core processing required. */ - if (IS_ENABLED(CONFIG_RCU_NOCB_CPU) && - unlikely(rcu_segcblist_is_offloaded(&rdp->cblist))) { + if (unlikely(rcu_segcblist_is_offloaded(&rdp->cblist))) { __call_rcu_nocb_wake(rdp, was_alldone, flags); /* unlocks */ } else { __call_rcu_core(rdp, head, flags); @@@@@@@@@@ -3498,10 -3454,12 -3498,11 -3501,10 -3511,10 -3500,10 -3503,10 -3498,10 -3498,10 +3523,11 @@@@@@@@@@ void kvfree_call_rcu(struct rcu_head *h goto unlock_return; } - /* - * Under high memory pressure GFP_NOWAIT can fail, - * in that case the emergency path is maintained. - */ ++ ++++++ kasan_record_aux_stack(ptr); success = kvfree_call_rcu_add_ptr_to_bulk(krcp, ptr); if (!success) { + run_page_cache_worker(krcp); + if (head == NULL) // Inline if kvfree_rcu(one_arg) call. goto unlock_return;