]>
Commit | Line | Data |
---|---|---|
12eb7899 GKH |
1 | From foo@baz Thu Oct 4 12:38:43 PDT 2018 |
2 | From: Emmanuel Grumbach <emmanuel.grumbach@intel.com> | |
3 | Date: Fri, 31 Aug 2018 11:31:06 +0300 | |
4 | Subject: mac80211: fix a race between restart and CSA flows | |
5 | ||
6 | From: Emmanuel Grumbach <emmanuel.grumbach@intel.com> | |
7 | ||
8 | [ Upstream commit f3ffb6c3a28963657eb8b02a795d75f2ebbd5ef4 ] | |
9 | ||
10 | We hit a problem with iwlwifi that was caused by a bug in | |
11 | mac80211. A bug in iwlwifi caused the firwmare to crash in | |
12 | certain cases in channel switch. Because of that bug, | |
13 | drv_pre_channel_switch would fail and trigger the restart | |
14 | flow. | |
15 | Now we had the hw restart worker which runs on the system's | |
16 | workqueue and the csa_connection_drop_work worker that runs | |
17 | on mac80211's workqueue that can run together. This is | |
18 | obviously problematic since the restart work wants to | |
19 | reconfigure the connection, while the csa_connection_drop_work | |
20 | worker does the exact opposite: it tries to disconnect. | |
21 | ||
22 | Fix this by cancelling the csa_connection_drop_work worker | |
23 | in the restart worker. | |
24 | ||
25 | Note that this can sound racy: we could have: | |
26 | ||
27 | driver iface_work CSA_work restart_work | |
28 | +++++++++++++++++++++++++++++++++++++++++++++ | |
29 | | | |
30 | <--drv_cs ---| | |
31 | <FW CRASH!> | |
32 | -CS FAILED--> | |
33 | | | | |
34 | | cancel_work(CSA) | |
35 | schedule | | |
36 | CSA work | | |
37 | | | | |
38 | Race between those 2 | |
39 | ||
40 | But this is not possible because we flush the workqueue | |
41 | in the restart worker before we cancel the CSA worker. | |
42 | That would be bullet proof if we could guarantee that | |
43 | we schedule the CSA worker only from the iface_work | |
44 | which runs on the workqueue (and not on the system's | |
45 | workqueue), but unfortunately we do have an instance | |
46 | in which we schedule the CSA work outside the context | |
47 | of the workqueue (ieee80211_chswitch_done). | |
48 | ||
49 | Note also that we should probably cancel other workers | |
50 | like beacon_connection_loss_work and possibly others | |
51 | for different types of interfaces, at the very least, | |
52 | IBSS should suffer from the exact same problem, but for | |
53 | now, do the minimum to fix the actual bug that was actually | |
54 | experienced and reproduced. | |
55 | ||
56 | Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> | |
57 | Signed-off-by: Luca Coelho <luciano.coelho@intel.com> | |
58 | Signed-off-by: Johannes Berg <johannes.berg@intel.com> | |
59 | Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> | |
60 | Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> | |
61 | --- | |
62 | net/mac80211/main.c | 21 ++++++++++++++++++++- | |
63 | 1 file changed, 20 insertions(+), 1 deletion(-) | |
64 | ||
65 | --- a/net/mac80211/main.c | |
66 | +++ b/net/mac80211/main.c | |
67 | @@ -254,8 +254,27 @@ static void ieee80211_restart_work(struc | |
68 | "%s called with hardware scan in progress\n", __func__); | |
69 | ||
70 | rtnl_lock(); | |
71 | - list_for_each_entry(sdata, &local->interfaces, list) | |
72 | + list_for_each_entry(sdata, &local->interfaces, list) { | |
73 | + /* | |
74 | + * XXX: there may be more work for other vif types and even | |
75 | + * for station mode: a good thing would be to run most of | |
76 | + * the iface type's dependent _stop (ieee80211_mg_stop, | |
77 | + * ieee80211_ibss_stop) etc... | |
78 | + * For now, fix only the specific bug that was seen: race | |
79 | + * between csa_connection_drop_work and us. | |
80 | + */ | |
81 | + if (sdata->vif.type == NL80211_IFTYPE_STATION) { | |
82 | + /* | |
83 | + * This worker is scheduled from the iface worker that | |
84 | + * runs on mac80211's workqueue, so we can't be | |
85 | + * scheduling this worker after the cancel right here. | |
86 | + * The exception is ieee80211_chswitch_done. | |
87 | + * Then we can have a race... | |
88 | + */ | |
89 | + cancel_work_sync(&sdata->u.mgd.csa_connection_drop_work); | |
90 | + } | |
91 | flush_delayed_work(&sdata->dec_tailroom_needed_wk); | |
92 | + } | |
93 | ieee80211_scan_cancel(local); | |
94 | ||
95 | /* make sure any new ROC will consider local->in_reconfig */ |