1 From stable+bounces-209991-greg=kroah.com@vger.kernel.org Fri Jan 16 07:57:03 2026
2 From: Rajani Kantha <681739313@139.com>
3 Date: Fri, 16 Jan 2026 14:53:33 +0800
4 Subject: net: phy: allow MDIO bus PM ops to start/stop state machine for phylink-controlled PHY
5 To: vladimir.oltean@nxp.com, rmk+kernel@armlinux.org.uk, kuba@kernel.org, stable@vger.kernel.org
6 Message-ID: <20260116065334.18180-2-681739313@139.com>
8 From: Vladimir Oltean <vladimir.oltean@nxp.com>
10 [ Upstream commit fc75ea20ffb452652f0d4033f38fe88d7cfdae35 ]
12 DSA has 2 kinds of drivers:
14 1. Those who call dsa_switch_suspend() and dsa_switch_resume() from
15 their device PM ops: qca8k-8xxx, bcm_sf2, microchip ksz
16 2. Those who don't: all others. The above methods should be optional.
18 For type 1, dsa_switch_suspend() calls dsa_user_suspend() -> phylink_stop(),
19 and dsa_switch_resume() calls dsa_user_resume() -> phylink_start().
20 These seem good candidates for setting mac_managed_pm = true because
21 that is essentially its definition [1], but that does not seem to be the
22 biggest problem for now, and is not what this change focuses on.
24 Talking strictly about the 2nd category of DSA drivers here (which
25 do not have MAC managed PM, meaning that for their attached PHYs,
26 mdio_bus_phy_suspend() and mdio_bus_phy_resume() should run in full),
27 I have noticed that the following warning from mdio_bus_phy_resume() is
30 WARN_ON(phydev->state != PHY_HALTED && phydev->state != PHY_READY &&
31 phydev->state != PHY_UP);
33 because the PHY state machine is running.
35 It's running as a result of a previous dsa_user_open() -> ... ->
36 phylink_start() -> phy_start() having been initiated by the user.
38 The previous mdio_bus_phy_suspend() was supposed to have called
39 phy_stop_machine(), but it didn't. So this is why the PHY is in state
40 PHY_NOLINK by the time mdio_bus_phy_resume() runs.
42 mdio_bus_phy_suspend() did not call phy_stop_machine() because for
43 phylink, the phydev->adjust_link function pointer is NULL. This seems a
44 technicality introduced by commit fddd91016d16 ("phylib: fix PAL state
45 machine restart on resume"). That commit was written before phylink
46 existed, and was intended to avoid crashing with consumer drivers which
47 don't use the PHY state machine - phylink always does, when using a PHY.
48 But phylink itself has historically not been developed with
49 suspend/resume in mind, and apparently not tested too much in that
50 scenario, allowing this bug to exist unnoticed for so long. Plus, prior
51 to the WARN_ON(), it would have likely been invisible.
53 This issue is not in fact restricted to type 2 DSA drivers (according to
54 the above ad-hoc classification), but can be extrapolated to any MAC
55 driver with phylink and MDIO-bus-managed PHY PM ops. DSA is just where
56 the issue was reported. Assuming mac_managed_pm is set correctly, a
57 quick search indicates the following other drivers might be affected:
59 $ grep -Zlr PHYLINK_NETDEV drivers/ | xargs -0 grep -L mac_managed_pm
60 drivers/net/ethernet/atheros/ag71xx.c
61 drivers/net/ethernet/microchip/sparx5/sparx5_main.c
62 drivers/net/ethernet/microchip/lan966x/lan966x_main.c
63 drivers/net/ethernet/freescale/dpaa2/dpaa2-mac.c
64 drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c
65 drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
66 drivers/net/ethernet/freescale/ucc_geth.c
67 drivers/net/ethernet/freescale/enetc/enetc_pf_common.c
68 drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
69 drivers/net/ethernet/marvell/mvneta.c
70 drivers/net/ethernet/marvell/prestera/prestera_main.c
71 drivers/net/ethernet/mediatek/mtk_eth_soc.c
72 drivers/net/ethernet/altera/altera_tse_main.c
73 drivers/net/ethernet/wangxun/txgbe/txgbe_phy.c
74 drivers/net/ethernet/meta/fbnic/fbnic_phylink.c
75 drivers/net/ethernet/tehuti/tn40_phy.c
76 drivers/net/ethernet/mscc/ocelot_net.c
78 Make the existing conditions dependent on the PHY device having a
79 phydev->phy_link_change() implementation equal to the default
80 phy_link_change() provided by phylib. Otherwise, we implicitly know that
81 the phydev has the phylink-provided phylink_phy_change() callback, and
82 when phylink is used, the PHY state machine always needs to be stopped/
83 started on the suspend/resume path. The code is structured as such that
84 if phydev->phy_link_change() is absent, it is a matter of time until the
85 kernel will crash - no need to further complicate the test.
87 Thus, for the situation where the PM is not managed by the MAC, we will
88 make the MDIO bus PM ops treat identically the phylink-controlled PHYs
89 with the phylib-controlled PHYs where an adjust_link() callback is
90 supplied. In both cases, the MDIO bus PM ops should stop and restart the
93 [1] https://lore.kernel.org/netdev/Z-1tiW9zjcoFkhwc@shell.armlinux.org.uk/
95 Fixes: 744d23c71af3 ("net: phy: Warn about incorrect mdio_bus_phy_resume() state")
96 Reported-by: Wei Fang <wei.fang@nxp.com>
97 Tested-by: Wei Fang <wei.fang@nxp.com>
98 Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
99 Link: https://patch.msgid.link/20250407094042.2155633-1-vladimir.oltean@nxp.com
100 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
101 Signed-off-by: Rajani Kantha <681739313@139.com>
102 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
104 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
106 drivers/net/phy/phy_device.c | 31 +++++++++++++++++++++++++++++--
107 1 file changed, 29 insertions(+), 2 deletions(-)
109 --- a/drivers/net/phy/phy_device.c
110 +++ b/drivers/net/phy/phy_device.c
111 @@ -281,6 +281,33 @@ static void phy_link_change(struct phy_d
112 phydev->mii_ts->link_state(phydev->mii_ts, phydev);
116 + * phy_uses_state_machine - test whether consumer driver uses PAL state machine
117 + * @phydev: the target PHY device structure
119 + * Ultimately, this aims to indirectly determine whether the PHY is attached
120 + * to a consumer which uses the state machine by calling phy_start() and
123 + * When the PHY driver consumer uses phylib, it must have previously called
124 + * phy_connect_direct() or one of its derivatives, so that phy_prepare_link()
125 + * has set up a hook for monitoring state changes.
127 + * When the PHY driver is used by the MAC driver consumer through phylink (the
128 + * only other provider of a phy_link_change() method), using the PHY state
129 + * machine is not optional.
131 + * Return: true if consumer calls phy_start() and phy_stop(), false otherwise.
133 +static bool phy_uses_state_machine(struct phy_device *phydev)
135 + if (phydev->phy_link_change == phy_link_change)
136 + return phydev->attached_dev && phydev->adjust_link;
138 + /* phydev->phy_link_change is implicitly phylink_phy_change() */
142 static bool mdio_bus_phy_may_suspend(struct phy_device *phydev)
144 struct device_driver *drv = phydev->mdio.dev.driver;
145 @@ -341,7 +368,7 @@ static __maybe_unused int mdio_bus_phy_s
146 * may call phy routines that try to grab the same lock, and that may
147 * lead to a deadlock.
149 - if (phydev->attached_dev && phydev->adjust_link)
150 + if (phy_uses_state_machine(phydev))
151 phy_stop_machine(phydev);
153 if (!mdio_bus_phy_may_suspend(phydev))
154 @@ -395,7 +422,7 @@ no_resume:
158 - if (phydev->attached_dev && phydev->adjust_link)
159 + if (phy_uses_state_machine(phydev))
160 phy_start_machine(phydev);