Back to home page

OSCL-LXR

 
 

    


0001 =====================
0002 PHY Abstraction Layer
0003 =====================
0004 
0005 Purpose
0006 =======
0007 
0008 Most network devices consist of set of registers which provide an interface
0009 to a MAC layer, which communicates with the physical connection through a
0010 PHY.  The PHY concerns itself with negotiating link parameters with the link
0011 partner on the other side of the network connection (typically, an ethernet
0012 cable), and provides a register interface to allow drivers to determine what
0013 settings were chosen, and to configure what settings are allowed.
0014 
0015 While these devices are distinct from the network devices, and conform to a
0016 standard layout for the registers, it has been common practice to integrate
0017 the PHY management code with the network driver.  This has resulted in large
0018 amounts of redundant code.  Also, on embedded systems with multiple (and
0019 sometimes quite different) ethernet controllers connected to the same
0020 management bus, it is difficult to ensure safe use of the bus.
0021 
0022 Since the PHYs are devices, and the management busses through which they are
0023 accessed are, in fact, busses, the PHY Abstraction Layer treats them as such.
0024 In doing so, it has these goals:
0025 
0026 #. Increase code-reuse
0027 #. Increase overall code-maintainability
0028 #. Speed development time for new network drivers, and for new systems
0029 
0030 Basically, this layer is meant to provide an interface to PHY devices which
0031 allows network driver writers to write as little code as possible, while
0032 still providing a full feature set.
0033 
0034 The MDIO bus
0035 ============
0036 
0037 Most network devices are connected to a PHY by means of a management bus.
0038 Different devices use different busses (though some share common interfaces).
0039 In order to take advantage of the PAL, each bus interface needs to be
0040 registered as a distinct device.
0041 
0042 #. read and write functions must be implemented. Their prototypes are::
0043 
0044         int write(struct mii_bus *bus, int mii_id, int regnum, u16 value);
0045         int read(struct mii_bus *bus, int mii_id, int regnum);
0046 
0047    mii_id is the address on the bus for the PHY, and regnum is the register
0048    number.  These functions are guaranteed not to be called from interrupt
0049    time, so it is safe for them to block, waiting for an interrupt to signal
0050    the operation is complete
0051 
0052 #. A reset function is optional. This is used to return the bus to an
0053    initialized state.
0054 
0055 #. A probe function is needed.  This function should set up anything the bus
0056    driver needs, setup the mii_bus structure, and register with the PAL using
0057    mdiobus_register.  Similarly, there's a remove function to undo all of
0058    that (use mdiobus_unregister).
0059 
0060 #. Like any driver, the device_driver structure must be configured, and init
0061    exit functions are used to register the driver.
0062 
0063 #. The bus must also be declared somewhere as a device, and registered.
0064 
0065 As an example for how one driver implemented an mdio bus driver, see
0066 drivers/net/ethernet/freescale/fsl_pq_mdio.c and an associated DTS file
0067 for one of the users. (e.g. "git grep fsl,.*-mdio arch/powerpc/boot/dts/")
0068 
0069 (RG)MII/electrical interface considerations
0070 ===========================================
0071 
0072 The Reduced Gigabit Medium Independent Interface (RGMII) is a 12-pin
0073 electrical signal interface using a synchronous 125Mhz clock signal and several
0074 data lines. Due to this design decision, a 1.5ns to 2ns delay must be added
0075 between the clock line (RXC or TXC) and the data lines to let the PHY (clock
0076 sink) have a large enough setup and hold time to sample the data lines correctly. The
0077 PHY library offers different types of PHY_INTERFACE_MODE_RGMII* values to let
0078 the PHY driver and optionally the MAC driver, implement the required delay. The
0079 values of phy_interface_t must be understood from the perspective of the PHY
0080 device itself, leading to the following:
0081 
0082 * PHY_INTERFACE_MODE_RGMII: the PHY is not responsible for inserting any
0083   internal delay by itself, it assumes that either the Ethernet MAC (if capable)
0084   or the PCB traces insert the correct 1.5-2ns delay
0085 
0086 * PHY_INTERFACE_MODE_RGMII_TXID: the PHY should insert an internal delay
0087   for the transmit data lines (TXD[3:0]) processed by the PHY device
0088 
0089 * PHY_INTERFACE_MODE_RGMII_RXID: the PHY should insert an internal delay
0090   for the receive data lines (RXD[3:0]) processed by the PHY device
0091 
0092 * PHY_INTERFACE_MODE_RGMII_ID: the PHY should insert internal delays for
0093   both transmit AND receive data lines from/to the PHY device
0094 
0095 Whenever possible, use the PHY side RGMII delay for these reasons:
0096 
0097 * PHY devices may offer sub-nanosecond granularity in how they allow a
0098   receiver/transmitter side delay (e.g: 0.5, 1.0, 1.5ns) to be specified. Such
0099   precision may be required to account for differences in PCB trace lengths
0100 
0101 * PHY devices are typically qualified for a large range of applications
0102   (industrial, medical, automotive...), and they provide a constant and
0103   reliable delay across temperature/pressure/voltage ranges
0104 
0105 * PHY device drivers in PHYLIB being reusable by nature, being able to
0106   configure correctly a specified delay enables more designs with similar delay
0107   requirements to be operated correctly
0108 
0109 For cases where the PHY is not capable of providing this delay, but the
0110 Ethernet MAC driver is capable of doing so, the correct phy_interface_t value
0111 should be PHY_INTERFACE_MODE_RGMII, and the Ethernet MAC driver should be
0112 configured correctly in order to provide the required transmit and/or receive
0113 side delay from the perspective of the PHY device. Conversely, if the Ethernet
0114 MAC driver looks at the phy_interface_t value, for any other mode but
0115 PHY_INTERFACE_MODE_RGMII, it should make sure that the MAC-level delays are
0116 disabled.
0117 
0118 In case neither the Ethernet MAC, nor the PHY are capable of providing the
0119 required delays, as defined per the RGMII standard, several options may be
0120 available:
0121 
0122 * Some SoCs may offer a pin pad/mux/controller capable of configuring a given
0123   set of pins'strength, delays, and voltage; and it may be a suitable
0124   option to insert the expected 2ns RGMII delay.
0125 
0126 * Modifying the PCB design to include a fixed delay (e.g: using a specifically
0127   designed serpentine), which may not require software configuration at all.
0128 
0129 Common problems with RGMII delay mismatch
0130 -----------------------------------------
0131 
0132 When there is a RGMII delay mismatch between the Ethernet MAC and the PHY, this
0133 will most likely result in the clock and data line signals to be unstable when
0134 the PHY or MAC take a snapshot of these signals to translate them into logical
0135 1 or 0 states and reconstruct the data being transmitted/received. Typical
0136 symptoms include:
0137 
0138 * Transmission/reception partially works, and there is frequent or occasional
0139   packet loss observed
0140 
0141 * Ethernet MAC may report some or all packets ingressing with a FCS/CRC error,
0142   or just discard them all
0143 
0144 * Switching to lower speeds such as 10/100Mbits/sec makes the problem go away
0145   (since there is enough setup/hold time in that case)
0146 
0147 Connecting to a PHY
0148 ===================
0149 
0150 Sometime during startup, the network driver needs to establish a connection
0151 between the PHY device, and the network device.  At this time, the PHY's bus
0152 and drivers need to all have been loaded, so it is ready for the connection.
0153 At this point, there are several ways to connect to the PHY:
0154 
0155 #. The PAL handles everything, and only calls the network driver when
0156    the link state changes, so it can react.
0157 
0158 #. The PAL handles everything except interrupts (usually because the
0159    controller has the interrupt registers).
0160 
0161 #. The PAL handles everything, but checks in with the driver every second,
0162    allowing the network driver to react first to any changes before the PAL
0163    does.
0164 
0165 #. The PAL serves only as a library of functions, with the network device
0166    manually calling functions to update status, and configure the PHY
0167 
0168 
0169 Letting the PHY Abstraction Layer do Everything
0170 ===============================================
0171 
0172 If you choose option 1 (The hope is that every driver can, but to still be
0173 useful to drivers that can't), connecting to the PHY is simple:
0174 
0175 First, you need a function to react to changes in the link state.  This
0176 function follows this protocol::
0177 
0178         static void adjust_link(struct net_device *dev);
0179 
0180 Next, you need to know the device name of the PHY connected to this device.
0181 The name will look something like, "0:00", where the first number is the
0182 bus id, and the second is the PHY's address on that bus.  Typically,
0183 the bus is responsible for making its ID unique.
0184 
0185 Now, to connect, just call this function::
0186 
0187         phydev = phy_connect(dev, phy_name, &adjust_link, interface);
0188 
0189 *phydev* is a pointer to the phy_device structure which represents the PHY.
0190 If phy_connect is successful, it will return the pointer.  dev, here, is the
0191 pointer to your net_device.  Once done, this function will have started the
0192 PHY's software state machine, and registered for the PHY's interrupt, if it
0193 has one.  The phydev structure will be populated with information about the
0194 current state, though the PHY will not yet be truly operational at this
0195 point.
0196 
0197 PHY-specific flags should be set in phydev->dev_flags prior to the call
0198 to phy_connect() such that the underlying PHY driver can check for flags
0199 and perform specific operations based on them.
0200 This is useful if the system has put hardware restrictions on
0201 the PHY/controller, of which the PHY needs to be aware.
0202 
0203 *interface* is a u32 which specifies the connection type used
0204 between the controller and the PHY.  Examples are GMII, MII,
0205 RGMII, and SGMII.  See "PHY interface mode" below.  For a full
0206 list, see include/linux/phy.h
0207 
0208 Now just make sure that phydev->supported and phydev->advertising have any
0209 values pruned from them which don't make sense for your controller (a 10/100
0210 controller may be connected to a gigabit capable PHY, so you would need to
0211 mask off SUPPORTED_1000baseT*).  See include/linux/ethtool.h for definitions
0212 for these bitfields. Note that you should not SET any bits, except the
0213 SUPPORTED_Pause and SUPPORTED_AsymPause bits (see below), or the PHY may get
0214 put into an unsupported state.
0215 
0216 Lastly, once the controller is ready to handle network traffic, you call
0217 phy_start(phydev).  This tells the PAL that you are ready, and configures the
0218 PHY to connect to the network. If the MAC interrupt of your network driver
0219 also handles PHY status changes, just set phydev->irq to PHY_MAC_INTERRUPT
0220 before you call phy_start and use phy_mac_interrupt() from the network
0221 driver. If you don't want to use interrupts, set phydev->irq to PHY_POLL.
0222 phy_start() enables the PHY interrupts (if applicable) and starts the
0223 phylib state machine.
0224 
0225 When you want to disconnect from the network (even if just briefly), you call
0226 phy_stop(phydev). This function also stops the phylib state machine and
0227 disables PHY interrupts.
0228 
0229 PHY interface modes
0230 ===================
0231 
0232 The PHY interface mode supplied in the phy_connect() family of functions
0233 defines the initial operating mode of the PHY interface.  This is not
0234 guaranteed to remain constant; there are PHYs which dynamically change
0235 their interface mode without software interaction depending on the
0236 negotiation results.
0237 
0238 Some of the interface modes are described below:
0239 
0240 ``PHY_INTERFACE_MODE_SMII``
0241     This is serial MII, clocked at 125MHz, supporting 100M and 10M speeds.
0242     Some details can be found in
0243     https://opencores.org/ocsvn/smii/smii/trunk/doc/SMII.pdf
0244 
0245 ``PHY_INTERFACE_MODE_1000BASEX``
0246     This defines the 1000BASE-X single-lane serdes link as defined by the
0247     802.3 standard section 36.  The link operates at a fixed bit rate of
0248     1.25Gbaud using a 10B/8B encoding scheme, resulting in an underlying
0249     data rate of 1Gbps.  Embedded in the data stream is a 16-bit control
0250     word which is used to negotiate the duplex and pause modes with the
0251     remote end.  This does not include "up-clocked" variants such as 2.5Gbps
0252     speeds (see below.)
0253 
0254 ``PHY_INTERFACE_MODE_2500BASEX``
0255     This defines a variant of 1000BASE-X which is clocked 2.5 times as fast
0256     as the 802.3 standard, giving a fixed bit rate of 3.125Gbaud.
0257 
0258 ``PHY_INTERFACE_MODE_SGMII``
0259     This is used for Cisco SGMII, which is a modification of 1000BASE-X
0260     as defined by the 802.3 standard.  The SGMII link consists of a single
0261     serdes lane running at a fixed bit rate of 1.25Gbaud with 10B/8B
0262     encoding.  The underlying data rate is 1Gbps, with the slower speeds of
0263     100Mbps and 10Mbps being achieved through replication of each data symbol.
0264     The 802.3 control word is re-purposed to send the negotiated speed and
0265     duplex information from to the MAC, and for the MAC to acknowledge
0266     receipt.  This does not include "up-clocked" variants such as 2.5Gbps
0267     speeds.
0268 
0269     Note: mismatched SGMII vs 1000BASE-X configuration on a link can
0270     successfully pass data in some circumstances, but the 16-bit control
0271     word will not be correctly interpreted, which may cause mismatches in
0272     duplex, pause or other settings.  This is dependent on the MAC and/or
0273     PHY behaviour.
0274 
0275 ``PHY_INTERFACE_MODE_5GBASER``
0276     This is the IEEE 802.3 Clause 129 defined 5GBASE-R protocol. It is
0277     identical to the 10GBASE-R protocol defined in Clause 49, with the
0278     exception that it operates at half the frequency. Please refer to the
0279     IEEE standard for the definition.
0280 
0281 ``PHY_INTERFACE_MODE_10GBASER``
0282     This is the IEEE 802.3 Clause 49 defined 10GBASE-R protocol used with
0283     various different mediums. Please refer to the IEEE standard for a
0284     definition of this.
0285 
0286     Note: 10GBASE-R is just one protocol that can be used with XFI and SFI.
0287     XFI and SFI permit multiple protocols over a single SERDES lane, and
0288     also defines the electrical characteristics of the signals with a host
0289     compliance board plugged into the host XFP/SFP connector. Therefore,
0290     XFI and SFI are not PHY interface types in their own right.
0291 
0292 ``PHY_INTERFACE_MODE_10GKR``
0293     This is the IEEE 802.3 Clause 49 defined 10GBASE-R with Clause 73
0294     autonegotiation. Please refer to the IEEE standard for further
0295     information.
0296 
0297     Note: due to legacy usage, some 10GBASE-R usage incorrectly makes
0298     use of this definition.
0299 
0300 ``PHY_INTERFACE_MODE_25GBASER``
0301     This is the IEEE 802.3 PCS Clause 107 defined 25GBASE-R protocol.
0302     The PCS is identical to 10GBASE-R, i.e. 64B/66B encoded
0303     running 2.5 as fast, giving a fixed bit rate of 25.78125 Gbaud.
0304     Please refer to the IEEE standard for further information.
0305 
0306 ``PHY_INTERFACE_MODE_100BASEX``
0307     This defines IEEE 802.3 Clause 24.  The link operates at a fixed data
0308     rate of 125Mpbs using a 4B/5B encoding scheme, resulting in an underlying
0309     data rate of 100Mpbs.
0310 
0311 Pause frames / flow control
0312 ===========================
0313 
0314 The PHY does not participate directly in flow control/pause frames except by
0315 making sure that the SUPPORTED_Pause and SUPPORTED_AsymPause bits are set in
0316 MII_ADVERTISE to indicate towards the link partner that the Ethernet MAC
0317 controller supports such a thing. Since flow control/pause frames generation
0318 involves the Ethernet MAC driver, it is recommended that this driver takes care
0319 of properly indicating advertisement and support for such features by setting
0320 the SUPPORTED_Pause and SUPPORTED_AsymPause bits accordingly. This can be done
0321 either before or after phy_connect() and/or as a result of implementing the
0322 ethtool::set_pauseparam feature.
0323 
0324 
0325 Keeping Close Tabs on the PAL
0326 =============================
0327 
0328 It is possible that the PAL's built-in state machine needs a little help to
0329 keep your network device and the PHY properly in sync.  If so, you can
0330 register a helper function when connecting to the PHY, which will be called
0331 every second before the state machine reacts to any changes.  To do this, you
0332 need to manually call phy_attach() and phy_prepare_link(), and then call
0333 phy_start_machine() with the second argument set to point to your special
0334 handler.
0335 
0336 Currently there are no examples of how to use this functionality, and testing
0337 on it has been limited because the author does not have any drivers which use
0338 it (they all use option 1).  So Caveat Emptor.
0339 
0340 Doing it all yourself
0341 =====================
0342 
0343 There's a remote chance that the PAL's built-in state machine cannot track
0344 the complex interactions between the PHY and your network device.  If this is
0345 so, you can simply call phy_attach(), and not call phy_start_machine or
0346 phy_prepare_link().  This will mean that phydev->state is entirely yours to
0347 handle (phy_start and phy_stop toggle between some of the states, so you
0348 might need to avoid them).
0349 
0350 An effort has been made to make sure that useful functionality can be
0351 accessed without the state-machine running, and most of these functions are
0352 descended from functions which did not interact with a complex state-machine.
0353 However, again, no effort has been made so far to test running without the
0354 state machine, so tryer beware.
0355 
0356 Here is a brief rundown of the functions::
0357 
0358  int phy_read(struct phy_device *phydev, u16 regnum);
0359  int phy_write(struct phy_device *phydev, u16 regnum, u16 val);
0360 
0361 Simple read/write primitives.  They invoke the bus's read/write function
0362 pointers.
0363 ::
0364 
0365  void phy_print_status(struct phy_device *phydev);
0366 
0367 A convenience function to print out the PHY status neatly.
0368 ::
0369 
0370  void phy_request_interrupt(struct phy_device *phydev);
0371 
0372 Requests the IRQ for the PHY interrupts.
0373 ::
0374 
0375  struct phy_device * phy_attach(struct net_device *dev, const char *phy_id,
0376                                 phy_interface_t interface);
0377 
0378 Attaches a network device to a particular PHY, binding the PHY to a generic
0379 driver if none was found during bus initialization.
0380 ::
0381 
0382  int phy_start_aneg(struct phy_device *phydev);
0383 
0384 Using variables inside the phydev structure, either configures advertising
0385 and resets autonegotiation, or disables autonegotiation, and configures
0386 forced settings.
0387 ::
0388 
0389  static inline int phy_read_status(struct phy_device *phydev);
0390 
0391 Fills the phydev structure with up-to-date information about the current
0392 settings in the PHY.
0393 ::
0394 
0395  int phy_ethtool_ksettings_set(struct phy_device *phydev,
0396                                const struct ethtool_link_ksettings *cmd);
0397 
0398 Ethtool convenience functions.
0399 ::
0400 
0401  int phy_mii_ioctl(struct phy_device *phydev,
0402                    struct mii_ioctl_data *mii_data, int cmd);
0403 
0404 The MII ioctl.  Note that this function will completely screw up the state
0405 machine if you write registers like BMCR, BMSR, ADVERTISE, etc.  Best to
0406 use this only to write registers which are not standard, and don't set off
0407 a renegotiation.
0408 
0409 PHY Device Drivers
0410 ==================
0411 
0412 With the PHY Abstraction Layer, adding support for new PHYs is
0413 quite easy. In some cases, no work is required at all! However,
0414 many PHYs require a little hand-holding to get up-and-running.
0415 
0416 Generic PHY driver
0417 ------------------
0418 
0419 If the desired PHY doesn't have any errata, quirks, or special
0420 features you want to support, then it may be best to not add
0421 support, and let the PHY Abstraction Layer's Generic PHY Driver
0422 do all of the work.
0423 
0424 Writing a PHY driver
0425 --------------------
0426 
0427 If you do need to write a PHY driver, the first thing to do is
0428 make sure it can be matched with an appropriate PHY device.
0429 This is done during bus initialization by reading the device's
0430 UID (stored in registers 2 and 3), then comparing it to each
0431 driver's phy_id field by ANDing it with each driver's
0432 phy_id_mask field.  Also, it needs a name.  Here's an example::
0433 
0434    static struct phy_driver dm9161_driver = {
0435          .phy_id         = 0x0181b880,
0436          .name           = "Davicom DM9161E",
0437          .phy_id_mask    = 0x0ffffff0,
0438          ...
0439    }
0440 
0441 Next, you need to specify what features (speed, duplex, autoneg,
0442 etc) your PHY device and driver support.  Most PHYs support
0443 PHY_BASIC_FEATURES, but you can look in include/mii.h for other
0444 features.
0445 
0446 Each driver consists of a number of function pointers, documented
0447 in include/linux/phy.h under the phy_driver structure.
0448 
0449 Of these, only config_aneg and read_status are required to be
0450 assigned by the driver code.  The rest are optional.  Also, it is
0451 preferred to use the generic phy driver's versions of these two
0452 functions if at all possible: genphy_read_status and
0453 genphy_config_aneg.  If this is not possible, it is likely that
0454 you only need to perform some actions before and after invoking
0455 these functions, and so your functions will wrap the generic
0456 ones.
0457 
0458 Feel free to look at the Marvell, Cicada, and Davicom drivers in
0459 drivers/net/phy/ for examples (the lxt and qsemi drivers have
0460 not been tested as of this writing).
0461 
0462 The PHY's MMD register accesses are handled by the PAL framework
0463 by default, but can be overridden by a specific PHY driver if
0464 required. This could be the case if a PHY was released for
0465 manufacturing before the MMD PHY register definitions were
0466 standardized by the IEEE. Most modern PHYs will be able to use
0467 the generic PAL framework for accessing the PHY's MMD registers.
0468 An example of such usage is for Energy Efficient Ethernet support,
0469 implemented in the PAL. This support uses the PAL to access MMD
0470 registers for EEE query and configuration if the PHY supports
0471 the IEEE standard access mechanisms, or can use the PHY's specific
0472 access interfaces if overridden by the specific PHY driver. See
0473 the Micrel driver in drivers/net/phy/ for an example of how this
0474 can be implemented.
0475 
0476 Board Fixups
0477 ============
0478 
0479 Sometimes the specific interaction between the platform and the PHY requires
0480 special handling.  For instance, to change where the PHY's clock input is,
0481 or to add a delay to account for latency issues in the data path.  In order
0482 to support such contingencies, the PHY Layer allows platform code to register
0483 fixups to be run when the PHY is brought up (or subsequently reset).
0484 
0485 When the PHY Layer brings up a PHY it checks to see if there are any fixups
0486 registered for it, matching based on UID (contained in the PHY device's phy_id
0487 field) and the bus identifier (contained in phydev->dev.bus_id).  Both must
0488 match, however two constants, PHY_ANY_ID and PHY_ANY_UID, are provided as
0489 wildcards for the bus ID and UID, respectively.
0490 
0491 When a match is found, the PHY layer will invoke the run function associated
0492 with the fixup.  This function is passed a pointer to the phy_device of
0493 interest.  It should therefore only operate on that PHY.
0494 
0495 The platform code can either register the fixup using phy_register_fixup()::
0496 
0497         int phy_register_fixup(const char *phy_id,
0498                 u32 phy_uid, u32 phy_uid_mask,
0499                 int (*run)(struct phy_device *));
0500 
0501 Or using one of the two stubs, phy_register_fixup_for_uid() and
0502 phy_register_fixup_for_id()::
0503 
0504  int phy_register_fixup_for_uid(u32 phy_uid, u32 phy_uid_mask,
0505                 int (*run)(struct phy_device *));
0506  int phy_register_fixup_for_id(const char *phy_id,
0507                 int (*run)(struct phy_device *));
0508 
0509 The stubs set one of the two matching criteria, and set the other one to
0510 match anything.
0511 
0512 When phy_register_fixup() or \*_for_uid()/\*_for_id() is called at module load
0513 time, the module needs to unregister the fixup and free allocated memory when
0514 it's unloaded.
0515 
0516 Call one of following function before unloading module::
0517 
0518  int phy_unregister_fixup(const char *phy_id, u32 phy_uid, u32 phy_uid_mask);
0519  int phy_unregister_fixup_for_uid(u32 phy_uid, u32 phy_uid_mask);
0520  int phy_register_fixup_for_id(const char *phy_id);
0521 
0522 Standards
0523 =========
0524 
0525 IEEE Standard 802.3: CSMA/CD Access Method and Physical Layer Specifications, Section Two:
0526 http://standards.ieee.org/getieee802/download/802.3-2008_section2.pdf
0527 
0528 RGMII v1.3:
0529 http://web.archive.org/web/20160303212629/http://www.hp.com/rnd/pdfs/RGMIIv1_3.pdf
0530 
0531 RGMII v2.0:
0532 http://web.archive.org/web/20160303171328/http://www.hp.com/rnd/pdfs/RGMIIv2_0_final_hp.pdf