The Linux Kernel Logo
  • Development process
  • Submitting patches
  • Code of conduct
  • Maintainer handbook
  • All development-process docs
  • Core API
  • Driver APIs
  • Subsystems
    • Core subsystems
    • Human interfaces
    • Networking interfaces
      • Networking
        • AF_XDP
        • Bare UDP Tunnelling Module Documentation
        • batman-adv
        • SocketCAN - Controller Area Network
        • The UCAN Protocol
        • Hardware Device Drivers
        • Networking Diagnostics
        • Distributed Switch Architecture
        • Linux Devlink Documentation
        • CAIF
        • Netlink interface for ethtool
        • IEEE 802.15.4 Developer’s Guide
        • ISO 15765-2 (ISO-TP)
        • J1939 Documentation
        • Linux Networking and Network Devices APIs
        • MSG_ZEROCOPY
        • FAILOVER
        • Net DIM - Generic Network Dynamic Interrupt Moderation
        • NET_FAILOVER
        • Page Pool API
        • PHY Abstraction Layer
        • phylink
        • IP-Aliasing
        • Ethernet Bridging
        • SNMP counter
        • Checksum Offloads
        • Segmentation Offloads
        • Scaling in the Linux Networking Stack
        • Kernel TLS
        • Kernel TLS offload
        • In-Kernel TLS Handshake
        • Linux NFC subsystem
        • Netdev private dataroom for 6lowpan interfaces
        • 6pack Protocol
        • ARCnet Hardware
        • ARCnet
        • ATM
        • AX.25
        • Linux Ethernet Bonding Driver HOWTO
        • cdc_mbim - Driver for CDC MBIM Mobile Broadband modems
        • DCCP protocol
        • DCTCP (DataCenter TCP)
        • Device Memory TCP
        • DNS Resolver Module
        • Softnet Driver Issues
        • EQL Driver: Serial IP Load Balancing HOWTO
        • LC-trie implementation notes
        • Linux Socket Filtering aka Berkeley Packet Filter (BPF)
        • Generic HDLC layer
        • Generic Netlink
        • Netlink Family Specifications
        • Generic networking statistics for netlink users
        • The Linux kernel GTP tunneling module
        • Identifier Locator Addressing (ILA)
        • IOAM6 Sysfs variables
        • io_uring zero copy Rx
        • IP dynamic address hack-port v0.03
        • IPsec
        • IP Sysctl
        • IPv6
        • IPVLAN Driver HOWTO
        • IPvs-sysctl
        • Kernel Connection Multiplexor
        • L2TP
        • The Linux LAPB Module Interface
        • How to use packet injection with mac80211
        • Management Component Transport Protocol (MCTP)
        • MPLS Sysfs variables
        • Multipath TCP (MPTCP)
        • MPTCP Sysfs variables
        • HOWTO for multiqueue network device support
        • Multi-PF Netdev
        • NAPI
        • Common Networking Struct Cachelines
        • Netconsole
        • Netdev features mess and how to get out from it alive
        • Network Devices, the Kernel, and You!
        • Netfilter Sysfs variables
        • NETIF Msg Level
        • Netmem Support for Network Drivers
        • Resilient Next-hop Groups
        • Netfilter Conntrack Sysfs variables
        • Netfilter’s flowtable infrastructure
        • OPEN Alliance 10BASE-T1x MAC-PHY Serial Interface (TC6) Framework Support
        • Open vSwitch datapath developer documentation
        • Operational States
        • Packet MMAP
        • Linux Phonet protocol family
        • PHY link topology
          • Overview
          • API
          • UAPI
        • HOWTO for the linux packet generator
        • PLIP: The Parallel Line Internet Protocol Device
        • PPP Generic Driver and Channel Interface
        • The proc/net/tcp and proc/net/tcp6 variables
        • Power Sourcing Equipment (PSE) Documentation
        • How to use radiotap headers
        • RDS
        • Linux wireless regulatory documentation
        • Network Function Representors
        • RxRPC Network Protocol
        • SOCKET OPTIONS
        • SECURITY
        • EXAMPLE CLIENT USAGE
        • Linux Kernel SCTP
        • LSM/SeLinux secid
        • Seg6 Sysfs variables
        • struct sk_buff
        • SMC Sysctl
        • NIC SR-IOV APIs
        • Interface statistics
        • Stream Parser (strparser)
        • Ethernet switch device driver model (switchdev)
        • Sysfs tagging
        • TC Actions - Environmental Rules
        • TC queue based filtering
        • TCP Authentication Option Linux implementation (RFC5925)
        • Thin-streams and TCP
        • Team
        • Timestamping
        • Linux Kernel TIPC
        • Transparent proxy support
        • Universal TUN/TAP device driver
        • The UDP-Lite protocol (RFC 3828)
        • Virtual Routing and Forwarding (VRF)
        • Virtual eXtensible Local Area Networking documentation
        • Linux X.25 Project
        • X.25 Device Driver Interface
        • XFRM device - offloading the IPsec computations
        • XFRM proc - /proc/net/xfrm_* files
        • XFRM
        • XFRM Syscall
        • XDP RX Metadata
        • AF_XDP TX Metadata
      • NetLabel
      • InfiniBand
      • ISDN
      • MHI
    • Storage interfaces
    • Other subsystems
  • Locking
  • Licensing rules
  • Writing documentation
  • Development tools
  • Testing guide
  • Hacking guide
  • Tracing
  • Fault injection
  • Livepatching
  • Rust
  • Administration
  • Build system
  • Reporting issues
  • Userspace tools
  • Userspace API
  • Firmware
  • Firmware and Devicetree
  • CPU architectures
  • Unsorted documentation
  • Translations
The Linux Kernel
  • Kernel subsystem documentation
  • Networking
  • PHY link topology
  • View page source

PHY link topology¶

Overview¶

The PHY link topology representation in the networking stack aims at representing the hardware layout for any given Ethernet link.

An Ethernet interface from userspace’s point of view is nothing but a struct net_device, which exposes configuration options through the legacy ioctls and the ethtool netlink commands. The base assumption when designing these configuration APIs were that the link looks something like

+-----------------------+        +----------+      +--------------+
| Ethernet Controller / |        | Ethernet |      | Connector /  |
|       MAC             | ------ |   PHY    | ---- |    Port      | ---... to LP
+-----------------------+        +----------+      +--------------+
struct net_device               struct phy_device

Commands that needs to configure the PHY will go through the net_device.phydev field to reach the PHY and perform the relevant configuration.

This assumption falls apart in more complex topologies that can arise when, for example, using SFP transceivers (although that’s not the only specific case).

Here, we have 2 basic scenarios. Either the MAC is able to output a serialized interface, that can directly be fed to an SFP cage, such as SGMII, 1000BaseX, 10GBaseR, etc.

The link topology then looks like this (when an SFP module is inserted)

+-----+  SGMII  +------------+
| MAC | ------- | SFP Module |
+-----+         +------------+

Knowing that some modules embed a PHY, the actual link is more like

+-----+  SGMII   +--------------+
| MAC | -------- | PHY (on SFP) |
+-----+          +--------------+

In this case, the SFP PHY is handled by phylib, and registered by phylink through its SFP upstream ops.

Now some Ethernet controllers aren’t able to output a serialized interface, so we can’t directly connect them to an SFP cage. However, some PHYs can be used as media-converters, to translate the non-serialized MAC MII interface to a serialized MII interface fed to the SFP

+-----+  RGMII  +-----------------------+  SGMII  +--------------+
| MAC | ------- | PHY (media converter) | ------- | PHY (on SFP) |
+-----+         +-----------------------+         +--------------+

This is where the model of having a single net_device.phydev pointer shows its limitations, as we now have 2 PHYs on the link.

The phy_link topology framework aims at providing a way to keep track of every PHY on the link, for use by both kernel drivers and subsystems, but also to report the topology to userspace, allowing to target individual PHYs in configuration commands.

API¶

The struct phy_link_topology is a per-netdevice resource, that gets initialized at netdevice creation. Once it’s initialized, it is then possible to register PHYs to the topology through :

phy_link_topo_add_phy()

Besides registering the PHY to the topology, this call will also assign a unique index to the PHY, which can then be reported to userspace to refer to this PHY (akin to the ifindex). This index is a u32, ranging from 1 to U32_MAX. The value 0 is reserved to indicate the PHY doesn’t belong to any topology yet.

The PHY can then be removed from the topology through

phy_link_topo_del_phy()

These function are already hooked into the phylib subsystem, so all PHYs that are linked to a net_device through phy_attach_direct() will automatically join the netdev’s topology.

PHYs that are on a SFP module will also be automatically registered IF the SFP upstream is phylink (so, no media-converter).

PHY drivers that can be used as SFP upstream need to call phy_sfp_attach_phy() and phy_sfp_detach_phy(), which can be used as a .attach_phy / .detach_phy implementation for the struct sfp_upstream_ops.

UAPI¶

There exist a set of netlink commands to query the link topology from userspace, see Documentation/networking/ethtool-netlink.rst.

The whole point of having a topology representation is to assign the phyindex field in struct phy_device. This index is reported to userspace using the ETHTOOL_MSG_PHY_GET ethtnl command. Performing a DUMP operation will result in all PHYs from all net_device being listed. The DUMP command accepts either a ETHTOOL_A_HEADER_DEV_INDEX or ETHTOOL_A_HEADER_DEV_NAME to be passed in the request to filter the DUMP to a single net_device.

The retrieved index can then be passed as a request parameter using the ETHTOOL_A_HEADER_PHY_INDEX field in the following ethnl commands :

  • ETHTOOL_MSG_STRSET_GET to get the stats string set from a given PHY

  • ETHTOOL_MSG_CABLE_TEST_ACT and ETHTOOL_MSG_CABLE_TEST_ACT, to perform cable testing on a given PHY on the link (most likely the outermost PHY)

  • ETHTOOL_MSG_PSE_SET and ETHTOOL_MSG_PSE_GET for PHY-controlled PoE and PSE settings

  • ETHTOOL_MSG_PLCA_GET_CFG, ETHTOOL_MSG_PLCA_SET_CFG and ETHTOOL_MSG_PLCA_GET_STATUS to set the PLCA (Physical Layer Collision Avoidance) parameters

Note that the PHY index can be passed to other requests, which will silently ignore it if present and irrelevant.

Previous Next

© Copyright The kernel development community.

Built with Sphinx using a theme provided by Read the Docs.