Design Article
Tell us What You Think
We want to know what you thought about this Design. Let us know by adding a comment.
Understanding Skew in 100GBASE-R4 applications
Tim Warland, AppliedMicro
12/14/2011 3:35 PM EST
The 100GBASE-R4 physical layer device converts 10-lanes running 10Gbps (CAUI) to 4-lanes running 25Gbps. The conversion process is data agnostic with no provision for rate adaptation, consequently skew management is an integral part of end-to-end system performance.
Recently ratified, the 100Gigabit Ethernet Standard, (IEEE802.3ba) defines a 100GBASE-LR4 and 100GBASE-ER4 interface for optical interfaces on single-mode fiber. The architecture includes the Physical Coding sub layer (PCS), connected on one side to the reconciliation sub-layer (RS) and MAC and to the PMA/PMD on the other side. Actually implementations are more likely to combine the PCS and PMA with the RS and MAC and interface to a second PMA and PMD (referred to as the PHY).
PCS Generates Alignment Markers
For 100GBASE-R4 implementations, the PCS is responsible for inserting lane alignment markers in the transmit direction and detecting lane alignment markers and aligning recovered data in the receive direction. The alignment process ensures properly formatted data for the MAC. Skew accumulation occurs downstream from the PCS and it is the responsibility of the receive PCS to remove skew and re-align the receive data to the RS.
In the transmit process, the first PCS demultiplexes data from the RS generating 20-logical lanes of data (at 5Gbps) which are scrambled to ensure sufficient transition density for CDR recovery. Alignment markers are inserted into each of the twenty logical lanes every 16,383 instances of 66-bit blocks in the transmit direction. The alignment markers are 66-bit blocks which uniquely define each of the 20-logical lanes.
PMA Performs Multiplexing
The 20-lane output from the PCS interfaces to the first PMA, which multiplexes the 20-logical lanes to generate the pin-efficient CAUI interface. The CAUI interface defines 10-lanes running 10.3125Gbps. These lanes contain multiplexed PCS data; consequently there is no valid information in a single lane. For example, a single lane of the CAUI interface cannot be considered as an independent 10GE channel.
For 100GBASE-LR and 100GBASE-ER applications, a second instance of the PMA layer is implemented with the PMD in a device called a Gearbox, such as the S28010 from AppliedMicro. The PMA in the Gearbox is responsible for multiplexing the 10-lane CAUI interface into 4-lanes running at 25.78125Gbps. This function is typically implemented in a CFP Module but can be implemented in a stand-alone Gearbox connected to a CFP2 Module. Four electrical lanes operating at 25.8Gbps from the Gearbox generate four optical lambda centered on 1300nm for 100GBASE-LR4 and 100GBASE-ER4.
In the receive direction, the opposite process occurs. The optical signals are recovered and converted into four electrical lanes. The Gearbox converts four-lanes from the PMD to the 10-lane CAUI interface. Because of the alignment marker strategy, there is no requirement to map lanes from the far-end to the near end. That is, lane 0 in the transmit direction does not have to connect to lane 0 on the receiver. The receiving PMA generates 20-logical lanes that are sent to the PCS, which detects the alignment markers and re-aligns the signal for the RS and MAC. Total skew in the system must be bounded such that lane alignment markers from one lane do not roll into the next group of lanes since there is no way to detect this error (except at the MAC).
Two Forms of Skew
There are two forms of skew: fixed skew or dynamic skew. Fixed skew is a constant lane-to-lane mismatch for a given implementation while dynamic skew is the change in skew as a function of time.
Fixed Skew
Fixed or static skew represents the constant difference in arrival time for two signals generated from the same source. It is generated by physical lane-to-lane differences in the time a signal reaches a destination relative to the data on any other lane. Consider for example, two lanes of the CAUI interface that have different trace lengths. Every two-centimeters of trace-length mismatch represents approximately 1UI of propagation delay on the CAUI bus. If lane 0 is 3cm and lane 1 is 5cm long, the lane-to-lane skew just associated with trace length mismatch represents 1UI difference in the time two bits launched from the transmitter at the same time reach the destination.
The PCS-based lane alignment process simplifies board design since designers are able to concentrate on matching trace length within the pair, controlling impedance and minimizing interference instead of matching the length of all 20-wires in each direction.
Skew can be generated in receive CDR circuits. Each receive CDR independently locks to the input waveform and recovers the local clock. Phase differences in each of these clocks are a source of skew. An internal FIFO is used to re-align all received data to a common clock. However the FIFO does not correct skew.
The fiber-optic cable is another source of skew. Dispersion over a significant length of fiber is the dominant form of skew in most 100GBASE-R4 applications.
These sources of skew are fixed in a given system and do not change. Trace length, cable length, etc are fixed properties that define the fixed skew for each implementation.
Dynamic Skew or Skew Variation
Skew variation, or dynamic skew, represents the change in skew as a function of time. It can be considered as a form of lane-to-lane wander. These variations are a result of physical changes during operation, such as temperature increases or supply voltage fluctuations. For example a CAUI transmitter which is heating asymmetrically, that is, one side is getting hotter faster than another side, may demonstrate a temporary increase in the rate of one of the transmitters relative to another. This variation would not be present when the temperature stabilizes.
Skew in 100Gbs Systems
Lane-to-lane skew, or the fixed delay between any two lanes, can be generated at several points in the 100GBASE-R4 system.
Transmit Sources
The sources of skew from the PCS to the fiber are outlined in Table 1. Lane-to-lane skew variation, or the amount of wander that exists between any two lanes is also defined. Total (cumulative) skew generated within the PMA and PMD function is defined by the IEEE as follows1:
Table 1: Transmit Skew Generation Limits

Receive Functions
In the receive direction the opposite process is applied. Each point in the receiver functional chain is provided an allowance for cumulative skew associated with each lane. The IEEE defines these limits as follows:
Table 2: Receive Skew Generation Limits

Total end to end skew and skew variation are compensated at the receive PCS. The PCS first achieves block-lock to find the lane alignment markers. The lane alignment markers are used to re-order the recovered data and remove skew so that recovered 20-logical lanes are re-aligned for the RS. The skew budget for the 100GBASE-R4 PCS receiver is shown in Table 3. FIFOs internal to the PCS are used to buffer each lane to perform alignment to the reconciliation sub-layer.
Table 3: Skew Tolerance Requirements

Skew in the Gearbox
The Gearbox is a bi-directional multiplexer/ demultiplexer. In the transmit direction the 10-lane CAUI interface is multiplexed to 4-lanes at 25.7Gbps. In the receive direction the 4-lane 25.7Gbps input from the optics is demultiplexed to the 10-lane CAUI interface.
The bit multiplexing structure defined by the IEEE is not deterministic because it is not possible to know in advance the exact location of a particular input bit relative to all the other input bits in the output stream. However, once a bit position is assigned for bits from any given lane relative to the other bits, that bit position does not change until the device is reset. Any difference between the frequency of the input data and the output data would violate this rule. Therefore the inputs and outputs to the Gearbox are synchronous in either direction although the transmit and receive data paths do not have to be synchronous to each other.
It follows that any skew variation at the input could cause the multiplexer to drop or double-count a bit from a lane which violates the IEEE bit multiplexing requirements. The Gearbox is designed with internal FIFOs that compensate for skew variation, up to the maximum defined by the standard, to ensure compliance under worst case skew variation. This has the added benefit of resetting skew variation to zero at the input to the bit multiplexer or demultiplexer.
Fixed skew in a well designed Gearbox is very small, typically less than 3UI maximum (at 10Gbps). Trace length mismatches at 10Gbps and 25Gbps are expected to contribute approximately 5UI (at 10Gbps) for well designed systems. Fiber dispersion is expected to be the dominant source of fixed skew.
Skew affects Delay
Skew and skew variation has a direct effect on traffic delay. Both fixed skew and skew variations are compensated by FIFO’s. Fixed skew is compensated with FIFO’s in the receive PCS. For skew variation, the FIFOs in the Gearbox are loaded to the mid-range of the maximum possible skew variation, which allows the lanes to wander relative to each to the depth of the FIFO (corresponding to the values in Table 1 or Table 2). Filling each of the FIFO’s to the mid-point creates a fixed delay associated with the signal chain.
Typical 100GBASE-R4 architectures will reset the skew variation in each of the PMA layers. The 20-logical lane to CAUI interface conversation use FIFO-based implementation which has sufficient depth to accommodate skew variation on the input but the skew variation is not propagated across the interface since this would cause the multiplexer to drop or add a bit position in extreme cases. Similarly the Gearbox implements PMA2 to convert from CAUI to 4-lanes and 25Gbps. This conversion terminates skew variation prior to multiplexing (or demultiplexing in the receive direction).
In order to minimize delay, both skew and skew variation must be constrained by design. A reasonable Gearbox implementation will be optimized to reduce these elements internally; however the external elements are beyond the Gearbox control so the Gearbox device must be designed with FIFOs of sufficient depth to accommodate the maximum permissible skew variation, which increases delay.
The S28010 Gearbox from AppliedMicro has a special feature that allows designers to minimize delay for well engineered links. For those designs that are engineered to minimize the amount of skew variation between end-points, the Gearbox can be configured with shallow FIFO’s. The FIFO depth must still be sufficiently deep to accommodate the maximum skew variation in the link to avoid bit slipping. Reducing the depth of the Gearbox FIFO reduces end-to-end delay for well-engineered links which improves network latency.
Summary
The IEEE standard for 100GBASE-LR4 and 100GBASE-ER4 communications defines a lane alignment strategy at the PCS layers that simplifies system design. Both fixed and dynamic skew are compensated at the end points in the network. This simplifies board design since designers by allowing them to focus on matching trace length within a pair of differential signals instead of matching overall trace lengths for each bus. Similarly Gearbox designs are simplified since they are not required to terminate the PCS to perform rate adaptation to compensate for skew. Finally, optical dispersion compensation is not required since there is no requirement to align the output data at the fiber.
There are two forms of skew: fixed (or static) skew and variable (or dynamic) skew. Fixed skew is associated with trace length mismatch and optical dispersion. It does not change for any given implementation. Variable skew is the amount a lane will wander relative to any other lane and is usually associated with changing conditions, such as thermal or voltage fluctuations.
The S28010 Gearbox from AppliedMicro is compliant with 100GBASE-LR4 and 100GBASE-ER4 systems. It is used to interface 10-lane 10Gbps CAUI bus to 4-lane 25.7Gbps interface within a CFP module or on a line-card with a CFP2 module. The Gearbox is architected to reduce end-to-end skew variation. Designers can take advantage of a feature in the Gearbox that reduces the depth of the skew compensation FIFOs which decreases latency for well designed systems.
About the Author
Tim Warland is the product manager for PHY products at AppliedMicro since 2009. Prior to this, he was involved in a number of start-up companies including Quake Technologies where he introduced the industry to 10Gbps networks. Warland was recognized for invaluable contributions to the development of IEEE Standard for 10GbE. He represents AppliedMicro at IEEE802.3 and OIF meetings, and holds an MBA from the University of Ottawa.
1These definitions are reasonable approximations of the skew points defined by the IEEE


