Design Article
Comment
Sanjib.Acharya
How is the energy saving figure 4 TWh/year by adopting energy efficient Ethernet ...
kinnar
Good research paper, this also discusses the fundamentals of Link Aggregation in ...
Optimize Energy Efficient Ethernet (IEEE 802.3az) performance in bundled links
Pedro Reviriego, Juan Antonio Maestro
8/10/2011 10:37 AM EDT
Recently, the IEEE 802.3az (Energy Efficient Ethernet) standard has been approved. The new standard will provide significant energy savings as it is deployed in the coming years. Several studies have also been done recently on reducing the energy consumption in bundle links commonly used in Ethernet. However they do not consider the use of Energy Efficient Ethernet (EEE). In this paper a technique that optimizes the use of Energy Efficient Ethernet in bundled links is presented. The technique improves responsiveness compared to previous approaches such that frame delay and loss are minimized. It can also be implemented by each link endpoint independently so that it can be easily deployed. Finally the energy efficiency is significantly improved when traffic is asymmetric.
The consumption of energy in computer networks is a topic of growing interest for the computer networking community. A number of efforts are now underway to achieve significant energy savings in computer networks. One example is the Energy Efficient Ethernet standard (IEEE 802.3az) which is predicted to save over 4 TWh/year [1]. The Energy Efficient Ethernet (EEE) standard [2] specifies methods to reduce energy consumption in Ethernet devices by defining low power modes. The idea is that a transceiver that has no frames to transmit can be put into a low power mode. When new frames arrive, it will go back into the active mode very quickly (a few microseconds). This enables energy savings that are almost transparent to upper protocol layers. The energy savings achieved in a given link are directly related to the time it spends in the low power mode. This time in turn depends on the traffic load and it has been shown that due to mode transition overheads for loads above a few percent the savings are greatly reduced [3]. This means that EEE provides limited savings when the links operate at loads larger than 10%.
In Ethernet it is common to bundle links to provide larger capacity. This has motivated a number of studies to optimize the energy consumption on bundled Ethernet links [4],[5],[6]. Basically in those methods the number of active links in a bundle is dynamically adapted to the traffic load. This means that when the bundle is lightly loaded significant energy can be saved. However those works do not consider the use of the Energy Efficient standard. In this paper a technique that optimizes the use of Energy Efficient Ethernet in bundled links is presented.
Bundled Links in Ethernet: Link Aggregation
Link aggregation in Ethernet was standardized in IEEE 802.3ad and later renamed as IEEE 802.1ax for consistency with other 802.3 standards [7]. It enables the bundling of multiple Ethernet links in a single logical link known as Link Aggregation Group (LAG). The link aggregation defined in IEEE 802.3ad is done above the MAC layer where entire frames are sent to the MAC layer of one of the aggregated links. Link aggregation provides several advantages [8]. For example the aggregated link capacity is the sum of the capacities of the links. This enables increases in link capacity that is smaller than the usual 10x factor for different Ethernet technologies. It also can be used to provide larger capacity than that provided by the latest physical layer (PHY) technology. Another interesting feature of link aggregation is that it increases the link availability, as the aggregated link would only be unavailable if all links were unavailable.
With arbitrary frame distribution, link aggregation can however cause problems like for example frame reordering [8]. This can occur for example when a long frame that arrived first is transmitted over one link and a second short frame that arrived shortly after the first one is transmitted over a different link thus arriving first. Frame reordering can cause problems to higher layer protocols and the IEEE 802.3ad standard was therefore designed to avoid frame reordering. Frame reordering is only an issue only frames that belong to the same conversation. Therefore algorithms that distribute frames across the different links for transmission ensuring that frames that belong to a conversation are sent over the same link will preserve frame order [8]. These algorithms ensure that those frames will not be reordered but limit the maximum bandwidth that can be achieved by a single conversation to that of a link.
Figure 1 shows the block diagram of the link aggregation sublayer as defined in the IEEE 802.3ad standard. The Frame Distribution block is in charge of assigning frames to links for transmission and must ensure that all frames from a conversation are transmitted over the same link. The Frame Collection block receives frames from different ports and passes them to the MAC client. The Frame collection only needs to ensure that frames from the same port are passed in order to the MAC client. This design makes the Frame Collection block independent from the algorithm used in the Frame Distribution block and avoids buffering and reordering in the collector. Fragmentation and reassembly is also avoided in this design.
Other alternatives could be used for link aggregation, for example in [9] it is proposed that links are aggregated at the physical layer. Frame transmission is done by fragmenting the frames in segments that are distributed among the different links for transmission. At reception the fragments are reassembled to reconstruct the original frame. This strategy has been widely used in WAN aggregation. In this case fragments of a conversation are transmitted over all the links and therefore a single conversation can use all the capacity of the aggregated link. This type of design however complicates the frame distribution and collection processes and was not selected in the IEEE 802.3ad standard.
To make an efficient use of the capacity in the aggregated link, the different distribution algorithms try to assign conversations to links in a way that the load in each of the links is similar. This enables high link utilization when the number of conversations is large. Conversations can be defined at different levels depending on the network topology and traffic. In Annex A of the standard different options are discussed, for example using the Ethernet source or destination address to identify a conversation. In some cases it is more convenient to use higher layer protocol information to define a conversation. For example a conversation could be defined to be a TCP connection. The standard also provides mechanisms to reassign conversations to a different link while ensuring frame order. This is useful to balance the load of the links dynamically.
As mentioned before, Energy Efficiency in Link Aggregation has received interest recently as significant energy is wasted for example when the traffic load is low, and all the aggregated links are active. This is due to the fact that energy consumption in physical layer device (PHYs) is today roughly independent of the traffic load once the link is active. In [4] it is proposed that the number of active links is adapted to the traffic load such that when there is little traffic only a few of the links are kept active. This idea is further studied in [5][6] where some experimental results are presented. Transitions between active and standby modes of the links require coordination between the link partners using Link Aggregation Control Protocol (LACP). This means that transitions would require a relative long time. The time would be in the order of hundreds of milliseconds as also the link has to be reestablished something that requires a significant amount of time [10]. Additionally links are managed as a whole, that is both link directions are active or idle but there is no possibility to have a link active in one direction and idle in the other. This approach can reduce significantly energy consumption in aggregated links but does not include coordination with the new Energy Efficient standard developed by IEEE 802.3.
Energy Efficient Ethernet in Bundled Links
Once EEE is adopted, it would seem that enabling EEE on each of the aggregated links would ensure good energy efficiency when there is low utilization of the links as they would be in low power mode most of the time. This would make schemes that adapt the number of active links to the traffic load unnecessary. However a as discussed before, EEE provides significant savings only when the load is very load. For example if links in a bundle are loaded at 15% EEE will provide limited savings, while if the traffic is concentrated on only some of the links of the bundle significant savings could be achieved. Therefore even when EEE is adopted it may be useful to study additional energy efficient policies for aggregated links that operate at those loads.
From the previous discussion, it becomes apparent that even when EEE is adopted to achieve optimal energy efficiency in a link aggregation, the traffic load should be concentrated on as few active links as possible as EEE energy efficiency is poor for lightly loaded links. To this end an algorithm is proposed in the following. One of the advantages of our proposal is that it can put the under-utilized links in and out of a low power mode much more quickly than the method described [4].
In our proposal, all links in the LAG are enabled for EEE. For the links that we select to be in low power mode, simply no conversations are assigned to them. This means that no frames will be transmitted on those links. For most EEE PHYs and MACs the state of link directions can be managed independently meaning that one direction can be in low power mode while the other is in active mode. This simplifies the selection of which links should be in standby mode as the decision can be made locally with no coordination required with the other end to ensure that the links in low power mode are the same.
Two possible configurations are illustrated in Figure 2. It should be noted that the configuration in Figure 2 (left) would not be possible in the method described in [4]. In fact for that configuration additional energy savings can be achieved as there is transmission only in one direction on each link, so there is no need to perform echo and near end crosstalk cancellation in the receivers [11]. This can provide significant energy savings for the proposed scheme when it is implemented in conjunction with a smart management of the PHYs.
The local frame distribution block monitors the traffic load and decides the number of links that have conversations assigned. If traffic decreases, the conversations can be reassigned to fewer links to reduce energy consumption and conversely if traffic increases more links can be used for the existing conversations. Those reassignments can be done in the same way that conversations are dynamically reassigned to balance load in existing implementations. The process is transparent to the collector at the other end. The block diagram of the algorithm is shown in Figure 3 and is similar to the one proposed in [4]. This is the algorithm considered in the rest of the paper. Note that reassigning the conversations is all that is needed in the proposed method to reduce the number of active links. Those with no conversations assigned will automatically enter the EEE low power mode and stay in it until the load increases. At that point some conversations would be assigned to the link and frame transmission can start immediately by entering the EEE active mode.

The main parameter of the algorithm is the load threshold that determines a change in the number of links. Real implementations should define a hysteresis mechanism to avoid frequent changes like for example the use of two different thresholds for increasing and decreasing the number of active links. However as the objective of this paper is only to evaluate if there are significant potential benefits to be gained from this approach the use of a single threshold is enough. If this threshold is large, for example a load approaching 100% of the link capacity then the latency of the link due to large queue occupancy would increase and the same would occur with the probability of frames being discarded due to queue overflow. If the threshold is small links will be lightly loaded reducing energy efficiency. Therefore a compromise value has been selected to illustrate the approach. This value is 80% of the link capacity.
The load among the links that are active (have conversations assigned to them) is balanced as in the traditional frame distribution schemes. This ensures that conversations receive a similar service irrespective of the link they are assigned to.
The main benefits of the proposed approach compared to existing techniques are:
* It effectively combines Energy Efficient Ethernet with Link Aggregation resulting in most cases in larger energy savings that those obtained by using EEE alone.
* The proposed scheme can be implemented independently of the link partner. This means that a manufacturer that implements this algorithm will achieve some energy savings even if the other end does not implement the algorithm.
* Link directions are managed independently. This enables configurations in which a different number of links are active in each direction. This is useful for links that have an asymmetric load such as when one direction carries mostly data frames and the other acknowledgements. In those cases, using a reduced number of links in the acknowledgement direction can save significant energy. In fact, energy optimization techniques based on asymmetric operation of Ethernet links have recently been proposed in other studies [12].
* The changes in the number of links are much faster as a) all decisions are taken locally with no need to exchange information with the other end and b) the transition from low power mode to active mode in EEE are very fast. This will improve the responsiveness of the system when load changes avoiding frame delay or lost.
To evaluate the performance of this algorithm it will be compared with a) the existing link aggregation implementations on which EEE is enabled and with b) the method proposed in [4]. In a) each of the links that composes the aggregation receives the same load and depending on whether it has frames to transmit enters or exits the active state. Therefore the power consumption is the number of links in the aggregation times the power consumption for a link at that load level. In b) the power consumption would directly depend on the number of active links as in the proposed approach.
Performance Evaluation
To estimate the potential energy savings of the proposed algorithm a number of simulation experiments have been done. In the simulations frames of 1250 bytes are used. Other frame lengths have also been simulated and the results are similar to those presented in the following. Frames are assumed to arrive following a Poisson distribution. As it is widely known that frame arrivals deviate from a Poisson model in LAN traffic this provides a rough approximation. However for LAGs that aggregate traffic from many sources the Poisson assumption can be valid for short time intervals, which are the time scales relevant for the proposed mechanism [13]. Link directions are thought to be managed independently as it is assumed that the MACs and PHY may enter the low power mode asymmetrically. Finally for the proposed method power consumption in the low power mode is assumed to be 10% of the power in the active mode and the power during state transitions is assumed to be the same that in the active state following previous studies [3]. The duration of the state transitions are taken from the IEEE 802.3az standard [2]. For the method described in [4] the consumption of the links that are put in standby is assumed to be zero. This is so because those links are effectively idle as opposed to the low power mode in EEE that keeps some elements active and schedules refresh periods periodically.
For the proposed algorithm and for the method described in [4] the threshold on the load that triggers the activation of a new link is 80% as discussed before and the simulations assume a steady state in which the load and the number of active links selected by the algorithm are fixed. The dynamic behaviour of the algorithm is left for future research, as our focus is to evaluate if the potential savings of using the proposed algorithm are enough to justify its implementation.
In figure 4 the results in terms of energy consumption for different loads are shown. In this case the aggregation is composed of four 10GBASE-T links (note that similar results are obtained for 100BASE-TX and 1000BASE-T links). The plots show the energy consumption for the proposed algorithm, the traditional with EEE, the method described in [4] and the legacy case in which no energy saving method is used. The results show how the proposed method and the method described in [4] can achieve significant energy savings for a wide range of loads. Both provide similar performance in terms of energy consumption. The main differences are that for low loads the proposed method benefits from EEE savings on the active link while for larger loads the 10% consumption of the links that are in EEE low power mode becomes more relevant. EEE alone improves the energy efficiency compared to legacy Ethernet but to a lesser extent.

Next: 802.2az Continued



kinnar
8/11/2011 3:07 PM EDT
Good research paper, this also discusses the fundamentals of Link Aggregation in a very good manner. The proposed technique really consumes very less power in light load conditions.
Sign in to Reply
Sanjib.Acharya
8/28/2011 12:39 PM EDT
How is the energy saving figure 4 TWh/year by adopting energy efficient Ethernet was estimated? Does this estimation include the new installations or consider upgrading the existing installed systems?
If this happens, it would be great!
Sign in to Reply