Advanced Ethernet Switching Technologies

doi:10.1007/978-981-19-3029-4_8

Huawei Technologies Co., Ltd.

17k Accesses

Abstract

Switch networking is often designed with a network architecture of dual aggregation layers and dual core layers to avoid single point of failure (e.g., network outage due to equipment damage). In this case, physical loops are formed. Once loops are formed, broadcast storms and MAC address flapping are generated in the network. Spanning tree protocol can prevent the formation of loops by blocking switch interfaces.

You have full access to this open access chapter, Download chapter PDF

Switch networking is often designed with a network architecture of dual aggregation layers and dual core layers to avoid single point of failure (e.g., network outage due to equipment damage). In this case, physical loops are formed. Once loops are formed, broadcast storms and MAC address flapping are generated in the network. Spanning tree protocol can prevent the formation of loops by blocking switch interfaces.

Networks formed using switches can configure multiple links into a single logical link to achieve traffic load balance and link redundancy, so as to save equipment costs. This is the link aggregation technology.

In certain network topologies, smart link can replace Spanning Tree Protocol (STP). Using Huawei’s smart link private protocol, it is able to achieve fast (millisecond) link switching. Configuring monitor link on the upstream switch can support smart link in a satisfying way.

8.1 Spanning Tree Protocol

8.1.1 Loop Problem of Switch Networking

As shown in Fig. 8.1, enterprises form a LAN with Layer 2 architecture, and the access layer switch connects the aggregation layer switch. If the aggregation layer switch fails, the two access layer switches will not be able access each other, which is a single point of failure. For businesses of some enterprises and organizations, long-time network interruption caused by equipment failure is unacceptable. In order to avoid single point of failure of the aggregation layer switch, two aggregation layer switches are usually deployed during the networking, as shown in Fig. 8.2. When aggregation layer switch 1 fails, two switches in the access layer can communicate through aggregation layer switch 2.

In this way, the network formed by the switches will then form a loop. As shown in Fig. 8.2, if Computer PC3 in the network sends a broadcast frame, the switch floods when it receives the broadcast frame, so the broadcast frame will continue to be forwarded in the loop, thus occupying the bandwidth of the switch port and consuming the resources of the switch. Computers in the network will keep receiving the frame repeatedly, and are unable to receive frames for normal communication. This is called a broadcast storm.

For a network formed by a switch, if there is a loop, there will also be a rapid flapping in the switch MAC address table. As shown in Fig. 8.2. At Time ①, port GE0/0/1 of access layer switch 2 receives a broadcast frame from PC3 and adds a mapping entry of MAC3 and Port GE0/0/1 to the MAC address table. The broadcast frame is sent out from GE0/0/3 and GE0/0/2 of access layer switch 2. At Time ②, GE0/0/2 of access layer switch 2 receives the broadcast frame from aggregation layer switch 2 and changes the port corresponding to MAC3 in the MAC address table to GE0/0/2. At Time ③, port GE0/0/3 of access layer switch 2 receives the broadcast frame from aggregation layer switch 1 and changes the port corresponding to MAC3 in the MAC address table to GE0/0/3. In this way, the contents of the table entry about the MAC address of PC3 in the MAC address table of access layer switch 2 will change endlessly and rapidly, which is MAC address flapping. Similarly, the MAC address table of access layer switch 1 and aggregation layer switches 1 and 2 will also experience the rapid flapping. The rapid flapping of the MAC address table will consume a lot of processing resources of the switch and may even cause the switch to breakdown.

This requires the switch to be able to effectively solve loops. Switches use the spanning tree protocol to prevent loops, and the Spanning Tree Protocol blocks loops by blocking ports.

8.1.2 Overview of the Spanning Tree Protocol

Spanning Tree Protocol can be applied to the establishment of tree topologies in computer networks, and its main function is to prevent redundant links from forming loops in switch networks. Spanning Tree Protocol is suitable for all vendors’ network devices, which vary in configuration from vendor to vendor, but are consistent in principle and application effect.

By passing Bridge Protocol Data Units (BPDUs) among switches, the Spanning Tree algorithm is used to elect a root bridge, a root port, and a designated port to ultimately form a tree-structured network. Among them, the root port and designated port are in the forwarding state, while other ports are disabled. If the network topology is changed, the spanning tree topology will be regenerated. The existence of Spanning Tree Protocol meets core and aggregation layer networks’ requirement for redundant links for network, and solves the “broadcast storm” problem and MAC address flapping problem caused by physical loops formed by redundant links.

Spanning Tree Protocol has the following three versions, and we can configure the version for Huawei switches, that is, specify the mode of Spanning Tree.

Spanning Tree Protocol: Spanning Tree Protocol (STP) here refers to a version of Spanning Tree Protocol, which is a data link layer protocol defined in IEEE 802.1D. If the switch runs the Spanning Tree Protocol in STP mode, all traffic will take the same path regardless of how many VLANs are in the switch.
Rapid Spanning Tree Protocol: in an STP network, if a switch is added or removed, or the bridge priority of a switch is changed, or a link fails, it is possible that the STP protocol will reselect the root bridge, reselect root ports for non-root bridges, and reselect the designated port for each link. Those ports that are in a blocking state may become forwarding ports. The process continues for tens of seconds (also known as convergence time), during which network disruptions may occur. To shorten the convergence time, IEEE 802.1w defines Rapid Spanning Tree Protocol (RSTP). This protocol has been improved a lot on the basis of STP to significantly reduce the convergence time to typically only a few seconds. STP is now rarely used in networks in reality and has been replaced by RSTP. One of the most important improvements of RSTP is that there are only three port states: discarding, learning and forwarding.
Multiple Spanning Tree Protocol: both STP and RSTP have the same defect, that is, all VLANs in the LAN share one spanning tree, and the link will not carry any traffic once it is blocked, resulting in wasted bandwidth. Multiple Spanning Tree Protocol (MSTP) is a new type of spanning tree protocol defined in IEEE 802.1S. MSTP introduces the concepts of “Instance” and “Region”. The so-called “instance” is a collection of multiple VLANs, and bundling multiple VLANs into a single instance is conducive to saving communication cost and resource usage. The topology of each instance of MSTP is calculated independently, and load balance can be achieved by these instances. When the protocol is used, multiple VLANs with the same topology can be mapped to an instance, and the forwarding state of these VLANs on ports will depend on the forwarding state of the corresponding instance in MSTP.

Huawei switch Spanning Tree Protocol uses MSTP mode by default, and this book will demonstrate how to change it to RSTP mode. Before illustrating the Spanning Tree Protocol, we also need to understand four basic terms, namely, bridge, bridge MAC address, bridge ID (BID), and port ID (PID).

1.
Bridge

Due to performance limitations and other factors, early switches generally have only two forwarding ports (if the switch has more ports, its forwarding speed will be so slow that the receiver cannot receive), so then the switch is often called a “network bridge”, or “bridge” for short. In IEEE terminology, the term “bridge” has been used to this day, but it does not specifically refer to switches with only two forwarding ports, but refers to switches with any number of ports in general. At present, “bridge” and “switch” are completely mixed, and they are also mixed in this book.
2.
Bridge MAC address

A bridge has multiple forwarding ports, and each port has a MAC address. Usually, we take the MAC address of the port with the smallest port number as the MAC address of the whole bridge.
3.
Bridge ID

As shown in Fig. 8.3, the bridge ID of a bridge (switch) consists of two parts. The first two bytes are the bridge priority, and the next six bytes are the bridge MAC address. The value of the bridge priority can be set manually, and the default value is 32,768.
4.
Port ID

There are various ways to define the port ID of a port of a bridge (switch), two of which are given in Fig. 8.4. In the first definition, the port ID consists of two bytes, the first byte being the port priority of the port, and the second byte the port number. In the second definition, the port ID consists of 16 bits, the first four bits being the port priority of the port and the next 12 bits the port number. The value of the port priority can be set manually. The PID definition method used by different equipment vendors may vary. The PID of Huawei switches uses the first definition.

8.1.3 Basic Concepts and Working Principles of the Spanning Tree Protocol

The basic principle of Spanning Tree Protocol is that in a switch network with physical loops, the switch automatically generates a network topology without loops by running the STP protocol.

The task of STP is to find all links in the network and close all redundant links to prevent network loops. To this end, STP first needs to elect a root bridge (root switch), which is responsible for deciding the network topology. Once all switches agree to elect a switch as the root bridge, the remaining switches must select a unique root port. STP must also select a designated port for ports at both ends of each link connecting two switches (a network cable is a link), and the ports that are neither the root nor designated ports become the alternate ports, which do not forward frames of computer communication, thus preventing loops.

Next, the network topology shown in Fig. 8.5 is taken as an example to explain the working process of Spanning Tree Protocol. It is divided into four steps: electing the root bridge; selecting the root port (RP) for non-root bridges; selecting a designated port (DP) for the ports at both ends of each link; and blocking alternate ports (AP).

1.
Elect the root bridge

The root bridge is the root node of the STP tree. To generate an STP tree, a root bridge must first be identified. The root bridge is the logical center of the entire switch network, but not necessarily its physical center. When the network topology changes, the root bridge may also change.

Switches running the STP protocol (shortened as STP switches) exchange STP protocol frames with each other, and the load data of these protocol frames are called bridge protocol data units (BPDUs). Although BPDU is the load data of STP protocol frames, it is not a network layer data unit; the generator, receiver, and processor of BPDU is STP switch itself, rather than the end computer. BPDU contains all the information related to STP protocol, and BID is one of them.

After the STP switches are first started, they all consider themselves as the root bridge and declare themselves as the root bridge in the BPDUs sent to other switches. When a switch receives BPDUs from other devices in the network, it compares the BID of the root bridge specified in the BPDU with its own BID. Switches continuously exchange BPDUs with each other while comparing BIDs until finally electing a switch with the smallest BID as the root bridge.

The network shown in Fig. 8.5 has five switches, A, B, C, D and E. The one with the smallest BID will be elected as the root bridge.

By default, BPDUs are sent every 2 s. In this example, Switch A and Switch B have the same priority, and the MAC address of Switch B is 4c1f-cc82-6053, which is smaller than that of Switch A, 4c1f-ccc4-3dad, so Switch B is more likely to be elected as the root bridge. In addition, you can specify the preferred switch to become the root bridge and alternate switches by changing the priority of switches. Usually, we designate in advance the switch with better performance and closer to the network center as the root bridge. In this example, it is clearly the optimal choice to make Switch B the preferred switch for the root bridge and Switch A the alternate switch.
2.
Select the root port

Once the root bridge is determined, any other switches that do not become the root bridge are referred to as non-root bridges. A non-root bridge may have more than one port connected to the network. In order to ensure that the working path from a non-root bridge to the root bridge is optimal and unique, it is necessary to identify a “root port” from the ports of non-root bridges, and the root port functions as the port for message interaction between non-root bridges and the root bridge.

The first criterion for the election of the root port is the root path cost (RPC), which is used by the STP protocol as an important basis for determining the root port. The smaller the RPC, the more likely the port is selected. When the RPC is the same, the BIDs of the uplink switches are compared, that is, the BIDs of the BPDUs received by each port of the switch are compared, and the one with smaller value is more likely to be elected; when the BIDs of the uplink switches are the same, the PIDs of the local switches are compared, that is, the respective PIDs of each port of the local switches are compared, and the port with smaller value is more likely to be elected. There can be at most one root port on a non-root bridge device.

The Spanning Tree Protocol uses the root path cost as an important basis for determining the root port. In a network running the STP protocol, we refer the cumulative path cost of a switch’s port to the root bridge (i.e., the sum of the path costs of all the links from that port to the root bridge) as the root path cost (RPC) of that port. The path cost of a link is related to the port bandwidth, and the larger the port bandwidth, the smaller the path cost. The correspondence between port bandwidth and path cost can be found in Table 8.1.

In Fig. 8.5, after identifying Switch B as the root bridge, and Switch A, C, D and E the non-root bridges, each non-root bridge has to choose the port closest to the root bridge (with the least cumulative cost) as the root port. Port G1 of Switch A and port F0 of Switch C, D and E in Fig. 8.5 become the root ports of these switches.

As shown in Fig. 8.6, S1 is the root bridge. Assuming that the costs of Path 1 and Path 2 are the same, then S4 will compare the bridge IDs of uplink devices S2 and S3. If S2’s bridge ID is smaller than S3’s, S4 will identify its G0/0/1 as its root port; if S3’s bridge ID is smaller than S2’s, S4 will identify its G0/0/2 as its root port.

For S5, assuming that the RPC of its port GE0/0/1 is the same as that of port GE0/0/2, since the uplink device of both ports is S4, S5 will also compare the PIDs of S4’s ports GE0/0/3 and GE0/0/4. If the PID of S4’s port GE0/0/3 is smaller than that of GE0/0/4, S5 will identify its GE0/0/1 as the root port. And if the PID of S4’s port GE0/0/4 is smaller than that of GE0/0/3, then S5 will specify its GE0/0/2 as the root port.
3.
Select the designated port

The root port ensures a unique and optimal working path between the switch and the root bridge. To prevent working loops, a designated port should also be determined for the ports connected to both ends of the network cable connecting the switch. The designated port is also determined by comparing RPCs, and the port with smaller RPC becomes the designated port; if their RPCs are the same, then BIDs are compared; if the BIDs are the same, PIDs of the devices are then compared, etc.; the one with smaller value becomes the designated port.

As shown in Fig. 8.7, assume that S1 has been elected as the root bridge and that the costs of each link are equal. Obviously, the RPC of S3’s port GE0/0/1 is smaller than that of S3’s port GE0/0/2, so S3 identifies its port GE0/0/1 as its own root port. Similarly, the RPC of S2’s port GE0/0/1 is smaller than that of S2’s port GE0/0/2, so S2 identifies its port GE0/0/1 as its own root port.

For the network segment between S3’s GE0/0/2 and S2’s GE0/0/2, the RPC of S3’s GE0/0/2 port is equal to that of S2’s GE0/0/2 port, so it is necessary to compare S3’s BID with S2’s BID. Assuming that S2’s BID is smaller than S3’s BID, then S2’s GE0/0/2 port will be determined as the designated port for the link between S3’s GE0/0/2 and S2’s GE0/0/2.

For network segment LAN, if LAN is a network formed with a hum, the hub is equivalent to network cables and does not participate in spanning tree. The only switch connected to LAN is S2. In this case, it is necessary to compare the PID of S2’s port GE0/0/3 with the PID of port GE0/0/4. Assuming that the PID of port GE0/0/3 is smaller than that of port GE0/0/4, then S2’s port GE0/0/3 will be determined as the designated port of the network segment LAN.

In the network shown in Fig. 8.5, since the connection bandwidth between Switch A and B is 1000 Mbit/s, then ports F1, F2, and F3 of Switch A have smaller RPCs than port F1 of Switch C, D and E. Therefore, ports F1, F2, and F3 of Switch A become designated ports. All ports of the root bridge are designated ports, and Switch E’s ports F2, F3, and F4 connected to the computer are designated ports.
4.
Block alternate ports

After determining the root port and the designated port, the remaining ports are the non-designated ports and non-root ports, which are collectively referred to as alternate ports. STP will logically block these alternate ports. Logical block means that these alternate ports are unable to forward frames generated and sent by the end computer, which are also known as user data frames. However, alternate ports can receive and process STP protocol frames, while the root and designated ports can both send and receive STP protocol frames and forward user data frames.

As shown in Figs. 8.5 and 8.7, once alternate ports are logically blocked, the STP tree (loop-free working topology) generation process is complete.

Table 8.1 Correspondence between port bandwidth and path cost

Full size table

8.1.4 STP Message Types

The basic principle of STP: the topology of the network is determined by passing a special protocol message, the bridge protocol data unit (BPDU), between switches. STP protocol frames use the IEEE 802.3 encapsulation format, and their load data is called BPDUs. STP switches build and maintain STP trees by exchanging STP protocol frames, and rebuild new STP trees when the physical topology of the network changes. The STP protocol frames are generated, sent, received, and processed by the STP switch. STP protocol frames are a type of multicast frames whose multicast address is 01-80-c2-00-00-00.

There are two types of BPDUs: configuration BPDU and TCN (Topology Change Notification) BPDU. The former is used to calculate the loop-free spanning tree, and the latter is used to shorten the refresh time of MAC table entries (from the default 300 s to 15 s) when the Layer 2 network topology changes.

During the initial formation of the STP tree, each STP switch actively generates and sends configuration BPDUs at a regular interval (2 s by default). When the STP tree is formed and stabilized, only the root bridge actively generates and sends configuration BPDUs at a regular interval (2 s by default, which is called Hello Time and can be modified on the root switch). Accordingly, the non-root switch periodically receives configuration BPDUs from its own root port and is immediately triggered to generate its own configuration BPDUs, which are sent out from its own designated port. In the process, it seems that the configuration BPDUs sent by the root bridge “pass through” the other switches hop by hop.

If a link in the network fails, resulting in a change in the working topology, the switch at the point of failure can directly sense the change through the port state, but the other switches cannot directly sense the change. At this time, the switch at the point of failure keeps sending TCN BPDUs to the upstream switch through its root port with the Hello Time as the cycle until it receives an acknowledgment configuration BPDU from the upstream switch, and its TCA (Topology Change Acknowledgment) flag is set to 1. After receiving the TCN BPDU, on the one hand, the upstream switch will reply to the acknowledgment configuration BPDU through its designated port, and on the other hand, it will keep sending a TCN BPDU to its upstream switch through its root port with the Hello Time as the cycle. This process will be repeated until the root bridge receives the TCN BPDU. After receiving the TCN BPDU, the root bridge will send a configuration BPDU with a TC (Topology Change) flag location of 1 to advertise all switches that the network topology has changed. Figure 8.8 illustrates this process.

After receiving the configuration BPDU with TC flag location of 1, the switch realizes that the network topology has changed, which indicates that the content of its own MAC address table is probably no longer correct. Then the switch will shorten the aging period of its own MAC address table (which is 300 s by default) to the length of Forward Delay (which is 15 s by default) to accelerate the aging of the original address table entries.

8.1.5 Port States of Spanning Tree

For a bridge or switch running STP, the port state will shift between the following five states.

1.
Blocking: a blocked port is unable to forward frames, and can only listens for BPDUs. Blocking state is set to prevent the use of paths with loops. By default, all ports are in blocking state when the switch is powered up.
2.
Listening: all switch ports listen for BPDUs so as to make sure no loops are generated on the network before transmitting data frames. Ports in the listening state are ready to forward data frames before the MAC address table is formed.
3.
Learning: the switch port listens for BPDUs and learns all paths in the switch network. Ports in the learning state form a MAC address table, but cannot forward data frames. The forward delay is the time taken to transform the port from the listening state to the learning state, which is set to 15 s by default and can be viewed by executing the display spanning-tree command.
4.
Forwarding: on the bridge ports, the port in the forwarding state sends and receives all data frames. If the interface is still the designated port or the root port at the end of the learning state, it enters the forwarding state.
5.
Disabled: administratively speaking, an interface in the disabled state cannot forward frames or form STP. In the disabled state, the port is essentially non-functional.

In most cases, switch ports are in the blocking or forwarding state. A forwarding port is the port with the lowest cost to the root bridge, but if the network topology changes (perhaps a link fails, or someone adds a new switch), the ports on the switch will be in the listening or learning state.

As mentioned earlier, blocking ports is a strategy to prevent network loops. Once the switch has decided on the optimal path to the root bridge, all other ports will be in the blocking state. The blocked ports can still receive BPDUs, but they cannot send any frames.

8.1.6 View and Configure STP

Set up an enterprise LAN with three switches, S1, S2 and S3, and the network topology is shown in Fig. 8.9. The operations below will enable the following functions.

1.
Enable STP.
2.
Determine the root bridge.
3.
Check the port states.
4.
Configure the STP mode as RSTP.
5.
Specify S2 as the root bridge and S1 as the alternate root bridge.

Display the spanning tree operation states on S1.

[S1]display stp --Display the configuration of STP -------[CIST Global Info][Mode MSTP]------- --Global configuration, and the default STP mode is MSTP CIST Bridge :32768.4c1f-cc82-6053 --Bridge ID of Switch, and 32768 is the priority Config Times :Hello 2s MaxAge 20s FwDly 15s MaxHop 20 Active Times :Hello 2s MaxAge 20s FwDly 15s MaxHop 20 CIST Root/ERPC :32768.4c1f-cc82-6053 / 0 --Root bridge ID，and S1 is the root bridge CIST RegRoot/IRPC :32768.4c1f-cc82-6053 / 0 CIST RootPortId :0.0 BPDU-Protection :Disabled TC or TCN received :7 TC count per hello :0 STP Converge Mode :Normal Time since last TC :0 days 0h:3m:23s Number of TC :8 Last TC occurred :GigabitEthernet0/0/1 ----[Port1(GigabitEthernet0/0/1)][FORWARDING]---- --Port GigabitEthernet 0/0/1 is in forwarding state Port Protocol :Enabled Port Role :Designated Port --Designated port Port Priority :128 --Port priority, and the default value is 128 Port Cost(Dot1T ) :Config=auto / Active=20000 Designated Bridge/Port :32768.4c1f-cc82-6053 / 128.1 Port Edged :Config=default / Active=disabled Point-to-point :Config=auto / Active=true Transit Limit :147 packets/hello-time Protection Type :None Port STP Mode :MSTP Port Protocol Type :Config=auto / Active=dot1s BPDU Encapsulation :Config=stp / Active=stp PortTimes :Hello 2s MaxAge 20s FwDly 15s RemHop 20 TC or TCN send :1 TC or TCN received :0 BPDU Sent :96 TCN: 0, Config: 0, RST: 0, MST: 96 BPDU Received :1 TCN: 0, Config: 0, RST: 0, MST: 1 ……

Enter “display stp brief” to display STP port state.

[S1]display stp brief MSTID Port Role STP State Protection 0 GigabitEthernet0/0/1 DESI FORWARDING NONE --Designated port, forwarding state 0 GigabitEthernet0/0/2 DESI FORWARDING NONE --Designated port, forwarding state 0 GigabitEthernet0/0/3 DESI FORWARDING NONE --Designated port, forwarding state

All ports on the root switch are designated ports, among which GigabitEthernet0/0/3 is connected to the computer and will also participate in the spanning tree protocol.

Note: ① if there is no loop between the switches, you can enter “stp disable” to disable the spanning tree protocol, so that the switch will be powered on and the ports will enter the forwarding state soon, and there will be no spanning tree process.

[S1]stp disable

② Enter “stp enable” to enable spanning tree protocol, which is enabled by default on Huawei switches.

[S1]stp enable

The following commands can be used to view the STP modes supported by Huawei switches and configure the STP mode as RSTP.

[S1]stp mode ? --View the STP modes supported mstp Multiple Spanning Tree Protocol (MSTP) mode rstp Rapid Spanning Tree Protocol (RSTP) mode stp Spanning Tree Protocol (STP) mode [S1]stp mode rstp --Set the STD mode as RSTP

Although STP automatically elects the root bridge, usually, the network administrator will pre-designate the switch with better performance and closer to the network center as the root bridge. You can designate the root bridge and the alternate root bridge by changing the priority of the switches.

The following changes the priority of Switch S2 to make it a preferred choice for the root bridge, and changes the priority of S1 to make it the alternate root bridge.

[S2]stp priority ? --View the value range of priority INTEGER<0-61440> Bridge priority, in steps of 4096 --Value range of priority, which is multiples of 4096 [S2]stp priority 0 --Set the priority to 0 [S1]stp priority 4096 --Set the priority to 4096

You can also use the following command to set the priority of S2 to 0.

[S2]stp root primary

You can also use the following command to set the priority of S1 to 4096.

[S1]stp root secondary

View the configuration information of STP on S2 and observe the mode of Spanning Tree Protocol, the root bridge ID and its priority.

[S2]display stp -------[CIST Global Info][Mode RSTP]------- --The STP mode is RSTP CIST Bridge :0 .4c1f-ccc4-3dad --Root bridge ID, and the priority is 0 Config Times :Hello 2s MaxAge 20s FwDly 15s MaxHop 20 Active Times :Hello 2s MaxAge 20s FwDly 15s MaxHop 20 CIST Root/ERPC :0 .4c1f-ccc4-3dad / 0 CIST RegRoot/IRPC :0 .4c1f-ccc4-3dad / 0 …

View the STP bried on S3, from which you can see the role and state of the ports.

<S3>display stp brief MSTID Port Role STP State Protection 0 GigabitEthernet0/0/1 ALTE DISCARDING NONE 0 GigabitEthernet0/0/2 ROOT FORWARDING NONE 0 GigabitEthernet0/0/3 DESI FORWARDING NONE

You can see that GigabitEthernet0/0/1 is an alternate (ALTE) port in discarding state. GigabitEthernet0/0/2 is a root port in forwarding state. GigabitEthernet 0/0/3 is a designated (DESI) port in forwarding state.

Note: ROOT means the port is a root port; ALTE is the abbreviation of the word Alternative, and the port is an alternate port; DESI is the abbreviation of the word Designation, and the port is a designated port.

The following operation disables port GigabitEthernet 0/0/3 of the switch. You can see that the initial state of the port is discarding, and 15 s later, the port enters the learning state, and only after 30 s does it finally enter the forwarding state.

[S3]display stp brief MSTID Port Role STP State Protection 0 GigabitEthernet0/0/1 ALTE DISCARDING NONE 0 GigabitEthernet0/0/2 ROOT FORWARDING NONE 0 GigabitEthernet0/0/3 DESI FORWARDING NONE --In forwarding state [S3]interface GigabitEthernet 0/0/3 [S3-GigabitEthernet0/0/3]shutdown --Shutdown port [S3-GigabitEthernet0/0/3]undo shutdown --Enable port <S3>display stp brief MSTID Port Role STP State Protection 0 GigabitEthernet0/0/1 ALTE DISCARDING NONE 0 GigabitEthernet0/0/2 ROOT FORWARDING NONE 0 GigabitEthernet0/0/3 DESI DISCARDING NONE --Initial state

8.2 Link Aggregation

8.2.1 Basic Concepts of Link Aggregation

First, let’s clarify some common concepts. Readers may often hear such concepts as standard Ethernet port, fast Ethernet (FE) port, 100 Gigabit port, Gigabit Ethernet (GE) port, and 10 Gigabit port. So, what exactly do these concepts mean?

In fact, these concepts are related to the specifications of Ethernet technology, especially to the bandwidth specifications of Ethernet ports. When IEEE develops specifications on the information transmission rate of Ethernet, the rate is almost always incremented by a factor of 10. At present, the standardized Ethernet port bandwidths are 10 Mbit/s, 100 Mbit/s, 1000 Mbit/s (1 Gbit/s), 10 Gbit/s and 100 Gbit/s. Increasing it by 10 times can not only well match the development of microelectronics and optical technology, but also control the confusing nature about the specifications on Ethernet information transmission rate. Imagine if the IEEE launches a specification of information transmission rates of 415 Mbit/ today and another one of 624 Mbit/s tomorrow. The manufacturers of Ethernet network interface cards must be suffering. And, when it comes to actually building Ethernet, the issue of matching the bandwidth of the ports at both ends of the Ethernet link can be a mess.

The concept of an Ethernet link corresponds to the concept of an Ethernet port. For example, if the ports at both ends of a link are GE ports, the link is called a GE link; if the ports at both ends of a link are FE ports, the link is called a FE link; and so on.

Now we are going to introduce what is link aggregation technology. Figure 8.10 illustrates the network structure of a company, where the access layer switches and aggregation layer switches are connected using GE links. If you intend to increase the connection bandwidth of the access layer switches and aggregation layer switches, theoretically, you can add another GE link, but the Spanning Tree Protocol will block a port of one of the links.

When devices at both ends of a link need to bundle multiple links into one logical link to increase the bandwidth of the link, Ethernet link aggregation (Eth-Trunk) technology is used. Eth-Trunk is also known as Link Aggregation, Link Trunking, and Link Bonding. It is important to note that the link aggregation technologies mentioned here are all for Ethernet links.

A link aggregation port can be used as an ordinary Ethernet port, and its difference with the ordinary Ethernet port is that when forwarding data, the link aggregation port (logical port) needs to select one or more ports from the member ports (physical ports) for data forwarding so as to achieve traffic load balance and link redundancy. As shown in Fig. 8.11, if a 2000 Mbit/s aggregation link built by two 1000 Mbit/s links is enough to meet the requirements, there is no need to purchase equipment for 10,000 Mbit/s interfaces.

8.2.2 Application Scenarios of Link Aggregation Technology

In the example mentioned in the previous section, we applied the link aggregation technology between two switches. In fact, link aggregation technology can also be applied between switches and routers, between routers, between switches and servers, between routers and servers, and between servers, as shown in Fig. 8.12. Note that, in theory, link aggregation is also possible to be used on personal computers (PCs), but no one will actually implement it due to factors such as costs. In addition, from the perspective of principle, a server is nothing but a high-performance computer. From the point of view of network applications, the server is imperative, so it is necessary to ensure that the connection between the server and other devices is highly reliable. Therefore, link aggregation technology is often required on servers.

8.2.3 Basic Principles of Link Aggregation

As shown in Fig. 8.13, Switch A and Switch B are connected by three physical links, which are configured as an aggregation link. In each switch, there is an aggregated port at each end of the aggregation link, and each aggregation port has a queue of frames to be sent and a queue to be received. The following is an example of how an aggregation link can enable traffic load balance by sending frames from Computer A to Computer C.

The process of Computer A sending three frames to Computer C over the aggregation link as follows: ① Computer A sends three frames to Computer C, that is, I, Love, You. ② These three frames go to the frame sending queue at the aggregation port of Switch A. ③ Then the frames from the sending queue are distributed to the three physical ports through the frame distributor. ④ The frames received by the three physical ports go to receiving queue at the aggregation port of Switch B. ⑤ Then the frames from the receiving queue are sent to the port connected to Computer C. In this way, frames pass through all three ports, making full use of the bandwidth of all three links. This is the basic principle of link aggregation, which is in fact “traffic sharing”. In addition, if a member link of the aggregation link fails and goes down, the total traffic of the aggregation link will continue to be shared among the other member links.

The link aggregation technology seems to be simple but it is not. One of the main problems that link aggregation technology needs to face is the “disorder”.

As shown in Fig. 8.14, frames in the sending queue of Switch A go through different physical links and their order may change when they reach Switch B. Some frames are longer and some are shorter, resulting in the following situation: frame “I” is sent before frame “love”, but at the receiving end, frame “love” is shorter and is received first, so the order in the receiving queue becomes “love”, “I”, “you”; Computer C receives frames in the order of “love” “I” “you”, and is unable to receive them in order, and this disorder is a harmful disorder.

The problem of harmful disorder can be solved by having all frames arriving at the same destination MAC address by the same physical link in the aggregation link. As shown in Fig. 8.15, frames to Computer C (frames whose destination MAC address is the MAC address of Computer C) are sent to Switch B through the link on the top, and frames to Computer D are sent to Switch B through the link in the middle. Although there is also disorder on the aggregation link, the frames reaching Computer C are in the correct order, and the frames to Computer D are in the correct order as well. This kind of disorder is called a harmless disorder. In this case, load balance is not guaranteed for multiple physical links of the aggregation link. As shown in Fig. 8.15, there is no traffic on the physical link at the bottom.

8.2.4 Modes of Link Aggregation

In order to make the link aggregation port work properly, it is required that all peer ports of member ports of local link aggregation ports belong to the same device end and have joined the same link aggregation port.

Similar to setting the port bandwidth, there are two ways to establish link aggregation: manual configuration and dynamic negotiation by both parties. In the context of Huawei Eth-trunk, the former is called Manual Mode, while the latter is named as LACP Mode according to the negotiation protocol, Link Aggregation Control Protocol.

1.
Manual mode

Manual mode means that the administrator creates an Eth-trunk on a device and then adds multiple ports connected to the same switch to this Eth-trunk according to their needs, and then performs the corresponding operations on the peer switch. With Eth-trunk configured in manual mode, devices will not exchange information with each other for establishing Eth-trunk. Instead, they will only bundle links according to the administrator’s configuration, and then send data through the bundled link by load balance.

Establishing Eth-Trunk by manual mode is inflexible. It can only determine whether the port is working properly by its physical state, and is unable to detect misconfigurations or incorrect links. If one of the links in a manually-configured Eth-trunk fails, then both devices can detect this and stop using that failed link and continue to use the normal link to send data. Although a portion of the bandwidth is unavailable due to the link failure, the effectiveness of the communication is still ensured, as shown in Fig. 8.16.

As shown in Fig. 8.17, the administrator mistakenly connects port GE0/0/2 of switch SW1 in Fig. 8.16 to switch SW3. SW1 will not know that the port is connected to other switch and still use port GE0/0/2 for load balance, so obviously frame “you” cannot be sent to SW2, thus resulting in abnormal communication. If LACP mode is used, SW1 and SW2 will automatically negotiate by exchanging LACP protocol frames to ensure that the peer is a member port of the same device and of the same aggregation port.
2.
LACP mode

LACP mode is a link aggregation mode using LACP protocol. Devices interact with each other through link aggregation control protocol data unit (LACPDU), and the protocol negotiation ensures that the peer is a member port of the same device and the same aggregation port. It is not complicated to configure Eth-Trunk using LACP mode. The administrator only needs to first create Eth-Trunk ports on devices at both ends, then configure this Eth-Trunk port as LACP mode, and finally add the physical ports to be bundled into this Eth-Trunk.

If low-end devices of older generation do not support LACP protocol, manual mode can be used.

8.2.5 Load-Balance Mode

Eth-Trunk supports load balance based on the IP address or MAC address of the message. Different modes (locally valid, and effective for outbound messages) can be configured to share the data flow among different member ports.

Common load-balance modes are: source IP address, destination IP address, source MAC address, destination MAC address, source and destination IP address, as well as source and destination MAC address. In the actual service, users need to configure the appropriate load-balance mode according to the characteristics of service traffic. If a certain parameter in the service traffic changes frequently (that is, the number is large), you should select a load-balance mode with a higher load balance degree regarding this parameter.

If the IP address of the message changes more frequently, then a load-balance mode based on the source IP address, destination IP address or source and destination IP address is more conducive to reasonably balancing the traffic load among physical links.

If the MAC address of the message changes more frequently and the IP address is relatively fixed, then the load-balance mode based on source MAC address, destination MAC address or source and destination MAC address is more conducive to reasonably balancing traffic load among physical links.

If the load-balance mode selected does not match the actual service characteristics, it may lead to uneven traffic sharing, that is, some member links are heavily loaded while the rest of the member links are idle. For example, if source MAC address mode is selected in a scenario where the source IP address of the message changes frequently but the source MAC address is fixed, all traffics will be shared on a single member link.

Let’s look at an example: as shown in Fig. 8.18, computers in Area A access a server in Area B. There are a lot of computers in Area A and a lot of source MAC addresses. The link aggregation port on SW1 is configured to use the load-balance mode based on source MAC address, so that the traffic of computers in Area A accessing the server in Area B will be shared relatively evenly among three physical links. Then link aggregation port on SW2 cannot be configured to use the load-balance mode based on source MAC address. If you do, there will be only one source MAC address (one server), and all traffic to Area A will go through only one physical link. For traffic from Area B to Area A, since there are a lot of destination MAC addresses, and SW2 is configured to use the load-balance mode based on destination MAC address. In this way, the traffic sent by the server to the computers in Area A is shared comparatively evenly over three physical links.

Figures 8.18 and 8.19 are similar, and both have Area A. Computers in Area A need to access the Internet through the link aggregation port. How to choose the load-balance mode for the link aggregation ports of two switches SW1 and SW2?

Computers in Area A access the Internet, and there are more computers in the Internet than in Area A. In other words, for the traffic of computers in Area A accessing the Internet, the parameter of destination IP address is the largest, so the load-balance mode based on destination IP address is configured for the link aggregation port of SW1, and the load-balance mode based on source IP address is configured for the link aggregation port of SW2.

8.2.6 An Example of Link Aggregation Configuration

Port bandwidths, duplex modes, and VLAN configurations of the physical ports joining the link aggregation port must be the same. The ports must all be access ports or all be trunk ports. If they are access ports, the default VLAN must be the same, and if they all trunk ports, then the PVID and the allowed VLAN of the port must be the same.

As shown in Fig. 8.20, three links connected to GE0/0/1, GE0/0/2, GE0/0/3 of switch SW1 and GE0/0/1, GE0/0/2, GE0/0/3 of switch SW2 are configured as one aggregation link. The load-balance mode is based on the source MAC address.

Create interface Eth-Trunk 1 on SW1, and the interface number should be the same as that of SW2. Configure the working mode of interface Eth-Trunk 1 as manual mode, add interfaces from GE0/0/1 to GE0/0/3 to interface Eth-Trunk 1, and configure Eth-Trunk 1 as a trunk link to allow all VLANs to pass through.

[SW1]interface Eth-Trunk 1 [SW1-Eth-Trunk1]mode ? --View working mode supported by aggregation link lacp-static Static working mode manual Manual working mode [SW1-Eth-Trunk1]mode manual load-balance --Configure link aggregation mode as manual mode [SW1-Eth-Trunk1]trunkport GigabitEthernet 0/0/1 to 0/0/3 [SW1-Eth-Trunk1]load-balance ? --View load-balance modes supported dst-ip According to destination IP hash arithmetic dst-mac According to destination MAC hash arithmetic src-dst-ip According to source/destination IP hash arithmetic src-dst-mac According to source/destination MAC hash arithmetic src-ip According to source IP hash arithmetic src-mac According to source MAC hash arithmetic [SW1-Eth-Trunk1]load-balance src-mac --Configure load-balance mode based on source MAC address [SW1-Eth-Trunk1]port link-type trunk [SW1-Eth-Trunk1]port trunk allow-pass vlan all [SW1-Eth-Trunk1]quit

Create interface Eth-Trunk 1 on SW2, and the interface number should be the same as that of SW1. Configure the working mode of interface Eth-Trunk 1 as manual mode, add interfaces from GE0/0/1 to GE0/0/3 to interface Eth-Trunk 1, and configure Eth-Trunk 1 as a trunk link to allow all VLANs to pass through.

[SW2]interface Eth-Trunk 1 [SW2-Eth-Trunk1]mode manual load-balance [SW2-Eth-Trunk1]trunkport GigabitEthernet 0/0/1 to 0/0/3 [SW2-Eth-Trunk1]load-balance src-mac [SW2-Eth-Trunk1]port link-type trunk [SW2-Eth-Trunk1]port trunk allow-pass vlan all [SW2-Eth-Trunk1]quit

Enter “display eth-trunk 1” to view the configuration information of Eth-Trunk 1.

[SW1]display eth-trunk 1 Eth-Trunk1's state information is: WorkingMode: NORMAL Hash arithmetic: According to SA Least Active-linknumber: 1 Max Bandwidth-affected-linknumber: 8 Operate status: up Number Of Up Port In Trunk: 3 -------------------------------------------------------------- PortName Status Weight GigabitEthernet0/0/1 Up 1 GigabitEthernet0/0/2 Up 1 GigabitEthernet0/0/3 Up 1

In the above echo message, “WorkingMode:NORMAL” indicates that the link aggregation mode of interface Eth-Trunk 1 is NORMAL, that is, manual mode. “Least Active-linknumber:1” means that the lower limit threshold of the member links in Up state is 1. The minimum number of active interfaces is set to ensure the minimum bandwidth. When the bandwidth is too small, some services that have high demand for link bandwidth will be abnormal. In this situation, the Eth-Trunk is cut off, and the service is switched to other paths through the high reliability of the network itself, so as to ensure the normal operation of the service. “Operate status:up” indicates that the status of Eth-Trunk 1 interface is “Up”. From the information in the bottom lines, you can see that Eth-Trunk 1 contains three member ports, GigabitEthernet0/0/1, GigabitEthernet0/0/2, and GigabitEthernet0/0/3.

8.3 Smart Link

Smart Link private protocol of Huawei can replace the STP protocol in certain scenarios and can achieve fast (millisecond-level) link switching.

8.3.1 Basic Principles of Smart Link

As shown in Fig. 8.21, access layer switch S4 has N user terminals connected to it, and S4 is connected to aggregation layer switches S2 and S3 via Link2-4 and Link 3-4, respectively. S2 and S3 are connected to core layer switch S1 via Link1-2 and Link1-3, respectively, and S1 is connected to the Internet via a router. To eliminate working loops, STP protocol is run on each switch. Assuming that the links in the STP tree contain Link1-2, Link1-3, and Link2-4, then when Link2-4 is disconnected, Link3-4 joins the STP tree, thus ensuring the connectivity of the network.

The convergence of Spanning Tree Protocol is relatively slow, which usually takes seconds. If some links in the network are high-speed links, a large amount of data will be lost when STP switches links. If some services sensitive to packet loss are run in the user terminal, then these services will be seriously impacted.

To address the above problems, Huawei has designed and implemented a private protocol called Smart Link, whose main role is to replace the STP protocol in certain scenarios and enable fast (millisecond-level) link switching. A Smart Link group consists of two interfaces, one of which is the master interface and the other is the slave interface. Under normal conditions, only the master interface is active for forwarding, while the slave interface is blocked and in standby (inactive) state. When the master interface fails, the Smart Link group automatically blocks the master interface and immediately switches the state of the slave interface from inactive to active. Smart Link technology is commonly used in dual uplink networking environments.

As shown in Fig. 8.22, a Smart Link group is configured on switch S4, with GE1/0/1 as its master interface and GE1/0/2 as its slave interface. Under normal circumstances, master interface GE1/0/1 is active for forwarding and the slave interface GE1/0/2 in the standby state, so the links that are really working are Link1-3, Link1-2, and Link2-4, while Link3-4 is in the interrupted state, which prevents the loops. If master interface GE1/0/1 suddenly fails, or if master interface GE1/0/1 senses Link2-4 is interrupted, then the Smart Link group immediately sets master interface GE1/0/1 to the blocking state, while switching slave interface GE1/0/2 from the standby state to the forwarding state. In this way, the links that are really working immediately become Link1-3, Link1-2, and Link3-4, while Link2-4 is in the interrupted state. In this way, the network connectivity is ensured, and meanwhile loops are prevented. Note that the Link protocol is mutually exclusive with the STP protocol, so there is no STP running in the network shown in Fig. 8.22.

From the above description, we can see that Smart Link technology works in a very simple way. However, the real situation may not be as simple as we think. Next, an example will be used to illustrate the main problems that Smart Link technology needs to solve.

As shown in Fig. 8.23, a Smart Link group is configured on switch S4 with GE1/0/1 as its master interface and GE1/0/2 its slave interface. The network is currently in a normal working state, i.e., Link3-4 is interrupted and Link1-3, Link1-2 and Link2-4 are all working. In addition, we assume that the MAC address of PC1’s network interface is MAC-1.

Suppose that at moment t, PC1 sends a frame to the Internet, then this frame must pass through Link2-4 and Link1-2 and then enter S1 from port GE1/0/3 of switch S1, and then S1 will forward this frame to the router. According to the MAC address learning mechanism of the switch, at moment t (ignoring the time this frame takes to move from PC1 to S1), the table entry about MAC-1 on S1 will become: the corresponding interface is GE1/0/3 and the value of aging timer (countdown timer) is 300 s (the default value).

Then, as shown in Fig. 8.24, we assume that at moment t + 5 s, Link2-4 is interrupted, and master interface GE1/0/1 of S4 is immediately blocked, and slave interface GE1/0/2 is immediately switched to the forwarding state. The working links at this time become Link1-3, Link1-2, Link3-4. Meanwhile, the table entry about MAC-1 on S1 will become: the corresponding interface is GE1/0/3 and the value of aging timer is 295 s.

Now, let’s assume that the time has transitioned from 5 s to 10 s, and let’s assume that PC1 has not sent any frames outward during this period, so there is still a table entry about MAC-1 in the MAC address table on S1, in which the interface corresponding to MAC-1 is still GE1/0/3, but the aging timer value has changed to 290 s, as shown in Fig. 8.25. For moment t + 10s, we assume that S1 receives a frame with the destination MAC address MAC-1 from the router. Obviously, after querying its own MAC address table, S1 will forward this frame out of its interface GE1/0/3 instead of its interface GE1/0/4. However, we know that Link2-4 is interrupted at this point, so the frame cannot be delivered to PC1, which results in a frame loss that we hate to see. In an extreme case, suppose that PC1 has not sent any frames during the period from t + 10s to t + 300 s, that is, the interface corresponding to MAC-1 in the MAC address table on S1 has always been GE1/0/3, then all frames sent by the router to S1 with MAC-1 as the destination MAC address during this period will be lost.

How does Smart Link avoid the above-mentioned frame loss? To address this problem, Smart Link defines a protocol frame called a flush frame, whose destination MAC address is a multicast MAC address 01-0f-e2-00-00-04. The main purpose of a flush frame is to notify the switch concerned to immediately clear the error table entry in the MAC address table.

As shown in Fig. 8.26, assume that the time reverts to moment t + 5 s. At this moment, Link2-4 is interrupted, and master interface GE1/0/1 of S4 is immediately blocked, while slave interface GE1/0/2 is immediately switched to the forwarding state. At this point the working links become Link1-3, Link1-2, Link3-4. Meanwhile, the table entry about MAC-1 on S1 reads: the corresponding interface is GE1/0/3 and the value of aging timer is 295 s. Now, with the Smart Link protocol, S4 immediately sends a flush frame through its slave interface GE1/0/2. After receiving the flush frame and analyzing it, S1 immediately clears the table entry about MAC-1 in its MAC address table. The structure of the flush frame and the control information it carries are not described here.

Next, assume that the time again progresses from moment t + 5 s to moment t + 10s, and assume that PC1 has not sent any frames during this time, so there will be no table entry about MAC-1 in the MAC address table of S1, as shown in Fig. 8.27. At moment t + 10s, we assume that S1 receives a frame from the router with a destination MAC address of MAC-1. Obviously, S1 cannot find a table entry about MAC-1 in its own MAC address table, so it floods this frame out of its interfaces GE1/0/3 and GE1/0/4. Clearly, the frame with MAC-1 as its destination MAC address going out from S1’s interface GE1/0/3 cannot reach PC1 (because Link2-4 is interrupted), but the frame with MAC-1 as its destination MAC address going out from GE1/0/4 will go through Link1-3 and Link3-4 and reach PC1, thus avoiding frame loss.

As we can see from the previous examples, flush frames play a critical role in the Smart Link protocol. In order to control the communication and scope of flush frames, Smart Link specifically defines a VLAN for flush frames, called the control VLAN. Flush frames must carry the control VLAN tag before they are sent. If a device needs to receive and process flush frames, it must be configured accordingly so it can receive, identify, and process frames with control VLAN tags. If a device is not configured as described above, it will directly discard the frames with control VLAN tags when it receives them.

Finally, let’s briefly introduce the restore function of Smart Link. Under normal circumstances, the master interface of Smart Link is active while the slave interface is inactive. When the master interface is down, it will switch to inactive state and the slave interface will switch to active state. However, after the master interface is back up, Smart Link will not automatically switch the state of the master interface back to active and that of slave interface to inactive. If we need to restore the state of the master interface to active and the state of the slave interface to inactive, we must configure the restore function of Smart Link in advance. In addition, when configuring the link restore function, we also need to configure a parameter called “wait to restore time”, and its default value is 60 s. In other words, although the master interface is back up (the main link is reconnected), it has to wait for a period of time (which is the so-called wait to restore time) before performing the restore action. This is because although the main interface is back up, its working state may not be stable, and it may even subject to flashes. Therefore, the restore operation should not be performed immediately.

8.3.2 An Example of Smart Link Configuration

As shown in Fig. 8.28, S1, S2, S3 and S4 form a loop. We need to configure interfaces GE0/0/1 and GE0/0/2 in a Smart Link group on S4, and make GE0/0/1 the master interface and GE0/0/2 the slave interface.

1.
Configuration roadmap
1. (a)
  Create a Smart Link group, add the corresponding interface to the Smart Link group, and specify the interface role.
2. (b)
  Enable the flush frame sending function.
3. (c)
  Enable the flush frame receiving function.
4. (d)
  Enable the Smart Link restore function.
5. (e)
  Enable the Smart Link function.
2.
Configuration steps

Since Smart Link protocol is mutually exclusive with STP protocol, you need to disable the STP function by entering the corresponding interface view and use the stp disable command before configuring Smart Link.
[S4]interface GigabitEthernet 0/0/1 [S4-GigabitEthernet0/0/1]stp disable [S4-GigabitEthernet0/0/1]quit [S4]interface GigabitEthernet 0/0/2 [S4-GigabitEthernet0/0/2]stp disable [S4-GigabitEthernet0/0/2]quit

Next, create Smart Link group 1 on S4, and use the port command to configure GE0/0/1 as the master interface of Smart Link group 1 and GE0/0/2 as the slave interface, enable Smat Link, and set the wait to restore time to 30 s.
[S4]smart-link group 1 [S4-smlk-group1]port GigabitEthernet 0/0/1 master [S4-smlk-group1]port GigabitEthernet 0/0/2 slave [S4-smlk-group1]restore enable --Enable restore [S4-smlk-group1]smart-link enable [S4-smlk-group1]timer wtr 30 --Set wait to restore time

Then, use the flush send command to enable Smart Link group 1 to send flush frames, with 10 as the control VLAN tag and “Huawei” as the password.
[S4-smlk-group1]flush send control-vlan 10 password simple huawei

Use the smart-link flush receive command on S1, S2 and S3 to specify that their interfaces GE0/0/1 and GE0/0/2 can receive and process flush frames carrying control VLAN 10.

Execute the following commands on S2.
[S2]interface GigabitEthernet 0/0/1 [S2-GigabitEthernet0/0/1]smart-link flush receive control-vlan 10 password simple huawei [S2-GigabitEthernet0/0/1]quit [S2]interface GigabitEthernet 0/0/2 [S2-GigabitEthernet0/0/2]smart-link flush receive control-vlan 10 password simple huawei [S2-GigabitEthernet0/0/2]quit

Execute the following commands on S1.
[S1]interface GigabitEthernet 0/0/1 [S1-GigabitEthernet0/0/1]smart-link flush receive control-vlan 10 password simple huawei [S1-GigabitEthernet0/0/1]quit [S1]interface GigabitEthernet 0/0/2 [S1-GigabitEthernet0/0/2]smart-link flush receive control-vlan 10 password simple huawei [S1-GigabitEthernet0/0/2]quit

Execute the following commands on S3.
[S3]interface GigabitEthernet 0/0/0 [S3-GigabitEthernet0/0/1]smart-link flush receive control-vlan 10 password simple huawei [S3-GigabitEthernet0/0/1]quit [S3]interface GigabitEthernet 0/0/2 [S3-GigabitEthernet0/0/2]smart-link flush receive control-vlan 10 password simple huawei [S3-GigabitEthernet0/0/2]quit

Enter “display smart-link group 1” to view the information related to the Smart Link of S4.
<S4>display smart-link group 1 Smart Link group 1 information : Smart Link group was enabled Wtr-time is: 30 sec. There is no Load-Balance There is no protected-vlan reference-instance DeviceID: 4c1f-cc88-31fe Control-vlan ID: 10 Member Role State Flush Count Last-Flush-Time -------------------------------------------------------------- GigabitEthernet0/0/1 Master Active 0 0000/00/00 00:00:00 UTC+00 :00 GigabitEthernet0/0/2 Slave Inactive 0 0000/00/00 00:00:00 UTC+00 :00

You can see that Smart Link group 1 is enabled, GigabitEthernet0/0/1 is active as the master interface, GigabitEthernet0/0/2 is inactive as the slave interface, the ID of the control VLAN is 10, and the wait to restore time is 30 s.

8.4 Monitor Link

8.4.1 Basic Principles of Monitor Link

As shown in Fig. 8.29, a Smart Link group is configured on Switch S4, with GE1/0/1 as the master interface and GE1/0/2 the slave interface. GE1/0/1 is active and GE1/0/2 inactive. If now interface GE1/0/1 of S2 fails, resulting in the interruption of Link1-2, then what will be the consequence? Obviously, it is impossible for S4 to sense the failure of S2’s interface GE1/0/1, so as a result all frames from S4’s master interface GE1/0/1 will be lost.

To address the above problem, Huawei has designed and implemented a private protocol called Monitor Link, which is mainly used in conjunction with Smart Link in certain scenarios to prevent frame loss.

In Fig. 8.29, we can configure a Monitor Link group on S2. This Monitor Link group contains two interfaces: interface GE1/0/1, which functions as an uplink interface, and interface GE1/0/2, which functions as a downlink interface. The working principle of Monitor Link is that: a Monitor Link group consists of one uplink interface and several downlink interfaces; if the uplink interface fails to work properly for various reasons, the state of all of its downlink interfaces must be immediately turned to “Down”. In other words, there is a linkage mechanism between the uplink interface and the downlink interfaces, and the working state of the downlink interfaces should be consistent with that of the uplink interface.

Let’s take another look at Fig. 8.29. Under normal conditions, the links in working state are Link2-4, Link1-2 and Link1-3. If interface GE1/0/1 of S2 fails, because of the Monitor Link protocol, the state of interface GE1/0/2 of S2 will be immediately turned to “Down”. In this way, interface GE1/0/1 of S4 cannot work properly. Therefore, Smart Link of S4 then immediately switches the state of its slave interface GE1/0/2 from inactive to active. Thus, the links in working state become Link1-3 and Link3-4, and the network connectivity is still ensured.

Similarly, to further strengthen the reliability of the network, we can also configure a Monitor Link group on S3, so that interface GE1/0/2 of S3 can be linked with interface GE1/0/1.

Let’s take a look at a more complicated case. As shown in Fig. 8.30, a Smart Link group is configured on S1, S2, and S3, and a Monitor Link group is configured on S2 and S3, respectively. Note that for the Monitor Link group on S2, the entire Smart Link group on S2 is considered its uplink interface, and the state of its downlink interface will be turned to “Down” only if both interfaces of that Smart Link group are not working properly. The situation is the same for S3, so it will not be repeated here.

In Fig. 8.30, if S2’s master interface fails, its slave interface is immediately switched to the working state, and at this time, the Monitor Link group on S2 does not have a linkage effect. If both the master and slave interfaces of S2 fail, then the state of the downlink interface of S2 is turned to “Down”, which triggers the Smart Link group on S1 to perform switching action. This example shows us that a flexible and clever combination of Smart Link and Monitor Link technology can often be used to better satisfy the special needs in complex networking situations.

When the uplink interface of a Monitor Link group does not work properly, all its downlink interfaces will be “down” as a result. If the uplink interface recovers, its downlink interface will also be automatically turned back to “up”, which is the restore function of Monitor Link. Similar to the Smart Link case, we can also configure a suitable wait to restore time for the Monitor Link.

8.4.2 An Example of Monitor Link Configuration

As shown in Fig. 8.31, a Smart Link group has been configured on S2 and S4, and we now need to configure a Monitor Link group on S2 and S3.

1.
Configuration roadmap
1. (a)
  Create a Monitor Link group on S2 and S3, and add the corresponding uplink and downlink ports.
2. (b)
  Configure the recover time of the Monitor Link group on S2 and S3.
2.
Configuration steps

Create Monitor Link group 1 on S2, add the already created Smart Link group 1 as the uplink port to Monitor Link group 1, and add port GE2/0/1 as the downlink interface to Monitor Link group 1.
[S2]monitor-link group 1 [S2-mtlk-group1]smart-link group 1 uplink [S2-mtlk-group1] port GigabitEthernet 0/0/1 downlink ? INTEGER<1-24> Downlink's index, ranging from integer 1 to 24 --Support up to 24 downlink ports <cr> [S2-mtlk-group1]port GigabitEthernet 2/0/1 downlink 1 --Specify the downlink port as 1

Create Monitor Link group 2 on S3, add port GE1/0/1 as the uplink port to Monitor Link group 2, and add port GE2/0/1 as the downlink port to Monitor Link group 2.
[S3Jmonitor-link group 2 [S3-mtlk-group2]port GigabitEthernet 1/0/1 uplink [S3-mtlk-group2] port GigabitEthernet 2/0/1 downlink 1

Then, configure the recover time on S2 and S3. Use the timer recover-time command to set the recover time for the Monitor Link group to 10 s.
[S2-mtlk-group1]timer recover-time 10 [S3-mtlk-group2]timer recover-time 10

Now, we need to confirm the configuration made, that is, use the display smart-link group all command to view the information of all Smart Link groups and the display monitor-link group all command to view the information of all Monitor Link groups.

8.5 Alternatives to STP and Current Networking Recommendations

Four alternatives to STP are listed below.

1.
Link aggregation refers to aggregating multiple physical interfaces together to form a logical interface for the load balance of outgoing and ingoing traffic on each member interface. The switch decides through which member interface packets are sent to the peer switch based on the interface load balance policy configured by the users. When the switch detects a link failure on one of the member interfaces, it stops sending packets on this interface and recalculates to decide which interface among the remaining links shall send the packets according to the load balance policy. The failed interface will resume the role of sending and receiving packets once it recovers. Link aggregation is an imperative technology in increasing link bandwidth, achieving link transmission resilience and link redundancy. Aggregation enables multiple links to be treated as one, which prevents loops.
2.
Smart Link is a solution tailored for dual uplink networking. If a Smart Link group is created on a device, then two uplink interfaces are added to the group. One of the interfaces is designated as the master interface and the other as the slave interface. By default, only the master interface is active and it forwards traffic normally, while the slave interface is blocked so the Layer 2 loop is broken. When the Smart interface fails or its directly connected link fails, Smart Link will immediately sense the change, and can switchover in milliseconds. The slave interface instantly changes to an active state, and starts sending and receiving service traffic. Smart Link is easy to configure and can switch in a fast speed. However, due to the limitation of its working mechanism, this technology is only applicable to specific networking scenarios.
3.
iStack/CSS. iStack is the stacking technology of Huawei box switches. The so-called stacking technology refers to the technology that multiple physical switches are connected through specific cables and configured accordingly so they logically become one device. And the concept of Cluster Switch System (CSS) is similar to iStack, except that it targets Huawei frame switches. When a Layer 2 loop is formed, the loop has to be broken by blocking the interface, but stacking/clustering is different. Stacking/clustering can connect the switches using stacking cables and then form a stacking system. After the establishment is complete, the two switches become one and are logically one switch.
4.
A scenario without Layer 2 loop. The Layer 2 loop in the network is manually broken to circumvent the application of spanning tree.

8.6 Exercises

1.
Which of the following descriptions of the forwarding state in the Spanning Tree Protocol is incorrect ( ).
1. A.
  The port in forwarding state can receive BPDU messages
2. B.
  The port in forwarding state does not learn the source MAC address of the message
3. C.
  The port in forwarding state can forward data packets
4. D.
  The port in forwarding state can send BPDU messages
2.
The following information is the port state information displayed on a switch running STP. Based on this information, which of the following descriptions is correct ( ).

<S3>display stp brief MSTID Port Role STP State Protection 0 GigabitEthernet0/0/1 ALTE DISCARDING NONE 0 GigabitEthernet0/0/2 ROOT FORWARDING NONE 0 GigabitEthernet0/0/3 DESI FORWARDING NONE

A.
This may be the only switch in this network
B.
This switch is the root switch in the network
C.
This switch is a non-root switch in the network
D.
This switch must be connected to three other switches

3.
When there are redundant paths in a Layer 2 switch network, what method can be used to prevent loops and improve the reliability of the network? ( )
1. A.
  Spanning Tree Protocol
2. B.
  Horizontal Partitioning
3. C.
  Route Poisoning
4. D.
  Trigger Update
4.
A user reports that the files are transferred in an extremely low speed in the network, and the administrator finds some duplicate frames in the network using Wireshark packet capture tool. Which of the following descriptions of the possible causes or solutions is correct ( ).
1. A.
  The switch floods the data frame when it cannot find the destination MAC address of the frame in the MAC address table
2. B.
  The switching equipment in the network must be upgraded
3. C.
  There are loops in the network at Layer 2
4. D.
  No VLANs are configured in the network
5.
(Multi-selection) What is the role of link aggregation? ( )
1. A.
  Increase bandwidth
2. B.
  Enable load balance
3. C.
  Increase network reliability
4. D.
  Facilitate data analyzing
6.
How to ensure that a switch becomes the root switch of the entire network? ( )
1. A.
  Configure an IP address for the switch that is lower than that of other switches
2. B.
  Set the root path cost of the switch to the lowest value
3. C.
  Configure a priority lower than the other switches for the switch
4. D.
  Configure a MAC address lower than the other switches for the switch
7.
The port cost calculated by STP has a certain relationship with the port bandwidth, that is, greater bandwidth leads to ( ) cost.
1. A.
  Smaller
2. B.
  Greater
3. C.
  Consistent
4. D.
  Unsure
8.
As shown in Fig. 8.32, by default, the network administrator wants to manually aggregate the two physical links between SWA and SWB using Eth-Trunk, and which of the following descriptions is correct ( ).
1. A.
  After aggregation, it can work normally
2. B.
  They can be aggregated, but after aggregation, only Interface G can send and receive data
3. C.
  They can be aggregated, but after aggregation, only Interface E can send and receive data
4. D.
  They cannot be aggregated

Author information

Authors and Affiliations

Consortia

Huawei Technologies Co., Ltd.

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (http://creativecommons.org/licenses/by-nc-nd/4.0/), which permits any noncommercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if you modified the licensed material. You do not have permission under this license to share adapted material derived from this chapter or parts of it.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Huawei Technologies Co., Ltd.. (2023). Advanced Ethernet Switching Technologies. In: Data Communications and Network Technologies. Springer, Singapore. https://doi.org/10.1007/978-981-19-3029-4_8

Download citation

DOI: https://doi.org/10.1007/978-981-19-3029-4_8
Published: 22 October 2022
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-3028-7
Online ISBN: 978-981-19-3029-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics