Keywords

1 Introduction

The provisioning of uninterrupted connectivity services is an indispensable feature of optical backbones. Considerable amounts of tangible and intangible losses, i.e., legal and reputation damages, can result from mere cable cuts as high aggregates of traffic are nowadays transported over a fiber strand [1]. Therefore, in order to properly serve new demands from an information-dependent society, networks must be designed using architectures not only with intrinsic abilities to survive, but they also must be resilient in face of a reasonable set of fault events.

Survivability is defined in [2, 3] as the ability of networks to fulfill their mission in a timely manner in the presence of attacks, failures or accidents. Raising the bar for network architects, resilience is defined in [4] as the capability of a network to minimize link or node failures effects, by assuming the sustainability of the provided services. For instance, a 16-node ring network survives to any single node or link failure, but this network is not resilient. When any link fails in this network, the distance between the corresponding neighbor nodes increases from 1 to 15 hops, and a 2-hop path must be replaced by a 14-hop path after any node failure. Therefore, considerable side effects can be felt on higher network layers and, as a result, implications to the service delivered to end users.

Modern optical backbones encompass complex multilayer mechanisms. Wavelength division multiplexing (WDM) transparently provides end-to-end interconnects over the physical topology. The coarse bandwidth granularity and rigid spectrum allocation of WDM have motivated recent developments in elastic networks and flexible transmission technologies. They aim at better grooming and transmitting heterogeneous traffic demands [5]. Indeed, these new spectral and bit-rate manageable solutions provide extra degrees of freedom (i.e., elastic and tuneable transponders) in accommodating extra demands arising from restoring impacted traffic demands, over sub and super-wavelength services, for improving the overall network resilience [6].

Nevertheless, re-grooming, re-routing or just switching traffic demands around faulty elements, with the least impact on both working and protected/restored demands, still is a huge challenge for backbone network architects. The intrinsic complexity of this multilayer problem turns exact optimization approaches unlikely to be solved in polynomial time. But there is a fundamental aspect that has been overlooked when trying to improve network resiliency. Regardless of the overlay grooming and transmission technologies, the outcomes of traffic engineering optimization tools will always be constrained by the connectivity diversity of the underlay graph at physical layer.

Nowadays, survivability is a necessary but not a sufficient condition. It is no longer acceptable to design networks for just providing means of service continuity. For instance, there is the challenge for designing optical backbones for supporting the emerging services from the 5th Generation (5G) mobile communications network. The Ultra-reliable and low-latency communications (URLLC) assume that there will be latency-bound and highly dependable underlayer connectivity across distributed datacenters [7]. Thus, optical backbones, operating under either normal or faulty conditions, should meet stringent requirements far beyond mere survivability metrics.

Unfortunately, the topological design of backbones is a complex task in its own right; and that is why the literature already addresses them using heuristic, rule-of-thumb based, and demographic methods, e.g., [8,9,10]. Lately, graph theory has joined forces in this front. By abstracting physical aspects and demand parameters, graph theory focus simply on identifying potentially good templates for network topologies, rather than chasing them in extremely large multidimensional search spaces from multi-parameter models. Previous works already investigated Harary graphs tacking survivability in optical backbones [11], twin graphs in datacenter networks [12], and graph invariant optimization for optical backhaul networks [13].

The contribution of this paper is to propose to the network community an interconnection paradigm that is able to systematically tackle resiliency (and survivability as a consequence) when designing or evolving backbone networks. Grounded on graph theory, our solution exploits a particular 2-geodetically connected (2-GC) graph family called twin graphs. A key point is that such graphs are already optimal in providing 2-GC with the least number of links. They can also be adapted over the classical ring-based legacy solutions, which were originally designed only having survivability in mind. Thus, backbones designed under this new paradigm can gracefully evolve networks from survivable rings into intrinsically resilient meshes.

The remainder of the paper is organized as follows. Section 2 discusses the evolution of backbone networks toward resilient and affordable architectures in face of growing demands from dependable services. We then present in Sect. 3 a formal definition of twin graphs, and a constructive method to generate them [14]. In order to demonstrate the practical relevance of twin graphs, we perform comparisons with a set of real-world backbone topologies (and rings) with the same number of nodes in Sect. 4. Finally, Sect. 5 brings a use case for direct comparison of a geographically mapped twin topology with a real-world topology. Section 6 presents our concluding remarks and future works.

2 Why Rings Are No Longer Good Enough for Our Networks?

Survivability in ring topologies is due to their 2-connectivity property, whereas resilience refers to 2-geodetically connectivity that rings are unable to provide. The difference is that any 2-geodetically connected topology intrinsically provides at least two node disjoint (working and backup) paths of minimal length (i.e., geodesics) connecting each node pair of the network, whereas a 2-connected topology provides at least two node disjoint paths connecting each node pair of the network, no matter their lengths. Thus, besides the survivability assured by the simple existence of backup paths, backbone networks should ensure the existence of backup paths not much longer than the working ones in order to provide service resilience either through protection or restoration strategies. Protection refers to techniques in which backup paths are defined prior to the occurrence of any failure, whereas in restoration techniques the backup paths are only defined when failures occur.

On the other hand, ring topology is a prototype traditionally used to address the survivability issue with the least number of extra links. The transition from survivable to resilient topologies should take this legacy architecture into account and upgrade it as inexpensively as possible. Service providers cannot afford to abundantly deploy spare resources for protecting network elements in face of their ever diminishing revenues.

2.1 Evolving Rings into Meshes

Ring topologies have scalability limitation. Big rings, i.e., many nodes spread across large geographical areas, are not economically nor technically viable. For instance, SONET/SDH and Resilient Ethernet Protocol limit their rings to 16 nodes [15]. Currently, interconnected rings is still believed to be the appropriate architectural arrangement for larger networks. Hierarchical structures composing collector and core rings can be built [1]. But note that a single node failure can isolate parts of the network. Therefore, the dual-homed ring is the most appropriate approach to follow when building interconnected rings, in spite of its higher deployment cost. Standardization bodies also support the multi-ring/ladder network prototype as a way of building a mesh through interconnected rings, e.g., [15]. Note that such an arrangement also makes network topologies to gradually depart from the simple and cost effective ring-based approach, demanding also more complex and less effective protection schemes [16].

As a result, network architects are already arguing against a rigid (dual-homed) ring hierarchy and advocating for a flexible mesh hierarchy [17]. Note that there is still an evident inclination in those modern proposals of following classic rules of thumb, such as building meshes with concentric and interconnected rings [1], to ensure survivability. There are initiatives such as the use of bounded rings in [18], which provides an integer linear program model to design such a topology; and some heuristic algorithms are also presented in [19,20,21]. Unfortunately, so far no systematic and scalable topology model has been proposed to grow survivable ring networks into resilient meshes.

2.2 How Can Graph Theory Help?

We argue in this paper that the principle of network resilience has been neglected at network topology design phase. Usually resilience has been left to be resolved by protection and restoration techniques, after a survivable network topology has been put in place.

It is usually assumed that survivability and resiliency are different instances of the network design problem; or at least a far too complicated problem to be solved at once. Not only inefficient solutions are being proposed for protecting traffic, but also complex and slow mechanisms for traffic restoration can be the result of an underlying constrained physical topology. Note that a consistent study of graph properties can address resiliency (and survivability as a consequence) and also unlock features to be used by overlay networks embedded in this physical topology.

But to be practical, this topology design coming from graph theory must prove to be geographically viable. Network architects should be also able to expand their network in a pay-as-you grow style, without being constrained by modular, “exotic”, or inflexible node interconnection architectures. It is also highly desirable that these graphs could resemble (or be compatible) with legacy solutions, such as multi-ring topologies. This may ease its adaptation to existing operator’s control and management frameworks.

3 Background: Twin Graphs

Twin graphs are particular 2-connected graphs that have at least 2 node-disjoint geodesics between every two non-adjacent nodes, and require for that the minimum number of links [22]. In graph theory literature, they are called minimum-size 2-geodetically-connected graphs. Each twin graph has order \(n \ge 4\) nodes and size \(m=2n-4\) links [22].

Twin graphs are recursively defined as follows [22]: (i) the cycle of order 4 \((C_4)\) is a twin graph; (ii) if G is a twin graph of order n and (uv) is a twin pair in G (i.e., a pair of nodes that have the same node neighbors), then the addition of a new node \(v'\) by using two new links \(uv'\) and \(vv'\) produces a twin graph \(G'\) of order \(n+1\). Thus, in order to grow a twin graph G, only two steps are needed:

  1. 1.

    Identify a twin pair (uv) in G;

  2. 2.

    Build \(G'\), where \(V(G') = V(G) \cup \{v'\}\) and \(E(G') = E(G) \cup \{ uv', vv'\}\).

Notice that this scaling up process can also be used to add to a twin graph more than one node at the time. Indeed, one can add a new node in each of the twin pairs of interest in order to build larger twin graphs. Moreover, there are other graph operations, such as the merging process proposed in [14], that can be used to build larger twin graphs.

When recursively generating a twin graph from the \(C_4\), whenever a node is added, it creates new cycles of order 4, thus all nodes and links of each twin graph belong to cycles of order 4 (see Fig. 1 in Sect. 4).

In summary, twin graphs can be easily generated by a recursive method, which consists of adding nodes by means of twin pairs. One can always grow a twin graph by adding a node to a twin pair, and in general it can be done in several ways, since twin pairs are not unique [22].

4 Results

In this Section practical issues are objectively investigated before suggesting twin topologies to network architects. Although it is not an exhaustive list of practical points to be considered, we expect to provide preliminary and yet solid results supporting twin graphs as a reference model for optical backbone topological design. Cable cuts caused by construction work is the more frequent disruption to optical backbones [23]. Therefore, this paper only focus on link failure analysis when testing resiliency and survivability.

4.1 Scalability and Flexibility

To illustrate the twin topologies scalability discussed in Sect. 3, Fig. 1 presents the growing process (from n to \(n+1\)) for all (non-isomorphic) twin topologies from 4 up to 7 nodes. Let denote a twin topology on n nodes. Starting from the (first row, first column), by adding an extra node, only one twin topology can be built (second row, first column). Now starting from the (first row, second column), two different twin topologies and can be built (second row, second column). They, in turn, can generate only two different twin topologies and (second row, third column) on 7 nodes. Finally, starting from and (first row, fourth column), one can generate four different twin topologies on 8 nodes (second row, fourth column).

Fig. 1.
figure 1

Growing twin topologies from the \(C_4\). Blank nodes and dashed lines represent all possible (non-isomorphic) ways for growing twin topologies of order \(4 \le n \le 7\) (on the top) by means of twin pairs, which are highlighted in colors, in order to build new twin topologies of order \(n+1\) (on the bottom). (Color figure online)

4.2 Topology Diversity

An optical backbone is a complex physical system and a graph is merely a very simplified abstraction meant just to represent node adjacency. Physical distance, link capacity and other aspects are not represented. Perhaps a given graph is not feasible due to geographical obstacles, so topology diversity provides designers options to pick and choose from a set of equivalent and yet different solutions. The importance of topology diversity and graph weighting will be latter illustrated in an use case.

Expanding results previously illustrated in Fig. 1 beyond \(n=7\), Table 1 brings the number of twin topologies with \(n \le 17\) nodes. As expected, the number of twin topologies increases with n, since all twin topologies of order \(n+1\) can be generated from a twin topology of order n.

Table 1. Number of twin topologies with \(n \le 17\) nodes.

For each n there is a twin topology of diameter 2, namely, the complete bipartite graph \(K_{2,n-2}\) [22]. This particular instance is also known as dual hub architecture [24]. Besides this, for each \(n \le 17\), we have found at least one twin topology for each diameter in the range from 2 to \(\lfloor n/2 \rfloor \).

Another important feature of twin topologies is that, given the network order, the number of possible topologies is bounded. For instance, for \(n=17\) there are 310 twin topologies to be investigated. This diversity also helps solving the physical topology design problem by finding topologies with a good trade-off between diameter and maximum degree depending on the specificity interconnect problems, e.g., [12].

4.3 Neighbor Nodes Resiliency to Link Failures

Every twin topology survives single link failures, as well as every 2-connected topology. However, a cut between neighbouring nodes in any topology always comes with a negative impact on routing. In this particular case, a working path of length one would necessarily be replaced by a longer backup path. We propose the link removal impact, denoted as \(\varDelta _h\), to measure the extent of that impact on routing, by summing up the differences in length of backup and working paths for each pair of nodes. More formally, the link removal impact is written as:

$$\begin{aligned} \varDelta _h = \sum ^{n-1}_{u=1}\sum ^{n}_{v=u+1} (h^{b}_{uv} - h^{w}_{uv}), \end{aligned}$$
(1)

where \(h^{b}_{uv}\) and \(h^{w}_{uv}\) are, respectively, the length of the backup and working paths, in number of hops, from node u to node v.

Results of the link removal impact can be seen in Fig. 2 for all twin topologies with \(4 \le n \le 17\) nodes, rings with up to 20 nodes, and real-world optical backbone topologies (reported on [25]) of order up to 20. It is noteworthy the fact that in Fig. 2 all twin topologies with a given number of nodes have the same link removal impact. Moreover, compared to the rings and the real-world networks under study, the twin topologies present the lowest link removal impact. Thus, among these sets of networks, the twin topologies are the most resilient regarding cuts between neighbouring nodes.

It is important, however, to remind that any pair of non-adjacent nodes in a twin graph will have at least one backup path exactly with the same number of hops as the working path since twins are 2-GC topologies. So the removal of links, other than neighboring ones, in twin nodes has absolutely no impact on routing in terms of hop count.

Fig. 2.
figure 2

Impact of all possible cuts between neigbouring nodes, for twin graphs with up to 17 nodes, rings with up to 20 nodes, and real-world optical backbone topologies (reported on [25]) of order up to 20.

4.4 Suvivability to Multiple Link Failures

Any twin topology will survive a single failure but its 2-GC feature can not guarantee that it will survive multiple failures. Following the procedure described in [11], we computed the relative number of link cut sets of sizes 2, 3 and 4 for the three sets of topologies. That procedure allow us to analyze twin topologies stressed with multiple failures.

The number of link cut sets of size i, denoted as \(S_i\), gives the number of ways a network becomes disconnected after removing i links. When normalizing \(S_i\) with respect to the total number of sets of i links, we get the relative number of cut sets of size i, which is denoted as \(S_i(\%)\). We can formally present it as:

$$\begin{aligned} S_i~(\%) = \frac{S_i}{{m\atopwithdelims ()i}} 100, \end{aligned}$$
(2)

where \({m\atopwithdelims ()i}\) it the combination of m links taken i at a time without repetition.

Figure 3 shows \(S_i~(\%)\) for \(i =2\), 3 and 4. For instance, when \(S_2~(\%) = 25\%\) it means that a quarter of all possible simultaneous failures in two links will disconnect the network. Thus, the higher this number, the more likely is the network to be disconnected.

Results in Fig. 3 show, as expected, that ring topologies become disconnected after removing 2 or more links. The larger is the cut set, the better twin-graph families perform in comparison to the group of real-world topologies. There is a clear and consistent trend: A lower probability of disconnecting the network when larger cut sets are considered for twin graph topologies in comparison to real-world topologies. This is highlighted by \(S_4~(\%)\) in Fig. 3(c) when \(n \ge 9\). Results for real-world networks is highly topology depended and therefore are scattered in contrast with twin topologies outcomes clustered in a narrow range despite their topology diversity seen in Table 1. This result shows that the overall vulnerability to simultaneous failure events can be significantly reduced by designing physical topologies with twin topologies.

Fig. 3.
figure 3

Relative number of cut sets of sizes (a) 2, (b) 3, and (c) 4 versus the number of nodes, for ring topologies, twin topologies, and real-world backbone topologies.

Fig. 4.
figure 4

Number of links required by twin graphs with up to 17 nodes, rings up to 20 nodes, and real-world optical backbone topologies (reported on [25]) of order up to 20.

4.5 Number of Additional Links to Provide Intrinsic Resilience

We must test if twin topologies produce networks that would be compatible with current optical backbone networks. For that, we present in Fig. 4 the number of links versus the number of nodes for all twin topologies, rings and real-world backbones under study. Evidently, the ring topology provides the lower bound and it can be seen that twin topologies in general require few more links than real-world optical backbones.

The literature shows that real-world optical backbones typically have average degree (that is twice the number of links over the number of nodes) ranging from 2 to 4 [26]. For twin graphs, the average degree \(\overline{d}\) is given by:

$$\begin{aligned} \overline{d} = \frac{2m}{n} = \frac{4n-8}{n} = 4 - \frac{8}{n}. \end{aligned}$$
(3)

Thus, it is noteworthy to see that the average node degree is bounded to the interval \(2 \le \overline{d} < 4\) for twin topologies of any order. In other words, the number of links of twin and real-world topologies with n nodes is between n and 2n.

We can conclude that, for a given n node topology, we can provide intrinsic resiliency with almost the same number of links used by real-world topologies. Thus, all the aforementioned advantages of twin topologies may come at a very reasonable cost.

5 Use Case: A Resilient CESNET Redesign

In order to illustrate how our paradigm can be used in practice, we have chosen to redesign the 12-node CESNET network shown in Fig. 5(a) to improve its resilience. Out of the set of real-world networks, this is the one with the closest average degree to a twin graph.

For mapping nodes from a twin topology to cities in their geographical positions, we have exhaustively analyzed twin topologies with 12 nodes looking for the minimal total fiber length. The result of CESNET redesigned as a twin expending the least accumulated fiber length is shown in Fig. 5(b). It is noteworthy that 11 out of 19 links of the original network remain unchanged (solid lines) in its Twin version.

Note that in this paper, we have initially chosen to consider only unweighted links because we focus on topological features of twin graphs. However, there is no constraint in considering, a posteriori, links weighted by geographical distances between nodes to optimally map a twin graph to geographical positions. Note that one may also use link capacities or any other link weighting parameter.

The total fiber length of the original CESNET topology and the corresponding twin topology are about 1842 km, and 2137 km, respectively. This \(16\%\) increase is somewhat expected since the original CESNET topology has 19 links, whereas any twin topology with 12 nodes has 20 links. Nevertheless, the benefits for resilience is clear as the link removal impact, \(\varDelta _h\), which is originally at 106, reduces to 40 (\(62\%\) smaller) for our twin-graph-based redesign. In addition, a significant improvement is achieved on the survivability to multiple link failures, measured by the relative number of cut sets of sizes 2, 3, and 4 shown in Table 2. For instance, \(23.3\%\) of all possible simultaneous failures in four links will disconnect the original CESNET topology, compared to only \(15.3\%\) for its twin version shown in Fig. 5(b).

Fig. 5.
figure 5

(a) The original CESNET topology; (b) A twin topology labelled according to CESNET nodes in order to minimize the total fiber length. Solid lines represent links that exist in both network versions. Dashed lines represent removed links in (a) and added links in (b), respectively.

Table 2. Comparison of resilience and survivability for topologies shown in Fig. 5.

6 Concluding Remarks

We proposed the twin topology as a new reference model to design (or redesign) physical topologies for next generation optical backbone networks. Twin graphs have an average degree in agreement with the average degree of real-world optical backbone topologies. Besides its intrinsic resilience given by their 2-GC, we found out improved survivability to multiple failures, fine granularity for scalability, and topology diversity for better matching graphs to the real problem.

We noticed also that the twin graph on n nodes minimizing the diameter, i.e, the complete bipartite graph \(K_{2,n-2}\), corresponds to the solution presented by [24] to the problem of designing physical topologies that ensure logical rings of size \(n-2\) can be embedded in a survivable manner. Since solutions of this problem must have at least \(2n-4\) links ([24], Theorem 2.5), it is interesting to investigate which twin graphs on n nodes also satisfy it.

Twin topologies solve, for \(\kappa =4\), the problem stated in [18], i.e., the problem of finding a 2-connected network such that each link belongs to a cycle of length at most \(\kappa \). For future work, it is interesting to investigate topologies based on cycles of different order (e.g., 3, 5, and 6) in the context of the design of optical backbone networks.

The implications for this new topology design on WDM systems are yet to be investigated. An exhaustive analysis over all 7-node networks have shown that throughput and blocking ratio strongly depends on physical interconnection topology [27]. These results suggest that wavelength requirements will be also affected by physical interconnection topology. It is expected that intrinsic 2-GC features of twin topologies can facilitate wavelength routing and therefore reduce wavelength counting. Future work will also involve the impact of our underlay twin topology on the multi-layer resilience mechanisms considering elastic and flexible optical transmission technologies.