1 Introduction

In the last decades, in several fields, from statistical physics to computer science and to biology, the topological and dynamical aspects of systems have been investigated from the perspective of the Complexity Science. Accordingly, Complex Networks Theory (CNT) has been the focus of intensive research (D'Agostino and Scala 2014) for analysing real-world systems (Pagani and Aiello 2013; Fiscon et al. 2018). Complexity Science is an emerging approach to research concerning with complex systems characterized by dynamic and nonlinear relationships occurring between their components and exhibiting non-trivial global behaviours at larger scales. Water distribution systems (WDSs) can also be modelled as spatial complex networks and studied within the approach of the CNT (Giudicianni et al. 2018). WDSs are also among the most important urban infrastructures, delivering drinking and industrial water to metropolitan areas, the operability of which is of crucial importance to ensure social development and economic prosperity. These infrastructures continually face natural and man-made threats that cause economic and social disruption. This leads the operators of water companies to continuously work on improving safety and security, from the design phase (Marinho et al. 2020) for efficiently managing repair crews in emergency circumstances (Wang et al. 2017) and speeding up mitigation actions during disasters (Paez et al. 2018), contaminations (Xin et al. 2017), cyber-attacks (Taormina and Galelli 2018). In addition, as city size increases, these infrastructures are getting ever bigger with an ever more complex behaviour. Consequently, the analysis, the management and the maintenance become difficult tasks to be accomplished, often exacerbated by the lack of site-specific information. CNT has proven to be particularly suitable for the analysis of WDSs, especially in the case of limited information about the system, for unveiling common properties and simplify management and monitoring (Giudicianni et al. 2020).

1.1 Fractal Geometry Theory and its Applications

To provide a deeper understanding of the underlying mechanisms of complex systems, many researchers have studied the possible self-similarity of their structure – i.e. their fractality – that represents a remarkable example of non-local property (Song et al. 2005). Fractal theory has been adopted to study many natural and artificial systems that exhibit similar patterning at different scales. Understanding fractals is a crucial aspect for decoding complex systems (Diao et al. 2017), as the behaviour of large systems can be revealed by focusing only on a small part. Mandelbrot (1982) defined a fractal as a set of parts similar to wholeness. The similarity can be exact (i.e. with precise copies of the entirety with successive changes of scales), or, especially for natural and man-made systems, the similarity can be statistical (i.e. subsets are similar to the wholeness, but not identical with it). Such a definition confirms an existence of scale invariance – i.e. an object looks similar at different scales of observation. About fractality of graphs, a network is fractal if groups of adjacent vertices are replaced with hyper-nodes and the obtained network keeps a similar structure to the original one (Akiba et al. 2016). The study of the fractality provided insights into modelling of growth processes of networks (Song et al. 2006), spreading processes (Serrano et al. 2011), vulnerability against failures (Hasegawa and Nemoto 2013), and network topology (Goh et al. 2006). However, only few works have been dedicated to the applications of fractal theory to water engineering. Qi et al. (2014) developed a fractal self-growth model to generate a cost-efficient design strategy for a water system. Kowalski et al. (2015) developed a method for the placement of sensors for water quality and pressure based on network fractal properties. Di Nardo et al. (2017) applied fractal theory for assessing the vulnerability of water networks to single pipe failures. Diao et al. (2017) developed a fractal-based analysis to identify critical pipes in WDSs. A review of the topic can be found in Iwanek et al. (2020), while recently, Caldarola and Maiolo (2020) investigated the opportunities of using fractal theory for the description and interpretation of data and dynamics of real complex systems. Finally, a review of the topic of fractal dimension of complex networks has been done by Wen and Cheong (2021).

1.2 Vulnerability Assessment for Water Distribution Systems

Vulnerability refers to the inability of a system to withstand the effects of a hostile environment, and vulnerability analysis aims at estimating the impact of the negative consequences of an increasing magnitude strain imposed to the system (disregarding the probabilities or frequencies of the strains affecting the system). It represents an emerging approach used within the management of critical infrastructures. Among them, WDSs – that are required to function properly under both normal and abnormal conditions – are currently undergoing large changes due to both urban population explosion and climate change. In the last decades, great effort has been dedicated to the topic of vulnerability assessment of WDSs. Haimes et al. (1998) reviewed needs and opportunities to reduce the vulnerability of WDSs to potential physical, chemical-biological, and cyber-attacks. Michaud and Apostolakis (2006) presented a methodology for ranking the WDS elements based on multi-attribute utility and graph theory algorithms. Gutiérrez-Pérez et al. (2013) introduced a methodology for an efficient WDS vulnerability analysis based on spectral graph theory to establish the relative importance of areas. Herrera et al. (2016) proposed a topology-based framework for large WDSs to estimate the redundancy of all possible routes from demand nodes to their supply sources. Shuang et al. (2014) evaluated the nodal vulnerability of WDSs under cascading failures in terms of loss of connectivity. Christodoulou and Fragiadakis (2015) proposed a methodology for the reliability assessment of WDSs by incorporating historical data on network performance, network component vulnerability against seismic loading, and topological information. Agathokleous et al. (2017) showed the effect of topology on network vulnerability through a betweenness centrality index.

In this work, for several WDSs, the global vulnerability analysis is carried out by removing an increasing number of components and estimating the arising consequences, allowing to evaluate the WDSs ability to resist disasters that highly impact their functioning by causing component mechanical failures (e.g. pipe breakages). Water companies still need strategies to face these emergencies and prioritize the use of available resources to rapidly restore the service. Due to the increasing interest in the management of WDSs during disasters and the frequent scarcity of site-specific information, simplified methods for a preliminary vulnerability evaluation are of crucial importance. Accordingly, the possibility to topologically rank WDSs according to their proneness to be damaged by catastrophic events represents a valid tool for prioritizing the restorations and better managing crews and resources, avoiding more dramatic consequences. Furthermore, during disaster scenarios, the change of topology is dominant over the variation of model boundary conditions (e.g. water request, initial tank levels), that become highly uncertain and of secondary importance. Consequently, a topological approach for the WDS global vulnerability assessment results in a more robust method in face of all the uncertainties about the system, allowing to identify the most disruptive alterations. The definition of such a topology-based tool is the main purpose of this paper. After the detection of the fractality of several WDSs by means a novel algorithm that exploits the community structure principle for the box-covering problem, a novel vulnerability index is proposed, that simultaneously considers the global structure of the graphs of WDSs and their average level of connectivity. This allows for a more comprehensive description of the system behaviour. The main contribution of this work is the investigation of the relationships between fractal feature, topology, and global vulnerability during abnormal extreme events, contributing to the development of topological approaches for the analysis and management of WDSs.

2 Methodology

2.1 Estimating Fractal Dimension from Community Structure

he study of the fractality of a network is based on the so-called box-covering problem (Song et al. 2005). The problem consists of defining the minimum number of boxes of equal size (clusters) NB (each box is constituted by a set of adjacent vertices such that each node is less than lB edges away from another node in the box) necessary to cover the whole network. One seed node is randomly chosen, and a box of nodes separated by at most lB (the size of the box equal to the number of edges on the shortest path between the two farthest nodes) from the seed node is formed. The procedure is repeated by choosing many seeds at random until the boxes cover the whole network. The process is repeated by varying the distance lB of boxes. The fractal dimension dB is given by the following power-law relationship (Song et al. 2005):

$${N}_{B}\approx {l}_{B}^{-dB}$$
(1)

between the number of boxes NB and the box size lB that reveals a self-similarity of the network topology. The problem is known to be NP-hard and computationally burdening (Song et al. 2007) and several heuristic algorithms for its approximate solution have been proposed (Schneider et al. 2012). For networks with uniform node degree distribution (most of the nodes have the same number of connections, like WDSs), fractality can be also defined through the “mass” MB of the boxes (their number of nodes) (Song et al. 2005) and Eq. 1 can be rewritten as:

$${M}_{B}\approx {l}_{B}^{dB}$$
(2)

and the exponents of the two power-law relationships in Eqs. (1) and (2) are equal. Starting from these definitions, the idea behind the proposed novel algorithm, that can be regarded as an approximation of the box-covering problem, is to exploit the community structure principle. Rather than finding the minimum number of boxes NB (with a set size lB) necessary to cover the network, a clustering algorithm is run to define the clusters (boxes) NC in which the network can be split. Accordingly, hereinafter it will be assumed NB = NC, the internal distance lB and the mass MB will be named as lC and MC, respectively. By applying a clustering algorithm, each cluster could have a different size in terms of both lC and MC (Fig. 1). However, since for uniform networks the number of connections for all the nodes is quite similar, it is expected that the mass MC, as well as the average distance between any pairs of nodes (and in turn the internal distance lC), is quite similar for each cluster. Accordingly, it is defined an average size LC for each clustering layout equal to the mean of the sizes lC of all the clusters:

Fig. 1
figure 1

Application of the proposed fractal algorithm to a small example network (EN); clusters are highlighted with different colours. For each cluster the number of nodes MC and the internal distance lC are reported: a) EN; b) EN divided in Nc = 4 clusters with LC = 1.00; c) EN divided in Nc = 3 clusters with Lc = 1.11; d) EN divided in Nc = 2 clusters with Lc = 1.20

$${L}_{C}=mean({l}_{C})$$
(3)

and by replacing lB with LC and NB with NC, Eq. (1) changes:

$${N}_{C}\approx {L}_{C}^{-dB}$$
(4)

The proposed algorithm consists of the following steps:

  1. 1)

    setting the number of clustered layouts S. The relationship in Eq. (4) is based on a large number of clustered layouts (herein S = 100); each clustered layout is characterized by a different number of clusters NC. Given n the number of nodes composing the network, for the analysed WDSs, it has been observed that setting the lowest number of clusters NC,min≈0.04n and the highest NC,max≈0.20n is enough for detecting the power-law relationship.

  2. 2)

    dividing the network in NC clusters. The heuristic Louvain clustering algorithm has been chosen (Blondel et al. 2008), that is one of the most performing recently developed. It is based on the optimization of the modularity (ranging between -1 and 1) that measures the density of links inside clusters as compared to links between clusters.

  3. 3)

    calculating the size lC of each cluster and the average size LC (Eq. 3);

  4. 4)

    repeating step 3) for all the clustered layouts S;

  5. 5)

    fitting the obtained couples NC and LC with the power-law equation (Eq. 4), the exponent of which is the estimated fractal dimension dB.

It is worth to highlight that the algorithm has been tested on a benchmark network for which the fractal dimension is known and calculated with heuristic box-covering algorithms. Results are very similar, confirming the efficiency and the reliability of the proposed algorithm.

2.2 Vulnerability Assessment to Multiple Random Pipe Failures

The paper investigates the WDS global vulnerability (in terms of loss of connectivity) under multiple random pipe failures to simulate the occurrence of a catastrophic event. The considered triggering condition is a random attack (that plans to impact randomly selected network components). The aim is also to simulate the case in which information about the system in not fully available (that is a common situation for WDSs). Consequently, only network topology is considered, implicitly assuming that all elements are of equal importance. For each WDS, a set of R = 100 pipe failure scenarios has been generated; each of them by randomly removing an increasing number of pipes till the disconnection of the network. For each scenario, the number of pipe failures (pipef) necessary to disconnect the WDS has been calculated and then normalised to the total number of pipes m composing the network. For each WDS, the modal value Fmodal of the normalised pipe failures required to disconnect the network is calculated. The idea is that, if a structural self-similarity (fractality) exists for the analysed WDSs, it can be also revealed in terms of specific number of pipe failures that more likely disconnect the system. This aspect further justifies the assumption of randomly generating the failure scenarios. It is supposed that branched-like structures need smaller numbers of pipe failures than looped-like ones. Accordingly, the Fmodal value can be used for quantifying the global vulnerability of WDSs.

2.3 Topological Metrics

Some robustness topological metrics, borrowed from Complex Network Theory and adopted for the quantification of the global vulnerability of networked systems, have been used for comparison. WDSs can be modelled as undirected graphs G = (V, E), where V is the set of the n nodes, and E is the set of the m links. A mathematical representation is provided by the Adjacency matrix A (symmetric for undirected networks) whose elements aij = aji = 1 if there is a link between nodes i and j, and aij = aji = 0 otherwise. Subsequently, the Laplacian matrix L = D–A is defined, where D = diag(ki) and ki is the node degree (number of connections). The second smallest eigenvalue of L, called algebraic connectivity λ2, quantifies the proneness of a network to be split into independent components. The distance dij between each pair of nodes is computed as the number of links along the shortest path connecting them. Table 1 lists the adopted robustness topological metrics (see Giudicianni et al. (2018) for a detailed description).

Table 1 Name, formulation, and vulnerability interpretation for the adopted topological robustness metrics

2.4 Software and Tools

The adopted algorithms have been implemented in Python 3.6. In particular, the novel algorithm for the definition of the fractal dimension has been developed by iteratively applying the Louvain algorithm using the Python package for complex network analysis NetworkX 2.4. The same package has been used for the calculation of the topological metrics. The simulations of the multiple random failures in WDSs have been carried out with the Python package Water Network Tool for Resilience (WNTR) v0.2.2. and the statistical analysis have been done with the Python library Scipy 1.5.0.

3 Case Study

For testing the proposed approach, nine graphs of WDSs are analyzed; seven out of nine refer to real WDSs and the other two are synthetic (data-file sources in Giudicianni et al. 2018), with size (number of nodes n) ranging between n = 37 and n = 1893 (see Table 2).

Table 2 Characteristics and simulation results of the analysed WDSs: name, number of pipes m, number of nodes n, density q, average node degree K, average path length l, algebraic connectivity λ2, type, layout, fractal dimension dB and modal Fmodal value of normalised failures

4 Results and Discussion

According to the values of the robustness metrics listed in Table 2, it is evident that, from a topological point of view, the analysed WDSs are very different from each other, especially in terms of link density q and connectivity λ2. The choice of working on very different WDSs from a topological viewpoint is crucial for further validating the proposed approach. In Fig. 2, the simulation results obtained by applying the proposed algorithm to estimate the fractal dimension of WDSs are plotted. The existence of power-law relationships between the number of clusters NC and the average size LC for all the analysed WDSs is confirmed by the linear regression lines in a log–log plane with coefficient of determinations R2 > 0.95. This confirms the spatial self-similarity of WDSs layouts at different scales and in different areas. Accordingly, it can be supposed that also their response (in terms of loss of connectivity) to an event that causes diffuse pipe breakages can be self-similar and a specific attitude to the disconnection can thus be revealed. The last two columns of Table 2 list the values of dB and Fmodal for the analysed WDSs. It is evident that the WDSs, the layout of which is more looped – like Fossolo – show the highest values of Fmodal (highest number of pipe failures necessary to disconnect the graph). On the contrary, branched layout WDSs – like Wolf – show the lowest values of Fmodal. This agrees with the values of the robustness metrics and their interpretations (Table 1). The study of the relationships between the topological robustness metrics and this new index Fmodal, used for expressing the global vulnerability of the WDSs, represents another novel point of the current research.

Fig. 2
figure 2

Power-law relationships between the number of clusters NC and the average size LC in the log–log plane for the analysed WDSs

Fossolo network is characterised by the highest values of link density q, average node degree K and algebraic connectivity λ2, and the lowest values of average path length l, which lead to suppose that it is the most robust from a topological viewpoint. On the other hand, Wolf network shows the lowest values of q, K, and λ2, and the second highest value of l, in agreement with the lowest value of Fmodal. It is evident that Fossolo also shows the lowest value of dB while Wolf is characterised by the highest one. Without loss of generality, Fig. 3 reports the frequency histograms of the number of normalised failures necessary to disconnect the network for two of the analysed WDSs. The plots show two different shapes for the histograms, pointing out two different behaviours in response to the same stress conditions for topologically dissimilar WDSs. In Fig. 3a, the frequency distribution has the shape of an asymmetric bell, while in Fig. 3b the histogram resembles a geometric type distribution. For the analysed WDSs, it has been noticed that the bell-shaped histograms are typical of densely looped systems (like Parete), while the geometric types characterise branched (or partly looped) networks (like Matamoros). This aspect further highlights the influence of the topology on the behaviour of WDSs with respect to failures. Such a dissimilarity can be explained by the fact that looped WDSs can be disconnected by both small and big number of failures (depending on where they occur), while branched WDSs are more likely disconnected by small numbers of failures, owing to the absence of highly interconnected zones.

Fig. 3
figure 3

Normalised failure distribution: (a) WDS of Parete; (b) WDS of Matamoros

Since the fractal dimension can be considered a measure of network organization, Fig. 4a shows the relationship between dB and Fmodal for the analysed WDSs. It is evident a decreasing trend of dB as the number of required failures increases, well describing the behaviour of WDSs in terms of leaning to disconnection. Anyway, for some networks this consistency is less clear. In order to enhance this correlation, the fractal dimension dB has been normalised with respect to the average node degree K, and the trend of dB/K against Fmodal is plotted in Fig. 4b. Different reasons led to choose the average node degree for normalization:

  • K represents a crucial topological feature characterising the structure of graphs;

  • K implicitly considers the number of elements composing the network (see Table 1);

  • K shows limited variability for WDSs (between two and four (Giudicianni et al. 2018)).

Fig. 4.
figure 4

a Fractal dimension dB versus modal normalised number of pipe failures Fmodal; and (b) Ratio between fractal dimension dB and average node degree K versus modal normalised number of pipe failures Fmodal.

The latter feature assures easily interpretable values of the normalized fractal dimensions. Indeed, using other topological metrics (e.g. average path length, link density, algebraic connectivity, number of nodes composing the graph), for which lower and upper bounds are not well defined, would result in an extremely wide range for the normalized fractal dimension. The normalised fractal dimension dB/K still decreases as the number of normalised failures increases, but with a clearer trend. Accordingly, WDSs characterised by lower values of dB/K are less vulnerable to be disconnected (or, equivalently, to be cut) by randomly generated multiple failures. For this reason, the normalised fractal dimension dB/K can be assumed as a measure of the global vulnerability to pipe failures. Hence, a novel vulnerability index is proposed, called Cut-Vulnerability Vcut, for which higher values indicate higher vulnerability:

$${V}_{cut}= \frac{{d}_{B}}{K}$$
(5)

The suitability of the proposed Cut-Vulnerability index Vcut has been further validated through the analysis of its relationships with the topological robustness metrics.

Figure 5 shows clear and consistent monotonic trends. Fig. 5a shows that Vcut decreases as the link density q increases; Fig. 5b shows that Vcut increases as the average path length l increases; finally, from Fig. 5c, it is clear that Vcut decreases as the algebraic connectivity λ2 increases. To further validate the effectiveness of the proposed vulnerability index Vcut with respect to the other topological metrics used as proxies of vulnerability, the Spearman’s correlation coefficient ρs (Spearman 1904) between them and Fmodal has been calculated. This coefficient quantifies the strength of a monotonic relationship (either decreasing or increasing) linking two data samples when the structure of the relationship is unknown. The values -1 and 1 indicate perfectly negatively and positively correlated variables, respectively. The Spearman’s correlation coefficients ρs between Fmodal and q, K, l, λ2, and Vcut are 0.93, 0.42, -0.78, 0.82, and -0.94, respectively, with a significance level α = 0.01. While K and l show the weakest correlations with Fmodal, Vcut has the highest absolute value of ρs. Furthermore, with respect to the other topological metrics, Vcut index simultaneously considers the global structure of the graphs, i.e. their self-similar feature (through the fractal dimension dB) and their average level of connectivity (through the average node degree K), providing a more comprehensive information about the structure of the WDSs. This new index highlights how crucial is the combination of connectivity and topological organization for defining network behaviour. Networks with the same connectivity K, but with different values of the fractal dimension dB, behave differently with respect to failures. In this regard, another contribution of this paper is the analysis of the mutual influence between topology and mechanical failures, and how topology affects the response to pipe failures.

Fig. 5
figure 5

Relationships between cut-vulnerability index Vcut and topological metrics: (a) Vcut versus link density q; (b) Vcut versus average path length l; (c) Vcut versus algebraic connectivity λ2

5 Conclusions

With the aim to study the criticality of water distribution networks facing natural disasters in the frequent case of scarcity of site-specific information and by defining new modelling tools, the present work brings together different fields, such as the statistical physics of complex networks, the topology of water systems and the fractal theory. The fractal structure of WDS layouts has been detected through a novel algorithm, that can be regarded as an approximation of the classical heuristic box-covering algorithms, which exploits the community structure principle. It results to be particularly appealing for application to big-sized systems. The fractal dimension dB is equal to the exponent of the power-law relationship between the number of clusters necessary for covering the network layout and their average size. The fractality of WDS layouts can be read as a mark of their self-similar topological organization and, consequently, it has been interpreted in terms of their self-similar response to mechanical failures, confirming the crucial role of the topology on the behaviour of water systems facing disasters. The global vulnerability assessment of WDSs under simultaneous failure events, as it could be the case after big disasters (i.e. landslides, floods, sabotages, earthquakes) has been performed. For nine WDSs, several failure scenarios have been generated, unveiling a specific vulnerability for each system, quantified through a novel index Fmodal equal to the specific number of random failures that more frequently lead to the disconnection of the system. Additionally, a novel vulnerability index called Cut-Vulnerability Vcut has been defined as the normalized fractal dimension to the average node degree dB/K. Such index has shown to well describe network topological self-similarity in terms of global vulnerability through its consistency with Fmodal and some other topological robustness metrics. The proposed index results particularly interesting in the case of a preliminary assessment of the vulnerability to disasters of water systems, for which geometric and hydraulic information is only partially available. It should be also stressed that during abnormal circumstances the variations of model boundary conditions for the hydraulic simulation become more uncertain and of secondary importance in comparison with the topological changes. This aspect further supports the assessment of the WDS global vulnerability in terms of loss of connectivity. In this regard, simplified methods based on topological knowledge alone, like that proposed and described in this paper, for a preliminary evaluation of WDSs’ vulnerability can represent effective and economic decision-support tools to drive the search for the most disruptive alterations under abnormal events, to prioritize the use of available resources, and consequently to rapidly restore service. Future works will investigate the possibility of applying the novel fractal algorithm to the weighted graph of the water systems, by considering the hydraulic/geometric characteristics of the pipes, in order to better reproduce their physical behaviour. This could reveal hydraulic self-similarity features in the resulting graph and allow simulating targeted attacks by considering the different hydraulic importance of network elements.