A Graph-Theoretic Framework for Assessing the Resilience of Sectorised Water Distribution Networks

Water utilities face a challenge in maintaining a good quality of service under a wide range of operational management and failure conditions. Tools for assessing the resilience of water distribution networks are therefore essential for both operational and maintenance optimization. In this paper, a novel graph-theoretic approach for the assessment of resilience for large scale water distribution networks is presented. This is of great importance for the management of large scale water distribution systems, most models containing up to hundreds of thousands of pipes and nodes. The proposed framework is mainly based on quantifying the redundancy and capacity of all possible routes from demand nodes to their supply sources. This approach works well with large network sizes since it does not rely on precise hydraulic simulations, which require complex calibration processes and computation, while remaining meaningful from a physical and a topological point of view. The proposal is also tailored for the analysis of sectorised networks through a novel multiscale method for analysing connectivity, which is successfully tested in operational utility network models made of more than 100,000 nodes and 110,000 pipes.


Introduction
Resilience can be defined as the ability of a system to maintain and adapt its operational performance in the face of failures and other adverse conditions (Laprie 2005;Strigini 2012). Depending on the application domain, different approaches are applied to assess the resilience of man-made systems. There is no single widely used definition for quantifying the resilience of water distribution networks (WDNs), a common method is to formulate the hydraulic resilience as a measure of the ability of a network to maintain supply under failure conditions. Todini (2000) proposed a resilience index based on the steady state flow analysis of WDNs and dissipated energy; consequently, the resilience of a water network was defined using a measure of the available surplus energy. Recent work extends this approach; for example, Raad et al. (2010) modelled the outcomes of this index proposing to work with alternative approaches to the original in Todini (2000); this was done to overcome computational difficulties in optimising a network design problem using Todini's resilience index as an objective function. Therefore, by using parameters of physical significance in the supply as surrogates and not the real hydraulic details, di Nardo et al. (2013a) included these surrogate indices for optimizing the resilience of sectorised water distribution networks. Baños et al. (2011) proposed an extension to Todini (2000) by considering failures under various water demand scenarios. Other approaches based on a steady state hydraulic analysis include Prasad and Park (2004) who adapted Todini's index by incorporating the effects of both surplus pressure and reliability of supply loops assessed by the variability in the diameter of pipes connected to the same node. They also applied the method to a multi-objective problem for the optimal design and rehabilitation of a water distribution network. Jayaram and Srinivasan (2008) used the same method from Prasad and Park (2004) for the analysis of networks with multiple sources. An alternative approach was proposed by Wright et al. (2015) who assess the resilience of networks with dynamically configurable topologies by estimating what percentage of operational demand can be supplied when a disruptive event occurs.
A common constraint of the above approaches is that the combination of possible failure scenarios grows exponentially as the network becomes bigger (Berardi et al. 2014) together with possible inconsistencies and uncertainties associated with hydraulic simulations (Gupta and Bhave 1994). Trifunović (2012) explores hydraulic properties of the network based on the statistical analysis of common parameters under normal operation and proposed them as indices to assess the resilience of a water network. However, these statistical parameters may not be valid under failure conditions (Giustolisi et al. 2008). Another inherent drawback is the dependence on well calibrated (accurate) hydraulic models for large networks. Beyond the implicit non-linearity, the number of parameters involved in the hydraulic equations  and their large number of possible combinations introduce high complexity to the WDN calibration problem (an NP-Hard problem, which cannot be solved in polynomial time Takahashi et al. (2010).) For large WDNs, these have an expensive computational cost since, even for moderate size networks, the number of possible failure scenarios grows exponentially and are approached using either some linearising assumptions or the use of heuristics for failure simulations (Berardi et al. 2014).
This paper assesses the resilience of WDNs from a topological perspective where properties such as network configuration and redundancy in connectivity are taken into account together with physical-based flow properties. This assessment is based on a computationally efficient implementation of the K-shortest paths algorithm (Eppstein 1998). It measures in a statistically robust way how every WDN demand node would be affected by disruptions in supply. This is done by analysing alternative paths between demand nodes and water sources (i.e. tanks and reservoirs). In addition, the paths are weighted by the hydraulic attributes of the supply routes to make the measures physically meaningful. Graph-theoretic indices have been applied to assess the resilience of water networks and used also as a surrogate measure in design problems (Yazdani et al. 2011). These surrogate indices are measures that correlate to physical/hydraulic based indices of resilience but are not necessarily based on physical indices whose parameters are either unavailable or are difficult to compute. These indices can be broadly classified into "statistical" and "spectral" measures (Gutiérrez-Pérez et al. 2013). Among the statistical indices, the meshedness coefficient (Buhl et al. 2006) provides an estimation of the topological redundancy of a network. Central-point dominance (Freeman 1977), gives a measurement of the network vulnerability to failures corresponding to central locations. Flow Entropy (Raad et al. 2010) measures the strength of supply to a node both in terms of the number of connections and their similarity. The so-called demand-adjusted entropic degree in Yazdani and Jeffrey (2012) is another alternative that uses demand on nodes and volume capacity on edges to compute a weighted entropic degree. More recently, (Liu et al. 2014) have extended the flow entropy measure to also consider the impact of pipe diameter on reliability.
The work also extends the proposed adaptation of K-shortest paths algorithm for the analysis of WDNs divided into sectors or District Metered Areas (DMAs) by utilizing multiscale network decomposition (Albert and Barabási 2002;Lee and Maggioni 2011). Demand nodes of a given DMA are aggregated into a sector-node, where a new graph (DMA-graph) is defined with vertices representing DMA areas and edges that abstract the sector-to-sector connectivity (Izquierdo et al. 2011). A mapping function is used to establish a relationship between the network graph of demand nodes and the DMA-graph, where every DMA-vertex (or sector-node) has new features such as number of customers, total demand, and an average pressure aggregated from information of individual nodes of the DMA. The physical connections between sectors, which consist of pipes and boundary valves, form the edges of this DMA-graph and define the level of its connectivity. Resilience assessments carried out on individual demand nodes in the DMAs are also linked to the corresponding sector-nodes, which are then used for a system-level resilience analysis. Considering both the network graph of demand nodes and a DMA-graph, a multiscale analysis of resilience is proposed.

A Graph Theory Perspective on the Definition of WDN Resilience
The network connectivity of a WDN can be modelled as a nearly-planar mathematical graph 1 , G = (V , E), where V (vertices) corresponds to n nodes and E (edges) corresponds to m pipes of the water system. This graph also has the particular characteristics that every node or edge should normally have at least one path of edges connecting it to a source node (tanks and/or reservoirs).
For a given WDN, our approach starts with computing a measure of resilience for each node by identifying the number of water sources a node is connected to and estimating how well that node is supplied by the available water sources. To accomplish this, all routes connecting a node i to all tanks and reservoirs are computed, and used to quantify the capacity or resistance of the corresponding routes. Since computing all routes which link nodes to sources is prohibitively expensive, our approach becomes feasible by restricting the computations to the K 'shortest' routes (Eppstein 1998) for the flow between a node i and its j -th source s j (i). In calculating this index, every route should be weighted by a surrogate measure of flow resistance. For example, the proposed surrogate flow resistance measure depends on the physical characteristics of connecting pipes as water travels from j -th source, s j (i), to i and it can be estimated by a measure proportional to the energy loss associated with a route of supply. The estimation of energy loss in water transportation is commonly done using the Darcy-Weisbach and Hazen-Williams formulae. Although both formulae depend on hydraulic measures like velocity and roughness, only the proportional dependence on topological constants of the network has been considered as no detailed hydraulic modelling is used. Energy loss across a link is proportional to f × L/D; where f is the friction factor, L is the length of the pipe and D represents its diameter. This resistance measure is used to weight every path in the calculus of shortest paths, providing a surrogate measure for potential energy loss associated with each path connecting a node and its water source. The proposed resilience index for a node is where S is the total number of sources in the network and r(k, s) is a surrogate measure of the energy loss (the resistance to water transport) associated with the k th path to source s. One such measure of energy loss could be approximated as: where M is the number of pipes in path k and f (·) estimates the friction factor by pipe age and material (Christensen 2009). The weighted KSP algorithm returns infinity for Eq. 2 if there is no path of water distribution connecting the given node with a supply source. The contribution of such a source to the resilience index for a node in Eq. 1 will be zero. Consequently, both the resistance of paths to sources and redundancy in supply are explicitly taken into account by the resilience index.
In Eq. 2, the flow resistance over K paths is averaged since using the shortest path only may not be sufficient to represent the connectivity and indirectly resilience. By averaging over the complete set of routes (i.e. K → 2 n , where n is a very big number for operational networks (Eppstein 1998)), the true connectivity of a node to a source is measured. However, the computational complexity of this measure grows exponentially (O(2 n n) for the proposed index) and it quickly becomes infeasible for medium to large networks. Averaging over the K-shortest paths is a computationally feasible approximation that takes into account a sufficiently large number of paths in a robust statistical method.

Validation of the Resilience Assessment Method
A previously published network C-Town (Ostfeld et al. 2012) is used as a validation casestudy for the graph-theoretic method proposed. This is a medium size network that consists of 333 nodes, 429 pipes, 4 valves, 5 pump stations, 7 elevated tanks, and 1 reservoir. A graph-theoretical analysis of resilience in WDNs (Yazdani et al. 2011) identifies a node of special importance as those with high betweenness centrality index, which measures the relative number of shortest paths from all vertices to all others that pass through a node. Nodes with betweenness centrality values larger than 2 times the standard deviation (SD) from the network average computed over all nodes will be denominated as high betweenness nodes. This work considers "critical transfer nodes" to those of topological importance with respect to the flow distribution with a WDN. Thus, critical transfer nodes are, a priori, those belonging to the main trunk and those with high betweenness.
In the case of the proposed K-shortest path index, for a demand node i and source s, let represent the inverse of the resistance to supply in Eq. 1 averaged over K paths. Figure 1a shows the difference g(K + 1) − g(K) for two nodes of varying centrality, A and B, where the latter belongs to the set of central nodes. In the two cases, the average of Eq. 3 taking K shortest paths converges when the number of paths is increased. That is, the difference on computing g(K + 1) or g(K) changes significantly as one moves away from using only the shortest path (i.e. K = 1) but not as much when one goes from K to K + 1 paths for larger values of K (over 30 paths in the case of Fig. 1a). Then, this measure is proposed as a decision criterion on the number of paths to be used in Eq. 1 to compute I GT (·).
Using K paths has two main advantages. Firstly, it emphasises the need to consider multiple paths of transport that contribute to the resilient operation of a network and it guarantees a robust statistical estimation since multiple paths are used to capture the average connectivity and capacity instead of the unique shortest path connecting two nodes. Secondly, the sum in Eq. 1 converges for relatively low values of K to obtain a good approximation of the analysis over all possible routes.
Once the tuning process provides an estimate of the K value, the proposal proceeds to compute the inverse of the resistance to supply, g(K), for the chosen K. In Fig. 1b g(30) is plotted for all individual nodes. The same plot also represents the values of g(30) for central nodes (main-trunk nodes and those with a betweenness centrality of at least 2 times the SD of the network average; they are plotted with red triangles). Most of the critical nodes have high g(K) compared to the average node g(K) value. In general, nodes of higher betweenness centrality are those that will appear more frequently in any shortest path, including paths to sources, and so will tend to have higher resilience as defined by Eq. 1.

Comparison of K-Shortest Paths Index (I GT ) with Alternative Approaches
The next stage of the analysis is a comparison of the various indices of resilience under failure conditions in a supply network. Three scenarios generated from C-Town are considered, (Table 1, Fig. 1). The first scenario represents normal operational conditions without failures. The second scenario is created by the failure (removal) of 2.5 % of randomly selected pipes. The third scenario assumes the failure of 5 % of pipes. The resilience indices are expected to degrade as more pipes are taken off. The aim of this example is to evaluate the consistency of the new index, I GT , by validating that it positively correlates with results obtained from hydraulic simulations. The comparison is done with Todini's hydraulic resilience index, I r (Todini 2000). Other graph-theoretic indices, such are centraldominance betweenness (I CB ), meshedness (I m ), and a demand-adjusted entropic index (I S ) (Yazdani and Jeffrey 2012), are also included in this analysis. Table 1 Figure 2a shows that the degradation in resilience is captured by the considered indices to a different degree. The simulation of failure indices in Fig. 2a is repeated 20 times (by randomly sampling the corresponding disrupted pipes at each repeated simulation; in this case, 20 samples is enough to arrange statistically meaningful comparisons between groups/indices). At each iteration, different sets of pipes (2.5 As shown in Fig. 2b, graph-theoretic indices show consistency with the hydraulic index 2 . The index I GT changes in a proportional way to variations in the condition of the water distribution network. The entropic degree, I S , is the measure which provides the poorest estimation of the deterioration in resilience as more pipes are removed, similarly to the results presented by (Greco et al. 2012). Figure 2b also shows that the meshedness index, I m , has a narrow distribution. This is because meshedness is a measure that depends only on the relative number of nodes and pipes, without taking into account their physical or hydraulic properties.

Assessing Resilience for Large Scale Water Networks
The network topology of a WDN includes both tree-like (usually hierarchical and lower redundant networks, i.e. having an equivalent proportion of nodes and links connecting them) and well meshed structures, where there usually exists several paths connecting two nodes and so a greater number of links compared to the number of nodes. These two type of structures can be interconnected by links forming graph-bridges, which transport water from the transmission mains to these distribution areas. One way to analyse large-scale WDNs is by partitioning the network into different components (assets: valves, pumps, pipes, and meters, among others). Alternatively, a WDN can be divided into sub-networks in order to make the analysis simpler than working with the whole system model. This is of interest especially in the case of large-scale WDNs, which can contain hundreds of thousands of nodes and pipes, where consequently the classical methods of analysis are not efficient because of the large dimensions.
Many WDNs have been divided into sectors (DMAs) mainly for leakage management (Tabesh et al. 2009;Gomes et al. 2013;Xin et al. 2014).  Mandel et al. (2015) is the first to propose a sectorisation approach for large-scale water networks. For these previously sectorised networks, the resilience index in Eq. 1 measures the level of connectivity of each demand node through various supply roots. A DMA division of the network automatically allows the use of two different scales of this resilience measure, having approaches of resilience for every individual node or measures summarised by sector. This provides new ways for analysing and managing the system and allows network configurations that have a major impact on resilience to be easily identified; for example multi-feed, single-feed, cascading DMAs.

Multiscale Resilience Analysis
The proposed graph-theoretic index is extended to water networks that are divided into sectors thus enabling resilience analysis. A DMA-based graph is a graph of lower level of detail (or coarser granularity) that captures the topology of existing DMAs. The DMAscale resilience is derived from a user-defined function that transforms the resilience of the demand nodes in a DMA to a scalar representing the DMA resilience. For example, by transforming the index of Eq. 1 to the coarser sector level graph to analyse the relative connectivity of DMAs. Here, it is used the trimmed mean (García-Escudero et al. 2003) to transform the resilience of n * i demand nodes of DMA i to the sector graph as: In calculating the mean, the trimmed approach discards points of very high or very low value before computing the mean for the remaining nodes. Then, the aggregated measure is the summary of the single values for n * i nodes of the DMA, with n * i ≤ n i . As such, a trimmed mean is a more robust centrality measure as the average value is not skewed by extreme low/high value data. For instance, if a DMA with predominantly high resilient nodes has a sub-area that has critically low resilience, the arithmetic average would not identify the presence of these in the DMA resilience index, whereas the trimmed method does. Furthermore, the presence of individual critical customers such as hospitals could significantly affect the system level resilience assessment. The set of excluded nodes in the computation of the trimmed mean, H, are identified to validate the existence of critical nodes among them, and the subsequent impact on the DMA resilience assessment. As a result, Eq. 5 formulates a constitutive rule which is used to rank DMAs as low resilient if they have low values of I GT for critical users (hospitals, schools, or industrial users) regardless of the fact that a DMA as a whole or a majority of its demand nodes might have high resilience.
where H i is the set of critical users in DMA i .
For a DMA with a large variation in the I GT value, it is possible to trim a significant number of nodes (i.e. a number of demand nodes representing a significant number of customers) in computing Eq. 4. To avoid this, the variability of the resilience is used, which is defined here as standard deviation normalized by the mean resilience value: Based on the variability of I GT (·) in each DMA i (Eq. 6), a second constitutive rule is formulated and applied: where t is a user-defined threshold.
When high variability exists, this constitutive rule computes the DMA resilience index over the whole set of demand nodes in the DMA and identifies the existence of low resilient nodes. The two Eqs. 5 and 7 assist operators to identify areas for a further investigation where a threshold of demand nodes or critical customers might not have the required level of resilience.

Multiscale Representation of a WDN
An essential part of the sector-level resilience analysis is the pre-processing of network data of varying types and complexity. For example, geographic information system (GIS) data, telemetry measurements, and hydraulic information is aggregated into an unified system that provides access to heterogeneous data sets. A novel augmented graph is proposed which transforms DMAs into sector-nodes. Each sector-node has aggregated data about the assets it contains, abstracting the high-granularity graph of hydraulic nodes to a lowgranularity graph. The pipes and valves that connect different DMAs represent the edges between sector-nodes and inherit their corresponding properties. A schematic view of this multiscale decomposition process is represented in Fig. 3.
By considering all links hydraulically connecting DMAs, the edges of the new graph are formed by taking into account the open/closed status of valves, and consequently links which can provide additional operational resilience are identified. The weights of the new graph edges are proportional to the aggregated volumetric properties of individual pipes and valves that connect sector-nodes. Sector-nodes can also be used to visualize average levels of demand, pressure, elevation, etc. and the sector-node size can inherit topological characteristics of DMAs (eg. number of nodes or customers).
A multiscale network decomposition to assess the resilience of water systems is approached by studying their performance and network structure at various granularity levels and by analysing topological and physical information inherited from single demand nodes to sectors. Other important details that are obtained in the analysis include the number and location of multi-feed DMAs, the number of valves within each DMA, and the pressure differential between neighbouring DMAs. In addition, the sector-node graph is used to identify which part of the network topology should require further investigation to improve resilience.

Experimental Study
Two operational WDNs were used to analyse the proposed methodology for assessing the resilience of sectorised water distribution networks. Firstly, a medium size WDN divided into 3 DMAs is investigated. A resilience assessment is carried out for each DMA. The second case-study corresponds to a large water supply zone with high population density in England, UK. In both cases, certain sensitive information about the networks is withheld in order to maintain confidentiality.

Case-Study A
WDN-A (Fig. 4a) has 4820 nodes, 1 tank, and 5234 pipes and valves sectorised into 3 DMAs. Detailed information for each DMA is summarised in Table 2. Figure 4a also shows the DMAs layout and the location of critical transfer nodes (i.e. nodes with high values of central-betweenness).
In order to compute the proposed graph-theoretic index, I GT of Eq. 1, a sufficient number of shortest paths K need to be used. The analysis outlined in Fig. 1 was carried out on different nodes and it showed that K = 30 as sufficient for WDN-A. In Fig. 4b, we plot the values of g(30) for each node of the WDN and for demand nodes with high betweenness centrality (labelled as critical nodes and also marked in red in Fig. 4a). For this network, analysis in Fig. 4b shows that the measure g(K) of the most critical nodes exceeds the average for the network.
From Table 2, it can be concluded that DMA-3 is the sector with lowest resilience given the index I GT (DMA 3 ), while I GT (DMA 2 ) index is higher than the other two. DMA-2 is

Case-Study B
WDN-B (shown in Fig. 5) is comprised of 106,115 nodes, 4 constant head sources, 28 reservoirs, 111,096 pipes, 102 pumps, and 1,900 boundary valves (Fig. 5a). GIS models, telemetry data, and physical and hydraulic information are combined in an integral database framework. Figure 5 shows the process of multiscale decomposition from the network of demand nodes to a coarser representation by sector-nodes. Through the graph transformation process, every DMA (sector-node) inherits the information related to the nodes within the area; such as, demand, elevation, pressure, and geographic coordinates. Sector-nodes also gather the information of the pipes and valve: diameter and length, and their own characteristics and the open/closed status for valves. The proposed multiscale decomposition of the network topology by sector-nodes reduces the dimensionality of complex networks. Figure 5b shows the spatial distribution of these sector-nodes together with their connectivity in terms pipes and valves. The volumetric capacity of aggregated links between DMAs are represented by the thickness of the connecting links. The colour code of the sector-nodes identifies whether a DMA is multi-feed (in colour red) or single feed (grey colour). The analysis shows that almost 40 % of the DMAs (i.e. 90 of the 228 DMAs) in WDN-B are fed by more than one supply connection, Fig. 5b. All this information together with the topology of the assets in the DMAs facilitate the estimation of the resilience indices at a DMA level, which is a new property of individual sector-nodes. Any measure based on individual assets within each DMA can be represented in the multiscale DMA-graph. Figure 6a shows the average resilience for each DMA, I GT (DMA i ), using a traffic light system. Red represents low resilience with an I GT value 2 SDs below the network average for all DMAs; in this example, the network average is 0.0395 and 2 SDs below that is 0.02. Similarly, high resilience sectors (with values 2 SDs above average DMA resilience) are represented using green colour, and DMAs in  between the two extremes with orange/amber colour. The augmented DMA-graph can also provide further information which assists operators in understating and analysing topological and operational constraints that affect resilience. Figure 6b highlights the critical transfer nodes of every DMA in distribution using the size of the circle plot associated with sectornodes. In this figure, criticality is determined based on the number of high betweenness nodes.
Resilience and critical transfer measures are simultaneously represented in Fig. 6(c), which combines a traffic light system and the sector-node size. As a result, the centre and centre-North of WDN-B include DMAs that have medium and low resilience measures and medium and high levels of flow transferability for central nodes (see Table 3). All these areas also supply critical customers and could be prioritised for further analysis and maintenance to improve network resilience. The boundaries of the WDN layout have low resilience but also low level of criticality. Table 3 presents a summary of attributes for the top 5 sectors in terms of their resilience and flow transfer critical nodes.
As shown in Table 3, where the DMAs are ranked by resilience, 3 of the 5 DMAs contain more than 2,000 nodes, while only one of them has a size less than 1,000 nodes. Taking into account the number of critical transfer nodes within each sector (column #CNs of Table 3), it is shown that DMA-29, DMA-101, and DMA-208 are sectors with larger number of transfer The resilience index, I GT , is computed for each DMA. The number of critical transfer nodes (i.e. the number of nodes with betweenness centrality values larger than 2 times the SD from the average computed over all the DMA nodes) is given by #CNs nodes. These DMAs also have a varying level of resilience; for example DMA-29 has the smallest resilience index and DMA-208 has the highest.

Conclusions
A novel resilience index based on weighted K-level shortest paths from consumption nodes to water sources has been derived to estimate WDN resilience together with information about critical customers. Computing these K different routes for every node provides numerical robustness to our analysis compared to analysing only the shortest path. The index is shown to be consistent with hydraulically-defined resilience measures. This is because the K-shortest paths are weighted by quantities associated with energy losses of water transport routes. The current proposal has the following advantages: • An inclusion of physical attributes of the water distribution network within graph theoretic indices gives the resilience analysis a hydraulically relevant and easy to quantify measure of how well connected a node is to the available water sources. • Computations for the proposed index scale quasi-linearly as a function of the number of nodes (∼ O(m + n log(n) + kn)). This makes the measure feasible for application to large scale water networks, where the classical critical link hydraulic based analysis of network resilience would scale combinatorially. • Constitutive rules and functions facilitate the inclusion of empirical knowledge.
The proposed graph-theoretic framework is complimented by a novel multiscale decomposition method that converts the original WDN layout to DMA-based graphs. The developed method allows us to work with a WDN at two different levels of abstraction: single demand node and DMA. This enhances the results obtained for assessing the resilience of a WDN, especially in the case of large scale networks.