1 Introduction

The water industry nowadays faces more and more challenges, such as the rapid changes in the population, the fast-paced growth of energy costs and extensive droughts worldwide (Dahal et al. 2016). Based on the analysis of the World Bank in the OECD states, almost 95% of households have a direct connection to a Water Distribution Network (WDN) (OECD 2015). Despite, the networks are constantly deteriorating against all of the utilities’ efforts. The process could be tracked through the non-revenue water (NRW) amount, which can exceed 30% (OECD 2022). This phenomenon does not just decrease consumer comfort and network efficiency and directly increases these systems’ greenhouse gas emissions significantly (Valdez et al. 2016) or can even be a significant risk for continuous service if the loss is too high (Ghandehari et al. 2020).

The NRW decrement of WDNs is still an open and challenging question (Fontana et al. 2018). The last thirty years brought numerous different techniques to relieve the amount of NRW. The methods can be divided into two different approaches (van Zyl and Cassa 2014). The first and permanent solution is the identification and repair of local leakages (Sophocleous et al. 2019; Perez et al. 2011). If the network contains large geodetic differences, and the utilities have a strict budget for reconstruction, the pressure management can be a viable solution (Latifi et al. 2021). The second approach focuses on the network-wide pressure management to reduce the amount of NRW indirectly (Fontana et al. 2018; Covelli et al. 2016; Gupta et al. 2018). Although the repair of the larger leakages is inevitable, there is an economical threshold for smaller leakages which might not be efficient to fix individually (Lim et al. 2015), as it was also proved in a real-life scenario (Sechi and Zucca 2017). The pressure management can be also serve as additional tool for leakage detection campaign. Since numerous Hungarian networks could benefit from the second approach, this paper follows the latter approach.

The leakage reduction of WDNs starts with the optimisation of district metered areas (DMAs) (Creaco and Haidar 2019). The selection of DMAs is a similar process to the segmentation of the WDNs by the use of isolation valves (Giustolisi and Ridolfi 2014). While the segmentation techniques use the already implemented isolation valves, active elements (flow metres, pressure valves) are necessary to create DMAs. On the one hand, the literature presented modularity based approaches which could make this process fast and reliably (Creaco et al. 2019, 2018). On the other hand, global optimisation techniques could provide a feasible solution for the setting and even placement optimisation of pressure reducing valves (PRVs) (Creaco and Haidar 2019; Wright et al. 2015; Jafari-Asl et al. 2020). As it is proved by Covelli et al. (2016), despite all of the advantages of this approach, e.g. the simpler, one-step application of the solver, and the precise history of convergence, these approaches have a considerable bottleneck: the rapid expansion of the search space. The growth of the network size, i.e., the number of the served inhabitants or the served area, significantly increases the fitness function complexity, which might lead to convergence errors or a purely random search (Huzsvar et al. 2020). As the size and resolution of the networks are continuously increasing; thus, the paper applies a clustering-based approach to detect individual communities which can serve as pressure zones.

The paper introduces a novel two-step method to strategically place Pressure Reducing Valves (PRVs) in water distribution networks to reduce leakage. In the first step, the Leiden algorithm, a technique based on identifying distinct communities, is employed to locate potential pressure zones. In the subsequent step, a differential evolution algorithm is used to determine the best settings for PRVs, effectively reducing leakage while maintaining uninterrupted service.

An important aspect of this paper is its approach involving an in-depth analysis of multiple real-world water distribution networks. This analysis is facilitated by a combination of techniques. Another notable aspect is the comprehensive evaluation of outcomes using a probabilistic framework. While various factors (like the type of damage, pipe material, soil acidity, and nearby traffic density) affect the extent and likelihood of leakages (Yu et al. 2019; Latifi et al. 2022; Schwaller and van Zyl 2015; Mora-Rodríguez et al. 2014), current research often overlooks the uncertainties associated with leakage modelling. The paper includes a sensitivity analysis illustrating how parameters within the leakage model can impact the achieved reduction in leakages within the optimized network.

The structure of the paper is following the next order. The next section introduces the methods of the topology optimisation of WDNs.The third chapter - case studies - presents the application of the optimisation technique through an in-depth analysis of three case studies and the general results of four additional real-life WDNs. After the case studies a detailed sensitivity analysis is presented for the uncertainty of the leakage model. Finally, the last chapter summarises the results of the investigations and the outlook for further research directions.

Fig. 1
figure 1

The process scheme of topology optimization

2 Methods

2.1 The Optimisation Workflow

During this chapter a combined optimisation technique will be presented which beholds the leakage model calibration with the Nelder-Mead technique (Gao and Han 2012), a clustering technique with the Leiden algorithm (Traag et al. 2019) and a global optimisation technique, the differential evolution (Storn and Price 1997). The steps are the following.

  1. 1.

    The leakage calibration of the WDN is performed with the Nelder-Mead method, see Fig. 1 inset a).

  2. 2.

    The network is transformed to a pressure weighted graph, see inset b).

  3. 3.

    The Leiden algorithm (Traag et al. 2019) identifies the homogeneous pressure zones in the network, where the number of these zones depends on the manually selected \(\gamma\) parameter, see inset c).

  4. 4.

    PRVs are placed for all of the edges which are connecting different pressure zones, see inset d) and e).

  5. 5.

    At last, a differential evolution algorithm (Storn and Price 1997) identifies the optimal setting for every PRV, see inset f).

2.2 Choice of the Hydraulic Solver

Nowadays, numerous out-of-the-box, sophisticated hydraulic modelling software are available, e.g., WaterGems (Wu et al. 2006) or Mike (Ekklesia et al. 2015), which are based on the traditional EPANET (Rossman 2000) solver. These programs have numerous advanced features, which can increase the modelling efficiency on a larger scale, e.g., the pressure-dependent demand modelling (Abbas et al. 2015) or water age calculation (Farmani et al. 2007). Although the EPANET solver is an efficient tool in computational time, extensions of new features can be cumbersome, e.g., adding different pressure-dependent leakage models. During this study, we use our in-house, modular, easy-to-extend software STACI, which was already validated, (see Weber and Hos 2020; Weber et al. 2020; Huzsvár et al. 2021) for details. From the computational point of view, Staci is efficient as the EPANET solver, and due to its modularity and general framework (there is no structural restriction on the modelling equations), new methods implementation efforts can be lower.

Table 1 The parameters of the analysed networks, the case studies are marked with bold

Seven real-life WDNs have been analysed from the region of Western Hungary. The main parameters of these real-life networks can be seen in Table 1. In all cases, snapshot simulations of an average hydraulic state were used as the basis of the optimisation, and the consumption values were calculated as a one year average.

2.3 Pressure Dependent Leakage Modelling

The leakages in a WDN are often modelled as orifice flows (Araujo et al. 2006; Dragan Savic and Kapelan 2008)

$$\begin{aligned} q_{i} = K_{f} \cdot p^{\beta }_{i}. \end{aligned}$$
(1)

Equation 1 means, that the leakage flow rate at i-th node \((\text {q}_\text {i})\) is equal to a flow constant \((\text {K}_\text {f})\) multiplied by the pressure \((\text {p}_\text {i})\) on the power of the leakage exponent \((\beta )\). In this equation, two different parameters appear, which describe the type of pipe damage and the behaviour of the leakage. The first one is the leakage exponent \((\beta )\), which can vary between 0.5 and 2.5 based on the material or the geometry of the damage (Gupta et al. 2016; Greyvenstein and van Zyl 2007). However, if the pipe material is highly elastic, extensions should be made to model the leakages properly (Cassa and van Zyl 2014; van Zyl and Cassa 2014; van Zyl 2014). The most problematic part of such representation is that the proper value of the exponent could be evaluated only in the case when the exact geometry of the damage, the material of the pipe and the exact pressure and soil conditions are known. The goal of the paper is to represent a pressure management technique without the necessity of the knowledge about the leakages, the authors selected 1.18 as a general approximation of this value (Araujo et al. 2006). However, the last chapter of the paper presents a sensitivity analysis to understand the uncertainty of the exponents.

The second metric describing the amount of a nodal specific leakage \(\text {K}_\text {f}\) is a damage type and size-dependent variable (De Marchis and Milici 2019). The modelling parameters (\(\text {K}_\text {f}\) and \(\beta\)) can be node-specific, zone-specific or network-specific based on the available amount of information. This paper uses the latter approach since only the water balance for the whole network is known. A Nelder-Mead optimisation technique (Gao and Han 2012) is applied to calculate the value of \(\text {K}_\text {f}\) for each network. During this optimisation, the fitness function was formulated as

$$\begin{aligned} f(K_f) = \frac{ |L_{rlf} - L_{mdl} |}{ L_{rlf} }. \end{aligned}$$
(2)

The optimised variable is the network-specific leakage constant \(K_f\), while \(L_{rlf}\) means the real-life, and \(L_{mdl}\) means the actual modelled leakage water loss of a network. The value of \(L_{rlf}\) is determined by the acoustical analysis of 353 kilometres of the pipeline by the utility company. That is, the \(L_{rlf}\) water losses includes only the leakages caused by the damages, while other type of NRW is not considered. The optimisation will stop, if the difference between \(L_{rlf}\) and \(L_{mdl}\) decrease below 1%. The achieved leakage constants can be seen in Table 1.

2.4 Network Representation

Since the leakages in the network depend on the local pressure, the optimisation of pressure management can decrease the amount of water loss. To reach this goal, a pressure-based representation is created to describe the behaviour of a WDN in graph form. A WDN is represented as an undirected graph, with nodal weights of

$$\begin{aligned} w_n^i = \frac{p_i}{p_{max}}, \end{aligned}$$
(3)

where \(w_n^i\) means the weight of i-th vertex, and \(p_i\) is the pressure of i-th node and \(p_{max}\) is the pressure maximum in the network. While the weight of edges is defined as

$$\begin{aligned} w_e^i=\frac{ |p_{beginning}-p_{end} |}{\Delta _{max}} \end{aligned}$$
(4)

where \(w_e^i\) stands for the weight of the i-th edge and \(p_{beginning}\) and \(p_{end}\) means the pressure at the start and end node of this edge. At last, \(\Delta _{max}\) is the maximal edge specific pressure difference in the network.

2.5 Community Detection

During the last two decades, the appearance of social media and the rapid expansion of network theory led to new, efficient cluster detecting algorithms (Barabási and Albert 1999), such as (Leiden Traag et al. 2019) or Louvain (Blondel et al. 2008) algorithms. These techniques can identify communities in a network with similar parameters, like users of social media with a similar taste for music, food or political preferences. They ca be also used for detecting pressure zones automatically (Creaco et al. 2019). To reach this goal, the algorithms are minimising the modularity (Newman and Girvan 2004) of a graph, which is formulated by Clauset et al. (2004) as

$$\begin{aligned} Q = \frac{1}{2m} \sum _{vm} \left( A_{vw}-\frac{k_v k_w}{2 m}\right) \delta \left( c_v,c_w\right) , \end{aligned}$$
(5)

where Q is the modularity, m is the number of edges in the network. While \(A_{vw}\) is the relevant element of the adjacency matrix, while \(k_v\) and \(k_w\) is the degree of node v and w. Furthermore, \(c_v\) is the community where vertex v, and \(c_w\) is the community where vertex w belongs. At last \(\delta\) is a function which equals 1 if \(c_v = c_w\) in every other case it is 0. This paper applies a new form of modularity (Traag et al. 2019)

$$\begin{aligned} Q = \frac{1}{2m} \sum _{c} \left( e_{c}-\gamma \frac{K_c^2}{2 m}\right) , \end{aligned}$$
(6)

where \(K_c\) is the sum of the degrees of the nodes in community c, and \(e_c\) is the actual number of edges in this community. This representation beholds a new parameter \(\gamma\), named as the resolution parameter \(\gamma > 0\). This variable can change the number of the communities during the analysis. If its value is closer to zero, there are fewer; if it is increasing, more segment is created, as can be seen in Fig. S1 - in the supplementary material. As Fig. S1 - in the supplementary material - depicts, the number of edge cuts is monotonically increasing with the resolution, such as the number of identified clusters. This parameter can be exploited to control the cluster identification process.

2.6 Valve Setting and Type Optimisation

According to the work of Mala-Jetmarova et al. (2018), the topology optimisation of WDNs is a well-established research field since the works of (Shamir 1974; Alperovits and Shamir 1977). Although the genetic algorithm can optimise the topology of WDNs due to its usability and versatility (Halhal et al. 1997; Savic and Walters 1997), the continuously growing networks limit their efficiency. The number of possible solutions are growing with the size of the network exponentially and leading to a combinatorial explosion (Mala-Jetmarova et al. 2015; Ghassemi et al. 2017; Creaco et al. 2018). The preliminary research also showed the fitness function describing the setting, placement and cost of the PRVs together is infeasible for a genetic algorithm. Besides, the interference between the PRVs in the system also proved to be an overly complex fitness function for the standard versions of the genetic algorithm.

To overcome such issue the question of the placement and the setting is solved in two steps: the Leiden algorithm detects the individual communities, serving as pressure zones, then a differential evolution method (Storn and Price 1997) finds the ideal setting for the valves. The idea behind the application of differential evolution was the superior convergence property of the technique in the case of the widely expanded fitness function due to its unique mutational process (Lampinen 2002). The technique was implemented in Python 3, while the algorithm was used through the scipy toolkit. The parameters of the applied technique can be found in supplementary material as Table S1.

The fitness function is formulated as

$$\begin{aligned} f(\vec {x}) = a\cdot L^{norm}_{sum} + b\cdot \beta \end{aligned}$$
(7)

where \(L^{norm}_{sum}\) formulated as

$$\begin{aligned} L^{norm}_{avg} = \frac{L^{new}_{sum}}{L^{old}_{sum}} \end{aligned}$$
(8)

where \(L^{new}_{sum}\) is the summarised leakage with the new PRV placement, while \(L^{old}_{sum}\) is the original leakage. In equation 7\(\beta\) is the relative water loss of the network. The relative loss is calculated as the fraction of the unserved consumers and the all consumers of the network as

$$\begin{aligned} \beta = \frac{\sum d_i - \sum c_i}{\sum d_i}, \end{aligned}$$
(9)

where \(d_i\) stands for the demand in the i-th node, while \(c_i\) means the actual consumption in the same node. To calculate \(c_i\) in all of the nodes, a pressure-dependent solver is applied (Abdy Sayyed et al. 2015; Weber et al. 2020). The parameters of the solver are set as \(P_{des} = 25\) [m], \(P_{min} = 10\) [m] and the exponent were \(m = 2.5\). In this compound fitness function - Eq. 7 - a and b are the dimensionless weighting parameters. These parameters are crucial due to the need to eliminate the possibilities when the optimisation technique identifies such solutions where the leakage decreases on a larger scale, but the service loss in the network is unfeasible. During the optimisation a value remained 1, but b were set as 100, i.e. even the smallest loss of service means an infeasible solution.

The implementation workflow of the optimisation is presented in the supplementary material as Figs. S2 and S3.

3 Results

To present the technique with sufficient thoroughness, three different case studies are presented based on the following aspects:

  • Network 7 is selected as a small but easy-to-validate example due to the unique geodesy of the WDN.

  • Network 3 is a rural area with a larger extent and multiple sources.

  • Network 2 is selected as a suburban area with a large grid-like network.

The pressure map of these networks can be seen in Fig. S4 in the supplementary material. Since the resolution parameter (\(\gamma\)) precisely defines the number of the identified pressure zones, i.e. the number of the applied pressure-reducing valves, three different limits are applied. These different cases, named Limit 1, 2 and 3, represent the technique’s investment planning possibilities. Since the maximal number of identifiable pressure zones in a network is equal to the number of nodes, i.e. every node can be an individual pressure zone ideally, there is a Pareto optimum between the cost and the achieved leakage reduction. For the case studies, these cost limitations are selected according to the industrial best practice of the West Hungarian utilities. Naturally, these limitations are entirely arbitrary and modifiable to local requirements.

3.1 Case Study 1

Network 7 consists of 5096 ms of pipeline and serves approximately 800 inhabitants with fresh water. According to the utilities, this village is located close to one of the industrial centres of the area. According to Fig. 2 the network has higher pressure differences originating from the area’s geodetic properties.

Fig. 2
figure 2

The leakage map of the original and the three optimised cases for Network 7

Figure 2 depicts the results of the optimisation, while Table 2 shows the results in detail, and Fig. S5 - in the supplementary material - shows the identified pressure zones in the network. Accordingly to Fig. 2 inset "Original", the area where most of the leakages located are in the northern side of the network. This area is almost 25 ms lower than the reservoir level, which causes a relatively high pressure - approximately 57 m.w.c - in this section. The algorithm identifies only two separable pressure zones in the network in limit 1 with only one PRV. The effect of the modification can be seen in Fig. 2 inset "Limit 1" after applying the valve and the modulation of the pressure on the eastern side of the network, the nodal leakages decreased significantly. Numerically the summarised leakage of the network is decreased by \(22.95 \%\), which means approximately 4962 cubic meters decrease of water loss in a year.

Although the Leiden algorithm identifies the boundaries of the communities, the differential evolution decides the final type of the valves. If the setting of the downstream pressure is high, the valve is simply open as a normal pipeline. For example, in Limit 2, the "V1L2" valve has 98.0 m.w.c. preset pressure that is even higher than the maximum of the whole network. The opposite case is if the setting is significantly low, as in the case of "V3L2". It means that this valve can be a closed isolation valve. These results highlight that the technique is not just capable of placing PRVs into the network, but it can point out pipes where the implementation is unnecessary or lines which should be closed to get a more feasible solution. The optimisation resulted in a \(27.8 \%\) decrease in summarised leakage, which means 6010 cubic meters of sparring yearly.

Similar behaviour is found for "Limit 3". According to Table 2, the valves "V3L3", "V4L3", and "V6L3" are in such critical places where the slightest change in pressure results the service pressure under 25 m.w.c.; thus, these must remain open to avoid such a scenario. Moreover, "V2L3" should be closed with an isolation valve. The summarised leakage decrease, in this case, was \(36.81 \%\), which means 7958 cubic meters yearly. Since this is a simple network, manual analysis is available. The industrial practice creates pressure zones based on the geodesy of the area and the engineers’ subjective experience together. The presented optimisation technique mimics this approach. The method first detaches the high-pressure area, and as the number of PRVs increases, the pressure difference between the pressure zones becomes closer and closer to each other.

The selection of the limits followed the rule that Limit 1 should mean the fewest possible number of zones with a resolution number of 0.002. In contrast, Limit 3 shows the highest amount of the rationally placeable PRVs - in the case of Network 7 - with the resolution number of 0.07. Limit 2 is selected between these two cases. It is important to notice that during the optimisation process, the highest rationally placeable amount of PRVs are always determined according to the resources and best practice of the area; thus, these limits are selected entirely subjectively and in a similar way for all of the case studies.

Table 2 Type and setting of the valves for different Limit scenarios in Network 2, 3 and 7

3.2 Case Study 2

The second presented network is picked due to its multiple sources, as Fig. 3 depicts. Two of these sources are pumping stations located in the centre of the area, while the other two are tanks in the northern and southern sections. Due to this structure, the definition of the pressure zones is not trivial. The algorithm identifies a resource-efficient solution with the closure of two isolation valves in Limit 1. The technique detaches the southern and northern sides of the network, decreasing the additional pressure from the central pumping stations. Similar phenomena appear in Limit 2 and Limit 3 scenarios. The detailed results of the optimisation can also be found in Table 2, and the identified pressure zones are presented in the supplementary material in Fig. S6.

Fig. 3
figure 3

The leakage map of the original and the three optimised cases for Network 3

Fig. 4
figure 4

The leakage histogram comparison of the original and Limit 3 for Network 3. The normalisation is created with the original state’s average leakage

To analyse the results in detail, Fig. 4 presents the distribution of nodal leakages all across the network before and after the implementation of the pressure-reducing valves in Limit 3 scenario. The leakage histogram shows the average leakage reduction of \(26.98 \%\) compared to the original state. Limit 3 spares 37555 cubic meters of water yearly. Moreover, the median value of the leakages also decreases with \(25.92 \%\), i.e. the PRV placement optimisation eliminates not just a few substantial leakages, but the whole distribution is decreased with one quarter. Limits 2 and 1 achieve average leakage decrement of \(12.75 \%\) and \(6.091 \%\). The decrease of the median leakages also shows a similar behaviour with \(5.821 \%\) in Limit 1 and with \(11.49 \%\) in Limit 2.

3.3 Case Study 3

The largest presented network consists of 36240 ms of pipe and 50.47 \(m^3/h\) demand. The motivation behind the presentation of the results in this is to introduce the limitation of the technique. As Fig. 5 depicts, the network has only one main pumping station in the centre of the area. It is also presented that the pressures are relatively low in the southern section of the system compared to the east side, where the pressure reaches 74 m.w.c. This inhomogeneity of the pressure, which in this case also has a geodetic origin and the localised source, constrains the maximally achievable leakage decrement.

Fig. 5
figure 5

The leakage map of the original and the three optimised cases for Network 2

As Fig. 5 depicts, in the case of Limit 1 two PRVs are placed near the central pumping station, which is causing a significant decrease in the leakages of the southeast section. Numerically, it means a \(19.65 \%\) decrease in the average leakage amount, which spare 29899 cubic meters of water under a year. After this, despite the increase in the resolution of pressure zone detection, the maximal reachable decrease in leakages remains in the same magnitude. Limit 2 achieved the average decrease of \(19.81 \%\) and Limit 3 \(20.01 \%\). By increasing the number of valves (or pressure zones) could not achieve significant leakage reduction. Financially it does not worth increasing the number of valves beyond Limit 3.

3.4 Summary of the Optimisation Results

With the leakage reduction technique, seven real-life WDNs were analysed from the region of Western Hungary. The summary of the results is presented in Fig. 6. In every case, a significant leakage decrement is achieved by implementing several pressure-reducing or isolation valves. The only WDN not reaching 10% leakage decrement is Network 1, where the large extent and minor geodetic differences highly constrained the efficiency of the technique. However, the spared amount of water, which is 9853, 14677, and 26492 cubic meters yearly in the case of Limits 1, 2 and 3, respectively, can justify the implementation of PRVs. In all other cases, the needed amount of valves is relatively small if the amount of leakage decrement is considered.

Fig. 6
figure 6

The leakage map of the original and the optimised case for Network 1

4 Leakage Model Uncertainty

The achieved leakage reduction depends on the \(\beta\) leakage exponent values, which were taken as 1.18 for all nodes for the optimisation. Determining a more accurate value requires the actual localisation and investigation of every leak that might not fit into the budget of the utility. However, the question is inevitable: how does the uncertainty of leakage model parameters effects the results of the optimised networks? A sensitivity study is performed with the following steps.

  1. 1.

    The network specific \(\beta\) is changed to a leakage-specific one. This means that in all nodes, a new leakage exponent is calculated based on the age and material of the connecting pipes.

  2. 2.

    A new network calibration is completed to identify the new network-specific \(\alpha\) to fit the new model to the known water balance.

  3. 3.

    The networks with the optimised PRV placement and setting are reanalysed with the new leakage distribution.

Table 3 The distribution of the different materials in the analysed networks

First the material specific \(\beta\) parameters are selected based on literature data (Yu et al. 2019; Latifi et al. 2022; Schwaller and van Zyl 2015; Mora-Rodríguez et al. 2014; Ávila et al. 2021; Greyvenstein and Van Zyl 2007; Fontana et al. 2016), see Table 3. To pick a value, a stochastic approach is used.

$$\begin{aligned} \beta _i = \beta _{min}+ \frac{(K_1 + K_2)}{2} \cdot (\beta _{max}-\beta _{min}), \end{aligned}$$
(10)

where \(\beta _i\) is the leakage exponent of the i-th pipe, \(K_1\) is a random number between 0 and 1. While the values of \(\beta _{min}\) and \(\beta _{max}\) are from Table 3, \(K_2\) is calculated as follows:

$$\begin{aligned} K_2 = \frac{Y_{i}-Y_{min}}{Y_{max}-Y_{min}}, \end{aligned}$$
(11)

where \(Y_i\) is the age of the i-th pipe, while \(Y_{max}\) means the maximal and \(Y_{min}\) minimal age of this pipe type. This formulation means that if the oldest pipe from a specific material has in the worst "virtual" condition, i.e. \(K_2\) equals 1. The opposite case is for the youngest pipe from a specific material, where this value is always 0. This stochastic approach is arbitrary and can be questioned or further developed.

This was important because polymer pipes just appeared in the last few decades, while Asbestos cement was used in the 1960-1980 era. This formulation approximates the pipe’s condition with its exact age on the timescale between the youngest and oldest pipes of this material. Thus the value of \(K_2\) will also be one in the case of the oldest AC, Polymer or any other kind of pipes.

Twelve scenarios are defined for every network. In Scenario 1, the value of \(K_1\) is 0, while in Scenario 12, it is one. These scenarios indicate the extreme boundaries for the best and the worst conditions. In all the other cases, \(K_1\) values are randomly selected using Mersenne Twister (Matsumoto and Nishimura 1998). The redistribution of leakage exponents \(\beta\) values makes it necessary to recalibrate the networks from the overall leakage amount. This process was performed for each of the scenarios for every network, i.e. the water balance or the amount of leaked water is still the same for each scenario in a network, and only its distribution differs. It ensures the validity of the comparison between scenarios.

Fig. 7
figure 7

The boxplots of the result sensitivity analysis, where L1, L2 and L3 marks the different search limits

The results of the uncertainty analysis can be seen in Fig. 7. As the boxplots suggest, the case of scenario one - when \(K_1\) is equal to 0 - often means the minimal effect case. In contrast, the scenario when \(K_1\) is equal to 1 marks the maximal effect case of the sensitivity analysis. The only exception is the Limit 1 case of Network 1, 6 when the reached leakage reduction is so low in the first case that the effect of the different scenarios is hardly separable. The other indicator is the green diamond, which marks the result of the original topology optimisation when the beta factors, thus the leakages, were approximated by a homogenous distribution throughout the network. According to the results, this approximation describes well the actual behaviour of the network, as Fig. 7 summarises, in the case of Network 1, 2, 3 and 4, the approximated leakage reduction fits nicely into the sensitivity range. In the issue of the other three networks, this approximation means a robust upper estimation; however, in all cases, the pressure-reducing valve distribution, which was identified by the application of this leakage simplification, means a significant decrease in summarised leakage.

5 Conclusions

This paper presents an efficient leakage reduction technique applying the Leiden clustering algorithm with the differential evolution; moreover, a stochastic sensitivity analysis quantifies the uncertainty throughout the analysis of seven real-life WDNs. As a first step of the analysis, a leakage constant calibration is performed with the application of the Nelder-Mead technique to match the non-revenue water data. The next step is the identification of homogeneous pressure zones with the application of the Leiden technique. To increase the accuracy, the graph of WDNs are weighted using the normalised pressure for the nodes and the normalised pressure loss for the pipes. Since the number of pressure zones can be scaled by the modification of the resolution \(\gamma\) parameter, the number of required valves can be adjusted to the available budget and resources. The differential evolution finds the optimal valve setting and type in the last step. The final type of valve is decided based on the actual setting of a valve: if the setting is low (below atmospheric), the valve is a closed isolation valve, and while the setting is high, it can be an open pipeline. The optimal solution avoids any consumption losses and maximises the leakage values.

The applicability of the technique is presented in detail through three case studies. In the simplest case - Case study 1 - the method suggested a valve placement that matches the industrial best practice, and it segregates the sections of the network from a geodetic perspective. Since Case study 2 has four different sources, the technique one by one segregates the sections where a dominant source can serve a standalone area, and the valves are placed on the connecting pipes between these areas. The last presented case, a more extensive network, has only one source located in the centre of the network. This combination - the large extent of the network and the single source - limited the applicability of the technique, which meant that an increment in the number of implemented PRVs could not achieve a more significant amount of leakage reduction.

Aggregating the optimisation results of all seven analysed networks, we experienced a clear trend that pressure-based segregation can provide a sufficient placement technique for the valves. At the same time, the differential evolution evades the service outages with pressure management. Last, the reached results are supported by a stochastic sensitivity study, which showed that the results could change in the case of the complete network leakage redistribution; however, the reached leakage reduction remains significant.