Introduction

There are numerous risk analysis techniques, ranging from simple qualitative analysis for dynamic systems. At present, promising advances have also been made on important issues in this field. Due to their multi-functionality and ability to affect a wide range of applications, they have attracted particular attention, especially in the field of water, where drinking water supply quality monitoring is a major concern for public health policymakers (Tixier et al. 2002).

Water supply systems have been found to be the most critical because of their considerable impact on public health (Maier, J., & Allende 1999; Swamee, P. K., & Tyagi 2000). In reality, water supply systems are made up of infrastructure and devices that are susceptible to sabotage and are characterised by the presence of the main components of each water supply system: raw water sources (usually a reservoir, river intake or aquifer), water treatment plants (including various treatment processes) and the networks that distribute drinking water to consumers. Seepage of some contaminants into the water distribution system can occur through storage tanks and pipes. Seepage through distribution systems can occur during or after maintenance and renovation work (Kirmeyer, G. J., & Martel 2001). If the pressure in the pipe is very low or insufficient, this greatly increases the likelihood of backflow contamination or leakage in the pipe. This can occur when the pipe pressure is reduced for repair or during transient pressures (e.g. when the hydrant is used for fire suppression or during water hammer). In addition, internal degradation of metal pipes and plumbing equipment can lead to increased concentrations of metal compounds in water. Not all metals experience the same corrosion mechanism, but overall, water with low pH, high dissolved oxygen, very high temperature and high levels of dissolved solids will result in a higher corrosion rate (Kleiner, Adams, and Rogers 1998). The chemical substances that manage to seep into the water system are frequently those originating from the internal lining material and the lining of the pipes, which cause a physico-chemical deterioration of the water quality, accompanied by health risks.

There have been several cases where water treatment plants have experienced catastrophic situations leading to epidemics. One such catastrophic event was the outbreak in Milwaukee, Wisconsin, USA, in April 1993, which resulted in more than 400,000 affected population, nearly 100 deaths, and a city completely paralysed. Other examples include the 1994 norovirus gastroenteritis outbreak, probably due to the consumption of faecally contaminated water in a nursing home in Albacete (Spain), which resulted in 341 deaths; the 1991 hepatitis E outbreaks in Kanpur (India), which affected 79,000 people, and the norovirus gastroenteritis outbreak in 1999, again probably associated with the consumption of faecally contaminated water in a nursing home in Albacete (Spain). Risk analysis methods are important for the controllable management of risks and hazardous events in drinking water production. These may include: biological or chemical watershed pollution, treatment process failures and water distribution system failures such as leakage or water pollution. The field of risk analysis is composed of three main parts: the first part corresponds to risk management. The focus is on measuring both the intensity of the possible hazard or harm but also their quantity in the medium assessed which could lead to the population being afflicted by noxious substances and conditions (NRC 1983). Risk evaluation assesses different policy options depending on the data obtained in the assessment carried out. It selects and implements particular control features, regulations to follow and implement when required. Risk communication is the capability to communicate vital information about the hazard which is an essential component in the process. It comprises sending and receiving data and assessment of the hazard and the strategies to be put in place by the technical crew (managers and assessors), but also feedback from users, and other stakeholders (Terje Aven and Zio 2014; Reddi et al. 2015). In this review article, the methods and tools are illustrated with examples providing the information necessary to make informed decisions about the risks associated with a treatment and distribution process. We truly believe our research will not only be relevant for researchers in the area of water production, treatment and consumption but also for officials, legislators and those in charge of operating the system since the assessment models can be adapted to meet their specific needs.

Review methodology

Our objective has been to review the existing literature on advanced risk analysis methods for water supply systems, focusing on the fundamental ideas and to assess their implantation at tactical, organisational and executive levels of the chain of command and operations. After the arduous task of casting out the pieces for this research from numerous pioneering papers published in recent years, only work considered useful and fundamental in the field was selected. Works deemed to be of particular importance and current value were given a higher rank though. We have sought to begin our review by summarising the major ideas of 141 published studies, sorting the papers into several groups and developing their particular characteristics in a distinct manner. We then examined the information collected and drew conclusions about the current status of the assessment techniques in the water production, distribution, consumption and managing fields.

The line of work applied to this paper was outlined as follows: Firstly, a number of academic databases were spotted as tools for the research. Namely, ScienceDirect, PubMed, Scopus, ISI Web of Science, EBSCOhost, Emerald, IEEE Xplore Digital Library, ProQuest, SpringerLink, Taylor and Francis, Wiley, In, Interscience, American Society of Mechanical Engineers (ASME) digital collection, One Petro and Google Scholar. The chosen indexed articles were published between 1981 and 2020. These different databases were chosen because they cover most of the peer-reviewed scientific literature studies related to risk analysis. The search strategy was carried out in two stages. The first step involved conducting a limited search of the databases for words in the text and index terms used to describe the articles. The second stage involved a keyword search of the databases; index words were also used. The papers’ research methodology was narrowed to title, the summary and some word tags, e.g., ‘Decision-making OR Water safety’ AND ‘Risk analysis’ ‘methodologies’ ‘Infection’ OR ‘Prevalence’ OR ‘Incidence’ OR ‘Occurrence’ OR ‘Burden’ AND ‘Human’ OR ‘Risk reduction’ ‘Avoidance plan’ OR ‘Water’ OR ‘Consumers’ ‘Causes of system failures’. For further details, review to shortening overall document, many ideas can be expressed in shorter forms, to make a more readable paper. See attached file.

According to the data obtained in the assessment carried out of the chosen studies, the result of each slot of the classification framework is presented next: A total of 135 studies were selected on the basis of title sorting. Six additional research papers were included by bibliographical research from other sources. So finally, 141 papers were set forward for summary review. A total of 68 articles came through for a thorough and complete examination, followed by 39 articles chosen to present other methods of risk analysis and their strategic importance. From these articles, just 28 were deemed suitable for inclusion in the final analysis. Due to problems of incompatibility with the criteria and objectives, 36 articles were excluded. The findings of this study give an overview of the potential of different risk analysis methods. However, this synthesis will not present all the elements and further aspects of the different procedures, which would not be the aim here, but rather a means to enable water utilities which may need the help of risk analysts to consult and apply most of the methods mentioned. Case studies have been included to justify the application of the different methods in the water sector, and the results are presented in more detail in the bibliography.

Figure 1 presents the developed framework for the classification of risk analysis in the water sector. It is composed of a division into five distinctive features that were implemented after scientific and industrial contributions published in scientific journals and conference proceedings were assessed.

Fig. 1
figure 1

Classification framework for risk analysis in the water sector

Risk analysis methods

There are two main tasks in the field of risk, namely the use of risk assessment and risk management to assess and deal with risks associated with particular processes, and then, to better comprehend, evaluate, characterise, communicate and (broadly) manage/control risks (NRC 1983). A number of questions have emerged lately on the matter though. For how this helps clear out the field of risk management, see Hansson and Aven (2014), Hollnagel (2014), Hale (2014), Jean-Christophe Le Coze (2014) and Terje Aven and Zio (2014). We should distinguish risk areas characterised by related risk education programmes, journals, papers, researchers, research groups and society. Hansson and Aven (2014) pointed out that science is to provide us with the epistemically most warranted statements which can be made, at the time being, on subject matters covered by the community of knowledge disciplines, on nature, ourselves as human beings, our societies, our physical constructions and our thought constructions (Hansson and Aven 2014).

By publishing a paper in an academic or scientific journal, we are playing our little part in the growth of the field of risk assessment. Some of the works one might come across are just general overviews. They might still be relevant for various applications, but the fact they are non-specific does not mean they do not deepen into certain specific aspects. The scope of some studies might mainly cover certain application areas, or only cover one area but can still be applied to many different sectors. Let’s say a study may deal with finding the best way to determine risk in a commercial environment. Interest outside this area is very limited though. Such is the case of supply chain risk management, which has recently moved from an emerging focus to an evolving field of research (Hansson and Aven 2014). The research of Fahimnia et al. (2015) proposes a review of the magnitudes within the different models for supply chain risk analysis. It also identifies productive areas for research which have proven the source of some of the most fundamental knowledge, concepts, theories, tools and techniques, for instance, particularly relevant works including research by the following authors: Blackhurst and Wu (2009); Brandenburg et al. (2014); Heckmann, Comes, and Nickel (2015); Jüttner, Peck, and Christopher (2003); Peck (2006); Tang and Zhou (2012); and Zsidisin (2003). This research covers such contributions (Heckmann, Comes, and Nickel 2015) presented based on the paths the industry has taken lately in terms of risk assessment, as summarised by SRA (SRA 2015).

The transfer of knowledge and experiences between fields is difficult to achieve, and it is commonly observed how various branches of an industry might develop tailor-made concepts, which might not necessarily be state of the art in the field. Giving these general researchers who find their own solutions for their particular issues to tackle might deserve more credit and circulation among specialists. On the other hand, limited work in a specific field generally stimulates and influences generic risk research. We will then look in more detail at how science is linked to key risk analysis and managing, specially when the science in the field is involved in everyday operators and managers’ decision-making.

The world of science regards research papers as more credible when they clearly state the main problem being tested, when they use established methods of data collection and analysis, and when they are examined by other researchers. These studies can provide useful information, such as the identification of public health or environmental risks that should be monitored as they develop. This brings us to the risk assessment stage, as shown in Fig. 2, based on the insights of the Hertz and Thomas (1983) model, which shows the connection among the facts and value in risk decision-making.

Fig. 2
figure 2

A model for linking the various stages in the risk informed decision-making

During this phase, the body-of-knowledge is reviewed, and a preliminary assessment is made about the risk and possible unforeseen variables that might come to play their part. We will obviously consider those related to the field case under scrutiny. We ought to understand the specific values of the decision-makers who should be incorporated into the project. At the same time, we owe it to ourselves and tell the onus probandi based on science, a sufficient amount of evidence to consider a claim to be part of the most accepted scientific thesis from the one based on practical applications of a specific model. Assessment can be, however, so closely linked to specific aspects that it must be done by specialists in the field.

As early as 1981, Alvin M. Weinberg and Robert B. Cumming debated the requirements of traditional scientific methods a number of articles they issued in the risk analysis journal, said paper being linked with the creation of the Society for Risk Analysis. Risk evaluation could show us the level of technical know-how. Other factors must be included to assess the dependability of the system within the scientific method. T Aven and Kørte (2003) addressed the issue by showing how decisions made on the base of science support the statement:

  • Characterisation of the impact of interconnection between nature, technology and social aspects in the system.

  • Characteristics of uncertainty and robustness of different types of know-how related to risk management, and methods for reducing these uncertainties and building knowledge capacity.

  • Studies which identify particular drawbacks or lacks in the knowledge base on which are the pillars of risk assessment management.

We ought to mention the confrontation between the idea that risk evaluation does not somewhat completely meet the requirements of traditional scientific methodology and the idea that it should be a means to facilitate risk management and policymaking. The fact that the field has experienced a swift on the importance given to precise risk assessment to an approach where the presence or lack of the necessary technical know-how and science base slowly takes a predominant role can be perceived as a consequence of these new ideas.

Risk has a number of variables. It can be difficult to assign an unequivocal definition which fits different types and contexts. The literature is flooded with different conceptualisations (e.g. Terje Aven 2009). There is a variation among them according to the context. An approach with the eyes of an engineer might greatly differ from the social scientist and the laboratory researcher, and finally, whether the risks considered are associated with public health issues, environmental issues or purely technical aspects. Because of the many different definitions, it is therefore necessary to clearly depict how the terms applied to their context. It might be that the case risk equals the possibility of an adverse shortcoming affecting the process. Nevertheless, the literature agrees that a combination of both aspects offers us a more classic definition of risk (e.g. EC, 2000; IEC, 1995; ISO/IEC, 2002). Stanley Kaplan and Garrick (1981) indicate that the answer to ‘What is risk?’ involves these three questions (addressed, as well, by Kaplan (Stan Kaplan 1997): What could happen? How likely is it to happen? What consequences will there be?

In this synthesis, the definition of Stanley Kaplan and Garrick (1981) is used as a starting point for the study of risk and also unpredictability. The outcome and multiple other factors can be used to fairly depict risk. It is, therefore, considered here as a situation that exceeds society’s ability to cope with it. It is not just a physical or technical problem that disrupts a water supply system but can also include other factors such as poor planning, poor management or failure to incorporate the uncertainties of climate change. Indeed, there is now evidence that climate change is evolving; it is largely man-made and will have, or is already having, significant consequences for human civilisation, including the water supply sector. It is a physical event that has deleterious effects on humans and their activities.

We owe to properly separate risk from uncertainty. Risk involves a known probability distribution that can be used to assess it. Deep uncertainty involves problems for which the probability distribution is unknown. As mentioned in the definitions of risk above, uncertainty is an integral constituent element of risk. It should then be considered in risk management and decision-making. Although part of the literature understands the probability within a study of risk as simply a description of uncertainties, that is not the sense we will follow in this study. A more thorough risk and decision assessment must take into account uncertainties in probabilities as in other aspects of the process. There are several reasons for uncertainty: some might be sourced in natural factors (random uncertainty), while in other cases the absence of the necessary theoretical foundation (epistemic uncertainty) as described in the literature (e.g. Aven and Kørte 2003; Velásquez Guerino 2017).

Developing the model further into more slots could cover a wide variety of potential occurrences within the process. For instance, some studies split the tree into 10 different sources of uncertainty. These can prove themselves to be essential in the system design and risk analysis phases just to cite two examples. Point estimates, for example, can be replaced by probability distributions for describing uncertainties in the different variables. A Bayesian approach is commonly applied in risk analysis (Czado and Brechmann 2014). Deep uncertainty affects the vast majority of risk analysis of water systems today, as climate change and human policy decision-making leave water managers with less clarity about future conditions and the impacts of their interventions. This is justified by: a limited comprehension of the actual sources of risk and the dire consequences of global warming. A fully developed and standardised methodology for conducting assessment studies on risk would alleviate the existing gap and might help mitigate negative impacts in the water sector. The risks of global warming and natural hazards should occupy a much higher rank in the minds of policymakers. The water sector requires as much technical medium and know-how as well as funding to build public capabilities to analyse and give a quick response to natural hazards. If action is not taken assess and prevent risks caused by global warming, damage could be impossible to revert and thus has serious repercussions for our future in all aspects of human activities.

Finally, the different methods used can be defined based on the next four features: quantitative, qualitative, probabilistic and deterministic. Some studies have opted for splitting the whole system into main categories: solely based on determination, probability and a combination of the two. These methods comprise an assessment of the product, in our case water as a commodity, the equipment used in the system and the count and measurement of various targeted objectives such as the population or the natural environment. The approach focused on probability or the recurrence of hazardous events or even the probability of industrial failure and human-based malfunctions can be included. Most applications of this techniques focus on a specific sector of the system.

Risk analysis tools, data sources and applications

A significant number of tools are currently available in the field of risk analysis that have been applied to qualitatively or quantitatively compute and combine ‘probability of failure’ and ‘consequences’ at different levels in water supply systems.

Figure 3 shows the ones selected and reviewed in this chapter within the framework of risk analysis and risk assessment, while Table 1 summarises the main references, including the purpose of the analysis and some main features.

Fig. 3
figure 3

Risk analysis tools within the risk assessment flow chart

Table 1 Overview of risk analysis tools and examples of application

A detailed analysis of each of the tools, including case studies, is now presented sequentially:

Hazard and operability study (HAZOP)

It is an effective and thorough means of conducting qualitative analysis of risks in industrial facilities. A detailed and rigorous technique allows us to identify hazards and malfunctions in an entire plant or treatment facility. It is a full assessment of the components of the systems in order to find any deviation and if they can develop into a bigger issue. The key objectives of this method are to: describe the installation or operation fully. They often include design-stage foreseen conditions, focus on how those mishaps can appear and defining if these may endanger or cause malfunctions within the planned system. We will now briefly outline the essential components to this type of analysis: First, split the system into study nodes and then define a number of relevant system parameters, namely, pressure, temperature, water volume and circulation, chemical signature and preset guides of usage (see Table 2). By doing so it allows us to monitor the process systemically with the aim of identifying possible deviations that could alter the amount and purity of water.

Table 2 Essential HAZOP components

HAZOP is carried out by brainstorming and setting a number of keywords (such us ‘none’, ‘more than’, ‘less than’, ‘part of’, etc.) in order to narrow ideas and interconnect them. One of the main features of this approach is that it provides a creative environment where the skills of different members of the team can be of use in different stages of the process. Occurrences are assessed by considering the practical know-how of the team. Typical teams consist of: leader, engineers, risk analysts, materials technicians, operations staff, designers and tool developers and manufacturers (known as OEMs–original equipment manufacturers). HAZOP can be applied to assess safety risks within a model taking different focuses. Advantages of this approach include:

  • Hardly quantifiable hazards are better managed with better results.

  • Frequency and probability values regarding risks are not required to be mathematically quantified neither do their consequences.

  • In the same way, risks are not necessarily classified.

  • Brainstorming is a tool commonly applied to the process.

  • It is a fairly easy tool to use and apply compared to other risk assessment techniques.

On the flipside, one of its main drawbacks lies in the fact that a multifaceted system where many elements intervene and have an influence on each other, they would render this technique rather useless when such appear (Marhavilas, Koulouriotis, and Gemeni 2011).

HAZOP offers an opportunity to be creative and to consider how risks or operational issues can appear. The systemic and thorough implantation of HAZOP techniques allows us to minimise dangers. The use of ‘guide words’ to analyse each model we want to implant and possible risks help us to connect those ideas into fruitful solutions. This approach to risk analysis is most relevant for application to the depuration and dispense of the water supply. If applied to quantify the possibility of the actions we were to take in the process a reliability assessment would be necessary. For minor hazards, our course of action might be prompted by know-how and common judgement. The results of a HAZOP might largely vary based on the technical know-how and attitude of the crew involved and management models applied. They must be experienced, knowledgeable and own the necessary skills and be empowered to give their approval to the actions decided upon. In other words, it is essential that HAZOP team members have the necessary experience, knowledge and authority in order not to falter when making a choice in critical areas. Each crew member possesses their own share of responsibility in the areas they manage or when a discussion meeting is carried out (Marhavilas, Koulouriotis, and Gemeni 2011).

Case study examples

Kletz (2018) confirmed the success of HAZOP for incident analysis, explaining four of the preventable accident cases. After presenting four separate CHAZOP schemes, the authors found that a holistic view of the system was essential and proposed a systematic approach to developing a hazard identification methodology to assess the safety of the system and improve its overall quality. On the flipside, Sikandar, S., Ishtiaque, S., & Soomro (2016) demonstrated the relevance of HAZOP methods in the water sector. The objective of that project was to use this tool in a hypothetical scenario where a typical modern water treatment plant was selected. The whole system was developed based on a tool called ‘process flow diagram’ (PFD), ‘piping and instrumentation diagram’ (PID) and a set of keywords. The hazards of the whole process were then identified, and some basic safety techniques were previously applied to solve those problems. As a summary, the results were evaluated, showing that an overall assessment resulted in approximately 57 hazards. On the other hand, for a safer system design about 63 actions were recommended, the majority of which (71%) required the installation of new devices.

Once again, the HAZOP method has shown its effectiveness in the field of water. Mohammad (Mohammadfam, Mahmoudi, and Kianfar 2012) carried out a study using two clearly defined methodologies, each with their own capabilities and limitations. This is due to the high use of chlorine in the water system installation, a potentially dangerous substance which is one of the most delicate aspects. A case study we assessed was the Tehran water treatment facility. A system process analysis was carried out, focusing specially on the management of safety of the chlorination unit. The most important issues to address included: equipment implementation and update which play an essential part in minimising risk. The results of the two techniques (HAZOP and Energy Trace and Barrier Analysis, ETBA risk identification) revealed they had now the capability to thoroughly analyse and critique the consumption and production. Both tools were considered effective in recognising and predicting hazards and could improve the level of security, deter mishaps to machinery and operators and, also, to improve the efficiency and consistency of the model by reducing operational issues. Whether a model is labelled as critical in some of its stages or, if an undesired evolution is observed during implementation the crew would then assess and determine the possible consequence and impact. Tools which can aid us under these circumstances, among others, include: Failure Modes & Effects and Criticality Analysis. FMEA has successfully been used in cases of identifying and preventing malfunctions (Mohammadfam, Mahmoudi, and Kianfar 2012; Ravi Sankar and Prabhu 2001).

Failure mode and effects analysis (FMEA)/failure mode, effects and criticality analysis (FMECA)

It is first developed in the aerospace industry of the 1960s as a formal design approach and has since then become a powerful and essential element for deterring possible issues and avoiding their development (Bowles and Peláez 1995). When applied to a process analysis, it can be applied to spot the reasons to a malfunction and to identify the causes of a failure. Even when we are bound to apply it to an assessment of crucial factors it would also be part of the FMEA. Its main goal is to single out possible errors, to determine the causes and effects of errors in the various components and to define what measures can be taken to prevent hazards from happening or alleviating their consequences. The results obtained allow operators, managers and policymakers to spot and address failure modes which can have a pernicious influence and result in poor performance during all stages of the system implantation and operation. An added benefit of FMEA is the capability for decision-makers to propose ways to eliminate or minimise possible errors and malfunctions in a model thus to improve the safe operation and reliability of the system. The technique starts by conceptualising the system examined. We then divide the system into smaller operational units and elements so we can spot possible errors and address them through a comprehension of their origin. Also, it is important to determine current controls (or solutions) for the prevention or detection of risk factors and finally to evaluate the consequences of a malfunction within the model. If we were to carry out a mathematically quantifiable analysis, a scale has to be established where:

(O) Probability of an occurrence to happening.

(S) Severity of the consequences.

(D) Detectability of these hazards and how to implement such.

A risk priority number (RPN) is then determined using the following formula:

$$RPN = { O } \times { S } \times { D}$$

To better comprehend how to apply this system, readers to concur to the study carried out by Shafiee et al. (2019). Although applications of FMEA are increasing in sectors where the safety element is crucial, the limited reach of this technique has been properly spotted and explained by academic studies for a long while. The second major drawback we have observed in this system lies in the fact the criteria used to choose the fields covered by O, S & D are entirely based on the operators experience and judgement only and thus subjectivity. This might result in a liability to the decision-making stage since it can produce a lack of trust on this data. As a result, several of these factors should be ranked at the same level of importance as they appear in many case studies. Originally called FMECA due to how ‘critical’ and how quantifiable such undesired mishaps might be. The FMECA is therefore a type of FMEA with a criticality approach. Several levels of failure analysis can be carried out. This methodology disregards the interactions of various failures, which means that each one must be assessed separately. This might be the cause for many cases where operators and organisations hide some of their results (see, for example, Gheibi, Karrabi, and Eftekhari 2019; Dinmohammadi, F and Shafiee 2013; Kabir and Papadopoulos 2018). FMEA has been implanted thoroughly in a number of different sectors, water production/distribution among them.

We can also use this technique to assess the design and reach of distribution systems, for example the specific conditions and needs a user might have. Data obtained via the FMECA allows us to prevent risk at different levels/stages: for instance, designs, procedures and maintenance. Results can also help us to make changes within the system and to schedule a maintenance plan or modifying such and other policies. The goals of this technique are:

  • Quantified risk of failure for every element within the system.

  • Which are the possible technical sources of such failures?

  • What effects might there be if the failure occurs?

  • What is the criticality threshold?

  • Is the failure headed towards a safe or unsafe outcome?

  • What mechanisms are allowing us to spot a failure?

  • What’s the procedure to follow in case of failure?

The methodology is simple enough to understand and does not really require training. However, it is advisable to use a facilitator, someone who has a certain amount of technical know-how and past experience with the technicalities involved. Useful skills of the various participants in the analysis form the common knowledge well be able to apply to the system. When applying an FMECA study to a whole water facility and its piping from source to user, human resource requirements will be fairly high. The study can be reported and registered by the use of a spreadsheet, but there is also plenty of specific software for this sort of study available in the market which can be used to do a different arrangement of risks/procedures. There are multiple versions of FMECA sheets although this is a fairly typical example. The obtained data can be applied to deterring risk in all stages of the process. The results of the analysis allow managers to identify and correct the different types of failures which have a negative effect on the system and to improve it. It is feasible to apply these methodologies to an entire water model or else to a particular area of the process that we want to improve or correct (Kabir and Papadopoulos 2018).

Case study examples

Success of FMEA was proven in a field study carried out in a water facility in Razavi Khorasan, Iran (Gheibi, Karrabi, and Eftekhari 2019). At first, researchers applied brainstorming techniques with a modification of the Delphi method. The study was carried out in the gas chlorination modules and possible factors of risk. They split them into two groups reflecting their interconnection within the gas chlorination building. In a second step, they spot which specific variables and conditions intervened by applying a FMEA method of analysis. Resulting data then were compared for validity. The final stages of the project were dedicated to structuring the data from the results into a Petri net model. Results proved that based on the FMEA risk factors, those risks in rank 46 and higher had to be labelled of absolute priority for the system and the functioning of the unit under study.

A different sort of approach to FMECA in a water plant was applied in Tucson, AZ, USA, within the RWSS (Regional Water Supply System) (Hwang, Lansey, and Quintanar 2015). This plant operates in drinking water, also repurposed or reused water and wastewater to a developing area in the city. The purpose of the study is to submit the model to a series of theoretical but possible complete failures in terms of water volume. FMEA is used then to assess how critical risks among component malfunctions. The results showed that there was a resilience advantage to maintaining the CWF (Central Well Field). In the initial study, costs were not taken into account. Further studies ought to be integrated into a FMECA considering structure to making decisions which balance both the economic and functionality variables. A tree of events, presenting a tolerated amount of redundancy by several levels of degrading so the correct policy can be applied to preserving the component or preventing its malfunction. This sort of analysis can cover a multitude of failure scenarios applying this technique called Markov analysis.

Markov analysis technique

Is a thorough and complex approach from a schematic model approach which allows gradual changes between various states. This technique permits a more proper modelling of risk in such systems where the conditions quickly change. A Markov chain of events reflects precisely that how occurrences might happen with time and can be calculated and predicted by statistical divisions. The structure of dependence between one factor and the next has to be fairly simple in order to give us results. In this sort of analysis, the data have to be extracted from present events, not past events if we agree with the conclusions of Kabir and Papadopoulos (2018). Data like types of component malfunctions, failure sequence and availability of replacement parts can also be included in the Markov analysis to determine how safe and reliable a component is which can indicate the rates of transition. This method was developed at the beginning of the twentieth century (Fu, Li, and Huang 2012). The Markov technique is named after the Russian mathematician Andrei Andreyevich Markov who undertook the study of stochastic processes, i.e. processes that include the operation of chance (Source). This method was first used to describe and preview the behaviour of gas particles in a closed container. This analysis has made it possible to define a new sequence of random but related facts that will resemble the original sequence. Markov analysis is a probabilistic technique that facilitates the decision-making process by providing a probabilistic description of various outcomes (Fu, Li, and Huang 2012). As a management tool, Markov analysis has been satisfactorily applied to a number of different industrial processes with great results, thanks to its versatility and accuracy.

In a Markov analysis, it is possible to define different operating stages or levels of systemic deterioration. Example A: Two pumps are required to perform a certain task. The choice is between installing both pumps or having a third one in reserve. A Markov analysis can then be performed to compare the performance of these two options. The results of the analysis for each option are as follows:

  • The statistical probability that we might need to use only one pump in a yearly period.

  • The average time necessary to run only one pump but also – as we said before—the number of times a pump can malfunction and breakdown.

In example B, the analysis is focused on processes of decay and deterioration of the industrial components and thus the system. Say, for instance, we’re assessing wastewater pipes. There is equipment for water systems which can easily measure the remaining wall thickness of the pipes. On the basis of these measurements, it is also possible to determine the state of deterioration several water conducting sections. As the model allows us to appreciate the transitions between those states, it allows us to better prevent and address the issue. In conclusion, it is considered a quite thorough and innovative technique. Skilled personnel are required to execute such analysis. A certain amount of past feedback is needed in order to operate the analysis correctly both during stages and transitions. As in any of the models we presented before there is a need to collect as much appropriate data as possible. Note that if we are considering to analysing several changes of state, say 3 or 4, a data management tool would be advisable in such cases as the collection and arrangement process can, at times, be time-consuming and a tad tedious (Fu, Li, and Huang 2012).

Case study examples

Sempewo and Kyokaali (2016) applied a Markov analysis a tool to the decision-making process and to foresee the actual and probable state of a distribution water supply. This is reflected in a case study in Kampala Water, Uganda. This study applied a Markov model to predict the future of the network. The results of this study prove that this would be ideally be the best solution. When combined with statistical data to predict the future of the system it can shed light to cohort conditions of the pipes in a water network. Even though it can help us predict future mishaps within the system it cannot be stretched out far into the future since its accuracy decreases with time. In such cases, a frequent and regular collection and storage and update of the transition matrices is advisable when possible. This would enormously facilitate the work of managers and operators in terms of repair, maintenance and budgetary constraints. Chiam, Yih, and Mitchell (2009) applied a control policy for a water facility system using the Markov decision process. Different systemic factors such as water shortages/cuts, lack of storage supply and how much water can be used or collected in a certain period of time (one hour in this case). Results are compared with a baseline-based system. In this case study, the results were compared with a system that used the basic policy. Results pointed at how personal judgement and what ‘feels right’ is not necessarily beneficial at all for the system.

In cases where there is a public scare about the health risks to consumers caused by inadequate water quality, an evaluation of the health impacts of microbiological contaminants in drinking water may be required, for example a Quantitative Microbiological Risk Assessment (QMRA).

Quantitative Microbiological Risk Assessment (QMRA)

It is a technique which has been generally known since 2004. It assesses both the source quality, treating process but also the possible risks for the users’ well-being. This objective can be achieved in many ways. Carrying a microbiological contamination study is one of the many ways to assess the presence of noxious microbes. Distribution hazards are not commonly cited among the elements of a QMRA analysis (Medema, G., & Ashbolt 2006; Haas, C. N., Rose, J. B., & Gerba 2014). The QRAM uses pathogen concentrations in source water. It also uses and treatment system information entered by the user as well as consumption and dose–response data for different micro-organisms analysed and cited in the academic research in our scope. This data is interpreted to determine risk of infection, disease and disability-adjusted life (DALYs). These ratios come in two measurements in this case, daily and yearly. Results are displayed so that they can be compared not only to official standards (such as ‘Health Canada’ target of 1 × 10–6 DALYs per person per year) but also to what the system can tolerate as acceptable in terms risk levels. The basis for QMRA was defined in (Medema, G., & Ashbolt 2006) and has been implemented and implanted now in the context of risk management (Medema, G., & Ashbolt 2006).

The QMRA is a clear and consistent methodology for comparing the risks of a system to a health objective. Thus, it works by defining whether a model or route is safe. It is also possible to determine the conditions under which security may be compromised. To ensure that the model is accessible to a wide number of users, it has been developed using a digital tool which is available even as a freeware which is spreadsheets from Microsoft’s Office or other freeware options also available and fully compatible, making it much more accessible for the developing world. Mathematical models have been developed by international organisations (Smeets 2008; Teunis et al. 1997; Schijven et al. 2015), as well as by other groups in Canada (Benoit Barbeau, Pierre Payment, Jose´e Coallier 2000; Jaidi et al. 2009; Murphy and King 2016), to quantitatively assess potential micro-biohazards linked to the water system under assessment. These models cover potential risks associated with bacterial, protozoan and viral pathogens. The majority of the models are not available in a format that is easy to use and download. However, the QMRA model developed for regulatory use in the Netherlands (QMRAspot) can be downloaded and used by anyone who wants to focus on risk assessment in their drinking water system (Medema, G., & Ashbolt 2006). The model needs to be operated by someone with the deep know-how, experience necessary and to carefully consider the input terms which will be used in the particular case under scrutiny. Nevertheless it’s a fairly accessible tool. We must not forget that most crucial stage of the water supply model application is the collection of data necessary prior to the system full development and put in place.

Case study examples

Case studies the implementation of the QMRA framework and how this has facilitated a deeper comprehension of the system’s management and also its entire modelling adapted to the specific needs of the case study (Teunis et al. 1997; Jaidi et al. 2009). QMRA studies of small water systems are not quite abundant and usually conducted in wealthy countries. Barker and Packer (Barker et al. 2013) estimated DALYs due to ‘Norovirus’, ‘Giardia’ and ‘Campylobacter’ in a tiny and isolated research post far from any human settlement where wastewater was repurposed directly to drinking. QMRA analysis performed, using norovirus, Giardia and Campylobacter as reference pathogens to determine the level of treatment required to meet the tolerable annual disease burden of 10–6 DALYs per person per year. Davis Station, Antarctica, was the sample source. Its community isolation made even more patent to seriously consider the volume of inhabitants, the density of the population and their vulnerability to pathogens and diseases before planning an adequate treatment to apply. QMRA carried out revealed that in the event of an outbreak, more control barriers should be put in place to reassure the access to fresh safe water.

Petterson (2016) also applied a QMRA in drinking water management. The aim was to carry out a QMRA framework to be able to discern which option is more profitable and less harmful in case of choosing different sorts of interventions and treatments in the design and development stages. The consequences of the above-mentioned interventions in water consumed at home in the overall numbers of diarrhoeal illness or DALYs were estimated, without referring to the concentration of pathogens in source water as base from which the analysis is carried out. Nepal’s case required the creation of a framework model and a digital tool to facilitate the task and results. The model demonstrates how the QMRA framework can incorporate data from different research to aid the decision-making process by managers and operators but also to spot what path to take next in order to improve the incidence of disease in a local environment. As a way of foreseeing future developments this tool tackles down many of the obstacles and challenges one might find if relying solely on epidemiological papers to compare interventions. QMRA has thus been regarded as a means to assess the microbial safety of drinking water systems (Benoit Barbeau, Pierre Payment, Jose´e Coallier 2000). An FTA (fault tree analysis) would be more than necessary if we were to surveil the whole water facility and identify possible falters of the network.

Fault Tree Analysis (FTA)

It is a logical and schematic medium. It is a tool used to assess where the occurrence or not of certain pre-established conditions would then provoke the occurrence we wanted to foresee and prevent in most cases (Rådbo, Svedung, and Andersson 2008; Ale et al. 2006). FTA allows the network to be translated into a logical diagram making it one of the popular methods currently applied to safety and reliability professionals in the industrial world. (Volkanovski, Čepin, and Mavko 2009; Ale et al. 2006). FTA is a top down deductive analysis tool in which the causes of an event are forecasted given the appearance of certain indicators. It provides a clear fathomable picture of many factors which intervene in failure modes and hazards to, namely equipment, human, internal and external factors which can lead to a fatal failure with dire consequences for both the network and its users. Technical errors might be perceived as basic while human error is considered as intermediate events which might increase until they become a technical issue In order to carry out an FTA a main event must be identified. This main event is a critical situation which might cause the system to fail. The formation of the tree starts with the definition of the main event. Subsequently, the events leading to each immediate event are determined and linked by logic gates. This same process continues until all root causes of the top event are found; see Fig. 4. FTA is a binary analysis which means that all events either occur or do not. In addition, an LRA can be qualitative, quantitative or both, depending on the information available and the purpose of the analysis. For more details on how to conduct an FTA, see (Haimes 2009).

Fig. 4
figure 4

Example of a fault tree (A Lindhe 2010)

FTA is most appropriate for the analysis of components such as pumping stations or treatment plants. It can also be applied to the analysis of networks but this can sometimes become much more complex. In the context of an overall risk analysis for the water sector it is possible to carry out a specific analysis of the factors which cause certain risk events. For instance, a sensitive safety barrier might require to be analysed in more detail to identify the events that could lead to its failure. Say, it may be necessary to analyse the performance of a specific treatment system (UV disinfection, filtration, CO2, etc.) in relation to the possible failures of its components. The advantages of this analysis are that it provides a complete picture of the system and how the failure relationships work within it. In addition to this, weak points in the system can also be identified allowing managers to optimise operations by prioritising the causes of a major event and taking action to avoid them. FTA can also help to identify the sources of a failure in the system and to assess how a basic event may contribute to the higher-level event (Glickman and Erkut 2007; Ale et al. 2006).

Similarly, it can take into account events of different natures—mechanical, functional or natural. An FTA can be applied with the reliability data of the base events to approximate the reliability of the system; when the main event is system failure, the base events are normally component failures (Glickman and Erkut 2007). It is therefore important to point out that significant training and experience is also required in order to use an FTA correctly, as the success of its implementation depends heavily on the skills of the analyst (Haimes 2009).

Case study examples

Although the use of FTA has its limitations it has proven to be a powerful tool for risk analysis in water utilities. Ugarelli and Røstum (2012) demonstrated the use of FTA in a large pumping station in the western part of Oslo. The purpose of the FTA was to perform a thorough analysis of the station and to gain a better understanding of how an adverse event could occur. The hypothesis was based on a rough risk analysis performed earlier by the authors. The main event in this analysis was the malfunctioning of the pumping station. Thirty-six sets of first-order cuts (only one component failure or event is required for the system failure) and four sets of second-order cuts (two component failures or two events are required) were identified using a qualitative approach. A quantitative approach was also used. The unreliability of each component was determined using the ratio of the needed time to repair the component versus the time for the component to malfunction. Following this, the probability of the highest event was calculated and the events that contributed most to the degradation of the pumping station were identified using the Birnbaum importance measure. A few events included fire leading to a failure in the electrical cabinet and sabotage. In the long term, FTA is an excellent tool for identifying critical events which contribute the most to system failure. It can also be used to identify risk reduction measures to enhance the reliability of the pumping station. Beauchamp and Lence (Beauchamp, Lence, and Bouchard 2010) identified the technical and functional risks of a water treatment plant using a quantitative FTA technique and operator experiences as input. The main objective of their research was the improvement in technical and operational factors. Andreas Lindhe et al. (2009) carried out a risk assessment of drinking water distribution systems by applying dynamic fault tree analysis in Gothenburg, Sweden. Taheriyoun and Moradinejad (2015) assessed the risk of the water treatment plant in the western city of Tehran by applying the FTA technique. The probability of failure of the top event was analysed on the basis of minimum cut sets.

Event Tree Analysis (ETA)

ETA was first applied in the field of atomic energy and has gradually gained acceptance in other fields such as chemical and mechanical engineering (Beim and Hobbs 1997). A triggering event is taken as the starting point and the predictable events, occurrences such as a system process failure or construction malfunction. They propagate sequentially from the triggering event and are illustrated graphically. The ETA is a system model for determining the safety of a system from the safety of the sub-events. It is called an event tree because the graphical presentation of sequential events grows like a tree as the number of events increases. As shown in Fig. 5, it consists of an initial event, probable subsequent events and final results caused by the sequence of events. The likely subsequent events are independent from each other and the specific final outcome depends exclusively on the initiating and subsequent events.

Fig. 5
figure 5

Event tree structure (Ang AH 1984)

In the design phase, ETA is used to verify the criteria for improving system performance, to obtain fundamental data on operations and to test management. It can also come in handy to identify useful methods for protecting a system against failure. Event tree analysis (ETA) is the most popular method of analysing the progression of a hazardous event from its origin to its final consequences. Event trees and fault trees are still widely applied in a number of sectors, particularly in the water sector. ETA has many advantages over other risk analysis tools. ETA can be used to identify all accidental events and processes that may occur in a complex system. The main steps in ETA are (i) identifying all relevant initiating events that may lead to an unacceptable risk, (ii) identifying all safety actions required to reduce the risk, (iii) constructing an event tree, (iv) describing the different potential accidental sequences, (v) assigning probabilities to each sequence of event, (vi) calculating the probability of success or failure of the system by summing up the probabilities of the different sequences in question. On the flipside, limitations of ETA lie in its inability to analyse several initiating events or safety incidents at the same time (Hong et al. 2009).

Case study examples

To date, few studies have been conducted on the risk assessment of water and wastewater treatment systems. Analouei, Taheriyoun, and Safavi (2020) assessed the probability of an effluent standard violation and its consequences in a water treatment plant. The objective of this study was used to determine the risk of a water treatment system violating effluent water quality parameters in relation to standard limits. The case study was the treatment plant located in Isfahan, Iran. The impacts of the adverse events in this study were classified into different scenarios using event tree analysis (ETA). The results showed a 41% risk of violation of the effluent standard limit in the plant. Furthermore, the risk factor analysis revealed that human error had the largest contribution to the calculation of the risk of failure of the WWTP. In this regard, a number of mitigation measures were proposed to improve the reliability of the WWTP.

Human Reliability Analysis (HRA)

HRA has been designed to perform a structured assessment of human reliability and human error probabilities; see for example, (Akyuz and Celik 2015; De Felice, F., Zomparelli, F., & Petrillo 2017; Konstandinidou et al. 2006). HRA involves the use of qualitative and quantitative methods to identify and quantify the human contribution to the risk profile. HRA has three basic functions: (1) identifying human errors, (2) predicting their likelihood and (3) reducing their likelihood if necessary. See for example (Swain and Guttmann 1983; Dhillon 1989; Park and Jung 1996; Kirwan 2017 and Embrey, D., Kontogiannis, T., & Green 1994). Human reliability is defined as the ability of humans to perform well-defined tasks in a fully satisfactory manner whether it is equipment maintenance, equipment or system operation, safety actions, analysis or other types of human tasks that influence system performance (Calixto 2016). Human error is opposed to human performance and basically to reliability. Generally, human error is defined as the failure to perform a defined task (or the performance of a prohibited action) that may cause disruption of planned tasks or damage to equipment and assets. The most common reasons for human error are: inadequate training and skills, inappropriate maintenance instructions and operating methods, poor work layout, poor design of certain equipment and unsuitable work tools (McLeod 2015).

Therefore, in order to carry out an effective HRA, it is essential to have a good understanding of the different types of human error, such as error of commission, error of omission and intentional error, and the factors that influence them. It should be noted that there are several factors which have an influence in human error, such as internal (psychological and physiological) or external (technological and social) factors that determine human performance (Rollenhagen, Westerlund, and Näswall 2013). One of the most complex phases of HRA is the determination of human reliability. Human reliability values are difficult to determine and can be uncertain (Ravi Sankar and Prabhu 2001).

HRA puts the focus on the consequences of human operators and maintainers within the system. These can be a quantifier and qualifier of the influence human falter can have on all aspects of water supplies. HRA is not just one technique but instead a collective term for several methods the main steps of which are always risk analysis, then determination of human error, quantification of reliability. HRA can, for instance, be applied to manned operations. (Ravi Sankar and Prabhu 2001). Task analysis consists of how suitable is a team of operators and what's required to do to achieve their aims. Task analysis encompasses a diversity of techniques and methods which shed light on the human–human and machine–human interactions within the network and water system. We need the task analysis to describe the tasks as thoroughly as possible in order to spot human error and be able to revert/deter it. The human error identification, spots and describes possible erroneous actions while the human reliability quantification estimates the probability of erroneous actions. There are several methods to carry out task analysis. The time consumption of an HRA depends on the scope of the analysis. It may be time-consuming to analyse all work processes involving human actions in the water supply system quantitatively. Access to relevant human reliability data as basis for quantitative human reliability analyses may be a problem. If no specific human reliability data for water supply systems are known the analyses have probably to be based on generic human reliability data from other types of industries.

Case study examples

Wu, S., Hrudey (2009) considered Human Reliability Analysis (HRA), its role in reducing human error and drinking water safety by proposing as a future objective of water research to reduce human error. Human errors contributed to 62 drinking water accidents in rich countries between 1974 and 2001. The results revealed that a proof of concept for the application of HRA to water quality incidents was suggested to be carried out. The gestation and life cycle of drinking water incidents and human errors in each life cycle period were determined. Through a reanalysis of the case studies, the long-time lag of drinking water incidents was reconfirmed and the active role of latent errors and third parties was reported.

Kančev (2020) presented in their study the sensitivity of human reliability analysis performed on a probabilistic evaluation model. The analysis was performed on a preselected set of actions of the post-initiator operator. The objective was to examine the impact of these operator actions on the risk in the production unit by changing the corresponding human error probabilities over a wide range. The results showed that future effort should focus on maintaining the current level of human reliability, i.e. not letting it deteriorate, rather than improving it. Probabilistic risk studies have shown that the human factor can contribute significantly to overall risk. Human reliability analysis focuses on the potential and mechanisms of human error which affect risk and safety. HRA involves both quantitative and qualitative aspects, aiming to design operator interfaces that will minimise operator errors and provide for error detection and recovery.

There are several methods of risk analysis, varying in complexity and detail. Risk analysis of infrastructure assets is widespread in the sector and is an essential tool for asset management. Approaches range from the simple coupling of GIS techniques with infrastructure data to visually track utility assets and examine the risk factors associated with them (e.g. Doyle and Grabinsky, 2003; Booth and Rogers 2001) to the complex integration of data-intensive GIS with hydraulic simulations to assess the risk of intrusion into the distribution system (e.g. Lindley TR 2002).

This study presents the summary results of risk analysis methods for water systems. It provides an overview of the main risk analysis methods for a water utility. The aim is to define the tasks associated with risk analysis, to demonstrate the applicability and capabilities of the different methods and thus to support the implementation of the Generic framework and methods for integrated risk management in water safety plans.

As a result of our review, Table 3 summarises the advantages and disadvantages of the tools from our point of view.

Table 3 Advantages and disadvantages of risk analysis tools applied in the water supply sector

Further discussion

It can be seen that the vast majority of the publications on risk analysis methods mentioned in this review each have different characteristics regarding the management of risks and hazardous events in the water sector. The risk picture for water utilities is therefore quite complex and includes technical, biological and human aspects of a large and diverse system.

However, this paper reports on more formalised and explicitly sophisticated approaches that can be used to analyse service risk in the water sector in more detail. These methods demonstrate the developments and progress that have been made on risk issues in the water sector (Renn 2008). These methods are more advanced than the simple ‘coarse’ risk analysis (CRA). CRAs are usually limited to a semi-quantitative estimate of risk, indicating levels of probability and different categories of consequences. The various methods and tools available today, as well as possible future methods and tools, provide better means than before for risk analysis to provide useful decision support on risk issues.

It should be emphasised that the result of the risk analysis is never decision-making, but provides important information for the decision maker to use (e.g. Kammen and Hassenzahl 1999). In order to reach an effective management of risks a water system needs not only analysis and policymaking but also some structure and commitment. So, complementary methods which integrate the whole water system are mandatory if we want to guarantee a water supply free of dire risks for its users.

These are the reasons why WHO has proposed a risk-based approach and the preparation of water safety plans (WSPs) (WHO 2008; Viljoen et al. 2010).

Conducted research on South Africa mentioning WHO recommendations as ‘a very useful guidance document that is relatively easy to follow and adapt to specific requirements’ (p.179). Senior management commitment, qualified and fully focused resources, and in-depth knowledge of relevant water quality issues along the supply chain are essential for water safety governance. WSPs are mandated in countries such as Australia, Iceland, New Zealand, Serbia, Switzerland, Uganda and the UK (Roeger and Tavares 2018). The basic concept of a WSP is to assess the entire drinking water system, identify possible hazards and plan the monitoring and operation of the system to control the risks.

By 2016, the number of countries with WSP had increased to almost 90 (García et al. 2018). Ultimately, the mandatory adoption of WSP in Iceland has proven to be beneficial and is an important tool for improving water quality and minimising disease (Gunnarsdóttir, Gardarsson, and Bartram 2012). This study highlights the existence of several barriers related to the institutional relationships between policymakers, regulators and management agencies which have been instrumental in determining the success or failure of water safety planning, particularly in rural areas. The study also shows that the implementation of this process influences inter-institutional relationships by building bridges between stakeholders, facilitating communication and providing support for managing relationships between them. The application of risk analysis methods shows us the progress made in providing access to improved water sources in urban areas with beneficial effects. However, this progress has not been as positive as it might have been expected in rural areas. Risk analysis methods are easily implemented in urban water supply systems, but rarely applied in rural water (Howard et al. 2005). Indeed, rural communities around the world remain a target at risk due to their vulnerability to drinking water hazards (Hunter, MacDonald, and Carter 2010). The application of risk analysis methods in rural areas is confronted with factors that influence its implementation. In deprived countries, people in rural areas use a wide variety of water sources. In general, they use: rainwater collected from the roofs of houses and stored in various containers (iron or plastic barrels, plastic jerry cans, aluminium pots, etc.); water from springs, rivers, underground water obtained from wells (a simple hole dug in the ground) and traditional wells, which may or may not be equipped with sheet metal collecting barrels. These wells are usually installed close to the houses. Many of them are open or equipped with a makeshift cover. The water is drawn from the well with a bucket or a can cut-open with a rope. In this type of water supply system, the application of the various risk analysis methods mentioned is very difficult to achieve. In other rural areas the infrastructure and water supply systems are most vulnerable to failure and contamination. (WHO 2008).

Distribution system operators are often untrained or undertrained and sometimes unpaid. They may only work part-time and may have other responsibilities within the community or privately. The application of risk analysis methods generally requires a certain amount of experience and skills as an operator. For some of these risk analysis methods, knowledge and experience in the field are essential. These methods are carried out by experts in close cooperation with the staff of the water utility. Training is a key player factor in the application and implementation of such risk deterrents. In urban areas, water and sanitation protection is more important than in rural areas. This difference has been reported in most countries on the continent (Sogbanmu et al. 2020).

Over the past decade, the number of piped drinking water supply projects in rural areas has virtually exploded. The water security plan (WSP) has been widely recommended as a systematic way to improve drinking water security. But so far, this approach has been little used to ensure drinking water safety in areas where people are forced to travel far from their homes to fetch water. In rural areas, the majority of people still consume unsafe water, despite the implementation of the WSP and the provision of improved water sources by municipalities. We carried out a thorough and systematic analysis of similar studies in the field (Jan 2000 to May 2020) which found: one example of unregulated water consumption (21%); non-specified in other cases (79%) from a total of 100 studies. Just a minimal 7% of the papers mentioned a rural community dependant on an unregulated water source. The rest of hazardous conditions mentioned were, by frequency: 67%—groundwater > 82%—chemical hazards, whereas the most applied methods of assessment were: 14% stochastic and statistical methods while the vast majority were deterministic. Publishing activity has increased by 57% in Asia (more than half of the total) while 47% of studies mentioned at least one publication gap in the areas of community risk analysis and management in rural areas (Pond et al. 2020) Despite many comments on this issue, policies and programmes remain too focused on short-term infrastructure provision without considering the need for medium and long-term sector support in deprived rural areas. The environmental sustainability of rural water supply programmes has so far not received sufficient attention including adequate rural risk analysis methods. More environmentally integrated approaches to water supply and sanitation are needed for instance in a context were management and water processes (treatment, depuration, distribution, etc.) are all integrated.

Overall conclusions

Water services sector has advanced considerably in the area of risk management and analysis. The present review provides a more complete vision of risk analysis methods for water facilities and water infrastructure in general. It aims to demonstrate the application of various risk analysis methods and tools for water service systems. It can be applied as a guidebook to the operation of a water plant allowing managers and operators to assess most critical situations.

This study has aimed at incorporating quite interesting new work and technical breakthrough and considering also what fields did they source from. Having consulted and reviewed many relevant papers on the matter, the next main conclusions were drawn:

  1. 1.

    The scientific foundation of risk assessment and risk management is still an open issue. There is a need to dig much deeper in the field of scientific base for these sorts of studies since general assumptions and some technical know-how can sometimes be misleading.

  2. 1.

    Principles, theories and base methods applicable to water supply systems are in continuous development, existing tools are suitable and a growing number of applications are available and of great interest.

  3. 2.

    Risk analysis methodologies are in their journey to gain the necessary broad technical, community and political acceptance in the water treatment sector. Some gaps (particularly in rural areas) and opportunities (particularly in governance implementations) have been included in the discussion.

As a final remark, responsible policy on water supply sector requires a more thorough approach to baseline studies, monitoring, evaluation and knowledge management in general than has been adopted so far. Policies should make more effective use of existing knowledge in the decision-making framework to achieve the sustainable development goals of providing safe water to the general population. We hope that this review can further stimulate research and policy by creating a stronger platform to address current and future challenges, especially in rural water security situations.