Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1.1 Introduction

The traditional approach to design, operation, and regulation for complex engineering systems in general and nuclear systems, in particular, has been deterministic in nature where safety principles that mainly include, defense in depth, fail-safe criteria, redundancy, and diversity form the basic framework. Given the competitive market conditions in this era of globalization, nuclear systems have to have a strategy where it meets the market demand effectively without comprising safety. The major goal here is to realistically understand the available engineering margins and associated uncertainties to meet safety and availability goals and make these systems more sustainable.

Even though deterministic approach is time-tested and worked well for all these years, it has some limitations. It is conservative and prescriptive in nature and does not provide a measurable parameter for safety and reliability of engineering systems. Experience and research have shown that the deterministic approach where defense in depth is integral to this approach, while reasonably assures safety, often leads to expensive systems and technologies that the society and market would not be able to afford. Further studies have shown that while some designs and regulations based on conservative approaches appear to reduce risk of complex engineering systems, e.g., nuclear plants, this may come at exorbitant cost and still may not guaranty safety [1]. With advances in technology, for example, advances in computational techniques, improved understanding of materials, simulation methods availability of data and information, there is increasing interest in application of best estimate and further risk-informed approach [2]. In spite of these developments, the approach to address uncertainty, by and large remains conservative, i.e., based on application of relatively large safety factor. The considerations of safety factors provided a way to compensate for a lack of knowledge and data; but often make the systems more complex, costly, and unsustainable. The requirement 15 of IAEA Safety Standard entitled safety assessment for facilities and activities states that both deterministic and probabilistic approaches should be used in safety demonstration [3]. Further, the integrated risk-informed approach employing deterministic and probabilistic framework is an established tool in support of decision-making [4].

Probabilistic risk assessment (PRA) or probabilistic safety assessment (PSA) and deterministic approach along with comprehensive considerations of uncertainty and human factor provide the basic framework for a holistic risk-based approach. Apart from PRA, other methods, such as hazard and operability analysis (HAZOP) and failure mode and effect analysis (FMEA), which are often considered as qualitative risk evaluation methods, are also used to derive risk insights in support of design and operation evaluation in the chemical and process industry.

1.2 Historical Perspective on Probabilistic Risk Assessment and Risk-Based Applications

Former’s paper in 1967 [5] provided a new approach for site selection. However, Rasmussen’s [6] comprehensive and landmark PRA study, known as the WASH-1400 safety study, and later the German risk assessment performed for European Union nuclear plants [7], laid the foundation for the risk-based/risk-informed approaches.

The evidence over six decades shows that nuclear power is a safe means of generating electricity; however, the three major nuclear plant accidents—Three Mile Island, USA, in 1979; Chernobyl, Russian Federation, in 1986; and Fukushima Dai-Ichi, Japan, in 2011 [8]—raised some questions that mainly include the capability or tolerance of plant and systems for human and institutional failures, considerations related to combined events, multi-unit site issues, gap areas in utility regulatory relations, and public communications particularly during emergency conditions.

After the Three Mile Island accident, role of PRA was recognized as a tool for improved and systematic understanding of nuclear plant design and operational safety issues. It became clear that it is not possible to base safety issues on a few selected accident scenarios, referred to as maximum credible accidents in the traditional deterministic approach, and it was found to be necessary to incorporate all the modes of individual component failures to construct a system model that facilitates quantitative prediction of all levels of safety.

The literature shows that over 200 Level 1 PRA studies have been performed for nuclear power plants up to 2011 [9]. In fact the current literature highlights the fact that for most of the NPPs, PRA studies have been completed. Further, there is growing interest in applying the PRA process to chemical plants [10]. The publication of USNRC Guide 1.200 [11] has encouraged application of PRA studies in support of risk-informed decisions. Successful development and deployment of risk monitors in many advanced countries and development of risk-based in-service inspection programs and risk-based maintenance management programs are just a few examples of applications of Level 1 PRA for addressing risk.

Some of the areas where the deterministic and probabilistic approaches work together at component level are reliability/risk-based approach to design a component, physics-of-failure approach to electronic system risk modeling, probabilistic fracture mechanics for mechanical component failure risk, damage modeling, software reliability modeling, where apart from estimate of component failure probability, the uncertainty characterization also forms part of the results. There is increasing interest in applying the risk-based approach [12] to shipbuilding [13], off-shore drilling [14, 15], aircraft design [16], space systems [17], chemical and process systems [18], marine systems [19], flood risk mitigation and management [20, 21], and the banking and financial sector [22]. Most of these applications employ risk analysis tools, be it probabilistic risk assessment, failure mode effect analysis, event tree or fault tree models, to assess and manage risks. Basically these applications focus on ways to address management of hazard in question to effectively address safety while achieving performance objectives.

For regulatory applications, the integrated risk-based approach can be applied at two levels. First, it can improve inspection procedures that form an integral part of regulatory review by prioritizing activities and enable optimization of resources in support of regulatory inspections, and second, performance of review based on risk-based approach, with stipulated deterministic and probabilistic goals and criteria, organizational and management framework for principles, etc.

1.3 Integrated Risk-Based Engineering Approach

There are concerns about the risks associated with system failure on the one hand and its impact on reliability and system availability on the other. Unfortunately, this information, particularly in terms of quantified estimates, is not available from the deterministic approach. Nevertheless, the deterministic approach has always provided intuitive and qualitative insights into risk and reliability of systems. In fact, the traditional deterministic approach to decision-making has also been risk-informed in nature as the “qualitative notion of risk” forms an integral part of decision-making, even in traditional approach to decisions. The only difference is that the current definition of the risk-informed approach uses quantified estimates of risk obtained from probabilistic risk assessment as input, along with the traditional deterministic considerations, for decision-making

Therefore, it can be argued that the qualitative and quantitative risk insights available from deterministic and probabilistic approach, respectively, provide a robust foundation for an integrated risk-based engineering approach. The deterministic approach provides basic principles, goals, and criteria, while probabilistic approach provides a rational and integrated methodology for system modeling toward arriving at quantified estimates of risk and reliability. These estimates are based on the system process configuration, engineering details, understanding of root causes of failures, and component reliability data.

The conventional definition of risk-based approach is where decisions are based on the insights provided by probabilistic safety assessment only. The other definition of risk-based approach is “an approach where the decisions are based on risk assessment,” without going into the argument for probabilistic or deterministic. The conventional notion of a risk-based approach essentially means decisions are based on inputs and insights from risk assessment.

Here we define integrated risk-based engineering (IRBE) approach as “an approach where the deterministic and probabilistic methodology is employed in an integrated manner toward reducing uncertainty in arriving at the solution. The considerations of human reliability and quality assurance form a part of IRBE framework. The provision for prognostics and surveillance forms an integral part of solution metrics.”

There is no explicit and clearly demarcated boundary between deterministic and probabilistic considerations for engineering systems, as these two aspects are overlapping and to some extent integrate into any engineering problem. Figure 1.1 depicts the role of deterministic and probabilistic approach in IRBE. As can be seen, the basic building blocks that make this approach have both deterministic as well as probabilistic elements and have overlapping functions at all the levels, viz. structural level to component and through subsystem to system and finally plant level for hardware and software systems.

Fig. 1.1
figure 1

Role of deterministic and probabilistic elements in IRBE

Requirement of PRA stems from, apart from requirements-related quantified statement of safety, the characterization of uncertainty such that there is an improved understanding of safety margins. On the other hand, for example, the fundamental scientific formulations and models, design criteria, design rules, and failure criteria are derived from the deterministic approach and probabilistic approach starts with the availability of these information.

The probabilistic approach in IRBE provides an integrated model of a system using sound and well-understood reliability engineering tools that further allows assessment of the performance of individual components and their impact on plant safety. It also provides a quantified statement of safety of a system (as core damage frequency for the case of nuclear plants) and at lower level the system unavailability, as well as a quantitative evaluation of both aleatory and epistemic uncertainties. Finally, the approach integrates human performance with the integrated model of a system. IRBE also provides an integrated framework for validation of results against stipulated goals and criteria and provision for monitoring and feedback for continued assurance for safety in support of any design, operation, and regulatory review. The major premise of IRBE is that deterministic and probabilistic approaches together have the capability to provide holistic solutions based on risk considerations in the present context of global and competitive market scenario.

1.4 Factor of Safety and Uncertainty

In a deterministic approach, the factor of safety is based on conservative assumptions that served to account for uncertainty in data and model with an intent to ensure safety margins. The factor of safety F is defined as:

$$ F = \frac{{S_{m} }}{{s_{m} }} $$
(1.1)

where s m and S m are mean stress and strength. Often factor of safety has relatively large value. This approach worked well when the data on mechanics of failure and advanced technologies and computational systems were not available. With the advent of technology, availability of material databases, and best estimate codes, it was possible to quantitatively characterize material properties in terms of not only material stress (s) and strength as point value of s m and S m but also the uncertainty associated with these properties as shown in Fig. 1.2. Without these distributions, the ratio of s m and S m used to be relatively large. With the availability of mean value of s m and S m and associated standard deviation, σs and σS, it became easier to provide effective and efficient designs. This formulation also provided an effective mechanism to evaluate failure probability or a statement of safety or reliability.

Fig. 1.2
figure 2

Stress–strength representation—a perspective on safety margin

Figure 1.2 highlights the importance of uncertainty assessment in optimizing the design stress and strength toward removing unnecessary conservatism. Stress and strength are both assumed to follow a normal distribution with respective means as \( \varvec{s}_{\varvec{m}} \) and \( \varvec{S}_{\varvec{m}} \), respectively. The overlapping shaded area indicates the probability of failure region. Probability of failure, i.e., probability that stress is greater than strength, is given as:

$$ F_{\text{pr}} = \mathop \smallint \limits_{ - \infty }^{\infty } f_{s} \left( s \right)\left\{ {\mathop \smallint \limits_{ - \infty }^{s} f_{S} \left( S \right){\text{d}}S} \right\} {\text{d}}s $$
(1.2)

where \( f_{s} \left( s \right) \) and \( f_{S} \left( S \right) \) are probability density function for stress and strength, respectively. Design objective is to reduce the probability of failure to an acceptable value and at the same time to remove unnecessary conservatism.

Increasing need for highly efficient, cost-effective, and reliable systems is pushing toward probabilistic design approach. Probabilistic design approach addresses the probability of failure directly and hence gives better design estimates. It helps in taking the conservative design approach from deterministic approach toward more realistic design. This, as part of IRBE, allows the designer to prepare optimum design that caters to safety requirements along with being cost-effective and efficient. Quantitative notion of risk and uncertainties reduces the excessive conservatism from the design.

1.5 Basic Framework for Integrated Risk-Based Engineering

Figure 1.3 shows the basic IRBE framework. The first step in IRBE is identification and formulation of requirement specifications. Depending on the type of application, e.g., system design, change evaluation in support of plant operation, or regulatory review, the requirements need to be formulated. The next step is identification of the deterministic and probabilistic components of the problem or issue on hand and detailed analysis. A deterministic approach, for example, a thermal-hydraulic analysis to assess passive system reliability involves characterizing uncertainty in the deterministic variables through one of the available simulation approaches, like Monte Carlo simulation of the governing equation to arrive at the final estimate of the parameter with uncertainty bounds. Similarly, modeling the frequency of pipe rupture failure might require a probabilistic fracture mechanics approach, wherein the structural inputs and crack initiation and growth parameters are treated as a random variable. The reliability parameters for hardware, software, and human error are derived either from a generic or plant-specific sources along with associated uncertainty that forms the input for the PRA modeling. A PRA approach is required to create an integrated model to estimate the failure probability of the system. The PRA model requires a fault tree or event tree approach depending on the nature of problem.

Fig. 1.3
figure 3

Basic framework of integrated risk-based engineering

Human factor considerations form part of deterministic as well as probabilistic evaluation. While for a given design, deterministic considerations might provide information on type of human–machine interface, time required for a given action, plant procedures, like technical specifications and emergency procedures; the probabilistic approach provides an effective framework for integrating human performance in the risk model of the plant.

Further, the IRBE procedure requires input, on plant surveillance provisions which include testing frequency, online monitoring, condition monitoring, and prognostic capability at plant, system, and component levels. This information is required to evaluate the plant’s capability to identify the failure in advanced such that corrective actions can be initiated. The surveillance and monitoring capability also plays critical role after the change has been affected to get feedback on subject system performance.

Integrated risk assessment is an iterative process where all the assumptions and uncertainty bounds at component, system, and plant levels are evaluated toward ensuring that the change is complying with the risk and performance goals, so that the subject change can be accepted. The results obtained from the integrated model are subjected to validation with applicable deterministic and probabilistic performance criteria, achievable goals, or regulatory stipulations. In case the evaluation is not meeting the set goals and criteria, the complete process is revisited till acceptable solution is obtained.

1.6 Major Elements of Integrated Risk-Based Engineering

This section deals with the typical procedural steps in IRBE to introduce and demonstrate the concepts in this approach. These steps are indicative only and not exhaustive, and depending on the type of application, it may be required to add or remove few steps. For example, the following steps are typical for implementing a change in specifications in the plant.

  1. 1.

    Define the problem,

  2. 2.

    Define the objective and scope of the analysis,

  3. 3.

    Identify major assumptions,

  4. 4.

    Identify system boundary and limitations/constraints,

  5. 5.

    Identify quality attributes for specific tasks or applicable code/standard,

  6. 6.

    Identify initiating events and set of input variables/data,

  7. 7.

    Identify safety provisions or engineering safety features,

  8. 8.

    Gather design and operational/expected performance data and information,

  9. 9.

    Perform deterministic and probabilistic modeling analysis,

  10. 10.

    Create an integrated (preferably dynamic) model for effective sensitivity analysis,

  11. 11.

    Ensure that human performance aspects are integrated in the model,

  12. 12.

    Estimate uncertainty in input variables,

  13. 13.

    Simulation and modeling/analysis and sensitivity analysis,

  14. 14.

    Compare the set of probabilistic and deterministic established goals and criteria in conjunction with insights on uncertainty,

  15. 15.

    Provide independent and/or regulatory review,

  16. 16.

    Ensure regulatory compliance and oversight,

  17. 17.

    Implement change, retrofit, or modification,

  18. 18.

    Follow-up and monitor trends through PHM procedures,

  19. 19.

    Document and record.

Even though the steps in IRBE have been listed in chronological order, the nature of the complete implementation involves many parallel activities with recursive tasks. The problem in hand should be formulated in a crisp and clear manner so that the objective and scope function can be defined in a clear and unambiguous manner.

Many steps in IRBE are governed by the nature of issue in hand; e.g., the problem in hand deals with new design, in support of existing plant operations (change in configurations or operating policy) or in support of regulatory review. Hence, nature of evaluation and resources required will change depending on scope and objective of the analysis.

The deterministic analysis provides failure criteria as an input for probabilistic analysis. For example, for a loss of coolant accident scenario the coolability criteria and accordingly the injection flow requirements are derived from deterministic analysis. In IRBE, the specific requirement is to provide apart from the point value of the parameter, like emergency flow, the uncertainty bounds of parameter (like minimum and maximum flow) also. In fact, the probabilistic model is developed based on the given plant configurations that satisfy the objective and performance function of the plant. The probabilistic goal could deal with complying with system-level unavailability (1 × 10−3/demand) and change in core damage frequency (CDF) goal (change in CDF <1% of the reference value). The acceptable radiation level criteria may require considerations of as low as reasonably achievable (ALARA) principles.

The analysis should reflect the assumptions, constraints, limitation of analysis tools and methods, data, etc., and validation of the same by performing the sensitivity analysis. As mentioned, uncertainty modeling, be it deterministic or probabilistic aspect, is an integral part of IRBE. Similarly, the system boundary should be clearly marked, and in case, there is some interfacing connection with other system then this needs considerations as part of the analysis.

The quality of a deterministic and probabilistic analysis is critical and has direct bearing on the results of the analysis. For example, only validated best estimate code should be used in the analysis. The analysis should take note of the code and standards followed for designing these systems. In case, any part of system is not meeting the standards then provision of supporting additional safety criteria should be demonstrated in the analysis. In case, a change is affecting plant technical specification then compliance should be checked. It should be ensured that human error events which form part of the analysis are covered in plant emergency operating procedures. If certain human actions are not considered in available plant emergency operating procedures, then the recommendation of the analysis should reflect this requirement.

Similarly, quality assurance for PRA is also critical, particularly when a real-time application is being developed. Subjecting various procedural and logical steps to establish risk assessment quality attributes is vital to get the required confidence in decision-making. The available framework; for implementation of a quality checklist for each of the procedural elements, including initiating event selection, system modeling, human reliability analysis, common cause failure analysis; include international standards/documents such as the IAEA-TECDOC-1101 on Quality assurance in PRA [23], IAEA-TECDOC-1804 on quality attributes for Level 1 PRA [24] and ASME/ANS standards [25]. Apart from this, national guide and standards also enable characterization of quality of the applications.

The plant system description, like any other analysis, forms an important element of IRBE. This includes systems descriptions, associated drawings, system modes of operations, description plant/system logic, and expected behavior of the system in various modes like normal operation, transient, and emergency conditions.

The deterministic principles and criteria provide the basic framework for ensuring safety. The role of the probabilistic approach should be to consolidate but not dilute these principles. For example, if the deterministic principles provide for a redundant system to cater to certain safety functions, the probabilistic approach should provide the probabilistic evaluation and should provide quantitative statement reliability or availability of these redundant provisions.

Other important deterministic or design/operation and maintenance aspects that need to be considered for evaluation include: (a) safety code or plant technical specification applicability, (b) assessment of consequences, e.g., in terms of leak size, containment scenario, and radioactivity release and, hence, adequacy of emergency provision in respect of on-site and off-site emergencies, (c) considerations of in-service inspection insights in respect of structural integrity assessment, (d) credit to be given for condition monitoring or prognostics provisions and status indications, (e) assurance against human performance and training-related aspects, (f) requirements related to the test override provision, (g) demonstration of built-in system capacity and plans and procedures to assess plant’s coping capability for a certain scenario, and (h) requirements of plans and provisions for regulatory review and oversight.

The objective of data collection and analysis is to generate reliability estimates for hardware components and common cause failures. The data should facilitate simulation with a range of test intervals to aid optimization. For example, steady-state unavailability data or data on demand failure probability do not allow dynamic modeling. Also, test interval optimization requires dynamic equations with respect to incremental changes in time intervals, and hence, use of steady-state values is not appropriate. Most of the components involved in the safety systems are modeled as standby-tested components. Accordingly, the standby-tested model is used to determine component failure probability [26].

The insights obtained from deterministic analysis in the form of available system redundancy, diversity, cooling criteria, or system failure criteria form the inputs to the probabilistic methods for risk assessment. Keeping in view the complexity of the problem and the problem definition, the Level 1 PRA model of the plant should be used. There are two major aspects of PRA model simulation—event tree analysis and fault tree analysis. Event tree modeling is performed to generate accident sequences associated with given initiating event and plant response in terms of success or failure of actuation of safety systems or human action. The fault tree analysis is performed for arriving at safety system failure probability.

Judgment or decision-making should not be based on point estimates. It is advisable that uncertainty analysis should be performed for real-time applications. Uncertainty analysis accounts for randomness in the variables, lack of knowledge, or data or inadequacy of the model employed for probabilistic modeling. Uncertainty analysis provides the upper and lower bounds of the estimates along with the median and mean values of the parameter. However, for this illustration, uncertainty and sensitivity analyses have not been performed.

Sensitivity analysis is performed to evaluate the impact of various assumptions on the overall results of PRA. Apart from this, sensitivity analysis is also performed considering the range of uncertainty bounds for selected components/human actions, for which the uncertainty is relatively high.

Technical review is performed at two levels, viz. peer review and regulatory review. The data and assumptions that form input for the analysis are important aspects of this review. The peer review is performed by experts dealing in the respective areas, but not the individual or group involved in the analysis. The role of the probabilistic expert is to assess the model and data accuracy and overall representation of the system. The design and operation and maintenance aspects are checked and verified, keeping in view the overall objective of the change. It is also verified that the fundamental safety aspects such as defense in depth are not diluted, and therefore, no compromise in redundancy or diversity level takes place. It is also ensured that the provision of fail-safe criteria is not compromised.

Finally, this review includes aspects such as assessment and implication of system unavailability and CDF estimates and its comparison with the target goals. A checklist procedure is used for ensuring that all the safety issues have been addressed. For example, the major metrics that form the inputs for decision-making could include, say for a given LOCA scenario, which is as follows:

  1. 1.

    For ensuring that the provision of defense in depth is not diluted following is typical checkpoints:

    1. (a)

      Provision of redundancy meets the design intent.

    2. (b)

      Provision of diversity meets the design intent.

    3. (c)

      Fail-safe criteria are maintained and meet the design intent.

    4. (d)

      System success criteria for applicable postulated scenario are met.

    5. (e)

      Provision exists to detect latent critical failures, or the likelihood of this event is low.

    6. (f)

      Alternative provisions still provide the same backup as before the change.

    7. (g)

      The results of in-service inspection and condition monitoring have not shown any symptom of significant degradation.

    8. (h)

      No new issues in respect of human actions and procedures.

    9. (i)

      The provision of surveillance and prognostics are commensurate with the change.

  2. 2.

    Probabilistic criteria

    1. (a)

      The data used for the analysis are from plant-specific source

    2. (b)

      The increase in system unavailability, e.g., <5%.

    3. (c)

      The increase in CDF is, e.g., <1.0%.

    4. (d)

      Sensitivity analysis has not shown any significant impact of any single assumptions.

    5. (e)

      Uncertainty analysis bound for the system unavailability and CDF is well within the acceptable range.

    6. (f)

      PRA attributes analysis provides adequate confidence in PRA model, data, assumptions, and uncertainty in final results.

After obtaining the regulatory clearance, it is management’s responsibility to ensure the recommended changes are implemented. This means modifying the technical specifications, changing system procedures and schedules, and training the staff, wherever required. Management should ensure regulatory compliance in words and “spirit.” Generating plans and procedures for follow-up and trend monitoring is an important part of a change implementation program. The follow-up program should ensure that adequate provision exists for prognostics and health management or a condition monitoring program for critical functions in the system. Finally, the activities related to documentation and records should comply with various communication protocols.