# Dynamic fault tree analysis of dynamic positioning system using Monte Carlo approach

## Abstract

Dynamic Positioning System (DPS) is a critical component of marine vessels and floating platforms that are used in offshore operations. It is a computer-controlled system that automatically maintains a ship’s position and heading using thrusters fitted to the ship. Reliability assessment of DPS can be analyzed through conventional fault tree analysis. However, the complex behavior such as sequence failure, redundancy management and priority of failing of events cannot be analyzed by the conventional fault tree analysis. The Dynamic Fault Tree (DFT) can address these shortcomings by defining additional dynamic gates. In this paper, Monte Carlo based simulation approach has been adopted for the analysis of DFT that models a DPS. It focuses on computing the Loss of Position (LOP) (or unavailability) of the vessel during the operation, standard measures of mean times to failure and repair, and reliability of the DPS system.

## Keywords

Dynamic positioning system Dynamic fault tree Monte Carlo simulation Loss of position Reliability Unavailability## Introduction

Oil and gas exploration, drilling and production in offshore areas are performed by floating structures or marine vessels. These floating systems use Dynamic Positioning System (DPS) for stabilization to ensure exploration, drilling, and production operations. A DPS consists of hull-mounted controllable thrusters for station keeping. It is always in the running mode during drilling and production operations so that a particular position of the system is maintained. The failure of DPS leads to Loss of Position (LOP) of a floating system that, in addition to the suspension of operations, may put environment, human lives, and assets into risk. The offshore operating environment is very often extreme and hostile, and accidents during drilling operations have proved fatal for health, safety, and environment (Hauff 2014; Pedersen 2015). The motivation for Quantitative Risk Assessment (QRA) studies of the DPS is to ensure safer operations of floating systems. As the complexity of the DPS increases, its reliability becomes important (Sørensen 2011).

An important objective of QRA studies is the prediction of the reliability of a system under operation. The Fault Tree Analysis (FTA) is an established technique for predicting the reliability of the system. The Dynamic Fault Tree Analysis (DFTA), on the other hand, extends the scope of the conventional fault tree (Lee et al. 1985; Vincoli 2014) by incorporating time-dependent behavior such as repairs and dynamic dependencies. (Dugan et al. 1992; Boudali et al. 2007; Rao et al. 2009). In this paper, a Dynamic Fault Tree (DFT) approach has been applied to a ‘floating system-DPS’ combination that is designed to maintain the position of the floating system.

The DFT is generally solved by using the Markov model approach, but in this approach, the state space becomes too large with an increase in the number of gates (Rao et al. 2009; Zhang and Chan 2012; Chiacchio et al. 2011). The Monte Carlo approach has been applied to DFT for high reliability systems such as relay protection system, phase measurement unit of a smart grid (Zhang and Chang, 2012), electrical power supply system of nuclear power plant and its reactor regulation system (Rao et al. 2009).

The system ‘unavailability’ over time is an important measure of the robustness (and hence reliability) of a system and in a DPS supported floating system, the system ‘unavailability’ is essentially the Loss of Position (LOP) of the system. The system’s LOP (i.e. unavailability) has been computed in the present work, and furthermore, the system measures such as Mean Time To Failure (MTTF), Mean Time To Repair (MTTR) and reliability are computed. The methodology adopted in DFTA is Monte Carlo simulation which is based on random generation of failure and repair times for the components involved in the DPS integrated floating system.

## System definition, data, and its DFT

- (i)
If the SCU fails followed by failure of TCU.

- (ii)
If both TCU and MLU fail.

- (iii)
If the TCU fails followed by failure of SCU.

The PAND gate in Fig. 3 has SCU and TCU as input events with SCU as the priority event (located left of the non-priority event TCU by convention). It implies that the gate output (see Fig. 1) occurs if both SCU (A) and TCU (B) fail and if SCU fails before or at the same time as TCU. This covers case (i) listed above.

The SPARE gate in Fig. 3 has TCU and MLU as input events. Of these two events, TCU (A) is the primary event and MLU (B) is a spare (or backup) event. The output of the gate occurs when both the primary event (TCU) and spare (or backup) event (MLU) fail. Therefore the scenario of LOP handled by this gate are that of case (ii) and case (iii) listed above.

The FDEP gate in Fig. 3 has SCU as input, and the output connects to MLU. This gate has a trigger input event, which is SCU (T) in the present DFT and a dependent event, which is MLU (A). It implies that when the trigger event occurs, the dependent basic event is forced to occur. In other words, if SCU (trigger event) fails, MLU also fails. The case where the failure of SCU forces the failure of MLU, i.e. case (iv), is handled by this gate.

MTTF and MTTR data of basic events

Basic event | MTTF (in yrs) | MTTR (in hrs) |
---|---|---|

TCU | 1.0959 | 100 |

SCU | 2.2831 | 100 |

MLU | 40.05 | 100 |

## Monte Carlo approach to DFT

A real-life process can be mathematically modeled by using statistics in a Monte Carlo analysis. In reliability analysis, the failure process of the basic events has a particular distribution (usually assumed exponential). The time to failure and the time to repair for every basic event is computed randomly from the cumulative distribution functions of the failure and repair processes of that event. Once the time to failure and the time to repair for the basic events are estimated, these are used considering the logic of the gate to which the events are connected to obtain the failure and repair timeline of the next event. This process is continued across all higher level gates to compute the failure and repair timeline of the Top Event (TE). This constitutes a single iteration of the DFT. Monte Carlo simulation is based upon a large number of realizations of the random process. This would result in a generalized random walk through the system which is similar to a real process. Monte Carlo simulation allows to find asymptotically convergent estimates of all numerical characteristics of the random process under consideration. Additionally Monte Carlo simulation approach has the major advantage in its ability to model a variety of probability distribution functions for a given basic event. These may be Weibull, lognormal etc. (Chiacchio et al. 2011).

## DFT analysis using Monte Carlo approach

*f*

_{F}(

*t*), where

*t*is the time to failure (TTF) of the event, is typically assumed to be an exponential distribution:

*F*

_{F}(

*t*) is given by

*t*) is generated randomly in every iteration and

*F*(

*t*) is computed for each

*t*. The time

*t*is obtained from (3) as

*f*

_{R}(

*t*), where

*t*is the time to repair (TTR), is also assumed to be an exponential distribution as TTF (see Eq. (2)) and the corresponding CDF is denoted

*F*

_{R}(

*t*) in a similar manner. The expressions are:

For all basic events, i.e. TCU, SCU and MLU, one has MTTF > MTTR. The maximum value of MTTF of these three events is that of MLU, which is 40 yrs. Therefore, all the events will fail at least once if the simulation (or mission) time *T*_{s} ≥ 40 yrs. ≈ 3.5 × 10^{5} h. Therefore, at this time, at least one failure of all the events is likely. In Monte Carlo simulation, where the system MTTF is obtained by averaging many failures, the simulation time must be sufficient to admit many event failures. For example, if about 1000 failures are deemed sufficient, the simulation time will be 1000 times 3.5 × 10^{5} h, or T_{s} ≈ 3.5 × 10^{8} h. The convergence of the system parameters (e.g. MTTF, MTTR) must be established to select a reasonable value of *T*_{s}.

*t*= 0, the successive TTF

_{j}(

*j*= 1 to

*n*) and TTR

_{j}(

*j*= 1 to

*n*-1) are marked such that

*n*is the number of failures in

*T*

_{S}. The downtime (

*t*

_{d,j}) and uptime (

*t*

_{u,j}) instances are marked in this diagram such that

The objective of DFTA is to obtain MTTF and MTTR of the top event (TE) which is LOP. The algorithm is presented below.

- 1.
Let iteration number be

*k*. Begin with*k*= 1. - 2.
Consider the basic event

*X*_{i}(*i*= 1 for TCU, 2 for SCU and 3 for MLU). Start with*i*= 1.

- (a)
Generate two random number between 0 and 1 and this number represents

*F*_{F}(*t*) and*F*_{R}(*t*) since 0 ≤*F*_{F}(*t*),*F*_{R}(*t*) ≤ 1. Find TTF_{1}(=*t*) from Eq. (4) and TTR_{1}(*t*) from Eq. (7). - (b)
Repeat step (a) above repeatedly to get ‘TTF

_{2}and TTR_{2}’, ‘TTF_{3}and TTR_{3}’ till ‘TTF_{n}and TTR_{m}’ till Eq. (8) is satisfied. - (c)
Obtain the down and uptimes from Eq. (10). Thus we have the vectors

\( {t}_{d,j}^i=<{t}_{d,1},{t}_{d,2},...,{t}_{d,n}{>}^T\mathrm{and}\ {t}_{u,j}^i=<{t}_{u,1},{t}_{u,2},...,{t}_{u,n}{>}^T \)

*X*

_{i}.

- 3.
Repeat step 2 for remaining (

*i*= 2 and 3) basic events. - 4.
From the uptime and downtime vectors of basic events, construct output uptime and downtime vectors based on gate type as shown in Fig. 5 and following the DFT connectivity obtain the uptime and downtime vectors for the top event (TE). If these vectors are denoted

_{j}and TTR

_{j}using Eq. (9) for the top event. Then, using basic definition, compute MTTF and MTTR of the top event at

*k*-th iteration as

*U*, by definition, is computed as

- 5.
Check convergence of MTTF and MTTR of TE in Eq. (12) at iteration

*k*by comparing them with those at iteration*k*-1 to any reasonable degree of accuracy. In the present calculations, the accuracy of MTTF is approximately 100 h, and that of MTTR is about 0.0001 h.

*N*=

*k*and obtain system parameters as

*U*) is

*λ*

_{S}and the system reliability

*R*at the end of time

*t*is given by

## Results and discussion

*T*

_{S}is shown in Fig. 6, that of MTTR in Fig. 7 and that of LOP in Fig. 8. It is clear that for convergence, one needs to adopt

*T*

_{S}value of at least 10

^{8}h. All calculations are performed with

*T*

_{S}= 10

^{9}h. Fig. 9 shows the convergence of the system MTTF with iterations, showing that about 8000 iterations are sufficient (

*N*≈ 8000) for converged results. The number of failures (

*n*, see Eq. 8) for various values of

*T*

_{S}is shown in Fig. 10, showing, as expected a linear relation for ‘sufficiently large’ values of

*T*

_{S}.

Results of Monte Carlo Simulation of DFT (Simulation time (*T*_{S}) = 10^{9} h, No. of iterations (*N*) = 10000)

LOP | 0.25 h/yr. |
---|---|

MTTF | 200.2 yrs |

MTTR | 50 h |

Reliability after 1 yr | 99.5% |

Reliability after 10 yr | 95.13% |

_{3}(i.e. MTTF of MLU) is overwhelmingly larger (≈ 40 yr) than MTTF

_{1}(≈ 1.1 yr) and MTTF

_{2}(≈ 2.3 yr). Arguably, it can be subject to a large uncertainty as well. Thus, to establish the dependence of system MTTF (i.e. MTTF

^{(TE)}) on MTTF

_{3}, DFT analyses have been carried out for three more values of MTTF

_{3}, namely 10 yr., 20 yr. and 30 yr. and the results are summarised in Table 3 and the variation of the system MTTF as function of MTTF of MLU is shown in Fig. 12.

System parameters as functions of MTTF of MLU (Simulation time (*T*_{S}) = 10^{9} h, No. of iterations (*N*) = 10000)

MTTF (MLU) (yr) | MTTF | MTTR | Reliability after 10 years (%) | LOP (hr/yr) |
---|---|---|---|---|

40 | 200.2055 | 50.0004 | 95.13 | 0.2502 |

30 | 193.6095 | 49.9875 | 94.97 | 0.2586 |

20 | 181.5404 | 50.0093 | 94.64 | 0.2759 |

10 | 153.1152 | 50.0039 | 93.68 | 0.327 |

## Conclusion

The dynamic positioning system is one of the most crucial subsystems in a floating offshore platform because the LOP of the platform can lead to either a catastrophic event or serious economic loss due to the suspension of operation. This paper proposes DFTA as a method to compute the system unavailability (LOP) and key system parameters such as MTTF, MTTR, and reliability. This approach models a ‘real’ system better because it can account for the timing of failure, the redundancies available in the system and the functional dependencies of the events of the fault tree, which cannot be considered in conventional FTA. The Monte Carlo approach to DFTA is indeed very powerful and flexible because it can easily accommodate any failure distribution. It also overcomes many problems of analytical approaches such as Markov models, Bayesian belief networks etc. where the complexity of the problem is directly related to the number of events in the fault tree.

The extension and future scope of this work lie in building DFT models for each of the units involved in the DPS and interlinking them. Furthermore, this approach can take into account the uncertainties present in the failure rate data which will aid practical engineering decisions. Such DFT studies can also help in system design by identifying the weakest link of the modeled system and introduce appropriate redundancies in the system to prevent failure.

## References

- ABS (2013) Guide for dynamic positioning systemsGoogle Scholar
- Boudali H, Crouzen P, and Stoelinga M (2007) Dynamic fault tree analysis using input/output interactive Markov chains, DSN '07 proceedings of the 37th annual IEEE/IFIP international conference on dependable systems and networks, 708–717Google Scholar
- Chiacchio F, Compagno L, D'Urso D, Manno G, Trapani N (2011) Dynamic fault trees resolution: a conscious trade-off between analytical and simulative approaches. Reliab Eng Syst Saf 96(11):1515–1526CrossRefGoogle Scholar
- Coppit D (2003) Engineering modeling and analysis: sound methods and effective tools. PhD thesis, The University of VirginiaGoogle Scholar
- Desai N (2015) Dynamic positioning: method for disaster prevention and risk management. Procedia Earth Planet Sci 11:216–223CrossRefGoogle Scholar
- Drori G (2015) Underlying causes of mooring lines failures across the industry, BP Systems. http://mcedd.com/wp-content/uploads/2014/04/00_Guy-Drori-BP.pdf
- Dugan JB, Bavuso SJ, Boyd MA (1992) Dynamic fault tree models for fault-tolerant computer systems. IEEE Trans Reliab 41(3):363–377CrossRefGoogle Scholar
- Fussel J, Aber E, Rahl R (1976) On the quantitative analysis of priority-and failure logic. IEEE Trans Reliab R25(5):324–326CrossRefGoogle Scholar
- Hauff K S (2014) Analysis of loss of position incidents for dynamically operated vessels, Master Thesis, Norwegian University of Science and TechnologyGoogle Scholar
- Lee WS, Grosh DL, Tillman FA, Lie CH (1985) Fault tree analysis, methods, and applications: a review. IEEE Trans Reliab 34(3):194–203CrossRefGoogle Scholar
- Pedersen R N (2015) QRA techniques on dynamic positioning systems during drilling operations in the Arctic: with emphasis on the dynamic positioning operator, Master’s thesis, The Arctic University of NorwayGoogle Scholar
- Phillips D, Stanbery R, Weisinger D (1997). Marine Technology Society https://dynamic-positioning.com/proceedings/dp1997/249_reliability_shatto_phillips.pdf
- Rao KD, Gopika V, Rao VVSS, Kushwaha HS, Verma AK, Srividya A (2009) Dynamic fault tree analysis using Monte Carlo simulation in probabilistic safety assessment. Reliab Eng Syst Saf 94(4):872–883CrossRefGoogle Scholar
- Sørensen AJ (2011) A survey of dynamic positioning control systems. Annu Rev Control 35(1):123–136CrossRefGoogle Scholar
- Spouge J (2004) Review of methods for demonstrating redundancy in dynamic positioning systems for the offshore industry, DNV consulting for HSE, ISBN 0 7176 2814 0Google Scholar
- Vincoli J W (2014) Fault tree analysis. Basic guide to system safety, Third EditionGoogle Scholar
- Zhang P, Chan KW (2012) Reliability evaluation of phasor measurement unit using Monte Carlo dynamic fault tree method. IEEE Transactions on Smart Grid 3(3):1235–1243CrossRefGoogle Scholar